Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

In this assignment, you will have a chance to implement the hash-join algorithm

ID: 3689783 • Letter: I

Question

In this assignment, you will have a chance to implement the hash-join algorithm for a DBMS. You can use C, C++, Java or any other high-level programming language to write the program. However, your program should include sufficient comments to make it readable. You need to turn in a brief report/description about your program design and implementation (e.g., high-level program diagram and data/file structures) and program usage; your program source code; proof of compilation (e.g., the screen snapshot of a successful compilation); and sample execution outputs. Please assemble all the above required contents in a single Word or PDF file for your submission. The program specificiation is given as follows. Let R_1 (a_1, a_2, a_3) and R_2 (b_1, b_2, b_3, b_4) be two relations with all integer attributes. Tuples in these two relations are sequentially stored in two data files, respectively. Use the hash-join algorithm to implement a join (equijoin) of R_1 and R_2. Assume that the hash function is f(k) = k mod N, where N is the number of buckets allowed in your hash structure/table. Your program should allow a user to choose the joining attributes from the two relations, i.e., performing R_1 R_1 middot a_i = R_2 middot b_j R_2 for any chosen pair of a_i and b_j, where a_i is the i-th attribute in R_1 (1 lessthanorequalto i lessthanorequalto 3) and b_j is the j-th attribute in R_2 (1 lessthanorequalto j lessthanorequalto 4). For example, a user may want to perform R_1 R_1 middot a_2 = R_2 middot b_3 R_2. Your program should display the join result and output the selectivity of the join. You may request a user to interactively input the necessary parameters, such as the data file names for R_1 and R_2, the number of tuples in each relation, and the joining attributes (e.g., 1 for the 1st attribute, 3 for the 3rd attribute). Use your program to perform several joins for different relation instances of R_1 and R_2.

Explanation / Answer

Hash Join algorithm

for each tuple t in T do
let h = hash on join attributes t(b)
place t in hash table Th in bucket keyed by hash value h
for each tuple x in X do
let h = hash on join attributes x(a)
if h indicates a nonempty bucket (B) of hash table Th
if h matches any t in B
concatenate x and t
place relation in R

Tables to be joined using hash join algorithm

Code in java

import java.util.*;

public class HashJoinDBMS {

public static void main(String[] args) {
String[][] tab1 = {{"27", "John"}, {"18", "Alice"}, {"28", "Gloria"},
{"18", "Patrick"}, {"28", "Alice"}};

String[][] tab2 = {{"John", "Willy"}, {"John", "Spidy"},
{"Alice", "Gummy"}, {"Alice", "Zompy"}, {"Gloria", "Betty"},
{"Bob", "foo"}};

HashJoinDBMS(tab1, 1, tab2, 0).stream()
.forEach(s -> System.out.println(Arrays.deepToString(s)));
}

static List<String[][]> HashJoinDBMS(String[][] rec1, int id1,
String[][] rec2, int id2) {

List<String[][]> res = new ArrayList<>();
m<String,List<String[]>> m=new HashMap<>();

for (String[] rec : rec1) {
List<String[]> v = m.getOrDefault(rec[id1], new ArrayList<>());
v.add(rec);
m.put(rec[id1], v);
}

for (String[] rec : rec2) {
List<String[]> list = m.get(rec[id2]);
if (list != null) {
list.stream().forEach(s -> {
res.add(new String[][]{s, rec});
});
}
}

return res;
}
}

Age Name 27 John 18 Alice 28 Gloria 18 Patrick 28 Alice Name Nemesis John Willy John Spidy Alice Gummy Alice Zompy Gloria Betty
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote