Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

DS 220 Midterm 1. (a) You declared a table phone-num and (b) nu customers who wa

ID: 3733874 • Letter: D

Question

DS 220 Midterm 1. (a) You declared a table phone-num and (b) nu customers who want to use three phone numbers in their account settings. whtomers who watable with one phone-number column. Now, you have what do you do if you have the data stored in (a) relational datà and addresses of n so that you can document database such as MongoDB? (b) You want to enable atomic updates to phone numbers users in MongoDB. What should you do in your desig support that? (c) What is eventual consistency? What is the advantage of having even consistency? (d) Your most frequent query is as follows: "Return all comments made in response to comment#, posted by a student". Show how you would store your data and what you would index using which type of indexing (B+-tree or hash tree)? 2. (a) Provide three clear examples using two transactions, the statements of the two transactions, and a timing diagram to show a write-write conflict, a read-write conflict, and a write-read conflict where the result is not equal to a serializable schedule. Show the initial value of the data and the final in the three cases. (b) Data warehouses use denormalized schema design. Show a small example that uses a denormalized schema. What are the advantages and disadvantages of denormalization? (c) What does information gain capture? When do you stop expanding a branch in a decision tree? (d) Consider the database D depicted in Table 1, containing five transactions, each containing several items. Consider minsup-60% and minconf-S096, Table i : Database D of transactions to be analyzed. TiD ltems T100 {B, O, N, E, C, O} T200 (B, O, N, E, C, A) T300 (C, A, N, E, C, A) T400 (F, A, N, E, C, A) T500 (F, A, C, A) Find all frequent 4-itemsets and 3-itemsets in the database. (i) (ii) Find association rules from these itemsets that satisfy the support and confidence thresholds stated above. (e) For what type of problems do you enable replication and for which do you enable sharding? Identify situations when replication and sharding can reduce downtime (state two scenarios, one for each). (f) When do you use a relational database, a key-value database, and a document database? Clearly identify the characteristics of the data, schema, and queries that you will consider to make such a choice. 3. (a) What is the difference between ACID and BASE? (b) What is a journal in MongoDB? Why do we need it?

Explanation / Answer

a) Here phone number attribute is a multivalued attribute and relational database doesn't allow multivalued attribute.

To solve this we make a new table or entity containing the multivalued attribute phone number and the primary key of the table in which phone number attribute was defined.Therefore, this new table will only contain two attributes. The second attribute(not the phone number attribute) in the new table will refer to the primary key of table in which phone number attribute was defined.The primary key of this new table will be a composite key containing both the attributes of this new table.

b) Nosql/document database/multivalue database allows multivalued attribute.