Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

4. Given a training set of 50 million tuples with 25 attributes each taking 4 by

ID: 3857859 • Letter: 4

Question

4. Given a training set of 50 million tuples with 25 attributes each taking 4 bytes space. One attribute is a class label with two distinct values, whereas for other attributes each has 30 distinct values. You have only a 512 MB main memory laptop. Outline an efficient method that constructs decision trees efficiently, and answer the following questions explicitly: (1) how many scans of the database does your algorithm take if the maximal depth of decision tree derived is 5? (2) what is the maximum memory space your algorithm will use in your tree induction?

Explanation / Answer

Here is the solution as per the given criteria, please go through it:-

(1) As per the given criteria, we will process each tuple at once.As we know that it contains (30 + 2) distinct values..So we will have scan of 10 million tuples.

(2) As per the used algorithm, we will process only single tuple at a time.
So it processing 25 attributes at a time.
As we know that each attribute takes 4 bytes.
So for 25 bytes, it takes 25x4 == 100 bytes.

It takes less memory by using multicore parallel processors so all the database operations will done fast.

Thankyou

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote