4. Given a training set of 50 million tuples with 25 attributes each taking 4 by
ID: 3857859 • Letter: 4
Question
4. Given a training set of 50 million tuples with 25 attributes each taking 4 bytes space. One attribute is a class label with two distinct values, whereas for other attributes each has 30 distinct values. You have only a 512 MB main memory laptop. Outline an efficient method that constructs decision trees efficiently, and answer the following questions explicitly: (1) how many scans of the database does your algorithm take if the maximal depth of decision tree derived is 5? (2) what is the maximum memory space your algorithm will use in your tree induction?
Explanation / Answer
Here is the solution as per the given criteria, please go through it:-
(1) As per the given criteria, we will process each tuple at once.As we know that it contains (30 + 2) distinct values..So we will have scan of 10 million tuples.
(2) As per the used algorithm, we will process only single tuple at a time.
So it processing 25 attributes at a time.
As we know that each attribute takes 4 bytes.
So for 25 bytes, it takes 25x4 == 100 bytes.
It takes less memory by using multicore parallel processors so all the database operations will done fast.
Thankyou
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.