1. Discuss why data partitioning is needed in data mining. 2. Based on the logis
ID: 2926983 • Letter: 1
Question
1. Discuss why data partitioning is needed in data mining.
2. Based on the logistic regression concepts, discuss if this particular logistic regression model should be used for classify "Loan Default", based on the derived model. Then, discuss the general usefulness of logistic regression for classification purposes.
3. Based on the k-Nearest Neighbors concepts, discuss if this particular k-Nearest Neighbors model should be used for classify "Loan Default", based on the derived model. Then, discuss the general usefulness of k-Nearest Neighbors for classification purposes.
Explanation / Answer
Data Partitioning is the formal process of determining which data subjects, data occurrence groups, and data characteristics are needed at each data site. It is an orderly process for allocating data to data sites that are done within the same common data architecture.
It is a great help in facilitating the efficient and effective management of highly available relational data warehouse.
The single biggest benefit to a data partitioning approach is easy yet efficient maintenance. As an organization grows, so will the data in the database. The need for high availability of critical data while accommodating the need for a small database maintenance window becomes indispensable. Data partitioning can answer the need for small database maintenance window in a very large business organization. With data partitioning, big issues pertaining to supporting large tables can be answered by having the database decompose large chunks of data into smaller partitions thereby resulting in better management. Data partitioning also results in faster data loading, easy monitoring of ageing data and efficient data retrieval system.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.