The author mebtions that at the usda their goal is to build a model that would p

ID: 354049 • Letter: T

Question

The author mebtions that at the usda their goal is to build a model that would predict the loan classification based on information about the loan, borrower, and property. Often, to maximize the processing and resulting-generating efficiency, they use several algorithms together. Explain how it was used? ceding three months have a 23 percent probabbity of clon- Descriptive Modeling ing their accounts next month Techniques Speed processes associated with predictive modeling Mest descrlptive modeling tools use one or ore emphasize speed the time it takes to of the following lechnigaes Clustering A descriptive technique that groups similar enticies and allocates dissimlilar entitles to dd- train a model and make predictions about new cases ferent groups clustering can find oups patients with similar profiles, and The A-nearest-neighbor algorithm's zero tralning time makes it the fastest trainer, but the model makes predic tions extremely slowly The three other major algorithms make predictions just as quickly as each other (and much more quickly than k-nearest-neighbor), but vary signii- cantly in their training ime, which lengthens in proportion to the number of passes each must make through the train- ing data. Nalve Bayes trains fastest of the three because it takes only one pass through the data Decision trees vary but typically require 20 to 40 data passes networks may need to pass over the data 100 to 1,000 times or more Clustering technigoes lnclude a special type of nes ral net called a Kobonrn net, as well as k-means and requires using a distance measure, ike the nearest pletely on the distance measure used, the number ef ways you can claster the data can be as high as the number of data miners doing the clustering, Thus clastering always requires sigsificant involvement Neural networkfrem a business or domain expert who masi jadge whether the resulting chusters are useful Asoclation aud sequencing Using these tech- niques can belp you uncover customer buying pat- PUTTING ALGORITHMS TO WORK At the USDA our goal was to build a model that would you can use to structure promotions, predict the loan classification based on Information about the loan, borrower, and property. Often, to maximlze our ation helps you understand what products or services processing and results-generating efficiency, we use seycustomers tend to purchase at the same time, while eral algorithms together. Because of Nalve Bayes' speedsequencing reveals which products customers and interpretability, we use it for initial explorations then follow up with decision tree or neural network models increase later as follow-up purc basket analysis, these techniques generate descrip- tive models that discover rules for drawing relation ships between the purchase of one product and the purchase of one or more others Self-taught tools To build a predictive model, a data mining tool needs examples: data that contains known outcomes. The tool To read more about descriptive modeling tech will use these examples in a process-variously namedmiques, check the Data Mining section, Technology learning Induction or training-to teach itself how to.ction of Exclusive Ore's home page at http/ dict the outcome of a given process or transaction The col- umn of data that contains the known outcomes-the value www.xore.com. we eventually hope to predict-also has various names: the dependent, target, label, or output varlable. Finally all other variables are variously called features attributes or USDA project illustrates this process and highlights some the independent or input variables Data mining's eclectic nature fostered this inconsistency in naming-the fleld common problems that modelers face in mining data. ompasses contributions from statistics artificial intel- Building models and a test database ligence, and database management; each field has chosen different names for the same concept. We built the models using two thirds of the data-8.000 rows-and set aside the remaining data as an independ ent data set for testing the models Testing reveals how well loan classification model has five values:problemless, sub- a model predicts the target variable-in this case, loan clas standard, loss, unclassified, and not available. Approx sification. During testing, we apply the model to the test imately 80 percent of loans fell into the problemless data and predict the loan classification for each borrower The dependent or output variable we used for the USDA category. For each of the 12,000 mortgages in the sample, sification Data mining consists of a cycle of generating, testing and Because we also know the actual loan classification, we can compare the predicted value to the actual value for all e knew in advance and included the correct loan clas- 4,000 cases From this data we can easily compute an accu racy score, the predictive accuracy evaluating many models. The data mining cycle for our The first model we built performed poorly.giving a pre November 1 December 1999 IT Pro 19

Explanation / Answer

At the USDA decision trees and neural network models were built on the basis of self-taught tools. Predicative model was constructed using data with known outcomes. The model used data from USDA’s data warehouse to predict different scenarios like likelihood of a long distance customer to switch to a competitor. In case of USDA descriptive models were also used. The models were mainly clustering and association.

The next step was algorithm implementation. Classification models were implemented using algorithms like k-nearest-neighbor algorithm, Naïve Bayes etc. The algorithm of k-nearest-neighbor entails making the data the model itself. In this method groups are found with most similar cases and the pre-dominant outcome are used to determine the predicted value. Naïve Bayes makes use of conditional probabilities calculated from observed frequencies.

Regression models were implemented using decision trees and neural networks. The algorithm of decision tree makes use of tree like graphical representation that is based on the rule ‘if condition then outcome’. Algorithms of neural networks are based on black box technology and early model of human brain function.

Navigate

The author had previously found that one of the advantages of leaf hairs in Ence

The author of Sundiata: an Epic of Old Mali, tells us that the functions of the

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

The author mebtions that at the usda their goal is to build a model that would p

Question

Explanation / Answer

Related Questions

Navigate