A large number of insurance records are to be examined to develop a model for pr

Q: A large number of insurance records are to be examined to develop a model for pr

a) Out of 800 samples, 220 were misclassified. misclassification rate = 220/800 = 0.275 = 27.5% b) Total number of false positive records = 90 Frauds Non-Frauds Total correctly classify 310 270 580 Incorrectly classify 90 130 220 Total 400 400 800

ID: 3260206 • Letter: A

Question

A large number of insurance records are to be examined to develop a model for predicting fraudulent claims. Of the claims in the historical database, 1% was judged to be fraudulent (class 1).

A sample database is taken to develop a model, and oversampling is used to provide a balanced sample in light of the very low response rate. When applied to this sample database (total number of records, N = 800), the model ends up correctly classifying 310 frauds, and 270 non-frauds. It misses 90 frauds and classified 130 records incorrectly as frauds when they were not.

If the positive sample number is fixed (400), the sample ratio is 1:99 (fraudulent vs. non-fraudulent, positive vs. negative)

1. What is the adjusted misclassification rate (error rate) that should be in the original non-oversampled database?

2. What is the total number of false positive records that should be in the original non-oversampled database?

Explanation / Answer

a) Out of 800 samples, 220 were misclassified.

misclassification rate = 220/800 = 0.275 = 27.5%

b) Total number of false positive records = 90

Frauds Non-Frauds Total correctly classify 310 270 580 Incorrectly classify 90 130 220 Total 400 400 800

Navigate

A large number of insurance records are to be examined to develop a model for pr

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

A large number of insurance records are to be examined to develop a model for pr

Question

Explanation / Answer

Related Questions

Navigate