Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Email spam filters are based on statistical analysis. Consider a simple spam fil

ID: 3252260 • Letter: E

Question

Email spam filters are based on statistical analysis. Consider a simple spam filter that obtains a sample of size n words from an email. It then compares the sample to a list of questionable words. If more than 75% of the sample appears in the list, the email is determined to be spam. Below are two sampling implementations. (a) Sampling Method 1: Put all the words in a "bin" and randomly select n words. Or, number all of the words in the email and then use a random number generator to select words. Every word or every group of words of size n has an equally likely chance of being selected. True or False. The above sampling method is an example of simple random sampling. True or False (b) Sampling Method 2: Separate the words in the email into two "bins" or strata, based on word length. Consider small words to consist of 3 or fewer letters and big words to consist of 4 or more letters. Pick a simple random sample from each bin corresponding to the proportion of small and big words. For example, if 40% of the words in the email are small, then randomly choose 0.4 * n of the small words and 0.6 * n of the big words. True or False. The above sampling method is an example of stratified sampling. True or False

Explanation / Answer

a) The answer is True because each unit has equal probability of being selected.

b) The answer is True because the words is being divided in Stratas and then random samples are being picked from each strata.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote