Suppose you have the following collection of SPAM and HAM emails SP AM {buy, car
ID: 3820068 • Letter: S
Question
Suppose you have the following collection of SPAM and HAM emails SP AM {buy, car, nigeria, profit} SP AM {money, profit, home} SP AM {nigeria, bank, check, wire} HAM {money, bank, home, car} HAM {home, fly, nigeria} We’ll assume that the probability of particular words appearing in a message are independent given the category. How would a Bayesian Spam Filter classify the following emails if we assume that we classify a message as SPAM if p(SPAM | message) > p(HAM | message) and classify the message as HAM otherwise. (a) message = {home, money} (b) message = {nigeria, bank}
Explanation / Answer
The characteristics a Bayesian spam filter can look at can be
the sentences or words in the body of the message, of course, and
its addresses (senders and message paths, for example!), but also
other aspects such as HTML/CSS code (like colors and other formatting) for some of the emails provided
word pairs, phrases and meta information (where a particular phrase appears).
in our question it is given
a)message={home,money]
p(spam|message)>p(ham|message )
for i part message containe home,money which is available in spam and ham also
but probability os spam is grtear and according to
p(spam|ham)=p(spam|ham).p(spam)/p(spam|ham).p(spam)+p(spam|ham).p(ham)
hence it will be categorized into spam
b)message = {nigeria, bank}
p(spam|ham)=p(spam|ham).p(spam)/p(spam|ham).p(spam)+p(spam|ham).p(ham)
this is also categorized into spam because nigeria probability occurrence is larger in spam rather than in ham
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.