Graduation Year

2004

Document Type

Thesis

Degree

M.S.C.S.

Degree Granting Department

Computer Science

Major Professor

Eugene Fink, Ph.D.

Committee Member

Dewey Rundus, Ph.D.

Committee Member

Alan Hevner, Ph.D.

Keywords

spam filter, machine learning, feature extraction

Abstract

The problem of junk mail, also called spam, has reached epic proportions and various efforts are underway to fight spam. Junk mail classification using machine learning techniques is a key method to fight spam. We have devised a machine learning algorithm where features are created from individual sentences in the subject and body of a message by forming all possible word-pairings from a sentence. Weights are assigned to the features based on the strength of their predictive capabilities for spam/legitimate determination. The predictive capabilities are estimated by the frequency of occurrence of the feature in spam/legitimate collections as well as by application of heuristic rules. During classification, total spam and legitimate evidence in the message is obtained by summing up the weights of extracted features of each class and the message is classified into whichever class accumulates the greater sum.

We compared the algorithm against the popular naïve-bayes algorithm (in [8]) and found it's performance exceeded that of naïve-bayes algorithm both in terms of catching spam and for reducing false positives.

Scholar Commons Citation

Malkhare, Rohan V., "Scavenger: A Junk Mail Classification Program" (2003). USF Tampa Graduate Theses and Dissertations.
https://digitalcommons.usf.edu/etd/1145

Download

Included in

American Studies Commons

COinS

USF Tampa Graduate Theses and Dissertations

Scavenger: A Junk Mail Classification Program

Graduation Year

Document Type

Degree

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Search

Browse By

Useful Links

USF Tampa Graduate Theses and Dissertations

Scavenger: A Junk Mail Classification Program

Author

Graduation Year

Document Type

Degree

Degree Granting Department

Major Professor

Committee Member

Committee Member

Keywords

Abstract

Scholar Commons Citation

Included in

Share

Search

Browse By

Useful Links