Exploiting Transport-Level Characteristics of Spam

In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robus...

Full description

Bibliographic Details
Main Authors: Beverly, Robert, Sollins, Karen
Other Authors: Karen Sollins
Published: 2008
Online Access:http://hdl.handle.net/1721.1/40287
_version_ 1826209255724679168
author Beverly, Robert
Sollins, Karen
author2 Karen Sollins
author_facet Karen Sollins
Beverly, Robert
Sollins, Karen
author_sort Beverly, Robert
collection MIT
description In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robust defense againstspam. We find that spam SMTP flows exhibit TCP behavior consistentwith traffic competing for link access, large round trip times andresource constrained hosts. Thus, SMTP flow characteristics providesufficient statistical power to differentiate between spam andlegitimate mail (ham). We build "SpamFlow" to learn and exploitthese differences. Using machine learning feature selection weidentify the most discriminatory flow properties and effect greaterthan 90% spam classification accuracy without content or reputationanalysis. SpamFlow correctly identifies 78% of the false negativesgenerated by a popular content filtering application -- demonstratingthe power in combining SpamFlow with existing techniques. Finally, weargue that SpamFlow is not easily subvertible due to economicand practical constraints inherent in sourcing spam.
first_indexed 2024-09-23T14:19:57Z
id mit-1721.1/40287
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:19:57Z
publishDate 2008
record_format dspace
spelling mit-1721.1/402872019-04-12T09:31:48Z Exploiting Transport-Level Characteristics of Spam Beverly, Robert Sollins, Karen Karen Sollins Advanced Network Architecture In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robust defense againstspam. We find that spam SMTP flows exhibit TCP behavior consistentwith traffic competing for link access, large round trip times andresource constrained hosts. Thus, SMTP flow characteristics providesufficient statistical power to differentiate between spam andlegitimate mail (ham). We build "SpamFlow" to learn and exploitthese differences. Using machine learning feature selection weidentify the most discriminatory flow properties and effect greaterthan 90% spam classification accuracy without content or reputationanalysis. SpamFlow correctly identifies 78% of the false negativesgenerated by a popular content filtering application -- demonstratingthe power in combining SpamFlow with existing techniques. Finally, weargue that SpamFlow is not easily subvertible due to economicand practical constraints inherent in sourcing spam. 2008-02-19T13:45:28Z 2008-02-19T13:45:28Z 2008-02-15 MIT-CSAIL-TR-2008-008 http://hdl.handle.net/1721.1/40287 Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory 12 p. application/pdf application/postscript
spellingShingle Beverly, Robert
Sollins, Karen
Exploiting Transport-Level Characteristics of Spam
title Exploiting Transport-Level Characteristics of Spam
title_full Exploiting Transport-Level Characteristics of Spam
title_fullStr Exploiting Transport-Level Characteristics of Spam
title_full_unstemmed Exploiting Transport-Level Characteristics of Spam
title_short Exploiting Transport-Level Characteristics of Spam
title_sort exploiting transport level characteristics of spam
url http://hdl.handle.net/1721.1/40287
work_keys_str_mv AT beverlyrobert exploitingtransportlevelcharacteristicsofspam
AT sollinskaren exploitingtransportlevelcharacteristicsofspam