Exploiting Transport-Level Characteristics of Spam
In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robus...
Main Authors: | , |
---|---|
Other Authors: | |
Published: |
2008
|
Online Access: | http://hdl.handle.net/1721.1/40287 |
_version_ | 1826209255724679168 |
---|---|
author | Beverly, Robert Sollins, Karen |
author2 | Karen Sollins |
author_facet | Karen Sollins Beverly, Robert Sollins, Karen |
author_sort | Beverly, Robert |
collection | MIT |
description | In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robust defense againstspam. We find that spam SMTP flows exhibit TCP behavior consistentwith traffic competing for link access, large round trip times andresource constrained hosts. Thus, SMTP flow characteristics providesufficient statistical power to differentiate between spam andlegitimate mail (ham). We build "SpamFlow" to learn and exploitthese differences. Using machine learning feature selection weidentify the most discriminatory flow properties and effect greaterthan 90% spam classification accuracy without content or reputationanalysis. SpamFlow correctly identifies 78% of the false negativesgenerated by a popular content filtering application -- demonstratingthe power in combining SpamFlow with existing techniques. Finally, weargue that SpamFlow is not easily subvertible due to economicand practical constraints inherent in sourcing spam. |
first_indexed | 2024-09-23T14:19:57Z |
id | mit-1721.1/40287 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:19:57Z |
publishDate | 2008 |
record_format | dspace |
spelling | mit-1721.1/402872019-04-12T09:31:48Z Exploiting Transport-Level Characteristics of Spam Beverly, Robert Sollins, Karen Karen Sollins Advanced Network Architecture In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robust defense againstspam. We find that spam SMTP flows exhibit TCP behavior consistentwith traffic competing for link access, large round trip times andresource constrained hosts. Thus, SMTP flow characteristics providesufficient statistical power to differentiate between spam andlegitimate mail (ham). We build "SpamFlow" to learn and exploitthese differences. Using machine learning feature selection weidentify the most discriminatory flow properties and effect greaterthan 90% spam classification accuracy without content or reputationanalysis. SpamFlow correctly identifies 78% of the false negativesgenerated by a popular content filtering application -- demonstratingthe power in combining SpamFlow with existing techniques. Finally, weargue that SpamFlow is not easily subvertible due to economicand practical constraints inherent in sourcing spam. 2008-02-19T13:45:28Z 2008-02-19T13:45:28Z 2008-02-15 MIT-CSAIL-TR-2008-008 http://hdl.handle.net/1721.1/40287 Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory 12 p. application/pdf application/postscript |
spellingShingle | Beverly, Robert Sollins, Karen Exploiting Transport-Level Characteristics of Spam |
title | Exploiting Transport-Level Characteristics of Spam |
title_full | Exploiting Transport-Level Characteristics of Spam |
title_fullStr | Exploiting Transport-Level Characteristics of Spam |
title_full_unstemmed | Exploiting Transport-Level Characteristics of Spam |
title_short | Exploiting Transport-Level Characteristics of Spam |
title_sort | exploiting transport level characteristics of spam |
url | http://hdl.handle.net/1721.1/40287 |
work_keys_str_mv | AT beverlyrobert exploitingtransportlevelcharacteristicsofspam AT sollinskaren exploitingtransportlevelcharacteristicsofspam |