The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS)
Spam e-mails are unsolicited e-mails received by users of the e-mail service. Spam e-mails cause serious harm to organizations, for they waste, among other things, their computational and networking resources. To reduce the damage caused by them, organizations use anti-spams. Anti-spams are software...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9565223/ |
_version_ | 1818735629364625408 |
---|---|
author | Isaac C. Ferreira Marcelo V. C. Aragao Edvard M. Oliveira Bruno T. Kuehne Edmilson M. Moreira Otavio A. S. Carpinteiro |
author_facet | Isaac C. Ferreira Marcelo V. C. Aragao Edvard M. Oliveira Bruno T. Kuehne Edmilson M. Moreira Otavio A. S. Carpinteiro |
author_sort | Isaac C. Ferreira |
collection | DOAJ |
description | Spam e-mails are unsolicited e-mails received by users of the e-mail service. Spam e-mails cause serious harm to organizations, for they waste, among other things, their computational and networking resources. To reduce the damage caused by them, organizations use anti-spams. Anti-spams are software systems that classify e-mails in order to separate legitimate from spam e-mails. The best current commercial and open-source anti-spams, and in particular the well-known commercial anti-spam CanIt-PRO, make use of various techniques, such as blacklists and/or SMTP extensions, to classify e-mails. Unfortunately, both blacklists and SMTP extensions have serious drawbacks, such as low scalability and high computational and network costs. This paper introduces the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS). Unlike the best current anti-spams, Open-MaLBAS does not make use of blacklists and SMTP extensions, but only of machine learning models for e-mail classification. Open-MaLBAS was compared to CanIt-PRO in a series of experiments on a database composed of 862,227 real e-mails, collected over three months at the Federal University of Itajubá, Brazil. The e-mails were previously classified by CanIt-PRO. From the experiments, it was observed that Open-MaLBAS was able to correctly classify 81.48% and 98.13% of the e-mails in the database, using, respectively, the two models — Multi-Layer Perceptron and Random Forest — evaluated. In addition, it managed to obtain times of up to 88% shorter than those of CanIt-PRO to classify all e-mails in the database. Open-MaLBAS is implemented in Java language, under free software license, for free use. It is available on GitHub. |
first_indexed | 2024-12-18T00:24:18Z |
format | Article |
id | doaj.art-209603a806694194be8fd401fd8a81fa |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-18T00:24:18Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-209603a806694194be8fd401fd8a81fa2022-12-21T21:27:16ZengIEEEIEEE Access2169-35362021-01-01913861813863210.1109/ACCESS.2021.31189019565223The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS)Isaac C. Ferreira0Marcelo V. C. Aragao1https://orcid.org/0000-0001-8999-8169Edvard M. Oliveira2Bruno T. Kuehne3https://orcid.org/0000-0003-2529-225XEdmilson M. Moreira4https://orcid.org/0000-0001-5059-9080Otavio A. S. Carpinteiro5https://orcid.org/0000-0002-7490-9255TRICOD Equipamentos Eletrônicos Indústria e Comércio LTDA, Itajubá, BrazilNational Institute of Telecommunications, Santa Rita do Sapucaí, BrazilResearch Group on Systems and Computer Engineering, Federal University of Itajubá, Itajubá, BrazilResearch Group on Systems and Computer Engineering, Federal University of Itajubá, Itajubá, BrazilResearch Group on Systems and Computer Engineering, Federal University of Itajubá, Itajubá, BrazilResearch Group on Systems and Computer Engineering, Federal University of Itajubá, Itajubá, BrazilSpam e-mails are unsolicited e-mails received by users of the e-mail service. Spam e-mails cause serious harm to organizations, for they waste, among other things, their computational and networking resources. To reduce the damage caused by them, organizations use anti-spams. Anti-spams are software systems that classify e-mails in order to separate legitimate from spam e-mails. The best current commercial and open-source anti-spams, and in particular the well-known commercial anti-spam CanIt-PRO, make use of various techniques, such as blacklists and/or SMTP extensions, to classify e-mails. Unfortunately, both blacklists and SMTP extensions have serious drawbacks, such as low scalability and high computational and network costs. This paper introduces the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS). Unlike the best current anti-spams, Open-MaLBAS does not make use of blacklists and SMTP extensions, but only of machine learning models for e-mail classification. Open-MaLBAS was compared to CanIt-PRO in a series of experiments on a database composed of 862,227 real e-mails, collected over three months at the Federal University of Itajubá, Brazil. The e-mails were previously classified by CanIt-PRO. From the experiments, it was observed that Open-MaLBAS was able to correctly classify 81.48% and 98.13% of the e-mails in the database, using, respectively, the two models — Multi-Layer Perceptron and Random Forest — evaluated. In addition, it managed to obtain times of up to 88% shorter than those of CanIt-PRO to classify all e-mails in the database. Open-MaLBAS is implemented in Java language, under free software license, for free use. It is available on GitHub.https://ieeexplore.ieee.org/document/9565223/Electronic mail (e-mail)internetmachine learningnetwork securityopen source softwaresimple mail transfer protocol (SMTP) |
spellingShingle | Isaac C. Ferreira Marcelo V. C. Aragao Edvard M. Oliveira Bruno T. Kuehne Edmilson M. Moreira Otavio A. S. Carpinteiro The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) IEEE Access Electronic mail (e-mail) internet machine learning network security open source software simple mail transfer protocol (SMTP) |
title | The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) |
title_full | The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) |
title_fullStr | The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) |
title_full_unstemmed | The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) |
title_short | The Development of the Open Machine-Learning-Based Anti-Spam (Open-MaLBAS) |
title_sort | development of the open machine learning based anti spam open malbas |
topic | Electronic mail (e-mail) internet machine learning network security open source software simple mail transfer protocol (SMTP) |
url | https://ieeexplore.ieee.org/document/9565223/ |
work_keys_str_mv | AT isaaccferreira thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT marcelovcaragao thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT edvardmoliveira thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT brunotkuehne thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT edmilsonmmoreira thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT otavioascarpinteiro thedevelopmentoftheopenmachinelearningbasedantispamopenmalbas AT isaaccferreira developmentoftheopenmachinelearningbasedantispamopenmalbas AT marcelovcaragao developmentoftheopenmachinelearningbasedantispamopenmalbas AT edvardmoliveira developmentoftheopenmachinelearningbasedantispamopenmalbas AT brunotkuehne developmentoftheopenmachinelearningbasedantispamopenmalbas AT edmilsonmmoreira developmentoftheopenmachinelearningbasedantispamopenmalbas AT otavioascarpinteiro developmentoftheopenmachinelearningbasedantispamopenmalbas |