LR-SDiscr: a novel and scalable merging and splitting discretization framework using a lexical generator
In this paper, we propose a novel supervised discretization method namely LR-SDiscr. It is based on a Left to Right (LR) scanning technique, which partitions automatically the input stream into intervals. Its originality resides in the fact it handles both merging and division operations in the same...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2019-04-01
|
Series: | Journal of Information and Telecommunication |
Subjects: | |
Online Access: | http://dx.doi.org/10.1080/24751839.2018.1552647 |
Summary: | In this paper, we propose a novel supervised discretization method namely LR-SDiscr. It is based on a Left to Right (LR) scanning technique, which partitions automatically the input stream into intervals. Its originality resides in the fact it handles both merging and division operations in the same process, and hence it benefits of the use of a large spectrum of statistic measures. The second strength of our proposal is the reduction of the discretization complexity by processing the data in only one pass. Extensive experiments were conducted on various cut-point functions using public benchmarks that include small, large and medical datasets. LR-SDiscr outperforms several classical and recent discretization methods. |
---|---|
ISSN: | 2475-1839 2475-1847 |