mbonsai: Application Package for Sequence Classification by Tree Methodology
In many applications such as transaction data analysis, the classification of long chains of sequences is required. For example, brand purchase history in customer transaction data is in a form like AABCABAA, where A, B, and C are brands of a consumer product. The decision tree-based package mbonsai...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Foundation for Open Access Statistics
2018-09-01
|
Series: | Journal of Statistical Software |
Subjects: | |
Online Access: | https://www.jstatsoft.org/index.php/jss/article/view/2603 |
_version_ | 1819173390088404992 |
---|---|
author | Yukinobu Hamuro Masakazu Nakamoto Stephane Cheung Edward H. Ip |
author_facet | Yukinobu Hamuro Masakazu Nakamoto Stephane Cheung Edward H. Ip |
author_sort | Yukinobu Hamuro |
collection | DOAJ |
description | In many applications such as transaction data analysis, the classification of long chains of sequences is required. For example, brand purchase history in customer transaction data is in a form like AABCABAA, where A, B, and C are brands of a consumer product. The decision tree-based package mbonsai is designed to handle sequence data of varying lengths using one or multiple variables of interest as predictor variables. This software package uses tree growing and pruning strategies adopted from C4.5 and CART algorithms, and includes new features for handling sequence data and indexing for classification purpose. The software uses a simple command line program for learning and predicting processes, and has the ability to generate user-friendly graphics depicting decision trees. The underlying C++ codes are designed to efficiently process large data sets in ASCII files. Two examples from transaction data sets are used to illustrate the application of mbonsai. |
first_indexed | 2024-12-22T20:22:19Z |
format | Article |
id | doaj.art-3dbeb6c5beb54d069c04ae3ce9950bab |
institution | Directory Open Access Journal |
issn | 1548-7660 |
language | English |
last_indexed | 2024-12-22T20:22:19Z |
publishDate | 2018-09-01 |
publisher | Foundation for Open Access Statistics |
record_format | Article |
series | Journal of Statistical Software |
spelling | doaj.art-3dbeb6c5beb54d069c04ae3ce9950bab2022-12-21T18:13:49ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602018-09-0186113010.18637/jss.v086.i061238mbonsai: Application Package for Sequence Classification by Tree MethodologyYukinobu HamuroMasakazu NakamotoStephane CheungEdward H. IpIn many applications such as transaction data analysis, the classification of long chains of sequences is required. For example, brand purchase history in customer transaction data is in a form like AABCABAA, where A, B, and C are brands of a consumer product. The decision tree-based package mbonsai is designed to handle sequence data of varying lengths using one or multiple variables of interest as predictor variables. This software package uses tree growing and pruning strategies adopted from C4.5 and CART algorithms, and includes new features for handling sequence data and indexing for classification purpose. The software uses a simple command line program for learning and predicting processes, and has the ability to generate user-friendly graphics depicting decision trees. The underlying C++ codes are designed to efficiently process large data sets in ASCII files. Two examples from transaction data sets are used to illustrate the application of mbonsai.https://www.jstatsoft.org/index.php/jss/article/view/2603decision treesequenceclassificationalphabet indexing |
spellingShingle | Yukinobu Hamuro Masakazu Nakamoto Stephane Cheung Edward H. Ip mbonsai: Application Package for Sequence Classification by Tree Methodology Journal of Statistical Software decision tree sequence classification alphabet indexing |
title | mbonsai: Application Package for Sequence Classification by Tree Methodology |
title_full | mbonsai: Application Package for Sequence Classification by Tree Methodology |
title_fullStr | mbonsai: Application Package for Sequence Classification by Tree Methodology |
title_full_unstemmed | mbonsai: Application Package for Sequence Classification by Tree Methodology |
title_short | mbonsai: Application Package for Sequence Classification by Tree Methodology |
title_sort | mbonsai application package for sequence classification by tree methodology |
topic | decision tree sequence classification alphabet indexing |
url | https://www.jstatsoft.org/index.php/jss/article/view/2603 |
work_keys_str_mv | AT yukinobuhamuro mbonsaiapplicationpackageforsequenceclassificationbytreemethodology AT masakazunakamoto mbonsaiapplicationpackageforsequenceclassificationbytreemethodology AT stephanecheung mbonsaiapplicationpackageforsequenceclassificationbytreemethodology AT edwardhip mbonsaiapplicationpackageforsequenceclassificationbytreemethodology |