Probabilistic algorithm for mining frequent sequences
Frequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requ...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Vilnius University Press
2010-12-01
|
Series: | Lietuvos Matematikos Rinkinys |
Subjects: | |
Online Access: | https://www.journals.vu.lt/LMR/article/view/17841 |
_version_ | 1828897734747226112 |
---|---|
author | Julija Pragarauskaitė Gintautas Dzemyda |
author_facet | Julija Pragarauskaitė Gintautas Dzemyda |
author_sort | Julija Pragarauskaitė |
collection | DOAJ |
description | Frequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions
about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods. |
first_indexed | 2024-12-13T15:03:06Z |
format | Article |
id | doaj.art-2e97c1fb14734b53b2bd465a2b027cd4 |
institution | Directory Open Access Journal |
issn | 0132-2818 2335-898X |
language | English |
last_indexed | 2024-12-13T15:03:06Z |
publishDate | 2010-12-01 |
publisher | Vilnius University Press |
record_format | Article |
series | Lietuvos Matematikos Rinkinys |
spelling | doaj.art-2e97c1fb14734b53b2bd465a2b027cd42022-12-21T23:41:05ZengVilnius University PressLietuvos Matematikos Rinkinys0132-28182335-898X2010-12-0151proc. LMS10.15388/LMR.2010.57Probabilistic algorithm for mining frequent sequencesJulija Pragarauskaitė0Gintautas Dzemyda1Matematikos ir informatikos institutasMatematikos ir informatikos institutasFrequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods.https://www.journals.vu.lt/LMR/article/view/17841frequent sequence miningprobabilistic algorithmdata mining |
spellingShingle | Julija Pragarauskaitė Gintautas Dzemyda Probabilistic algorithm for mining frequent sequences Lietuvos Matematikos Rinkinys frequent sequence mining probabilistic algorithm data mining |
title | Probabilistic algorithm for mining frequent sequences |
title_full | Probabilistic algorithm for mining frequent sequences |
title_fullStr | Probabilistic algorithm for mining frequent sequences |
title_full_unstemmed | Probabilistic algorithm for mining frequent sequences |
title_short | Probabilistic algorithm for mining frequent sequences |
title_sort | probabilistic algorithm for mining frequent sequences |
topic | frequent sequence mining probabilistic algorithm data mining |
url | https://www.journals.vu.lt/LMR/article/view/17841 |
work_keys_str_mv | AT julijapragarauskaite probabilisticalgorithmforminingfrequentsequences AT gintautasdzemyda probabilisticalgorithmforminingfrequentsequences |