Probabilistic algorithm for mining frequent sequences

Frequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requ...

Full description

Bibliographic Details
Main Authors: Julija Pragarauskaitė, Gintautas Dzemyda
Format: Article
Language:English
Published: Vilnius University Press 2010-12-01
Series:Lietuvos Matematikos Rinkinys
Subjects:
Online Access:https://www.journals.vu.lt/LMR/article/view/17841
_version_ 1828897734747226112
author Julija Pragarauskaitė
Gintautas Dzemyda
author_facet Julija Pragarauskaitė
Gintautas Dzemyda
author_sort Julija Pragarauskaitė
collection DOAJ
description Frequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods.
first_indexed 2024-12-13T15:03:06Z
format Article
id doaj.art-2e97c1fb14734b53b2bd465a2b027cd4
institution Directory Open Access Journal
issn 0132-2818
2335-898X
language English
last_indexed 2024-12-13T15:03:06Z
publishDate 2010-12-01
publisher Vilnius University Press
record_format Article
series Lietuvos Matematikos Rinkinys
spelling doaj.art-2e97c1fb14734b53b2bd465a2b027cd42022-12-21T23:41:05ZengVilnius University PressLietuvos Matematikos Rinkinys0132-28182335-898X2010-12-0151proc. LMS10.15388/LMR.2010.57Probabilistic algorithm for mining frequent sequencesJulija Pragarauskaitė0Gintautas Dzemyda1Matematikos ir informatikos institutasMatematikos ir informatikos institutasFrequent sequence mining in large volume databases is important in many areas, e.g., biological, climate, financial databases. Exact frequent sequence mining algorithms usually read the whole database many times, and if the database is large enough, then frequent sequence mining is very long or requires supercomputers. A new probabilistic algorithm for mining frequent sequences is proposed. It analyzes a random sample of the initial database. The algorithm makes decisions about the initial database according to the random sample analysis results and performs much faster than the exact mining algorithms. The probability of errors made by the probabilistic algorithm is estimated using statistical methods.https://www.journals.vu.lt/LMR/article/view/17841frequent sequence miningprobabilistic algorithmdata mining
spellingShingle Julija Pragarauskaitė
Gintautas Dzemyda
Probabilistic algorithm for mining frequent sequences
Lietuvos Matematikos Rinkinys
frequent sequence mining
probabilistic algorithm
data mining
title Probabilistic algorithm for mining frequent sequences
title_full Probabilistic algorithm for mining frequent sequences
title_fullStr Probabilistic algorithm for mining frequent sequences
title_full_unstemmed Probabilistic algorithm for mining frequent sequences
title_short Probabilistic algorithm for mining frequent sequences
title_sort probabilistic algorithm for mining frequent sequences
topic frequent sequence mining
probabilistic algorithm
data mining
url https://www.journals.vu.lt/LMR/article/view/17841
work_keys_str_mv AT julijapragarauskaite probabilisticalgorithmforminingfrequentsequences
AT gintautasdzemyda probabilisticalgorithmforminingfrequentsequences