Finding co-authorship patterns by frequent patterns

In the growing emphasis and importance of big data, where data are constantly increasing, mining those data became an essential task in these years in order to generate new useful information, that would improve our daily lives. Frequent pattern mining, a key task of identifying relationships in dat...

Full description

Bibliographic Details
Main Author: Tan, Jomain Zi Hao
Other Authors: He Bingsheng
Format: Final Year Project (FYP)
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/63056
Description
Summary:In the growing emphasis and importance of big data, where data are constantly increasing, mining those data became an essential task in these years in order to generate new useful information, that would improve our daily lives. Frequent pattern mining, a key task of identifying relationships in data remains one of the most important tasks of data mining. Frequent pattern mining reveals new useful information that was once hidden in large datasets of data, and it allows broad real life application such as customer behavioral analysis. Apriori and FP- Growth are two basic and well-known algorithms that are used for frequent pattern mining. Google Scholar, a search engine by Google, allows users to search for published papers. Google Scholar indexed an extensive collection of published papers, and the general public is able to use the search engine to find related papers through the author name or the title of publish papers. While Google Scholar returns accurate and reliable results, there are some specific areas of search that are not implemented, such as our scope of interest in this project. This project will compare and explore two frequent itemset mining algorithms, Apriori and Fp- growth by search ingfor co-authorship patterns.