Using language-based search in mining large software repositories

Language component plays an important role in data/information retrieval. Data retrieval in software engineering is often hindered by the difficulty of getting data from commercial software. The emergence of the open source repositories has contributed tremendously in the collection of software da...

Full description

Bibliographic Details
Main Author: Awang Abu Bakar, Normi Sham
Format: Proceeding Paper
Language:English
Published: 2011
Subjects:
Online Access:http://irep.iium.edu.my/8451/1/PACLING_AwangAbuBakar.pdf
_version_ 1825645021821403136
author Awang Abu Bakar, Normi Sham
author_facet Awang Abu Bakar, Normi Sham
author_sort Awang Abu Bakar, Normi Sham
collection IIUM
description Language component plays an important role in data/information retrieval. Data retrieval in software engineering is often hindered by the difficulty of getting data from commercial software. The emergence of the open source repositories has contributed tremendously in the collection of software data. This paper highlights the data retrieval method for mining software from a vast open source software repository, SourceForge. For the purpose of automating the data retrieval from the repository, a parser was written using the Python programming language, and based on the pattern matching algorithm. The retrieved data were later used to estimate the quality of the open source software.
first_indexed 2024-03-05T22:41:29Z
format Proceeding Paper
id oai:generic.eprints.org:8451
institution International Islamic University Malaysia
language English
last_indexed 2024-03-05T22:41:29Z
publishDate 2011
record_format dspace
spelling oai:generic.eprints.org:84512011-12-20T05:51:21Z http://irep.iium.edu.my/8451/ Using language-based search in mining large software repositories Awang Abu Bakar, Normi Sham QA75 Electronic computers. Computer science Language component plays an important role in data/information retrieval. Data retrieval in software engineering is often hindered by the difficulty of getting data from commercial software. The emergence of the open source repositories has contributed tremendously in the collection of software data. This paper highlights the data retrieval method for mining software from a vast open source software repository, SourceForge. For the purpose of automating the data retrieval from the repository, a parser was written using the Python programming language, and based on the pattern matching algorithm. The retrieved data were later used to estimate the quality of the open source software. 2011-12-17 Proceeding Paper PeerReviewed application/pdf en http://irep.iium.edu.my/8451/1/PACLING_AwangAbuBakar.pdf Awang Abu Bakar, Normi Sham (2011) Using language-based search in mining large software repositories. In: Pacific Association for Computational Linguistics (PACLING 2011), 19-21 July 2011, Kuala Lumpur. http://www.sciencedirect.com/science/article/pii/S1877042811024219
spellingShingle QA75 Electronic computers. Computer science
Awang Abu Bakar, Normi Sham
Using language-based search in mining large software repositories
title Using language-based search in mining large software repositories
title_full Using language-based search in mining large software repositories
title_fullStr Using language-based search in mining large software repositories
title_full_unstemmed Using language-based search in mining large software repositories
title_short Using language-based search in mining large software repositories
title_sort using language based search in mining large software repositories
topic QA75 Electronic computers. Computer science
url http://irep.iium.edu.my/8451/1/PACLING_AwangAbuBakar.pdf
work_keys_str_mv AT awangabubakarnormisham usinglanguagebasedsearchinmininglargesoftwarerepositories