Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role

Abstract Searching relevant papers is a fundamental task for the elaboration of secondary studies. This task is known to be tedious and time‐consuming when it is made manually, especially with the presence of several academic repositories. Recently, Semantic Scholar has emerged as a new artificial i...

Full description

Bibliographic Details
Main Author: Abdelhakim Hannousse
Format: Article
Language:English
Published: Hindawi-IET 2021-02-01
Series:IET Software
Subjects:
Online Access:https://doi.org/10.1049/sfw2.12011
_version_ 1797429273773998080
author Abdelhakim Hannousse
author_facet Abdelhakim Hannousse
author_sort Abdelhakim Hannousse
collection DOAJ
description Abstract Searching relevant papers is a fundamental task for the elaboration of secondary studies. This task is known to be tedious and time‐consuming when it is made manually, especially with the presence of several academic repositories. Recently, Semantic Scholar has emerged as a new artificial intelligence‐based search engine enabling a set of valuable features. The present study investigates the role of Semantic Scholar in retrieving relevant papers for performing secondary studies in software engineering. For this sake, an examination is performed to check the ability of Semantic Scholar to locate included papers in recent and well‐established secondary studies. Afterwards, a hybrid and automatic search strategy is introduced making use of Semantic Scholar as a sole search engine and it incorporates: automatic search, snowballing, and use of Computer Science Ontology (CSO) and Software Engineering Body of Knowledge (SWEBOK) for refining queries. The proposed strategy is validated by replicating the search of high‐quality secondary studies in the software engineering field. To guarantee objectivity, a systematic search is conducted of recent secondary studies published in the field since 2015. For the coverage test, Semantic Scholar is examined to locate primary papers of selected secondary studies and identify missing venues. The proposed search strategy is used to check the ability to retrieve primary papers of each secondary study. The systematic search yielded 20 high‐quality secondary studies with 1337 distinct primary papers. The coverage test revealed that Semantic Scholar covers 98.88% of the papers. The proposed search strategy enabled the full replication of 13 studies and more than 90% for the 7 remaining studies.
first_indexed 2024-03-09T09:10:48Z
format Article
id doaj.art-3c8de03c4bb841a896c264ecf44728a1
institution Directory Open Access Journal
issn 1751-8806
1751-8814
language English
last_indexed 2024-03-09T09:10:48Z
publishDate 2021-02-01
publisher Hindawi-IET
record_format Article
series IET Software
spelling doaj.art-3c8de03c4bb841a896c264ecf44728a12023-12-02T08:45:47ZengHindawi-IETIET Software1751-88061751-88142021-02-0115112614610.1049/sfw2.12011Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification roleAbdelhakim Hannousse0Department of Computer Science Université 8 Mai 1945 Guelma AlgeriaAbstract Searching relevant papers is a fundamental task for the elaboration of secondary studies. This task is known to be tedious and time‐consuming when it is made manually, especially with the presence of several academic repositories. Recently, Semantic Scholar has emerged as a new artificial intelligence‐based search engine enabling a set of valuable features. The present study investigates the role of Semantic Scholar in retrieving relevant papers for performing secondary studies in software engineering. For this sake, an examination is performed to check the ability of Semantic Scholar to locate included papers in recent and well‐established secondary studies. Afterwards, a hybrid and automatic search strategy is introduced making use of Semantic Scholar as a sole search engine and it incorporates: automatic search, snowballing, and use of Computer Science Ontology (CSO) and Software Engineering Body of Knowledge (SWEBOK) for refining queries. The proposed strategy is validated by replicating the search of high‐quality secondary studies in the software engineering field. To guarantee objectivity, a systematic search is conducted of recent secondary studies published in the field since 2015. For the coverage test, Semantic Scholar is examined to locate primary papers of selected secondary studies and identify missing venues. The proposed search strategy is used to check the ability to retrieve primary papers of each secondary study. The systematic search yielded 20 high‐quality secondary studies with 1337 distinct primary papers. The coverage test revealed that Semantic Scholar covers 98.88% of the papers. The proposed search strategy enabled the full replication of 13 studies and more than 90% for the 7 remaining studies.https://doi.org/10.1049/sfw2.12011artificial intelligenceontologies (artificial intelligence)query formulationquery processingsearch enginessearch problems
spellingShingle Abdelhakim Hannousse
Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
IET Software
artificial intelligence
ontologies (artificial intelligence)
query formulation
query processing
search engines
search problems
title Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
title_full Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
title_fullStr Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
title_full_unstemmed Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
title_short Searching relevant papers for software engineering secondary studies: Semantic Scholar coverage and identification role
title_sort searching relevant papers for software engineering secondary studies semantic scholar coverage and identification role
topic artificial intelligence
ontologies (artificial intelligence)
query formulation
query processing
search engines
search problems
url https://doi.org/10.1049/sfw2.12011
work_keys_str_mv AT abdelhakimhannousse searchingrelevantpapersforsoftwareengineeringsecondarystudiessemanticscholarcoverageandidentificationrole