Exploratory and directed search strategies at a social science data archive

Researchers need to be able to find, access, and use data to participate in open science. To understand how users search for research data, we analyzed textual queries issued at a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). We collec...

Full description

Bibliographic Details
Main Authors: Sara Lafia, A.J. Million, Libby Hemphill
Format: Article
Language:English
Published: International Association for Social Science Information Service and Technology 2024-03-01
Series:IASSIST Quarterly
Subjects:
Online Access:https://iassistquarterly.com/index.php/iassist/article/view/1087
_version_ 1797233885275226112
author Sara Lafia
A.J. Million
Libby Hemphill
author_facet Sara Lafia
A.J. Million
Libby Hemphill
author_sort Sara Lafia
collection DOAJ
description Researchers need to be able to find, access, and use data to participate in open science. To understand how users search for research data, we analyzed textual queries issued at a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). We collected unique user queries from 988,475 user search sessions over four years (2012-16). Overall, we found that only 30% of site visitors entered search terms into the ICPSR website. We analyzed search strategies within these sessions by extending existing dataset search taxonomies to classify a subset of the 1,554 most popular queries. We identified five categories of commonly-issued queries: keyword-based (e.g., date, place, topic); name (e.g., study, series); identifier (e.g., study, series); author (e.g., institutional, individual); and type (e.g., file, format). While the dominant search strategy used short keywords to explore topics, directed searches for known items using study and series names were also common. We further distinguished exploratory browsing from directed search queries based on their page views, refinements, search depth, duration, and length. Directed queries were longer (i.e., they had more words), while sessions with exploratory queries had more refinements and associated page views. By comparing search interactions at ICPSR to other natural language interactions in similar web search contexts, we conclude that dataset search at ICPSR is underutilized. We envision how alternative search paradigms, such as those enabled by recommender systems, can enhance dataset search.
first_indexed 2024-04-24T16:23:16Z
format Article
id doaj.art-3e59762ebfc4413b985bd3dcc5345ea0
institution Directory Open Access Journal
issn 2331-4141
language English
last_indexed 2024-04-24T16:23:16Z
publishDate 2024-03-01
publisher International Association for Social Science Information Service and Technology
record_format Article
series IASSIST Quarterly
spelling doaj.art-3e59762ebfc4413b985bd3dcc5345ea02024-03-31T09:23:26ZengInternational Association for Social Science Information Service and TechnologyIASSIST Quarterly2331-41412024-03-0148110.29173/iq1087Exploratory and directed search strategies at a social science data archiveSara Lafia0A.J. Million1https://orcid.org/0000-0002-8909-153XLibby Hemphill2ICPSR, University of MichiganICPSR, University of MichiganICPSR and UMSI, University of Michigan Researchers need to be able to find, access, and use data to participate in open science. To understand how users search for research data, we analyzed textual queries issued at a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). We collected unique user queries from 988,475 user search sessions over four years (2012-16). Overall, we found that only 30% of site visitors entered search terms into the ICPSR website. We analyzed search strategies within these sessions by extending existing dataset search taxonomies to classify a subset of the 1,554 most popular queries. We identified five categories of commonly-issued queries: keyword-based (e.g., date, place, topic); name (e.g., study, series); identifier (e.g., study, series); author (e.g., institutional, individual); and type (e.g., file, format). While the dominant search strategy used short keywords to explore topics, directed searches for known items using study and series names were also common. We further distinguished exploratory browsing from directed search queries based on their page views, refinements, search depth, duration, and length. Directed queries were longer (i.e., they had more words), while sessions with exploratory queries had more refinements and associated page views. By comparing search interactions at ICPSR to other natural language interactions in similar web search contexts, we conclude that dataset search at ICPSR is underutilized. We envision how alternative search paradigms, such as those enabled by recommender systems, can enhance dataset search. https://iassistquarterly.com/index.php/iassist/article/view/1087research datainformation searchquery log analysisuser behaviorweb analytics
spellingShingle Sara Lafia
A.J. Million
Libby Hemphill
Exploratory and directed search strategies at a social science data archive
IASSIST Quarterly
research data
information search
query log analysis
user behavior
web analytics
title Exploratory and directed search strategies at a social science data archive
title_full Exploratory and directed search strategies at a social science data archive
title_fullStr Exploratory and directed search strategies at a social science data archive
title_full_unstemmed Exploratory and directed search strategies at a social science data archive
title_short Exploratory and directed search strategies at a social science data archive
title_sort exploratory and directed search strategies at a social science data archive
topic research data
information search
query log analysis
user behavior
web analytics
url https://iassistquarterly.com/index.php/iassist/article/view/1087
work_keys_str_mv AT saralafia exploratoryanddirectedsearchstrategiesatasocialsciencedataarchive
AT ajmillion exploratoryanddirectedsearchstrategiesatasocialsciencedataarchive
AT libbyhemphill exploratoryanddirectedsearchstrategiesatasocialsciencedataarchive