A Fuzzy-Match Search Engine for Physician Directories

BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspe...

Full description

Bibliographic Details
Main Authors: Rastegar-Mojarad, Majid, Kadolph, Christopher, Ye, Zhan, Wall, Daniel, Murali, Narayana, Lin, Simon
Format: Article
Language:English
Published: JMIR Publications 2014-11-01
Series:JMIR Medical Informatics
Online Access:http://medinform.jmir.org/2014/2/e30/
_version_ 1818382475779375104
author Rastegar-Mojarad, Majid
Kadolph, Christopher
Ye, Zhan
Wall, Daniel
Murali, Narayana
Lin, Simon
author_facet Rastegar-Mojarad, Majid
Kadolph, Christopher
Ye, Zhan
Wall, Daniel
Murali, Narayana
Lin, Simon
author_sort Rastegar-Mojarad, Majid
collection DOAJ
description BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. ObjectiveThe Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. MethodsInstead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. ResultsUsing the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%). ConclusionsWe designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
first_indexed 2024-12-14T02:51:04Z
format Article
id doaj.art-c3dc28df1e2b402a90384769aeafe1eb
institution Directory Open Access Journal
issn 2291-9694
language English
last_indexed 2024-12-14T02:51:04Z
publishDate 2014-11-01
publisher JMIR Publications
record_format Article
series JMIR Medical Informatics
spelling doaj.art-c3dc28df1e2b402a90384769aeafe1eb2022-12-21T23:19:46ZengJMIR PublicationsJMIR Medical Informatics2291-96942014-11-0122e3010.2196/medinform.3463A Fuzzy-Match Search Engine for Physician DirectoriesRastegar-Mojarad, MajidKadolph, ChristopherYe, ZhanWall, DanielMurali, NarayanaLin, SimonBackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. ObjectiveThe Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. MethodsInstead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. ResultsUsing the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%). ConclusionsWe designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.http://medinform.jmir.org/2014/2/e30/
spellingShingle Rastegar-Mojarad, Majid
Kadolph, Christopher
Ye, Zhan
Wall, Daniel
Murali, Narayana
Lin, Simon
A Fuzzy-Match Search Engine for Physician Directories
JMIR Medical Informatics
title A Fuzzy-Match Search Engine for Physician Directories
title_full A Fuzzy-Match Search Engine for Physician Directories
title_fullStr A Fuzzy-Match Search Engine for Physician Directories
title_full_unstemmed A Fuzzy-Match Search Engine for Physician Directories
title_short A Fuzzy-Match Search Engine for Physician Directories
title_sort fuzzy match search engine for physician directories
url http://medinform.jmir.org/2014/2/e30/
work_keys_str_mv AT rastegarmojaradmajid afuzzymatchsearchengineforphysiciandirectories
AT kadolphchristopher afuzzymatchsearchengineforphysiciandirectories
AT yezhan afuzzymatchsearchengineforphysiciandirectories
AT walldaniel afuzzymatchsearchengineforphysiciandirectories
AT muralinarayana afuzzymatchsearchengineforphysiciandirectories
AT linsimon afuzzymatchsearchengineforphysiciandirectories
AT rastegarmojaradmajid fuzzymatchsearchengineforphysiciandirectories
AT kadolphchristopher fuzzymatchsearchengineforphysiciandirectories
AT yezhan fuzzymatchsearchengineforphysiciandirectories
AT walldaniel fuzzymatchsearchengineforphysiciandirectories
AT muralinarayana fuzzymatchsearchengineforphysiciandirectories
AT linsimon fuzzymatchsearchengineforphysiciandirectories