A Fuzzy-Match Search Engine for Physician Directories
BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspe...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JMIR Publications
2014-11-01
|
Series: | JMIR Medical Informatics |
Online Access: | http://medinform.jmir.org/2014/2/e30/ |
_version_ | 1818382475779375104 |
---|---|
author | Rastegar-Mojarad, Majid Kadolph, Christopher Ye, Zhan Wall, Daniel Murali, Narayana Lin, Simon |
author_facet | Rastegar-Mojarad, Majid Kadolph, Christopher Ye, Zhan Wall, Daniel Murali, Narayana Lin, Simon |
author_sort | Rastegar-Mojarad, Majid |
collection | DOAJ |
description | BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names.
ObjectiveThe Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster.
MethodsInstead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data.
ResultsUsing the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%).
ConclusionsWe designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website. |
first_indexed | 2024-12-14T02:51:04Z |
format | Article |
id | doaj.art-c3dc28df1e2b402a90384769aeafe1eb |
institution | Directory Open Access Journal |
issn | 2291-9694 |
language | English |
last_indexed | 2024-12-14T02:51:04Z |
publishDate | 2014-11-01 |
publisher | JMIR Publications |
record_format | Article |
series | JMIR Medical Informatics |
spelling | doaj.art-c3dc28df1e2b402a90384769aeafe1eb2022-12-21T23:19:46ZengJMIR PublicationsJMIR Medical Informatics2291-96942014-11-0122e3010.2196/medinform.3463A Fuzzy-Match Search Engine for Physician DirectoriesRastegar-Mojarad, MajidKadolph, ChristopherYe, ZhanWall, DanielMurali, NarayanaLin, SimonBackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. ObjectiveThe Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. MethodsInstead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. ResultsUsing the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%). ConclusionsWe designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.http://medinform.jmir.org/2014/2/e30/ |
spellingShingle | Rastegar-Mojarad, Majid Kadolph, Christopher Ye, Zhan Wall, Daniel Murali, Narayana Lin, Simon A Fuzzy-Match Search Engine for Physician Directories JMIR Medical Informatics |
title | A Fuzzy-Match Search Engine for Physician Directories |
title_full | A Fuzzy-Match Search Engine for Physician Directories |
title_fullStr | A Fuzzy-Match Search Engine for Physician Directories |
title_full_unstemmed | A Fuzzy-Match Search Engine for Physician Directories |
title_short | A Fuzzy-Match Search Engine for Physician Directories |
title_sort | fuzzy match search engine for physician directories |
url | http://medinform.jmir.org/2014/2/e30/ |
work_keys_str_mv | AT rastegarmojaradmajid afuzzymatchsearchengineforphysiciandirectories AT kadolphchristopher afuzzymatchsearchengineforphysiciandirectories AT yezhan afuzzymatchsearchengineforphysiciandirectories AT walldaniel afuzzymatchsearchengineforphysiciandirectories AT muralinarayana afuzzymatchsearchengineforphysiciandirectories AT linsimon afuzzymatchsearchengineforphysiciandirectories AT rastegarmojaradmajid fuzzymatchsearchengineforphysiciandirectories AT kadolphchristopher fuzzymatchsearchengineforphysiciandirectories AT yezhan fuzzymatchsearchengineforphysiciandirectories AT walldaniel fuzzymatchsearchengineforphysiciandirectories AT muralinarayana fuzzymatchsearchengineforphysiciandirectories AT linsimon fuzzymatchsearchengineforphysiciandirectories |