A Fuzzy-Match Search Engine for Physician Directories

BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspe...

Full description

Bibliographic Details
Main Authors: Rastegar-Mojarad, Majid, Kadolph, Christopher, Ye, Zhan, Wall, Daniel, Murali, Narayana, Lin, Simon
Format: Article
Language:English
Published: JMIR Publications 2014-11-01
Series:JMIR Medical Informatics
Online Access:http://medinform.jmir.org/2014/2/e30/
Description
Summary:BackgroundA search engine to find physicians’ information is a basic but crucial function of a health care provider’s website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. ObjectiveThe Marshfield Clinic website provides a search engine for users to search for physicians’ names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. MethodsInstead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: “Typographic”, “Phonetic spelling variation”, and “Nickname”. To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. ResultsUsing the “Challenge Data Set of Marshfield Physician Names,” we evaluated the accuracy of fuzzy-match engine–top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine–top one (71%). ConclusionsWe designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
ISSN:2291-9694