SilkMoth: an efficient method for finding related sets with maximum matching constraints

Determining if two sets are related - that is, if they have similar values or if one set contains the other - is an important problem with many applications in data cleaning, data integration, and information retrieval. For example, set relatedness can be a useful tool to discover whether columns fr...

Full description

Bibliographic Details
Main Authors: Deng, Dong, Kim, Albert, Madden, Samuel R, Stonebraker, Michael
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: VLDB Endowment 2019
Online Access:https://hdl.handle.net/1721.1/121341