Lazo: A Cardinality-Based Method for Coupled Estimation of Jaccard Similarity and Containment

© 2019 IEEE. Data analysts often need to find datasets that are similar (i.e., have high overlap) or that are subsets of one another (i.e., one contains the other). Exactly computing such relationships is expensive because it entails an all-pairs comparison between all values in all datasets, an O(n...

Full description

Bibliographic Details
Main Authors: Castro Fernandez, Raul, Min, Jisoo, Nava, Demitri, Madden, Samuel
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE) 2022
Online Access:https://hdl.handle.net/1721.1/143769