Lazo: A Cardinality-Based Method for Coupled Estimation of Jaccard Similarity and Containment

© 2019 IEEE. Data analysts often need to find datasets that are similar (i.e., have high overlap) or that are subsets of one another (i.e., one contains the other). Exactly computing such relationships is expensive because it entails an all-pairs comparison between all values in all datasets, an O(n...

وصف كامل

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Castro Fernandez, Raul, Min, Jisoo, Nava, Demitri, Madden, Samuel
مؤلفون آخرون: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
التنسيق: مقال
اللغة:English
منشور في: Institute of Electrical and Electronics Engineers (IEEE) 2022
الوصول للمادة أونلاين:https://hdl.handle.net/1721.1/143769