Seqpare: a novel metric of similarity between genomic interval sets [version 2; peer review: 2 approved]

Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing...

Full description

Bibliographic Details
Main Authors: Selena C. Feng, Nathan C. Sheffield, Jianglin Feng
Format: Article
Language:English
Published: F1000 Research Ltd 2021-01-01
Series:F1000Research
Online Access:https://f1000research.com/articles/9-581/v2
Description
Summary:Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric.
ISSN:2046-1402