Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure

Chromatin accessibility, as measured by ATACseq, varies between hematopoietic cell types in different lineages of the hematopoietic differentiation tree, e.g. T cells vs. B cells, but methods that associate variation in chromatin accessibility to the lineage structure of the differentiation tree are...

Full description

Bibliographic Details
Main Authors: Thomas B. George, Nathaniel K. Strawn, Sivan Leviyang
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-10-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.707117/full
_version_ 1819104397308723200
author Thomas B. George
Nathaniel K. Strawn
Sivan Leviyang
author_facet Thomas B. George
Nathaniel K. Strawn
Sivan Leviyang
author_sort Thomas B. George
collection DOAJ
description Chromatin accessibility, as measured by ATACseq, varies between hematopoietic cell types in different lineages of the hematopoietic differentiation tree, e.g. T cells vs. B cells, but methods that associate variation in chromatin accessibility to the lineage structure of the differentiation tree are lacking. Using an ATACseq dataset recently published by the ImmGen consortium, we construct associations between chromatin accessibility and hematopoietic cell types using a novel co-clustering approach that accounts for the structure of the hematopoietic, differentiation tree. Under a model in which all loci and cell types within a co-cluster have a shared accessibility state, we show that roughly 80% of cell type associated accessibility variation can be captured through 12 cell type clusters and 20 genomic locus clusters, with the cell type clusters reflecting coherent components of the differentiation tree. Using publicly available ChIPseq datasets, we show that our clustering reflects transcription factor binding patterns with implications for regulation across cell types. We show that traditional methods such as hierarchical and kmeans clusterings lead to cell type clusters that are more dispersed on the tree than our tree-based algorithm. We provide a python package, chromcocluster, that implements the algorithms presented.
first_indexed 2024-12-22T02:05:42Z
format Article
id doaj.art-868aa20def57415ca76a7eef616a9d7c
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-22T02:05:42Z
publishDate 2021-10-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-868aa20def57415ca76a7eef616a9d7c2022-12-21T18:42:32ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-10-011210.3389/fgene.2021.707117707117Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage StructureThomas B. GeorgeNathaniel K. StrawnSivan LeviyangChromatin accessibility, as measured by ATACseq, varies between hematopoietic cell types in different lineages of the hematopoietic differentiation tree, e.g. T cells vs. B cells, but methods that associate variation in chromatin accessibility to the lineage structure of the differentiation tree are lacking. Using an ATACseq dataset recently published by the ImmGen consortium, we construct associations between chromatin accessibility and hematopoietic cell types using a novel co-clustering approach that accounts for the structure of the hematopoietic, differentiation tree. Under a model in which all loci and cell types within a co-cluster have a shared accessibility state, we show that roughly 80% of cell type associated accessibility variation can be captured through 12 cell type clusters and 20 genomic locus clusters, with the cell type clusters reflecting coherent components of the differentiation tree. Using publicly available ChIPseq datasets, we show that our clustering reflects transcription factor binding patterns with implications for regulation across cell types. We show that traditional methods such as hierarchical and kmeans clusterings lead to cell type clusters that are more dispersed on the tree than our tree-based algorithm. We provide a python package, chromcocluster, that implements the algorithms presented.https://www.frontiersin.org/articles/10.3389/fgene.2021.707117/fullchromatin accessibilityhematopoiesisclusteringtree (graphs)ATACseqepigenetics
spellingShingle Thomas B. George
Nathaniel K. Strawn
Sivan Leviyang
Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
Frontiers in Genetics
chromatin accessibility
hematopoiesis
clustering
tree (graphs)
ATACseq
epigenetics
title Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
title_full Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
title_fullStr Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
title_full_unstemmed Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
title_short Tree-Based Co-Clustering Identifies Chromatin Accessibility Patterns Associated With Hematopoietic Lineage Structure
title_sort tree based co clustering identifies chromatin accessibility patterns associated with hematopoietic lineage structure
topic chromatin accessibility
hematopoiesis
clustering
tree (graphs)
ATACseq
epigenetics
url https://www.frontiersin.org/articles/10.3389/fgene.2021.707117/full
work_keys_str_mv AT thomasbgeorge treebasedcoclusteringidentifieschromatinaccessibilitypatternsassociatedwithhematopoieticlineagestructure
AT nathanielkstrawn treebasedcoclusteringidentifieschromatinaccessibilitypatternsassociatedwithhematopoieticlineagestructure
AT sivanleviyang treebasedcoclusteringidentifieschromatinaccessibilitypatternsassociatedwithhematopoieticlineagestructure