FOCS: A novel method for analyzing enhancer and gene activity patterns infers an extensive enhancer–promoter map
Số trang: 14
Loại file: pdf
Dung lượng: 2.27 MB
Lượt xem: 7
Lượt tải: 0
Xem trước 2 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
Recent sequencing technologies enable joint quantification of promoters and their enhancer regions, allowing inference of enhancer–promoter links. We show that current enhancer–promoter inference methods produce a high rate of false positive links. We introduce FOCS, a new inference method, and by benchmarking against ChIAPET, HiChIP, and eQTL data show that it results in lower false discovery rates and at the same time higher inference power.
Nội dung trích xuất từ tài liệu:
FOCS: A novel method for analyzing enhancer and gene activity patterns infers an extensive enhancer–promoter mapHait et al. Genome Biology (2018) 19:56https://doi.org/10.1186/s13059-018-1432-2 METHOD Open AccessFOCS: a novel method for analyzingenhancer and gene activity patterns infersan extensive enhancer–promoter mapTom Aharon Hait1,3, David Amar1,2, Ron Shamir1*† and Ran Elkon3,4*† Abstract Recent sequencing technologies enable joint quantification of promoters and their enhancer regions, allowing inference of enhancer–promoter links. We show that current enhancer–promoter inference methods produce a high rate of false positive links. We introduce FOCS, a new inference method, and by benchmarking against ChIA- PET, HiChIP, and eQTL data show that it results in lower false discovery rates and at the same time higher inference power. By applying FOCS to 2630 samples taken from ENCODE, Roadmap Epigenomics, FANTOM5, and a new compendium of GRO-seq samples, we provide extensive enhancer–promotor maps (http://acgt.cs.tau.ac.il/focs). We illustrate the usability of our maps for deriving biological hypotheses. Keywords: Enhancers, Promoters, Gene regulation, ENCODE, Roadmap, FANTOM5, GRO-seq, eRNA, ChIA-PET, eQTLBackground putative regulatory elements in the genome [2]. As EN-Deciphering the regulatory role of the noncoding part of CODE analyses were mainly applied to cancer cell lines,the human genome is a major challenge. With the com- a follow-up project, the Roadmap Epigenomics, appliedpletion of the sequencing of the genome, efforts have similar analyses to a large collection of human primaryshifted over the past decade towards understanding the cells and tissues, in order to establish more physiologicalepigenome. These efforts aim at understanding regula- maps of common and cell type-specific putative regula-tory mechanisms outside the protein-coding sequences tory elements [3]. Given the plethora of candidate en-that allow the production of thousands of different cell hancer regions called by these projects, the nexttypes from the same DNA blueprint. Enhancer elements pressing challenge is to identify which of them is actuallythat distally control the activity of target promoters play functional and map them to the genes they regulate. Acritical roles in this process. Consequently, large-scale naïve approach that is still widely used in genomic stud-epigenomic projects set out to identify all the cis-regula- ies links enhancers to their nearest genes. Yet, emergingtory elements that are encoded in the genome. Promin- indications suggest that up to 50% of enhancers crossent among them is the ENCODE consortium [1, 2], over their most proximal gene and control a more distalwhich applied a variety of epigenomics techniques to a one [4]. A common approach that improves this naïvelarge panel of human cell lines. Profiling epigenetic enhancer–promoter (E–P) mapping is based on pairwisemarks of regulatory activity (including DHS-seq profiling correlation between activity patterns of promoters (P)of DNase I hypersensitive sites (DHSs), which is ac- and putative enhancers (E), and identifies E–P pairs, lo-cepted as a common feature of all active elements), EN- cated within a distance limit, that show highly correlatedCODE collectively identified hundreds of thousands of patterns across many samples [2, 3]. However, this ap- proach does not take into account interactions among* Correspondence: rshamir@tau.ac.il; ranel@tauex.tau.ac.il multiple enhancers that control the same target pro-† Equal contributors moter. Furthermore, Pearson correlation, which is typic-1 Blavatnik School of Computer Science, Tel Aviv University, 69978 Tel Aviv,Israel ally applied for this task, is highly sensitive to outliers3 Department of Human Molecular Genetics & Biochemistry, Sackler School of and thus prone to false positives.Medicine, Tel Aviv University, 69978 Tel Aviv, IsraelFull list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Hait et al. Genome Biology (2018) 19:56 ...
Nội dung trích xuất từ tài liệu:
FOCS: A novel method for analyzing enhancer and gene activity patterns infers an extensive enhancer–promoter mapHait et al. Genome Biology (2018) 19:56https://doi.org/10.1186/s13059-018-1432-2 METHOD Open AccessFOCS: a novel method for analyzingenhancer and gene activity patterns infersan extensive enhancer–promoter mapTom Aharon Hait1,3, David Amar1,2, Ron Shamir1*† and Ran Elkon3,4*† Abstract Recent sequencing technologies enable joint quantification of promoters and their enhancer regions, allowing inference of enhancer–promoter links. We show that current enhancer–promoter inference methods produce a high rate of false positive links. We introduce FOCS, a new inference method, and by benchmarking against ChIA- PET, HiChIP, and eQTL data show that it results in lower false discovery rates and at the same time higher inference power. By applying FOCS to 2630 samples taken from ENCODE, Roadmap Epigenomics, FANTOM5, and a new compendium of GRO-seq samples, we provide extensive enhancer–promotor maps (http://acgt.cs.tau.ac.il/focs). We illustrate the usability of our maps for deriving biological hypotheses. Keywords: Enhancers, Promoters, Gene regulation, ENCODE, Roadmap, FANTOM5, GRO-seq, eRNA, ChIA-PET, eQTLBackground putative regulatory elements in the genome [2]. As EN-Deciphering the regulatory role of the noncoding part of CODE analyses were mainly applied to cancer cell lines,the human genome is a major challenge. With the com- a follow-up project, the Roadmap Epigenomics, appliedpletion of the sequencing of the genome, efforts have similar analyses to a large collection of human primaryshifted over the past decade towards understanding the cells and tissues, in order to establish more physiologicalepigenome. These efforts aim at understanding regula- maps of common and cell type-specific putative regula-tory mechanisms outside the protein-coding sequences tory elements [3]. Given the plethora of candidate en-that allow the production of thousands of different cell hancer regions called by these projects, the nexttypes from the same DNA blueprint. Enhancer elements pressing challenge is to identify which of them is actuallythat distally control the activity of target promoters play functional and map them to the genes they regulate. Acritical roles in this process. Consequently, large-scale naïve approach that is still widely used in genomic stud-epigenomic projects set out to identify all the cis-regula- ies links enhancers to their nearest genes. Yet, emergingtory elements that are encoded in the genome. Promin- indications suggest that up to 50% of enhancers crossent among them is the ENCODE consortium [1, 2], over their most proximal gene and control a more distalwhich applied a variety of epigenomics techniques to a one [4]. A common approach that improves this naïvelarge panel of human cell lines. Profiling epigenetic enhancer–promoter (E–P) mapping is based on pairwisemarks of regulatory activity (including DHS-seq profiling correlation between activity patterns of promoters (P)of DNase I hypersensitive sites (DHSs), which is ac- and putative enhancers (E), and identifies E–P pairs, lo-cepted as a common feature of all active elements), EN- cated within a distance limit, that show highly correlatedCODE collectively identified hundreds of thousands of patterns across many samples [2, 3]. However, this ap- proach does not take into account interactions among* Correspondence: rshamir@tau.ac.il; ranel@tauex.tau.ac.il multiple enhancers that control the same target pro-† Equal contributors moter. Furthermore, Pearson correlation, which is typic-1 Blavatnik School of Computer Science, Tel Aviv University, 69978 Tel Aviv,Israel ally applied for this task, is highly sensitive to outliers3 Department of Human Molecular Genetics & Biochemistry, Sackler School of and thus prone to false positives.Medicine, Tel Aviv University, 69978 Tel Aviv, IsraelFull list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Hait et al. Genome Biology (2018) 19:56 ...
Tìm kiếm theo từ khóa liên quan:
Genome Biology Gene regulation DNA blueprint Benchmarking against ChIAPET Human cell linesTài liệu liên quan:
-
Gene-level differential analysis at transcript-level resolution
11 trang 17 0 0 -
Long non-coding RNAs: Emerging players regulating plant abiotic stress response and adaptation
20 trang 15 0 0 -
24 trang 13 0 0
-
The contribution of Alu exons to the human proteome
14 trang 12 0 0 -
An integrated multi-omics approach to identify regulatory mechanisms in cancer metastatic processes
28 trang 11 0 0 -
18 trang 10 0 0
-
Guidelines for benchmarking of optimization-based approaches for fitting mathematical models
10 trang 8 0 0 -
Evolution of plant genome architecture
14 trang 8 0 0 -
A novel vector platform for vitamin H-inducible transgene expression in mammalian cells
9 trang 7 0 0 -
21 trang 7 0 0