Summary: | Fanconi anemia (FA) is a heterogeneous recessive disorder associated with a markedly elevated risk to develop cancer. To date sixteen FA genes have been identified, three of which predispose heterozygous mutation carriers to breast cancer. The FA proteins work together in a genome maintenance pathway, the so-called FA/BRCA pathway which is important during the S phase of the cell cycle. Since not all FA patients can be linked to (one of) the sixteen known complementation groups, new FA genes remain to be identified. In addition the complex FA network remains to be further unravelled. One of the FA genes, FANCI, has been identified via a combination of bioinformatic techniques exploiting FA protein properties and genetic linkage. The aim of this study was to develop a prioritization approach for proteins of the entire human proteome that potentially interact with the FA/BRCA pathway or are novel candidate FA genes. To this end, we combined the original bioinformatics approach based on the properties of the first thirteen FA proteins identified with publicly available tools for protein-protein interactions, literature mining (Nermal) and a protein function prediction tool (FuncNet). Importantly, the three newest FA proteins FANCO/RAD51C, FANCP/SLX4, and XRCC2 displayed scores in the range of the already known FA proteins. Likewise, a prime candidate FA gene based on next generation sequencing and having a very low score was subsequently disproven by functional studies for the FA phenotype. Furthermore, the approach strongly enriches for GO terms such as DNA repair, response to DNA damage stimulus, and cell cycle-regulated genes. Additionally, overlaying the top 150 with a haploinsufficiency probability score, renders the approach more tailored for identifying breast cancer related genes. This approach may be useful for prioritization of putative novel FA or breast cancer genes from next generation sequencing efforts.
|