Summary: | Thirty fosmids were randomly selected from a library of subsp. (cv. Pawtuckaway) DNA. These fosmid clones were individually sheared, and ∼4- to 5-kb fragments were subcloned. Subclones on a single 384-well plate were sequenced bidirectionally for each fosmid. Assembly of these data yielded 12 fosmid inserts completely sequenced, 14 inserts as 2 to 3 contiguous sequences (contigs), and 4 inserts with 5 to 9 contigs. In most cases, a single unambiguous contig order and orientation was determined, so no further finishing was required to identify genes and their relative arrangement. One hundred fifty-eight genes were identified in the ∼1.0 Mb of nuclear genomic DNA that was assembled. Because these fosmids were randomly chosen, this allowed prediction of the genetic content of the entire ∼200 Mb genome as about 30,500 protein-encoding genes, plus >4700 truncated gene fragments. The genes are mostly arranged in gene-rich regions, to a variable degree intermixed with transposable elements (TEs). The most abundant TEs in were found to be long terminal repeat (LTR) retrotransposons, and these comprised about 13% of the DNA analyzed. Over 30 new repeat families were discovered, mostly TEs, and the total TE content of is predicted to be at least 16%.
|