Subject: Re: genome sequence coverages Date: 14 Nov 2006 To: drospege@eugenes.org |I am trying to find out what the relative sequence coverage was for |specific species in the 12 Drosophila genome project. Specifically |we would like to know the sequence coverage in D. simulans, D. |sechellia, D. yakuba and D. erecta. Can you tell me if and where |this information exists on the web. Your question on sequence coverage could mean different things. One way is how many times did the whole genome shutgun sequencing and assembly overlap or cover genome regions, on average. This for most of the Dros. genomes is 8 to 9 fold coverage. This includes all the Agencourt assemblies (Dere, Dana, Dvir, Dmoj, Dgri), and Dyak (WUSTL). These had 4X WGS coverage: Dsec (4.86X Broad), Dper (4.10X Broad), Dwil (Venter). Dpse (Baylor) was 8X I believe, as was Dmel (probably). Dsim (WUSTL) is a special case of a mosaic assembly of several strains, one strain at 4X coverage, and 3 or 4 others at 1X coverage. This Dsim assembly shows phylogenetic oddities due to that. If you mean how well do the species genomes cover or match Dmel genome, here is an answer from DroSpeGe. Figure 1 here, the middle green line displays DNA genome coverage in summary form: http://insects.eugenes.org/species/news/genome-summaries/dnacoverage.html This green line shows "genome coverage" in the sense of overall matching of DNA segments. This is based on BLAST matches of genome dna, which is one way, but not neccessarily the most precise way, to measure this. These BLAST match statistics are for the matching base range, not identity, so it is a rough measure. Here is another way to look at the same BLAST match of genomes: of the regions that align well, what percent of bases are identical? This doesn't count those regions that don't align well, but still can have some matching bases. Note also that there are multiple, duplicate alignments so the species near Dmel have more than a full genome of matches. BLAST-aligned to Dmel genome (118357599 Dmel bases) Aligned Identical I/A bases bases ratio dsec 167573291 156630658 0.9347 dsim 214792784 202084159 0.9408 dyak 115449735 105385696 0.9128 dere 107867313 98257255 0.9109 dana 40494218 35796226 0.8840 dper 51626464 45858199 0.8883 dpse 26131078 23267705 0.8904 dwil 18893458 16534575 0.8751 dmoj 14231831 12445329 0.8745 dvir 18432531 16072818 0.8720 dgri 15129933 13232344 0.8746 - Don Gilbert -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405 -- gilbertd@indiana.edu--http://marmot.bio.indiana.edu/