Arthropod EST map errors by Assembly unit size
ESTs for Arthropod species here are measured for errors mapping
to genome assemblies. The density of errors is plotted against
assembly unit bins from large to small. Scaffold bins have approximately
the same base size, with small scaffolds in higher bins (14..25).
This tests if small
scaffolds have a higher portion of errors (H1) versus same
distribution (null hypothesis). Small scaffolds tend to have
lower assembly coverage, more missing and mis-called bases.
Errors measured (red lines) versus uniquely, perfectly mapped ESTs (black lines):
1. ESTs with duplicate locations (GMAP align with > 95% identity in 2+ locations)
2. EST mapping failures (poor align < 90% identity)
Species: Pea Aphid, Daphnia pulex waterflea,
Nasonia vit. jewel wasp, Ixodes scap. tick,
fruit flies Drosophila erecta, Drosophila mojavensis, Drosophila grimshawi
Don Gilbert, Nov. 2009
EST low Identity Errors
OK Error pError
aphid 136041 21869 0.138
bombyx 209121 30964 0.129
daphnia 114128 31450 0.216
ixodes 130902 54881 0.295
nasonia 147382 20441 0.122
|
|
EST Duplicate locations
One Diff Split Same Err p.Diff p.Split p.Same p.Err
aphid 132559 10955 1535 729 14698 0.0683 0.00957 0.00454 0.0916
bombyx 209897 16456 1578 3516 7764 0.0688 0.00660 0.01470+ 0.0325
daphnia 115809 6112 462 1594 24820 0.0411 0.00310 0.01071+ 0.1668+
ixodes 149662 9914 5388 630 20614 0.0532 0.02894+ 0.00338 0.1107
nasvit 156288 3499 1698 197 11268 0.0202 0.00982 0.00114 0.0652
Duplicate key: One= EST has one unique location; Diff= EST duplicates on 2 scaffolds;
Split= EST is split between scaffolds; Same= EST duplicates on same scaffold;
Err= EST fail to map (low or zero identity)
|
EST Duplicates (red) by Scaffold size
Aphid, Bombyx, Daphnia, Ixodes, Nasonia
|
Scaffold sizes (X axis) are from Large (left) to small (right).
|
EST low Identity Errors (red) by Scaffold size
Aphid, Bombyx, Daphnia, Ixodes, Nasonia
|
Scaffold sizes (X axis) are from Large (left) to small (right).
|
|