DroSpeGe About Arthropods BLAST BioMart Maps Data News

Gene variation by Gene Ontology group in Drosophila genomes

Version 8 (May 2006). See Version 9 analysis (Dec 2006)

Deviations in GO categories by species genomes for gene match counts.

These may indicate where species genes differ in functional categories. Statistically significant deviations are brightly colored. Low counts or 'missing genes' may be due to divergence rather than lack; extra gene matches indicate something more is there.

The "gene match counts" here are High-scoring Segment Pair (HSP) groupings, and include various events: gene duplications, alternate splice exons within genes, new genes that appear composed of exons from other genes, as well as computational artifacts (see notes below). The detail pages provide links to GBrowse genome map views showing all secondary HSPs.

GO Class using Fruitfly genes using Mouse,Worm,Yeast genes
GO Molecular Function 12-GO_Function-fruitfly 12-GO_Function-notDM
GO Biological Process 12-GO_Process-fruitfly 12-GO_Process-notDM
GO Cell Location 12-GO_CellLocation-fruitfly 12-GO_CellLocation-notDM
Genome views of example duplications
Alternate exons (Dvir) New gene, parts of 4 reproductive genes (Dsec) New gene 2, parts of 4 reproductive genes (Dsec) Duplicate gene (Dpse and Dper) Duplicate region (Dper, not Dpse)

View Summary ... see also summary with C.elegans, Daphnia pulex and 4 Fruitflies


  • Genome averages of gene count and other protein match statistics are here and PDF.

  • GO-Slim groupings are used for Biological Process, Molecular Function, Cell Location (~125 categories). Find below this table "euprot4go.tab.gz" which has correspondence between MOD gene ids, GO primary ids, and the GO-slim grouping ids used here. Chris Mungall's GO map2slim software was used for this. Current GO associations for genes used in BLAST analyses were used. See below counts of GO associations available for each proteome thus identified with GO groupings.

  • All protein matches for tBLASTn, probability <= 1e-3, includes duplicate matches. Low score matches contained in the location of better matches are removed.

  • Gene counts are based on High-scoring Segment Pair (HSP) groupings, where the group is determined from overlap of query protein parts, and target genome overlaps. Included are HSP groups that are distinct protein parts in the same gene region (alternate exons), as well as protein parts found at distinct genome locations. The data includes computational artifacts, esp. where paralogs exist, a secondary HSP group for paralog-A can partially overlap primary HSP matches to paralog-B.

  • Proteome source subsets are those organism with extensive GO annotations: Dmel, Mouse, Worm, Yeast

  • Target genomes analyzed include Drosophila species along with outgroup species Ano. gambia, Daphnia pulex and C. elegans

  • Data tables used in this analysis, extracted from BLAST output, are below: modDM.gob5stats.gz : fruitfly, 8109 GO genes of 13472 in proteome, modMM.gob5stats.gz : mouse, 12732 GO genes of 18941 in proteome, modCE.gob5stats.gz : worm, 8812 GO genes of 19764 in proteome, modSC.gob5stats.gz : yeast, 5758 GO genes of 5777 in proteome. Table fields match those in the genome-mean-wfmgenes8 figures: species DB query : genome target and gene source, align eval bits exonHSP intronGap len1 : values of best gene match, nparalog : number of distinct "gene" matches (HSP groups), dist12 len2 : distance to and length of 1st duplicate, dist13 len3 : distance to and length of 2nd duplicate, GOC GOID : GO class and GO-slim ID

Don Gilbert, May 2006
      Name                                   Last modified       Size  Description

[DIR] Parent Directory 02-Jan-2007 17:51 - [DIR] ngenes-12-GO_CellLocation-notDM-v8/ 13-May-2006 15:47 - [DIR] duplgene-examples/ 10-May-2006 12:14 - [DIR] ngenes-6-GO_CellLocation-notDM-v8/ 09-May-2006 20:42 - [DIR] ngenes-6-GO_CellLocation-fruitfly-v8/ 09-May-2006 20:41 - [DIR] ngenes-12-GO_CellLocation-fruitfly-v8/ 09-May-2006 20:41 - [DIR] ngenes-6-GO_Process-notDM-v8/ 09-May-2006 20:40 - [DIR] ngenes-6-GO_Process-fruitfly-v8/ 09-May-2006 20:40 - [DIR] ngenes-12-GO_Process-notDM-v8/ 09-May-2006 20:40 - [DIR] ngenes-12-GO_Process-fruitfly-v8/ 09-May-2006 20:40 - [DIR] ngenes-6-GO_Function-notDM-v8/ 09-May-2006 20:39 - [DIR] ngenes-6-GO_Function-fruitfly-v8/ 09-May-2006 20:39 - [DIR] ngenes-12-GO_Function-notDM-v8/ 09-May-2006 20:39 - [DIR] ngenes-12-GO_Function-fruitfly-v8/ 09-May-2006 20:38 - [TXT] 6-GO-summary.html 09-May-2006 20:35 67k [TXT] 12-GO-summary.html 09-May-2006 20:07 67k [IMG] genome-mean-dmelgene8.png 07-May-2006 19:25 12k [   ] genome-mean-dmelgene8.pdf 07-May-2006 19:24 43k [IMG] genome-mean-wfmgenes8.png 07-May-2006 19:18 16k [   ] genome-mean-wfmgenes8.pdf 07-May-2006 19:18 84k [   ] modSC.gob7stats.gz 07-May-2006 17:33 1.8M [   ] modCE.gob7stats.gz 07-May-2006 17:33 6.0M [   ] modMM.gob7stats.gz 07-May-2006 17:33 6.8M [   ] modDM.gob7stats.gz 07-May-2006 17:32 5.6M [IMG] dspp-genes-go-summary.png 05-May-2006 15:42 250k [   ] euprot4go.tab.gz 22-Apr-2006 19:23 942k

Developed at the Genome Informatics Lab of Indiana University Biology Department