Dear aphid genome annotators, Here is a brief tutorial on how to effectively search for your genes of interest in the computed annotations at http://insects.eugenes.org/aphid/ As noted earlier I produced a computational annotated set of gene models, many of which are the same or similar to NCBI's Gnomon/Refseq set. However there are about 4,000 models not in NCBI's GNomon set. Of the remaining 30,000 new Augustus-predicted models, about 1/2 are the same or nearly to NCBI's set, but the other 1/2 differ enough in protein that, if better than NCBI's they would be worth an expert's time to look at. These are annotated with functional data from best matches to the UniProt and NCBI-RefProt genes. You can search for protein functions and find gene groups in your research area. There is a basic keyword search you find here http://insects.eugenes.org/aphid/ But for serious work you may want to use this form to refine your search. See the Search [?] help page or use this page http://insects.eugenes.org/lucegene_aphid/searchfields.jsp - Don Gilbert, July 2008 Here is an example from your list: CLIP Serine Proteases (SPHs) --------------- http://insects.eugenes.org/lucegene_aphid/searchfields.jsp Aphid.euGenes: Field Search Find in [ Aphid Gene Pages ] library records where [all of] of the following conditions are met: [Add Condition] reset above to [at least one] of the following conditions The field [any] contains [all of the words] : CLIP Serine Protease The field [any] contains [at least one of the words] : SPH [Search] Query: aphidgenexml-(+all:clip +all:serine +all:protease) all:sph No. matches = 18 of 33716 documents GeneID Genome_map Description docid DGIL_AUG5s7550g3t1 SCAFFOLD7550:20386-23375 serine protease eugenes:DGIL_AUG5s7550g3t1 DGIL_AUG5s4599g1t1 SCAFFOLD4599:154-766 serine protease eugenes:DGIL_AUG5s4599g1t1 DGIL_AUG5s7537g4t1 SCAFFOLD7537:62126-63532 CLIP-domain serine protease subfamily D (AGAP002813-PB) eugenes:DGIL_AUG5s7537g4t1 .. this one is tandem partial-duplicate to RefProt XP_001944315 or DGIL_AUG5s7537g2t4 below DGIL_AUG5s2502g14t1 SCAFFOLD2502:123907-139755 PREDICTED: similar to CLIP-domain serine protease subfamily A (AGAP006954-PA) eugenes:DGIL_AUG5s2502g14t1 DGIL_AUG5s14589g2t1 SCAFFOLD14589:16443-19465 serine protease 1 eugenes:DGIL_AUG5s14589g2t1 DGIL_AUG5s10088g1t1 SCAFFOLD10088:4084-10365 PREDICTED: similar to coagulation factor-like protein 1 eugenes:DGIL_AUG5s10088g1t1 DGIL_AUG5s17035g1t1 SCAFFOLD17035:813-1159 serine protease eugenes:DGIL_AUG5s17035g1t1 DGIL_AUG5s2013g11t1 SCAFFOLD2013:92598-96425 PREDICTED: similar to snake CG7996-PA eugenes:DGIL_AUG5s2013g11t1 DGIL_AUG5s7537g2t4 SCAFFOLD7537:22526-43783 transmembrane protease, serine 9 eugenes:DGIL_AUG5s7537g2t4 .. partial duplicate to DGIL_AUG5s7537g4t1 above DGIL_AUG5s12507g3t1 SCAFFOLD12507:32261-36475 PREDICTED: similar to snake CG7996-PA eugenes:DGIL_AUG5s12507g3t1 DGIL_AUG5s12507g10t1 SCAFFOLD12507:94417-99805 PREDICTED: similar to snake CG7996-PA eugenes:DGIL_AUG5s12507g10t1 DGIL_AUG5s4220g1t1 SCAFFOLD4220:4246-8180 PREDICTED: similar to snake CG7996-PA eugenes:DGIL_AUG5s4220g1t1 DGIL_AUG5s11537g2t1 SCAFFOLD11537:3082-7135 PREDICTED: similar to CG13318-PA eugenes:DGIL_AUG5s11537g2t1 DGIL_AUG5s2013g16t1 SCAFFOLD2013:143255-147655 PREDICTED: similar to snake CG7996-PA eugenes:DGIL_AUG5s2013g16t1 DGIL_AUG5s2013g17t1 SCAFFOLD2013:149911-161225 AGAP003627-PA eugenes:DGIL_AUG5s2013g17t1 DGIL_AUG5s11082g5t1 SCAFFOLD11082:53416-56767 PREDICTED: similar to CG9372-PA eugenes:DGIL_AUG5s11082g5t1 DGIL_AUG5s11082g6t1 SCAFFOLD11082:59816-64157 PREDICTED: similar to CG9372-PA eugenes:DGIL_AUG5s11082g6t1 DGIL_AUG5s8498g1t1 SCAFFOLD8498:1-794 transcriptional regulator, AraC family eugenes:DGIL_AUG5s8498g1t1 ------------- Key groups in this table of Stress, Immunity, Defense genes can be found https://dgc.cgb.indiana.edu/display/aphid/Stress%2C+Immunity%2C+and+Defense RECOGNITION: Peptidoglycan recognition proteins (PGRPs) Query: aphidgenexml-all:peptidoglycan all:pgrp No. matches = 10 Necrotic (NECs) Query: aphidgenexml-all:necrotic all:nec No. matches = 2 Persephone (PSHs) Gram negative binding proteins (GNBPs) Query: aphidgenexml-(+all:gram +all:negative +all:binding) all:gnb No. matches = 1 Galectins Query: aphidgenexml-all:galectin No. matches = 2 EGF motif genes (Eater, etc.) Query: aphidgenexml-all:egf all:eater No. matches = 287 SIGNALLING: CLIP Serine Proteases (SPHs) Query: aphidgenexml-(+all:clip +all:serine +all:protease) all:sph No. matches = 18 Serpins (SPNs) Query: aphidgenexml-all:serpin all:spn No. matches = 21 Toll Pathway Query: aphidgenexml-+all:toll +all:pathway No. matches = 5 IMD/JNK Pathways Query: aphidgenexml-all:jnk all:imd No. matches = 11 JAK/STAT Pathway Query: aphidgenexml-all:jak all:stat No. matches = 11 EFFECTORS: Antimicrobial Peptides Query: aphidgenexml-all:antimicrobial No. matches = 1 Lysozymes Query: aphidgenexml-all:lysozyme No. matches = 0 Melanization/Prophenoloxidase (PPOact, PPO) Query: aphidgenexml-all:melanization all:prophenoloxidase all:ppoact all:ppo No. matches = 9 Thiolester containing protein (TEPs) Query: aphidgenexml-all:thiolester all:tep No. matches = 0 Turandot (Tot) Query: aphidgenexml-all:turandot all:tot No. matches = 0 Heat Shock Proteins (HSPs) Query: aphidgenexml-(+all:heat +all:shock +all:protein) all:hsp No. matches = 42 Glutathione S transferase (GSTs) Query: aphidgenexml-(+all:glutathione +all:transferase) all:gst No. matches = 30 Alarm phermone related genes Query: aphidgenexml-+all:alarm +all:phermone No. matches = 0 * DSCAM Query: aphidgenexml-all:dscam No. matches = 6 * oxidative stress specific genes Query: aphidgenexml-+all:oxidative +all:stress No. matches = 1 * encapsulation specific genes Query: aphidgenexml-all:encapsulation No. matches = 0