possible refseq discrepancies - (Dec/22/2010 )
Hello,
I extracted a record for a particular refseq from http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refgene.txt.tgz
ie.
908 NM_000419 chr17 - 42449549 42466873 42449731 42466841 30 42449549,42451721,42452026,42452365,42452958,42453200,42453452,42453675,42454376,42455065,42455729,42456010,42457056
,42457369,42457613,42457753,42457967,42458246,42460860,42461261,42461452,42461678,42461905,42462315,42462531,42462653,42462918,42463191,42463377,42466653, 42449791,42451838,42452128,42452479,42453084,42453353,42453552,42453
756,42454456,42455158,42455877,42456078,42457182,42457521,42457669,42457858,42458013,42458429,42461072,42461314,42461506,42461722,42461953,42462444,42462577,42462703,42463084,42463289,42463499,42466873, 0 ITGA2B cmpl
cmpl 0,0,0,0,0,0,2,2,0,0,2,0,0,1,2,2,1,1,2,0,0,1,1,1,0,1,0,1,2,0,
This reports the first exon as 42449549 to 42449791 which is 242 bases long.
However, the record for NM_000419 (http://www.ncbi.nlm.nih.gov/nuccore/88758614) has exon1 as 1->220.
Has anyone else seen such discrepencies? - or have I missed something obvious!
Thanks.
I think it is quite normal to see such discrepancy across different databases (UCSC, NCBI and Ensembl) with regard to gene annotation. In terms of refseq, I tend to trust NCBI more because refseqs are human annotated by NCBI.