As the doctor gone rogue

July 2, 2012

UCSC genome browser human gene location?

Filed under: data management, genetics — hypotheses @ 4:16 pm

To download UCSC RefSeq Gene info with HUGO gene name, start and stop codon or transcription start-end site, locate the refflat.txt.gz file on the annotation database
– Hg18: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/refFlat.txt.gz
– Hg19: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/refFlat.txt.gz
The reFlat.txt.gz contains the following columns as described in the reFlat.sql file
– 1) Hugo gene name
– 2) chromosome
– 3) strand (+/-)
– 4) Transcription start position
– 5) Transcription end position
– 6) Coding region start
– 7) Coding region end
– 8) Number of exons
– 9) Exon start positions
– 10) Exon end positions

These info should come in handy when you have to map the location of genetic marker to genes. I post the script I wrote to do this mapping latter if there is anyone interested.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: