This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Python Script to read in a reference genome of refseq IDs and output several tab-delimited BED (text) files suitable for use with bedtools coverage for counting ChIP-seq reads that map to various gene features. | |
All output files have the structure expected by bedtools, namely, | |
CHROM POSITION1 POSITION2 REFSEQ_ID | |
Possible output files include: | |
1. distal promoter (transcription start [-5KB,-1KB]) KB means kilobase pairs, not kilobyte | |
2. proximal promoter (transcription start [-1KB,1KB]) | |
3. gene body (anywhere between transcription start and transcription end) | |
4. transcript (anywhere in an exon)- outputs each exon as a separate line | |
5. first 1/3 transcript- outputs each exon as a separate line |
