This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python3 | |
| """ | |
| Filter sequences from a multifasta file based on an exclusion list. | |
| Usage: | |
| python filter_fasta.py -i input.fasta -e exclude_ids.txt -o output.fasta | |
| The exclude list should have one sequence ID per line (without the leading '>'). | |
| Matching is done against the first word of each FASTA header. | |
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| grep -w -F -f taxon.txt *report.txt | awk 'BEGIN{OFS="\t"}{ print $1,$2}' > tb.txt | |
| # taxon.txt is a file containing the Taxon name you want to summarize, e.g: Mycobacterium tuberculosis complex | |
| # *.report.txt is the wildcard for selecting multiple Kraken report files | |
| # tb.txt is the output TSV file |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python | |
| from Bio import SeqIO | |
| fasta = "the_fasta_file.fasta" | |
| for record in SeqIO.parse(fasta, "fasta"): | |
| print("ID: %s" % record.id) | |
| print("Sequence length: %s" % len(record)) | |
| print("Number of Ns: %s" % record.seq.count('N')) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| samtools depth CIV3724802_ref_bwa_sorted.bam | awk '{sum+=$3} END { print "Average = ",sum/NR}' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| list=`cat TEXT_FILE` # list of the record file IDs. | |
| for i in $list | |
| do echo $i | |
| SHELL COMMAND [OPTIONS] $i #Command with file id | |
| done |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| total_files=`find -name '*.fastq' | wc -l` | |
| arr=( $(ls *.fastq) ) | |
| echo "mapping started" >> map.log | |
| echo "---------------" >> map.log | |
| for ((i=0; i<$total_files; i+=2)) | |
| { | |
| ref_genome=../ref.gb | |
| sample_name=`echo ${arr[$i]} | awk -F "_" '{print $1}'` | |
| echo "[mapping running for] $sample_name" |