Skip to content

Instantly share code, notes, and snippets.

@zorbax
Created June 2, 2021 09:57
Show Gist options
  • Save zorbax/8d59e922d2cc883e2b88f3bcb2767f7e to your computer and use it in GitHub Desktop.
Save zorbax/8d59e922d2cc883e2b88f3bcb2767f7e to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3
from Bio.Blast.Applications import NcbiblastnCommandline
import pandas as pd
cline = NcbiblastnCommandline(query='insulin.fasta', remote=True, outfmt=6, out='-')
output = cline()[0].strip()
rows = [line.split() for line in output.splitlines()]
cols = ['qseqid', 'sseqid', 'pident', 'length',
'mismatch', 'gapopen', 'qstart', 'qend',
'sstart', 'send', 'evalue', 'bitscore']
data_types = {'pident': float, 'length': int, 'mismatch': int,
'gapopen': int, 'qstart': int, 'qend': int,
'sstart': int, 'send': int, 'evalue': float,
'bitscore': float}
df = pd.DataFrame(rows, columns=cols).astype(data_types)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment