Skip to content

Instantly share code, notes, and snippets.

View nawshad's full-sized avatar
:electron:

Nawshad Farruque nawshad

:electron:
View GitHub Profile
@nawshad
nawshad / collect-timeline.py
Created October 15, 2020 07:59 — forked from emallson/collect-timeline.py
Iterating over user timelines with Twython
from twython import Twython, TwythonRateLimitError, TwythonError
from glob import glob
from util import sleep_until
from csv import DictReader, DictWriter
import os
APP_KEY = ''
ACCESS_TOKEN = ''
tw = Twython(APP_KEY, access_token=ACCESS_TOKEN)
@nawshad
nawshad / text_preprocessing.py
Created September 25, 2020 09:23 — forked from MrEliptik/text_preprocessing.py
A python script to preprocess text (remove URL, lowercase, tokenize, etc..)
import re, string, unicodedata
import nltk
import contractions
import inflect
from nltk import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.stem import LancasterStemmer, WordNetLemmatizer
def replace_contractions(text):
"""Replace contractions in string of text"""
@nawshad
nawshad / pad_packed_demo.py
Created April 3, 2020 04:08 — forked from HarshTrivedi/pad_packed_demo.py
Minimal tutorial on packing (pack_padded_sequence) and unpacking (pad_packed_sequence) sequences in pytorch.
import torch
from torch import LongTensor
from torch.nn import Embedding, LSTM
from torch.autograd import Variable
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
## We want to run LSTM on a batch of 3 character sequences ['long_str', 'tiny', 'medium']
#
# Step 1: Construct Vocabulary
# Step 2: Load indexed data (list of instances, where each instance is list of character indices)