Skip to content

Instantly share code, notes, and snippets.

@JasonKessler
Last active March 9, 2018 19:42
Show Gist options
  • Save JasonKessler/840c2efc158420c69a092ed5b2cfb739 to your computer and use it in GitHub Desktop.
Save JasonKessler/840c2efc158420c69a092ed5b2cfb739 to your computer and use it in GitHub Desktop.
Positive vs. Negative ICLR Reviews LORIDP
reviews_df = pd.read_csv('https://github.com/JasonKessler/ICLR18ReviewVis/raw/master/iclr2018_reviews.csv.bz2')
reviews_df['parse'] = reviews_df['review'].apply(spacy.load('en', parser=False))
full_corpus = st.CorpusFromParsedDocuments(reviews_df, category_col='decision', parsed_col='parse').build()
corpus = full_corpus.remove_categories(['Workshop'])
priors = (st.PriorFactory(full_corpus, term_ranker=st.OncePerDocFrequencyRanker)
.use_all_categories().align_to_target(corpus).get_priors())
html = st.produce_frequency_explorer(
corpus,
category='Accept',
not_categories=['Reject'],
term_ranker = st.OncePerDocFrequencyRanker,
term_scorer = st.LogOddsRatioInformativeDirichletPrior(priors, 1),
grey_threshold = 1.64,
metadata = corpus.get_df()['metadata'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment