Skip to content

Instantly share code, notes, and snippets.

@ftrivino
Forked from joragupra/stop_words.py
Created June 14, 2017 16:40
Show Gist options
  • Save ftrivino/3618b43aca72ae50f1456e8c87697ea4 to your computer and use it in GitHub Desktop.
Save ftrivino/3618b43aca72ae50f1456e8c87697ea4 to your computer and use it in GitHub Desktop.
Usando palabras filtro cuando creamos el clasificador
prepositions =['a','ante','bajo','cabe','con','contra','de','desde','en','entre','hacia','hasta','para','por','según','sin','so','sobre','tras']
prep_alike = ['durante','mediante','excepto','salvo','incluso','más','menos']
adverbs = ['no','si','sí']
articles = ['el','la','los','las','un','una','unos','unas','este','esta','estos','estas','aquel','aquella','aquellos','aquellas']
aux_verbs = ['he','has','ha','hemos','habéis','han','había','habías','habíamos','habíais','habían']
tfid = TfidfVectorizer(stop_words=prepositions+prep_alike+adverbs+articles+aux_verbs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment