Search
 
SCRIPT & CODE EXAMPLE
 
CODE EXAMPLE FOR PYTHON

spacy remove stop words

import spacy
nlp = spacy.load('en_core_web_sm')
# Tokenize words so they can be individually looped through
docs = list(nlp.pipe(yourdataframe['yourtextcolumn'].apply(str)))
# Iterate over parsed document objects 
for doc in docs:
  	# remove stop words by only keeping non-stop words
    tokens = [token for token in doc if not token.is_stop]
    
    
# Or you can remove stop words in your count vectorizer step
import spacy
nlp = spacy.load('en_core_web_sm')
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(max_df=0.95, min_df=2, stop_words=nlp.Defaults.stop_words)
 
PREVIOUS NEXT
Tagged: #spacy #remove #stop #words
ADD COMMENT
Topic
Name
2+8 =