Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

deduplication jaccard python

s1 = "what's the flight time from Berlin to Helsinki?"
s2 = "how long does it take to fly from Berlin to Helsinki?"

shingles1 = set([s1[max(0, i-4):i] for i in range(4, len(s1) + 1)])
shingles2 = set([s2[max(0, i-4):i] for i in range(4, len(s2) + 1)])

len(shingles1 & shingles2) / len(shingles1 | shingles2)
Comment

PREVIOUS NEXT
Code Example
Python :: scikit learn split data set site:stackoverflow.com 
Python :: max(X_train, key=len).split() 
Python :: The current Numpy installation fails to pass a sanity check due to a bug in the windows runtime. 
Python :: python openstreetmap multiple latitude 
Python :: python converter to c 
Python :: taggablemanager serializer django 
Python :: run shell script to yaml file 
Python :: how to update pip python 
Shell :: chrome remote debug 
Shell :: how to check if am using wayland 
Shell :: how to install cv2 
Shell :: install sklearn 
Shell :: ubuntu pip3 
Shell :: delete files with a certain extension recursively 
Shell :: another git process seems to be running in this repository 
Shell :: kill app at port 
Shell :: install openzeppline 
Shell :: install metasploitable on ubuntu 
Shell :: git command show current repo 
Shell :: find php.ini ubuntu 
Shell :: how to install python on ubuntu pyenv 
Shell :: pip install flask_restful 
Shell :: how to update portainer 
Shell :: check redis version 
Shell :: vuejs sass Syntax Error: TypeError: this.getOptions is not a function 
Shell :: uninstall wps 
Shell :: check git config 
Shell :: view git branches most recent first 
Shell :: update all chocolatey packages 
Shell :: npm to fix lint issues 
ADD CONTENT
Topic
Content
Source link
Name
8+3 =