Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

droping Duplicates

# this is based on 2 factors the name of the dog and the breed of the dog so we can
# have 2 dog with the same name but diff breed.

df.drop_duplicates(subset=["name", "breed"])

# Without including index
df.drop_duplicates(subset=["name", "breed"], index = False)
Comment

.drop duplicates()

df = pd.DataFrame({"Date": ["2022", "2022", "2021", "2021", "2020", "2020"], "Time": ["20:00", "20:00", "20:00", "21:00", "22:00", "22:00"]})
df.drop_duplicates()

#output
#	Date	Time
#	2022	20:00
#	2021	20:00
#	2021	21:00
#	2020	22:00
Comment

Duplicate Remove

from pathlib import Path
import hashlib
import os

def remove_duplicate(path):
    unique = {}
    for file in Path(path).rglob('*'):
        if file.is_file():
            with open(file, 'rb') as f:
                filehash = hashlib.md5(f.read()).hexdigest()
                if filehash not in unique:
                    unique[filehash] = file
                else:
                    # Test print before removing
                    print(f'Removing --> {unique[filehash]}')
                    #os.remove(unique[filehash])

if __name__ == '__main__':
    path = r'C:foo'
    remove_duplicate(path)
Comment

PREVIOUS NEXT
Code Example
Python :: train test split sklearn 
Python :: create pytorch zeros 
Python :: kivy button on click 
Python :: word guessing game python 
Python :: python read and write pdf data 
Python :: subtract from dataframe column 
Python :: opencv shift image python 
Python :: isdigit python 
Python :: python count variable and put the count in a column of data frame 
Python :: how to capitalize first letter in python 
Python :: how to install python libraries using pip 
Python :: legend font size python matplotlib 
Python :: How to join two dataframes by 2 columns so they have only the common rows? 
Python :: change marker border color plotly 
Python :: how to colour letters in python 
Python :: python slack 
Python :: (for in) printing in python 
Python :: remove dot from number python 
Python :: Python program to implement linear search and take input. 
Python :: sort a dictionary 
Python :: pyspark dataframe to parquet 
Python :: pandas cumulative mean 
Python :: python while false loop 
Python :: Simple dictionary in Python 
Python :: sort rows by values dataframe 
Python :: echo $pythonpath ubuntu set default 
Python :: python f string 2 decimals 
Python :: django-sslserver 
Python :: pandas dataframe to series 
Python :: howe to print all values and keysin d 
ADD CONTENT
Topic
Content
Source link
Name
1+9 =