Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

Pivot Spark data frame using python

from pyspark.sql.functions import avg

flights = (sqlContext
    .read
    .format("csv")
    .options(inferSchema="true", header="true")
    .load("flights.csv")
    .na.drop())

flights.registerTempTable("flights")
sqlContext.cacheTable("flights")

gexprs = ("origin", "dest", "carrier")
aggexpr = avg("arr_delay")

flights.count()
## 336776

%timeit -n10 flights.groupBy(*gexprs ).pivot("hour").agg(aggexpr).count()
## 10 loops, best of 3: 1.03 s per loop
Comment

PREVIOUS NEXT
Code Example
Python :: fonction nombre premier python 
Python :: Python NumPy asfarray Function Example Tuple to float type array 
Python :: python selenium: does not wait until page is loaded after a click() command 
Python :: how to set geometry to full screen in pyqt5 
Python :: python is ascii 
Python :: get all commands discord.py 
Python :: save a preprocess text 
Python :: ocaml returns the last element of a list 
Python :: install python 3.4 mac terminal 
Python :: .dropna() python 
Python :: distribution analysis pandas 
Python :: xlrd python read excel 
Python :: Tuple: Create tuple 
Python :: override get_queryset django with url parameters 
Python :: open python not write file 
Python :: python get element by index 
Python :: music distorted on discord 
Python :: reshape 
Python :: np.random.randint to generate -1 +1 
Python :: find a character in a string python last 
Python :: python source code 
Python :: correlation meaning 
Python :: python replace negative infinity 
Python :: use get method request html page python 
Python :: Palindrome in Python Using while loop for string 
Python :: Lucky four codechef solution 
Python :: boto3 python s3 
Python :: remove last digit from number python 
Python :: python if file exists append else create 
Python :: check for root python 
ADD CONTENT
Topic
Content
Source link
Name
8+8 =