melt() Method: Unpivot a DataFrame¶

Date published: 2018-11-25

Category: Data Analysis

Subcategory: Data Wrangling

Tags: unpivot, melt, python, pandas

In pandas, we can "unpivot" a DataFrame - turn it from a wide format - many columns - to a long format - few columns but many rows. We can accomplish this with the pandas melt() method. This can be helpful for further analysis of our new unpivoted DataFrame.

Import Module¶

In [20]:

                
                    Copied!
                    
import pandas as pd
import pandas as pd

Example: Pivot Tesla Car Acceleration Details¶

Here is fictional acceleration tests for three popular Tesla car models. In order to verify acceleration of the cars, I figured a third-party may make three runs to test the three models alongside one another.

In [21]:

                
                    Copied!
                    
                        
                        
                    
                    

            
s = 'Tesla Model S P100D'
x = 'Tesla Model X P100D'
three = 'Tesla Model 3 AWD Dual Motor'

s_data = [s, 2.5, 2.51, 2.54]
x_data = [x, 2.92, 2.91, 2.93]
three_data = [three, 3.33, 3.31, 3.35]

data = [s_data, x_data, three_data] 
df = pd.DataFrame(data, columns=['car_model', 'Sept 1 9am', 'Sept 1 10am', 'Sept 1 11am'])
df
s = 'Tesla Model S P100D'
x = 'Tesla Model X P100D'
three = 'Tesla Model 3 AWD Dual Motor'

s_data = [s, 2.5, 2.51, 2.54]
x_data = [x, 2.92, 2.91, 2.93]
three_data = [three, 3.33, 3.31, 3.35]

data = [s_data, x_data, three_data] 
df = pd.DataFrame(data, columns=['car_model', 'Sept 1 9am', 'Sept 1 10am', 'Sept 1 11am'])
df

Out[21]:

	car_model	Sept 1 9am	Sept 1 10am	Sept 1 11am
0	Tesla Model S P100D	2.50	2.51	2.54
1	Tesla Model X P100D	2.92	2.91	2.93
2	Tesla Model 3 AWD Dual Motor	3.33	3.31	3.35

Notice how this DataFrame features four columns, one for the car model name, and three for acceleration runs of a car. If they were to continue with this trend of data collection and do far more runs, this dataset would have lots of columns - perhaps making it daunting to visualize and analyze.

I want to "unpivot" this data from a wide format to a long format using the pandas melt() method.

On the df DataFrame, we'll call the melt() method and set the following arguments:

id_vars to ['car_model'] since each row from df is identified by the car model name
var_name to 'date' since this new column needs a name
value_name to '0-60mph_in_seconds' since this new column needs a name

In [22]:

                
                    Copied!
                    
df_unpivoted = df.melt(id_vars=['car_model'], var_name='date', value_name='0-60mph_in_seconds')
df_unpivoted
df_unpivoted = df.melt(id_vars=['car_model'], var_name='date', value_name='0-60mph_in_seconds')
df_unpivoted

Out[22]:

	car_model	date	0-60mph_in_seconds
0	Tesla Model S P100D	Sept 1 9am	2.50
1	Tesla Model X P100D	Sept 1 9am	2.92
2	Tesla Model 3 AWD Dual Motor	Sept 1 9am	3.33
3	Tesla Model S P100D	Sept 1 10am	2.51
4	Tesla Model X P100D	Sept 1 10am	2.91
5	Tesla Model 3 AWD Dual Motor	Sept 1 10am	3.31
6	Tesla Model S P100D	Sept 1 11am	2.54
7	Tesla Model X P100D	Sept 1 11am	2.93
8	Tesla Model 3 AWD Dual Motor	Sept 1 11am	3.35

After this "unpivot", we can easily calculate the minimum (essentially the fastest) 0-60 time that we'd publish in a final report. To do so, we take our df_unpivoted DataFrame, group by the car_model column, and find the minimum value in the 0-60mph_in_seconds column.

In [23]:

                
                    Copied!
                    
df_unpivoted.groupby('car_model')['0-60mph_in_seconds'].min()
df_unpivoted.groupby('car_model')['0-60mph_in_seconds'].min()

Out[23]:

car_model
Tesla Model 3 AWD Dual Motor    3.31
Tesla Model S P100D             2.50
Tesla Model X P100D             2.91
Name: 0-60mph_in_seconds, dtype: float64