Histogram Plot using Pandas¶
Date published: 2018-03-04
Category: Data Visualizations
Subcategory: Pandas Plot
Tags: histogram plot
Import Modules¶
In [1]:
Copied!
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
% matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
% matplotlib inline
Read in Tips Dataset from URL¶
In [2]:
Copied!
df_tips = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv')
df_tips = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv')
Preview the Data¶
Preview the top 5 rows¶
In [3]:
Copied!
df_tips.head()
df_tips.head()
Out[3]:
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
View the count of values per colummn and data types¶
In [4]:
Copied!
df_tips.info()
df_tips.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): total_bill 244 non-null float64 tip 244 non-null float64 sex 244 non-null object smoker 244 non-null object day 244 non-null object time 244 non-null object size 244 non-null int64 dtypes: float64(2), int64(1), object(4) memory usage: 13.4+ KB
Plot a Simple Histogram of Tip Amounts¶
We access the tip column, call the plot
method and pass in hist
to the kind
argument to output a histogram plot.
In [5]:
Copied!
df_tips['tip'].plot(kind='hist');
df_tips['tip'].plot(kind='hist');
Plot a Simple Histogram of Total Bill Amounts¶
We access the total_bill column, call the plot
method and pass in hist
to the kind
argument to output a histogram plot.
Here is the Pandas hist
method documentation page.
In [6]:
Copied!
df_tips['total_bill'].plot(kind='hist');
df_tips['total_bill'].plot(kind='hist');
Adjust Plot Styles¶
Below, I'll adjust plot styles so it's easier to interpret this plot.
In [9]:
Copied!
sns.set(font_scale=1.4)
df_tips['total_bill'].plot(kind='hist', figsize=(10, 10));
plt.xlabel("Total Bill Amount ($)", labelpad=14)
plt.ylabel("Frequency", labelpad=14)
plt.title("Distribution of Tip Bill Amounts ($)", y=1.015, fontsize=22);
sns.set(font_scale=1.4)
df_tips['total_bill'].plot(kind='hist', figsize=(10, 10));
plt.xlabel("Total Bill Amount ($)", labelpad=14)
plt.ylabel("Frequency", labelpad=14)
plt.title("Distribution of Tip Bill Amounts ($)", y=1.015, fontsize=22);