Let's prepare some more advanced plots. Here, we'll look at adding two columns of data to a single plot, each sharing the same horizontal axis. We will use to external libraries: numpy
and matplotlib
import numpy as np
import matplotlib.pyplot as plt
We can grab a small csv
file from the city of new york. This has water consumption values for the last 40 years along with population.
path = "https://data.cityofnewyork.us/api/views/ia2d-e54m/rows.csv?accessType=DOWNLOAD"
data = np.genfromtxt(path, delimiter=',', names=True)
data.dtype.names
While not necessary, we will just make some arrays of the columns so that we can do some math on them if we want. (eg: see the population)
year = data['Year']
NYCPopulation = data['New_York_City_Population']*1e-6
NYCConsumption = data['NYC_ConsumptionMillion_gallons_per_day']
NYCConsumptionPerCapita = data['Per_CapitaGallons_per_person_per_day']
If we just wanted a simple plot of one of the columns over time, we could do this.
fig, ax = plt.subplots()
ax.scatter(year, NYCConsumption, s=8,color='steelblue')
ax.set_xlabel('Year')
ax.set_ylabel('Million gallons per day')
ax.set_title('NYC Water Consumption')
ax.grid()
plt.show()
But, it might be more interesting to see how that water consumption also compares to the population. Thus, we need two axes objects in the same figure.
To do this, we wave to create two axis objects in the same figure, and link their horizontal axes:
fig, ax1 = plt.subplots(figsize = [8,5])
# make the first axis
# maybe a bar graph is most appropriate here
ax1.set_xlabel('Year')
ax1.set_ylabel('Gallons per day [Millions]', color="steelblue")
ax1.bar(year, NYCConsumption, width=1, edgecolor="white", linewidth=0.7, color="steelblue",label='Gallons per Day')
ax1.tick_params(axis='y', labelcolor="steelblue")
ax1.set_title('NYC Water Consumption and Population')
# instantiate a second axes that shares the same x-axis
ax2 = ax1.twinx()
#this on can just be a regular plot with lines and markers
ax2.set_ylabel('Population [Millions of People]', color='darkred') # we already handled the x-label with ax1
ax2.plot(year, NYCPopulation, color='darkred', marker='o', markersize=4, label='Population')
ax2.tick_params(axis='y', labelcolor='darkred')
# otherwise the right y-label is slightly clipped
fig.tight_layout()
# add a legend that uses the label arguments in the bar and plot lines.
# and put it in a nice place
fig.legend(loc='lower right', bbox_to_anchor=(0.8, 0.2))
plt.show()
This tutorial also exists as a Colab Notebook. You can find it here: Plotting Two Data Sets with Python (Colab)