Data Visualization with Seaborn
Introduction to Seaborn
Seaborn is a Python visualization library based on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics. It is built on top of Matplotlib and closely integrated with pandas data structures.
Getting Started
Before we start using Seaborn, you need to install it. You can install Seaborn using pip:
pip install seaborn
Once installed, you can import Seaborn in your Python script:
import seaborn as sns
Loading Data
Seaborn comes with several built-in datasets that you can use to practice. Let's load the 'tips' dataset, which contains information about the tips received by waiters in a restaurant:
tips = sns.load_dataset("tips")
You can view the first few rows of the dataset using the head()
method:
print(tips.head())
Basic Plotting with Seaborn
Scatter Plot
A scatter plot is used to display the relationship between two numerical variables. Let's create a scatter plot of 'total_bill' vs 'tip':
sns.scatterplot(x="total_bill", y="tip", data=tips)
Line Plot
A line plot is used to display data points in a line. It is useful for time series data:
sns.lineplot(x="size", y="total_bill", data=tips)
Advanced Plotting with Seaborn
Histogram
A histogram is used to display the distribution of a single numerical variable:
sns.histplot(tips["total_bill"], bins=20, kde=True)
Box Plot
A box plot is used to display the distribution of a numerical variable across different categories:
sns.boxplot(x="day", y="total_bill", data=tips)
Heatmap
A heatmap is used to display the correlation between different numerical variables:
sns.heatmap(tips.corr(), annot=True, cmap="coolwarm")
Customizing Seaborn Plots
Seaborn provides various ways to customize the appearance of your plots. You can change the color palette, style, and more:
Changing Color Palette
sns.set_palette("pastel")
sns.scatterplot(x="total_bill", y="tip", data=tips)
Customizing Plot Style
sns.set_style("whitegrid")
sns.scatterplot(x="total_bill", y="tip", data=tips)
Saving Your Plots
Once you have created your plot, you might want to save it. You can save your plot using Matplotlib's savefig()
method:
import matplotlib.pyplot as plt
sns.scatterplot(x="total_bill", y="tip", data=tips)
plt.savefig("scatter_plot.png")
Conclusion
Seaborn is a powerful library for creating statistical graphics in Python. It is built on top of Matplotlib and provides a high-level interface for drawing attractive and informative statistical graphics. We hope this tutorial has given you a good overview of how to use Seaborn for data visualization. Happy plotting!