Python for ML: NumPy & Pandas
1. NumPy Introduction
NumPy is a powerful library for numerical computing in Python. It provides support for arrays, matrices, and a collection of mathematical functions to operate on these data structures.
Key features of NumPy include:
- Supports multi-dimensional arrays and matrices.
- Offers a variety of mathematical operations.
- Facilitates linear algebra, Fourier transforms, and random number generation.
2. Using NumPy
To use NumPy, you need to install it first. You can install it using pip:
pip install numpy
Here is a simple example of creating an array and performing operations:
import numpy as np
# Create an array
array = np.array([1, 2, 3, 4, 5])
# Perform operations
mean = np.mean(array)
sum_array = np.sum(array)
print("Mean:", mean)
print("Sum:", sum_array)
3. Pandas Introduction
Pandas is a data manipulation and analysis library that provides data structures such as Series and DataFrame, making it easy to work with structured data.
Key features of Pandas include:
- DataFrame for handling tabular data.
- Easy data manipulation and cleaning.
- Integration with many data formats (CSV, Excel, SQL, etc.).
4. Using Pandas
To use Pandas, install it via pip:
pip install pandas
Example of creating a DataFrame and performing basic operations:
import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [24, 30, 22]}
df = pd.DataFrame(data)
# Display the DataFrame
print(df)
# Calculate mean age
mean_age = df['Age'].mean()
print("Mean Age:", mean_age)
FAQ
What is the difference between NumPy and Pandas?
NumPy is mainly for numerical data and provides array functionalities, while Pandas is used for data manipulation and analysis, particularly with tabular data.
Can I use NumPy and Pandas together?
Yes, they are often used together in data science projects, with NumPy handling numerical operations and Pandas managing data structures.
How do I handle missing data in Pandas?
You can use methods like dropna()
to remove missing values or fillna()
to fill them with a specific value.