Numpy
Table of Contents
Description
NumPy (Numerical Python) is a foundational Python library for numerical computing. It provides support for multi-dimensional arrays, a collection of high-performance mathematical functions, and tools for array manipulation, broadcasting, and indexing. It's widely used in Data Science, Machine Learning, AI, and scientific computing.
Prerequisites
- Basic Python syntax
- Understanding of lists and loops
- Familiarity with functions and basic math operations
Examples
Here's a simple example of a data science task using Python:
import numpy as np
# Creating a NumPy array
arr = np.array([1, 2, 3])
print("Array:", arr)
# Array operations
print("Added 5:", arr + 5) # Broadcasting: adds 5 to every element
print("Squared:", arr ** 2) # Element-wise square
# 2D Array
matrix = np.array([[1, 2], [3, 4]])
print("Matrix:\n", matrix)
# Indexing
print("First row:", matrix[0])
print("Element at (1,1):", matrix[1][1])
# Broadcasting in 2D
print("Add 1 to all:", matrix + 1)
Real-World Applications
Data Science & Machine Learning
Handling datasets efficiently (fast numerical computation)
Feature scaling and transformation
Building mathematical models with large arrays
Image & Signal Processing
Representing pixel data as arrays
Applying filters via convolution
Finance
Large-scale numerical simulations
Vectorized calculations for pricing models.
Where Data Science Is Applied
Finance
- Risk and return modeling using matrix operations
- Portfolio simulations
Healthcare
- Image-based analysis (e.g., CT, MRI using arrays)
- Patient data modeling and transformation
E-commerce
- Efficient recommendation algorithms using matrix multiplication
- Data normalization and batch processing
Machine Learning
- Underlying numerical operations in models like linear regression, PCA
- Data preprocessing and augmentation
Robotics
- Coordinate transformations and movement control using arrays
- Sensor data processing with broadcasting
Resources
Data Science topic PDF
Harvard Data Science Course
Free online course from Harvard covering data science foundations
Interview Questions
➤ NumPy is a Python library used for numerical operations with support for arrays, broadcasting, and mathematical functions. It offers high speed and memory efficiency, making it ideal for scientific computing.
➤ A Python list is generic and supports heterogeneous data types. A NumPy array is homogeneous and allows element-wise operations and faster computation due to optimized C-based backend.
➤ Broadcasting is a technique that allows NumPy to perform operations between arrays of different shapes by automatically expanding dimensions so the shapes are compatible.
➤ np.array(), np.zeros(), np.ones(), np.arange(), np.linspace(), np.reshape(), np.mean(), np.dot(), etc.
➤ NumPy supports multi-dimensional indexing, slicing, and even boolean indexing, making it much more powerful than standard list indexing.