NumPy for Trading

The fundamental library for numerical computing in Python. NumPy provides the foundation for fast array operations, mathematical functions, and vectorized calculations essential for algorithmic trading.

Difficulty: Beginner

Category: Data & Analysis

🧱 Foundation Library

Installation

NumPy is typically included with most Python distributions. Install or upgrade with pip.

# Install NumPy

$ pip install numpy

# Verify installation

$ python -c "import numpy; print(numpy.__version__)"

Key Features

Blazing Fast

C-optimized operations that are 10-100x faster than pure Python loops for numerical computations.

N-Dimensional Arrays

Efficient multi-dimensional arrays perfect for storing OHLCV data and correlation matrices.

Mathematical Functions

Comprehensive math library: statistics, linear algebra, random numbers, and more.

Broadcasting

Apply operations across arrays of different shapes without explicit loops.

Code Examples

NumPy Arrays for Price Data

Work with price data using NumPy arrays

Python

import numpy as np

# Create price array

prices = np.array([1.1234, 1.1245, 1.1238, 1.1252, 1.1260])

# Basic statistics

print(f"Mean: {np.mean(prices):.4f}")

print(f"Std Dev: {np.std(prices):.4f}")

print(f"Min: {np.min(prices):.4f}")

print(f"Max: {np.max(prices):.4f}")

# Price changes

returns = np.diff(prices) / prices[:-1]

print(f"Daily returns: {returns * 100}%")

# Cumulative returns

cum_returns = np.cumprod(1 + returns) - 1

print(f"Cumulative return: {cum_returns[-1] * 100:.2f}%")

Calculate Moving Averages

Efficient moving average using convolution

Python

import numpy as np

def sma(prices, period):

"""Simple Moving Average using convolution"""

weights = np.ones(period) / period

return np.convolve(prices, weights, mode='valid')

def ema(prices, period):

"""Exponential Moving Average"""

alpha = 2 / (period + 1)

ema_values = np.zeros(len(prices))

ema_values[0] = prices[0]

for i in range(1, len(prices)):

ema_values[i] = alpha * prices[i] + (1 - alpha) * ema_values[i-1]

return ema_values

# Example usage

prices = np.random.randn(100).cumsum() + 100

sma_20 = sma(prices, 20)

ema_20 = ema(prices, 20)

print(f"SMA(20) latest: {sma_20[-1]:.2f}")

print(f"EMA(20) latest: {ema_20[-1]:.2f}")

Calculate Volatility Metrics

Historical volatility and ATR calculation

Python

import numpy as np

def historical_volatility(prices, period=20):

"""Calculate annualized historical volatility"""

log_returns = np.diff(np.log(prices))

volatility = np.std(log_returns[-period:]) * np.sqrt(252)

return volatility

def atr(high, low, close, period=14):

"""Average True Range indicator"""

tr1 = high[1:] - low[1:]

tr2 = np.abs(high[1:] - close[:-1])

tr3 = np.abs(low[1:] - close[:-1])

true_range = np.maximum(np.maximum(tr1, tr2), tr3)

atr_values = np.convolve(true_range, np.ones(period)/period, mode='valid')

return atr_values

# Example with random OHLC data

np.random.seed(42)

close = np.random.randn(100).cumsum() + 100

high = close + np.abs(np.random.randn(100) * 0.5)

low = close - np.abs(np.random.randn(100) * 0.5)

print(f"Historical Volatility: {historical_volatility(close):.2%}")

print(f"ATR(14) latest: {atr(high, low, close)[-1]:.4f}")

Monte Carlo Simulation

Simulate future price paths

Python

import numpy as np

def monte_carlo_simulation(initial_price, mu, sigma, days, simulations):

"""

Generate Monte Carlo price simulations

Parameters:

- initial_price: Starting price

- mu: Expected daily return

- sigma: Daily volatility

- days: Number of days to simulate

- simulations: Number of simulation paths

"""

dt = 1 # Daily

# Generate random returns

random_returns = np.random.normal(

mu * dt,

sigma * np.sqrt(dt),

(simulations, days)

)

# Calculate price paths

price_paths = initial_price * np.cumprod(1 + random_returns, axis=1)

return price_paths

# Simulate 1000 paths for 252 trading days

paths = monte_carlo_simulation(

initial_price=1.1000,

mu=0.0001, # 0.01% daily return

sigma=0.008, # 0.8% daily volatility

days=252,

simulations=1000

)

print(f"Final price range: {paths[:, -1].min():.4f} - {paths[:, -1].max():.4f}")

print(f"Expected final price: {paths[:, -1].mean():.4f}")

print(f"Probability of profit: {(paths[:, -1] > 1.1000).mean():.2%}")

Correlation Analysis

Calculate correlation matrix for pairs trading

Python

import numpy as np

# Simulated returns for multiple currency pairs

np.random.seed(42)

eurusd_returns = np.random.randn(100) * 0.01

gbpusd_returns = eurusd_returns * 0.8 + np.random.randn(100) * 0.005 # Correlated

usdjpy_returns = np.random.randn(100) * 0.01 # Less correlated

# Stack returns into matrix

returns_matrix = np.column_stack([

eurusd_returns,

gbpusd_returns,

usdjpy_returns

])

# Calculate correlation matrix

correlation_matrix = np.corrcoef(returns_matrix.T)

pairs = ['EURUSD', 'GBPUSD', 'USDJPY']

print("Correlation Matrix:")

print("-" * 50)

for i, pair1 in enumerate(pairs):

for j, pair2 in enumerate(pairs):

print(f"{pair1}-{pair2}: {correlation_matrix[i,j]:.3f}", end=" ")

print()

# Find most correlated pair

mask = np.triu(np.ones_like(correlation_matrix, dtype=bool), k=1)

max_idx = np.unravel_index(np.argmax(correlation_matrix * mask), correlation_matrix.shape)

print(f"\nMost correlated: {pairs[max_idx[0]]} & {pairs[max_idx[1]]}")

Portfolio Optimization

Mean-variance optimization with NumPy

Python

import numpy as np

def optimize_portfolio(returns, num_portfolios=10000):

"""

Find optimal portfolio weights using random sampling

"""

num_assets = returns.shape[1]

mean_returns = np.mean(returns, axis=0)

cov_matrix = np.cov(returns.T)

results = np.zeros((num_portfolios, 3))

weights_record = np.zeros((num_portfolios, num_assets))

for i in range(num_portfolios):

# Random weights

weights = np.random.random(num_assets)

weights /= weights.sum()

weights_record[i] = weights

# Portfolio return (annualized)

portfolio_return = np.dot(weights, mean_returns) * 252

# Portfolio volatility (annualized)

portfolio_std = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights))) * np.sqrt(252)

# Sharpe ratio

sharpe = portfolio_return / portfolio_std

results[i] = [portfolio_return, portfolio_std, sharpe]

# Find optimal portfolio (max Sharpe)

max_sharpe_idx = np.argmax(results[:, 2])

return weights_record[max_sharpe_idx], results[max_sharpe_idx]

# Example with 3 assets

np.random.seed(42)

returns = np.random.randn(252, 3) * np.array([0.01, 0.015, 0.02])

optimal_weights, metrics = optimize_portfolio(returns)

print(f"Optimal Weights: {optimal_weights.round(3)}")

print(f"Expected Return: {metrics[0]:.2%}")

print(f"Expected Volatility: {metrics[1]:.2%}")

print(f"Sharpe Ratio: {metrics[2]:.2f}")

Vectorized Signal Generation

Fast signal generation without loops

Python

import numpy as np

def generate_signals(close, fast_period=10, slow_period=30):

"""

Generate trading signals using vectorized operations

"""

# Calculate SMAs

fast_sma = np.convolve(close, np.ones(fast_period)/fast_period, mode='full')[:len(close)]

slow_sma = np.convolve(close, np.ones(slow_period)/slow_period, mode='full')[:len(close)]

# Initialize signals

signals = np.zeros(len(close))

# Valid index (after slow period)

valid_idx = slow_period - 1

# Vectorized signal generation

signals[valid_idx:] = np.where(

fast_sma[valid_idx:] > slow_sma[valid_idx:],

1, # Buy

np.where(

fast_sma[valid_idx:] < slow_sma[valid_idx:],

-1, # Sell

0 # Neutral

)

# Detect crossovers (signal changes)

crossovers = np.diff(signals, prepend=0)

return signals, crossovers

# Example

np.random.seed(42)

close = np.random.randn(100).cumsum() + 100

signals, crossovers = generate_signals(close)

print(f"Buy signals: {np.sum(crossovers == 2)}")

print(f"Sell signals: {np.sum(crossovers == -2)}")

print(f"Current position: {'Long' if signals[-1] == 1 else 'Short' if signals[-1] == -1 else 'Flat'}")

Calculate Maximum Drawdown

Efficient drawdown calculation

Python

import numpy as np

def calculate_drawdown(equity_curve):

"""

Calculate drawdown series and maximum drawdown

"""

# Running maximum

running_max = np.maximum.accumulate(equity_curve)

# Drawdown (as percentage)

drawdown = (equity_curve - running_max) / running_max

# Maximum drawdown

max_drawdown = np.min(drawdown)

# Find drawdown duration

underwater = drawdown < 0

return drawdown, max_drawdown

# Example equity curve

np.random.seed(42)

returns = np.random.randn(252) * 0.02

equity = 10000 * np.cumprod(1 + returns)

drawdown, max_dd = calculate_drawdown(equity)

print(f"Maximum Drawdown: {max_dd:.2%}")

print(f"Current Drawdown: {drawdown[-1]:.2%}")

print(f"Days in drawdown: {np.sum(drawdown < 0)}")

Common Use Cases

Price array manipulation

Technical indicator calculation

Statistical analysis

Monte Carlo simulations

Portfolio optimization

Correlation analysis

Vectorized backtesting

Risk metrics calculation

Random number generation

Linear algebra operations

Best Practices & Common Pitfalls

Use Vectorization

Replace Python loops with NumPy operations for massive speed improvements in backtesting.

Pre-allocate Arrays

Create arrays with np.zeros() or np.empty() before filling them to avoid memory reallocation.

Use Views, Not Copies

Understand when NumPy creates views vs copies to manage memory efficiently with large datasets.

Watch for NaN Propagation

NaN values propagate through calculations. Use np.nanmean(), np.nanstd() for handling missing data.

Indexing Returns Views

Modifying a slice modifies the original array. Use .copy() if you need independent data.

Float Precision

Be aware of floating-point precision issues when comparing prices. Use np.isclose() for comparisons.

Additional Resources

Official Documentation

Trading-Specific Resources

Vectorized Backtesting with NumPy
Monte Carlo Methods in Finance
Portfolio Optimization Techniques

Next Steps

NumPy is the foundation. Now explore pandas for data manipulation and specialized trading libraries:

pandas - DataFrames TA-Lib - Indicators vectorbt - Fast Backtesting