1094 lines
54 KiB
Plaintext
1094 lines
54 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fb3aacb0",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# Quantitative Trading — Strategy Development & Backtesting Demo\n",
|
||
"# 量化交易 — 策略开发与回测演示\n",
|
||
"\n",
|
||
"# =============================================================================\n",
|
||
"#\n",
|
||
"# 本文件是数据管道 (quant_data_pipeline_demo.py) 的续集。\n",
|
||
"# This file is the sequel to the data pipeline demo.\n",
|
||
"#\n",
|
||
"# Topics covered / 涵盖主题:\n",
|
||
"# 1. Technical Indicators 技术指标 (MA, RSI, MACD, Bollinger Bands)\n",
|
||
"# 2. Signal Generation 信号生成 (entry & exit rules)\n",
|
||
"# 3. Two Demo Strategies 两个示范策略:\n",
|
||
"# A. Dual Moving Average Crossover 双均线金叉死叉策略\n",
|
||
"# B. RSI Mean Reversion RSI 均值回归策略\n",
|
||
"# 4. Vectorized Backtest Engine 向量化回测引擎\n",
|
||
"# 5. Performance Metrics 绩效指标\n",
|
||
"# (Sharpe, Sortino, Max Drawdown, Win Rate …)\n",
|
||
"# 6. Visualization 可视化\n",
|
||
"#\n",
|
||
"# Prerequisites / 前置条件:\n",
|
||
"# pip install numpy pandas matplotlib scipy\n",
|
||
"#\n",
|
||
"# Running / 运行方式:\n",
|
||
"# python quant_strategy_backtest_demo.py\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "156c36ec",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"import pandas as pd\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import matplotlib.gridspec as gridspec\n",
|
||
"from scipy import stats\n",
|
||
"import warnings\n",
|
||
"warnings.filterwarnings('ignore')\n",
|
||
"\n",
|
||
"# 中文字体配置 / Chinese font config\n",
|
||
"plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei', 'Arial Unicode MS', 'SimHei', 'DejaVu Sans']\n",
|
||
"plt.rcParams['axes.unicode_minus'] = False\n",
|
||
"\n",
|
||
"np.random.seed(42)\n",
|
||
"print(\"=\" * 70)\n",
|
||
"print(\" 量化交易策略开发与回测演示\")\n",
|
||
"print(\" Quantitative Trading: Strategy Development & Backtesting Demo\")\n",
|
||
"print(\"=\" * 70)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "62cbe290",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 0: Synthetic Price Data 合成价格数据\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# We simulate a single stock using Geometric Brownian Motion (几何布朗运动),\n",
|
||
"# the classical model that underlies the Black-Scholes formula.\n",
|
||
"#\n",
|
||
"# GBM formula:\n",
|
||
"# dS = μ·S·dt + σ·S·dW\n",
|
||
"#\n",
|
||
"# Discrete form (what we actually compute each day):\n",
|
||
"# S_t = S_{t-1} · exp( (μ - σ²/2)·dt + σ·√dt·ε )\n",
|
||
"#\n",
|
||
"# where:\n",
|
||
"# μ = drift / 年化漂移率 (expected annual return)\n",
|
||
"# σ = volatility / 年化波动率\n",
|
||
"# dt = 1/252 (one trading day as a fraction of a year)\n",
|
||
"# ε ~ N(0,1) (standard normal random shock / 标准正态随机扰动)\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "04ed6429",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def generate_price_series(\n",
|
||
" n_days: int = 1500,\n",
|
||
" mu: float = 0.10, # 年化预期收益率 / annual expected return\n",
|
||
" sigma: float = 0.25, # 年化波动率 / annual volatility\n",
|
||
" s0: float = 100.0, # 初始价格 / initial price\n",
|
||
" seed: int = 42,\n",
|
||
") -> pd.Series:\n",
|
||
" \"\"\"\n",
|
||
" Generate a synthetic daily price series via GBM.\n",
|
||
" 用几何布朗运动生成合成日线价格序列。\n",
|
||
" \"\"\"\n",
|
||
" np.random.seed(seed)\n",
|
||
" dt = 1.0 / 252 # 每个交易日占一年的比例\n",
|
||
" epsilon = np.random.randn(n_days) # 每日随机冲击\n",
|
||
" log_returns = (mu - 0.5 * sigma ** 2) * dt + sigma * np.sqrt(dt) * epsilon\n",
|
||
" prices = s0 * np.exp(np.cumsum(log_returns)) # 累积乘积 → 价格路径\n",
|
||
"\n",
|
||
" # 生成工作日日期序列 / generate business-day date index\n",
|
||
" dates = pd.bdate_range(start=\"2019-01-02\", periods=n_days)\n",
|
||
" return pd.Series(prices, index=dates, name=\"close\")\n",
|
||
"\n",
|
||
"\n",
|
||
"price = generate_price_series()\n",
|
||
"print(f\"\\n[数据] 生成模拟股票价格: {len(price)} 个交易日\")\n",
|
||
"print(f\" 价格区间: {price.min():.2f} ~ {price.max():.2f}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ccda8a1f",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 1: Technical Indicators 技术指标\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# Technical indicators transform raw price/volume data into signals.\n",
|
||
"# 技术指标将原始价格/成交量数据转化为交易信号。\n",
|
||
"#\n",
|
||
"# They are divided into two broad families:\n",
|
||
"# 主要分为两大类:\n",
|
||
"#\n",
|
||
"# ① Trend-following indicators 趋势跟随指标\n",
|
||
"# → Moving Averages (MA), MACD\n",
|
||
"# → Work well in trending markets (趋势市中效果好)\n",
|
||
"#\n",
|
||
"# ② Oscillators / Mean-reversion indicators 震荡/均值回归指标\n",
|
||
"# → RSI, Bollinger Bands\n",
|
||
"# → Work well in range-bound / choppy markets (震荡市中效果好)\n",
|
||
"\n",
|
||
"# =============================================================================\n",
|
||
"\n",
|
||
"# ── 1-A Simple Moving Average 简单移动平均线 (SMA) ──────────────────────────\n",
|
||
"#\n",
|
||
"# SMA_n(t) = (P_{t} + P_{t-1} + … + P_{t-n+1}) / n\n",
|
||
"#\n",
|
||
"# The SMA smooths out daily noise to reveal the underlying trend.\n",
|
||
"# SMA 平滑日内噪音,揭示潜在趋势。\n",
|
||
"# A longer window → smoother, but lags more behind recent price action.\n",
|
||
"# 窗口越长 → 越平滑,但对价格变化的反应越滞后。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "8ec5acb4",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def sma(prices: pd.Series, window: int) -> pd.Series:\n",
|
||
" \"\"\"Simple Moving Average / 简单移动平均线\"\"\"\n",
|
||
" return prices.rolling(window=window).mean()\n",
|
||
"\n",
|
||
"\n",
|
||
"# ── 1-B Exponential Moving Average 指数移动平均线 (EMA) ───────────────────\n",
|
||
"#\n",
|
||
"# EMA gives MORE weight to recent prices (recent data matters more).\n",
|
||
"# EMA 给予近期价格更高权重(近期数据更重要)。\n",
|
||
"#\n",
|
||
"# EMA_t = α · P_t + (1 - α) · EMA_{t-1}\n",
|
||
"# where α = 2 / (n + 1) (smoothing factor / 平滑因子)\n",
|
||
"#\n",
|
||
"# EMA reacts faster than SMA to price changes.\n",
|
||
"# EMA 对价格变动的反应比 SMA 更灵敏。\n",
|
||
"\n",
|
||
"def ema(prices: pd.Series, span: int) -> pd.Series:\n",
|
||
" \"\"\"Exponential Moving Average / 指数移动平均线\"\"\"\n",
|
||
" return prices.ewm(span=span, adjust=False).mean()\n",
|
||
"\n",
|
||
"\n",
|
||
"# ── 1-C RSI 相对强弱指数 (Relative Strength Index) ─────────────────────────\n",
|
||
"#\n",
|
||
"# RSI measures the speed and magnitude of recent price changes.\n",
|
||
"# RSI 衡量近期价格变动的速度和幅度。\n",
|
||
"#\n",
|
||
"# Formula:\n",
|
||
"# RS = average_gain / average_loss (over last n days)\n",
|
||
"# RSI = 100 - 100 / (1 + RS)\n",
|
||
"#\n",
|
||
"# Interpretation / 指标解读:\n",
|
||
"# RSI > 70 → Overbought 超买 (price may be due for a pullback / 价格可能回调)\n",
|
||
"# RSI < 30 → Oversold 超卖 (price may be due for a bounce / 价格可能反弹)\n",
|
||
"# RSI = 50 → Neutral 中性\n",
|
||
"\n",
|
||
"def rsi(prices: pd.Series, window: int = 14) -> pd.Series:\n",
|
||
" \"\"\"\n",
|
||
" Compute Wilder's RSI.\n",
|
||
" 计算 Wilder 平滑法 RSI。\n",
|
||
" \"\"\"\n",
|
||
" delta = prices.diff() # 每日价格变化 / daily price change\n",
|
||
" gain = delta.clip(lower=0) # 只保留上涨部分 / keep only up-days\n",
|
||
" loss = -delta.clip(upper=0) # 只保留下跌部分 / keep only down-days\n",
|
||
"\n",
|
||
" # Wilder uses EMA with span = 2*n - 1 (equivalent to 1/n smoothing)\n",
|
||
" avg_gain = gain.ewm(alpha=1.0 / window, adjust=False).mean()\n",
|
||
" avg_loss = loss.ewm(alpha=1.0 / window, adjust=False).mean()\n",
|
||
"\n",
|
||
" rs = avg_gain / avg_loss # 相对强弱值 / relative strength\n",
|
||
" return 100 - (100 / (1 + rs)) # 转换为 0~100 范围\n",
|
||
"\n",
|
||
"\n",
|
||
"# ── 1-D MACD 指数平滑异同移动平均线 ────────────────────────────────────────\n",
|
||
"#\n",
|
||
"# MACD reveals the relationship between two EMAs.\n",
|
||
"# MACD 揭示两条 EMA 之间的关系。\n",
|
||
"#\n",
|
||
"# Components / 构成:\n",
|
||
"# MACD Line MACD线 = EMA(12) - EMA(26) (fast minus slow / 快线减慢线)\n",
|
||
"# Signal Line 信号线 = EMA(9) of MACD Line (trigger line / 触发线)\n",
|
||
"# Histogram 柱状图 = MACD Line - Signal Line\n",
|
||
"#\n",
|
||
"# Trading rules / 交易规则:\n",
|
||
"# MACD crosses above Signal → Bullish (金叉, buy signal / 买入信号)\n",
|
||
"# MACD crosses below Signal → Bearish (死叉, sell signal / 卖出信号)\n",
|
||
"\n",
|
||
"def macd(prices: pd.Series,\n",
|
||
" fast: int = 12, slow: int = 26, signal: int = 9\n",
|
||
" ) -> pd.DataFrame:\n",
|
||
" \"\"\"\n",
|
||
" Compute MACD, Signal line, and Histogram.\n",
|
||
" 计算 MACD线、信号线和柱状图。\n",
|
||
" \"\"\"\n",
|
||
" ema_fast = ema(prices, fast)\n",
|
||
" ema_slow = ema(prices, slow)\n",
|
||
" macd_line = ema_fast - ema_slow # MACD 线\n",
|
||
" signal_line = ema(macd_line, signal) # 信号线 (DIF的EMA)\n",
|
||
" histogram = macd_line - signal_line # 柱状图 (MACD Bar)\n",
|
||
" return pd.DataFrame({\n",
|
||
" \"macd\": macd_line,\n",
|
||
" \"signal\": signal_line,\n",
|
||
" \"histogram\": histogram,\n",
|
||
" })\n",
|
||
"\n",
|
||
"\n",
|
||
"# ── 1-E Bollinger Bands 布林带 ─────────────────────────────────────────────\n",
|
||
"#\n",
|
||
"# Bollinger Bands place upper/lower envelopes around a moving average.\n",
|
||
"# 布林带在移动平均线上下各画一条\"包络线\"。\n",
|
||
"#\n",
|
||
"# Formula:\n",
|
||
"# Middle Band 中轨 = SMA(n)\n",
|
||
"# Upper Band 上轨 = SMA(n) + k·σ_n (k = 2 by default / 默认 k=2)\n",
|
||
"# Lower Band 下轨 = SMA(n) - k·σ_n\n",
|
||
"#\n",
|
||
"# where σ_n is the rolling standard deviation / 滚动标准差\n",
|
||
"#\n",
|
||
"# When price touches the lower band → oversold area (超卖区域)\n",
|
||
"# When price touches the upper band → overbought area (超买区域)\n",
|
||
"# Band width (带宽) contracts before explosive moves (波动收窄常预示突破)\n",
|
||
"\n",
|
||
"def bollinger_bands(prices: pd.Series, window: int = 20, k: float = 2.0\n",
|
||
" ) -> pd.DataFrame:\n",
|
||
" \"\"\"\n",
|
||
" Compute Bollinger Bands.\n",
|
||
" 计算布林带(上轨、中轨、下轨)。\n",
|
||
" \"\"\"\n",
|
||
" mid = sma(prices, window) # 中轨 (SMA)\n",
|
||
" std = prices.rolling(window).std() # 滚动标准差\n",
|
||
" upper = mid + k * std # 上轨\n",
|
||
" lower = mid - k * std # 下轨\n",
|
||
" # %B indicator: where is the current price within the band?\n",
|
||
" # %B 指标:当前价格在带宽中的位置 (0=下轨, 1=上轨)\n",
|
||
" pct_b = (prices - lower) / (upper - lower)\n",
|
||
" return pd.DataFrame({\n",
|
||
" \"upper\": upper, \"mid\": mid, \"lower\": lower, \"pct_b\": pct_b\n",
|
||
" })\n",
|
||
"\n",
|
||
"\n",
|
||
"# Compute all indicators on our simulated price series\n",
|
||
"# 对模拟价格序列计算所有指标\n",
|
||
"sma20 = sma(price, 20) # 20日均线 / 20-day SMA\n",
|
||
"sma60 = sma(price, 60) # 60日均线 / 60-day SMA (longer trend)\n",
|
||
"rsi14 = rsi(price, 14) # 14日RSI / 14-day RSI\n",
|
||
"macd_df = macd(price) # MACD (12/26/9)\n",
|
||
"bb = bollinger_bands(price, window=20, k=2.0)\n",
|
||
"\n",
|
||
"print(\"\\n[指标] 技术指标计算完成:\")\n",
|
||
"print(f\" SMA20 — 首个有效值日期: {sma20.first_valid_index().date()}\")\n",
|
||
"print(f\" SMA60 — 首个有效值日期: {sma60.first_valid_index().date()}\")\n",
|
||
"print(f\" RSI14 — 首个有效值日期: {rsi14.first_valid_index().date()}\")\n",
|
||
"print(f\" MACD — 首个有效值日期: {macd_df['macd'].first_valid_index().date()}\")\n",
|
||
"print(f\" BollingerBands — 首个有效值日期: {bb['mid'].first_valid_index().date()}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "739084bb",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 2: Strategy A — Dual Moving Average Crossover\n",
|
||
"# 策略 A — 双均线金叉/死叉策略\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# One of the oldest and most intuitive trend-following strategies.\n",
|
||
"# 最古老也最直观的趋势跟随策略之一。\n",
|
||
"#\n",
|
||
"# Logic / 逻辑:\n",
|
||
"# Golden Cross (金叉): short MA crosses ABOVE long MA → BUY (做多)\n",
|
||
"# Death Cross (死叉): short MA crosses BELOW long MA → SELL (平仓)\n",
|
||
"#\n",
|
||
"# Rationale / 原理:\n",
|
||
"# When the short-term average rises above the long-term average, it signals\n",
|
||
"# that recent momentum is stronger than the historical trend → bullish.\n",
|
||
"# 短期均线上穿长期均线,意味着近期动能强于历史趋势 → 看涨。\n",
|
||
"#\n",
|
||
"# Parameters / 参数:\n",
|
||
"# SHORT_WINDOW = 20 (fast line / 快线)\n",
|
||
"# LONG_WINDOW = 60 (slow line / 慢线)\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "a99d5899",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"SHORT_WIN = 20 # 短期均线窗口 / short-term MA window\n",
|
||
"LONG_WIN = 60 # 长期均线窗口 / long-term MA window\n",
|
||
"\n",
|
||
"ma_short = sma(price, SHORT_WIN)\n",
|
||
"ma_long = sma(price, LONG_WIN)\n",
|
||
"\n",
|
||
"# ── Signal generation 信号生成 ───────────────────────────────────────────────\n",
|
||
"#\n",
|
||
"# Signal (信号) = +1 when we should be LONG (持多仓), 0 when out of market (空仓)\n",
|
||
"#\n",
|
||
"# Step 1: raw_signal = 1 whenever short MA > long MA (short MA above long MA)\n",
|
||
"# Step 2: detect crossovers (cross = today's signal ≠ yesterday's signal)\n",
|
||
"#\n",
|
||
"# We use a \"position\" approach — hold the position until it reverses.\n",
|
||
"# 使用\"持仓\"方式 — 持有直到信号翻转。\n",
|
||
"\n",
|
||
"# raw_signal: 1 = short above long (看多区域), 0 = short below long (看空区域)\n",
|
||
"raw_signal = (ma_short > ma_long).astype(int)\n",
|
||
"\n",
|
||
"# Align signals: use yesterday's signal to trade today (avoid lookahead bias)\n",
|
||
"# 用昨天的信号决定今天的仓位,避免\"未来数据偷窥\" (前视偏差 / lookahead bias)\n",
|
||
"ma_signal = raw_signal.shift(1).fillna(0)\n",
|
||
"\n",
|
||
"print(\"\\n[策略A] 双均线信号生成完成\")\n",
|
||
"print(f\" 多头持仓天数 (Signal=1): {int(ma_signal.sum())} 天\")\n",
|
||
"print(f\" 空仓天数 (Signal=0): {int((ma_signal == 0).sum())} 天\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "dd6312ac",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 3: Strategy B — RSI Mean Reversion\n",
|
||
"# 策略 B — RSI 均值回归策略\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# This is a contrarian strategy: buy when the market seems \"too weak\",\n",
|
||
"# sell when it seems \"too strong\".\n",
|
||
"# 这是一个逆势策略:市场\"跌过头\"时买入,\"涨过头\"时卖出。\n",
|
||
"#\n",
|
||
"# Logic / 逻辑:\n",
|
||
"# RSI drops below oversold level (超卖线, default 30) → BUY signal\n",
|
||
"# RSI rises above overbought level (超买线, default 70) → SELL signal\n",
|
||
"#\n",
|
||
"# This exploits mean reversion (均值回归): extreme prices tend to revert.\n",
|
||
"# 利用均值回归特性:极端价格倾向于回归均值。\n",
|
||
"#\n",
|
||
"# Risk / 风险:\n",
|
||
"# In a strong trend, RSI can stay oversold/overbought for long stretches.\n",
|
||
"# 在强趋势中,RSI 可以长时间停留在超卖/超买区域,造成连续亏损。\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c428d624",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"RSI_OVERSOLD = 30 # 超卖线 / oversold threshold\n",
|
||
"RSI_OVERBOUGHT = 70 # 超买线 / overbought threshold\n",
|
||
"\n",
|
||
"def rsi_signal(rsi_series: pd.Series,\n",
|
||
" oversold: float = 30,\n",
|
||
" overbought: float = 70) -> pd.Series:\n",
|
||
" \"\"\"\n",
|
||
" Generate long/short/flat signals from RSI.\n",
|
||
" 根据 RSI 生成多空平信号。\n",
|
||
"\n",
|
||
" Returns a Series of:\n",
|
||
" +1 → Long (做多)\n",
|
||
" -1 → Short (做空)\n",
|
||
" 0 → Flat (空仓, no position)\n",
|
||
" \"\"\"\n",
|
||
" position = pd.Series(0, index=rsi_series.index, dtype=float)\n",
|
||
" current_pos = 0 # 当前持仓状态 / current position state\n",
|
||
"\n",
|
||
" for i in range(1, len(rsi_series)):\n",
|
||
" r = rsi_series.iloc[i]\n",
|
||
" if pd.isna(r):\n",
|
||
" position.iloc[i] = 0\n",
|
||
" continue\n",
|
||
"\n",
|
||
" # Entry rules / 入场规则\n",
|
||
" if r < oversold and current_pos == 0:\n",
|
||
" current_pos = 1 # 超卖 → 做多 / oversold → go long\n",
|
||
"\n",
|
||
" elif r > overbought and current_pos == 0:\n",
|
||
" current_pos = -1 # 超买 → 做空 / overbought → go short\n",
|
||
"\n",
|
||
" # Exit rules / 出场规则\n",
|
||
" # Exit long when RSI recovers above 50 (回到中性区域 / back to neutral)\n",
|
||
" elif current_pos == 1 and r > 50:\n",
|
||
" current_pos = 0\n",
|
||
"\n",
|
||
" # Exit short when RSI falls below 50\n",
|
||
" elif current_pos == -1 and r < 50:\n",
|
||
" current_pos = 0\n",
|
||
"\n",
|
||
" position.iloc[i] = current_pos\n",
|
||
"\n",
|
||
" return position\n",
|
||
"\n",
|
||
"\n",
|
||
"rsi_pos = rsi_signal(rsi14, RSI_OVERSOLD, RSI_OVERBOUGHT)\n",
|
||
"\n",
|
||
"# Shift by 1 day to avoid lookahead bias / 前移一天避免前视偏差\n",
|
||
"rsi_signal_shifted = rsi_pos.shift(1).fillna(0)\n",
|
||
"\n",
|
||
"print(\"\\n[策略B] RSI信号生成完成\")\n",
|
||
"print(f\" 多头持仓天数 (Signal=+1): {int((rsi_signal_shifted == 1).sum())} 天\")\n",
|
||
"print(f\" 空头持仓天数 (Signal=-1): {int((rsi_signal_shifted == -1).sum())} 天\")\n",
|
||
"print(f\" 空仓天数 (Signal= 0): {int((rsi_signal_shifted == 0).sum())} 天\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3bb7e0d4",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 4: Vectorized Backtest Engine 向量化回测引擎\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# A backtest (回测) simulates how a strategy would have performed\n",
|
||
"# on historical data. It is the primary tool for validating a strategy\n",
|
||
"# before risking real money.\n",
|
||
"# 回测是在历史数据上模拟策略表现的工具,是真实投资前验证策略的主要手段。\n",
|
||
"#\n",
|
||
"# Two main backtest styles / 两种主要回测方式:\n",
|
||
"#\n",
|
||
"# ① Vectorized backtest 向量化回测\n",
|
||
"# - Compute all positions & P&L as array operations at once (numpy/pandas)\n",
|
||
"# - Very fast; good for strategy exploration\n",
|
||
"# - 所有仓位和盈亏一次性用数组运算计算,速度极快,适合策略探索\n",
|
||
"#\n",
|
||
"# ② Event-driven backtest 事件驱动回测\n",
|
||
"# - Simulate time step-by-step, reacting to each market event\n",
|
||
"# - More realistic (handles fills, slippage, latency, order queuing)\n",
|
||
"# - 逐笔模拟市场事件,更真实(考虑成交、滑点、延迟等),速度较慢\n",
|
||
"#\n",
|
||
"# We use the vectorized approach here for clarity and speed.\n",
|
||
"# 此处使用向量化方式,兼顾清晰度和速度。\n",
|
||
"#\n",
|
||
"# Cost model 交易成本模型:\n",
|
||
"# - Commission (佣金): charged each time you trade (per trade)\n",
|
||
"# - Slippage (滑点): the difference between the expected fill price and\n",
|
||
"# the actual fill price (price moves against you)\n",
|
||
"# We approximate both as a percentage of the trade value.\n",
|
||
"# 两者合并近似为交易金额的固定比例。\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "1d6b6a55",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"class VectorizedBacktester:\n",
|
||
" \"\"\"\n",
|
||
" A simple vectorized backtesting engine.\n",
|
||
" 简单的向量化回测引擎。\n",
|
||
"\n",
|
||
" Assumptions / 假设:\n",
|
||
" • Long-only or long/short positions\n",
|
||
" • Trade at next-day's open (用下一天开盘价成交) — conservative assumption\n",
|
||
" We approximate this by using the same day's close shifted by 1 day.\n",
|
||
" • Round-trip cost (单次交易成本) = 2 × cost_per_trade\n",
|
||
" (pay cost on entry AND exit / 进出各收一次)\n",
|
||
" • No leverage (无杠杆), position size is 100% of capital when in trade\n",
|
||
" \"\"\"\n",
|
||
"\n",
|
||
" def __init__(\n",
|
||
" self,\n",
|
||
" prices: pd.Series,\n",
|
||
" signal: pd.Series,\n",
|
||
" cost_per_trade: float = 0.001, # 0.1% one-way / 单向 0.1% (含佣金+滑点)\n",
|
||
" initial_capital: float = 1_000_000.0, # 初始资金 / initial capital\n",
|
||
" name: str = \"Strategy\",\n",
|
||
" ):\n",
|
||
" self.prices = prices\n",
|
||
" self.signal = signal.reindex(prices.index).fillna(0)\n",
|
||
" self.cost_per_trade = cost_per_trade\n",
|
||
" self.initial_capital = initial_capital\n",
|
||
" self.name = name\n",
|
||
" self._run()\n",
|
||
"\n",
|
||
" def _run(self):\n",
|
||
" \"\"\"Core backtesting logic. 核心回测逻辑。\"\"\"\n",
|
||
" prices = self.prices\n",
|
||
" signal = self.signal\n",
|
||
"\n",
|
||
" # ── Daily price return 日收益率 ────────────────────────────────────\n",
|
||
" daily_ret = prices.pct_change().fillna(0)\n",
|
||
"\n",
|
||
" # ── Strategy return (before costs) 策略日收益率(扣除成本前)─────────\n",
|
||
" # Strategy return = signal × market return\n",
|
||
" # 策略当日收益率 = 持仓方向 × 市场当日收益率\n",
|
||
" strat_ret_gross = signal * daily_ret\n",
|
||
"\n",
|
||
" # ── Transaction cost 交易成本 ──────────────────────────────────────\n",
|
||
" # Detect position changes (signal changes from one day to the next)\n",
|
||
" # 检测仓位变化(信号从一天到下一天发生变化)\n",
|
||
" position_change = signal.diff().fillna(0).abs() # >0 means we traded\n",
|
||
" # Cost is charged each time position changes\n",
|
||
" # 每次仓位变化时扣除成本\n",
|
||
" cost = position_change * self.cost_per_trade\n",
|
||
"\n",
|
||
" # ── Net strategy return 策略净收益率 ───────────────────────────────\n",
|
||
" strat_ret_net = strat_ret_gross - cost\n",
|
||
"\n",
|
||
" # ── Equity curve 净值曲线 ───────────────────────────────────────────\n",
|
||
" # The equity curve tracks how 1 unit of capital grows over time.\n",
|
||
" # 净值曲线追踪单位资本随时间的增长。\n",
|
||
" # (1 + daily_net_return) compounded every day\n",
|
||
" equity = self.initial_capital * (1 + strat_ret_net).cumprod()\n",
|
||
" equity_bh = self.initial_capital * (1 + daily_ret).cumprod() # Buy & Hold benchmark\n",
|
||
"\n",
|
||
" # ── Drawdown 回撤 ──────────────────────────────────────────────────\n",
|
||
" # Drawdown measures how far we are from the peak at any point in time.\n",
|
||
" # 回撤衡量当前净值距离历史最高点的跌幅。\n",
|
||
" rolling_max = equity.cummax()\n",
|
||
" drawdown = (equity - rolling_max) / rolling_max # always <= 0\n",
|
||
"\n",
|
||
" # Store results for later analysis\n",
|
||
" self.daily_ret = daily_ret\n",
|
||
" self.strat_ret = strat_ret_net\n",
|
||
" self.equity = equity\n",
|
||
" self.equity_bh = equity_bh\n",
|
||
" self.drawdown = drawdown\n",
|
||
" self.n_trades = int((position_change > 0).sum())\n",
|
||
" self.total_cost = cost.sum()\n",
|
||
"\n",
|
||
" # ── Performance metrics 绩效指标 ──────────────────────────────────────────\n",
|
||
" #\n",
|
||
" # A well-rounded strategy evaluation uses multiple metrics, because\n",
|
||
" # no single number captures the full picture.\n",
|
||
" # 全面的策略评估需要多个指标,因为单一数字无法描述全貌。\n",
|
||
" #\n",
|
||
" # Key metrics / 关键指标:\n",
|
||
" # Total Return 总收益率 — how much did we make in total?\n",
|
||
" # CAGR 年化复合增长率 — annualized compounded growth rate\n",
|
||
" # Sharpe Ratio 夏普比率 — return per unit of total risk (risk-adjusted)\n",
|
||
" # Sortino Ratio 索提诺比率 — return per unit of DOWNSIDE risk only\n",
|
||
" # Max Drawdown 最大回撤 — worst peak-to-trough decline\n",
|
||
" # Calmar Ratio 卡玛比率 — CAGR / Max Drawdown (reward vs worst loss)\n",
|
||
" # Win Rate 胜率 — fraction of days (or trades) with positive P&L\n",
|
||
" # Profit Factor 盈亏比 — total profit / total loss\n",
|
||
"\n",
|
||
" def metrics(self) -> dict:\n",
|
||
" \"\"\"Compute and return a dictionary of performance metrics.\n",
|
||
" 计算并返回绩效指标字典。\"\"\"\n",
|
||
" r = self.strat_ret\n",
|
||
" eq = self.equity\n",
|
||
" n = len(r)\n",
|
||
" years = n / 252.0 # approximate years in sample / 样本年数估算\n",
|
||
"\n",
|
||
" # Total return / 总收益率\n",
|
||
" total_return = (eq.iloc[-1] / self.initial_capital) - 1\n",
|
||
"\n",
|
||
" # CAGR 年化复合增长率\n",
|
||
" # CAGR = (EndValue / StartValue)^(1/years) - 1\n",
|
||
" cagr = (1 + total_return) ** (1 / years) - 1\n",
|
||
"\n",
|
||
" # Annualized volatility 年化波动率\n",
|
||
" ann_vol = r.std() * np.sqrt(252)\n",
|
||
"\n",
|
||
" # Sharpe Ratio 夏普比率\n",
|
||
" # Sharpe = (Mean excess return) / StdDev(return) × √252\n",
|
||
" # Excess return = strategy return - risk-free rate\n",
|
||
" # 超额收益率 = 策略收益率 - 无风险利率\n",
|
||
" # We use 0 as risk-free rate for simplicity (or assume it's netted out)\n",
|
||
" risk_free = 0.0\n",
|
||
" sharpe = (r.mean() - risk_free / 252) / r.std() * np.sqrt(252) if r.std() > 0 else 0\n",
|
||
"\n",
|
||
" # Sortino Ratio 索提诺比率\n",
|
||
" # Like Sharpe but only penalizes DOWNSIDE volatility\n",
|
||
" # 类似夏普,但只惩罚下行波动率(亏损波动率)\n",
|
||
" downside = r[r < 0]\n",
|
||
" downside_std = downside.std() * np.sqrt(252) if len(downside) > 0 else 1e-9\n",
|
||
" sortino = (cagr - risk_free) / downside_std if downside_std > 0 else 0\n",
|
||
"\n",
|
||
" # Maximum Drawdown 最大回撤\n",
|
||
" max_dd = self.drawdown.min() # most negative value (最大负值)\n",
|
||
"\n",
|
||
" # Calmar Ratio 卡玛比率\n",
|
||
" # Calmar = CAGR / |Max Drawdown|\n",
|
||
" calmar = cagr / abs(max_dd) if max_dd != 0 else 0\n",
|
||
"\n",
|
||
" # Win rate 胜率 (fraction of trading days with positive return)\n",
|
||
" win_rate = (r > 0).mean()\n",
|
||
"\n",
|
||
" # Profit factor 盈亏比\n",
|
||
" # = Sum of positive returns / |Sum of negative returns|\n",
|
||
" gross_profit = r[r > 0].sum()\n",
|
||
" gross_loss = abs(r[r < 0].sum())\n",
|
||
" profit_factor = gross_profit / gross_loss if gross_loss > 0 else np.inf\n",
|
||
"\n",
|
||
" return {\n",
|
||
" \"总收益率 Total Return\": f\"{total_return:.2%}\",\n",
|
||
" \"年化收益率 CAGR\": f\"{cagr:.2%}\",\n",
|
||
" \"年化波动率 Ann. Volatility\": f\"{ann_vol:.2%}\",\n",
|
||
" \"夏普比率 Sharpe Ratio\": f\"{sharpe:.3f}\",\n",
|
||
" \"索提诺比率 Sortino Ratio\": f\"{sortino:.3f}\",\n",
|
||
" \"最大回撤 Max Drawdown\": f\"{max_dd:.2%}\",\n",
|
||
" \"卡玛比率 Calmar Ratio\": f\"{calmar:.3f}\",\n",
|
||
" \"胜率 Win Rate\": f\"{win_rate:.2%}\",\n",
|
||
" \"盈亏比 Profit Factor\": f\"{profit_factor:.3f}\",\n",
|
||
" \"交易次数 # Trades\": str(self.n_trades),\n",
|
||
" \"总成本 Total Cost\": f\"{self.total_cost:.4%}\",\n",
|
||
" }\n",
|
||
"\n",
|
||
" def print_metrics(self):\n",
|
||
" \"\"\"Pretty-print the performance report. 格式化打印绩效报告。\"\"\"\n",
|
||
" print(f\"\\n{'=' * 55}\")\n",
|
||
" print(f\" 策略绩效报告 / Performance Report: {self.name}\")\n",
|
||
" print(f\"{'=' * 55}\")\n",
|
||
" for k, v in self.metrics().items():\n",
|
||
" print(f\" {k:<35} {v}\")\n",
|
||
" print(f\"{'=' * 55}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ea411c63",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 5: Run Backtests 执行回测\n",
|
||
"\n",
|
||
"# =============================================================================\n",
|
||
"\n",
|
||
"# ── Strategy A: MA Crossover 双均线策略 ──────────────────────────────────────"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e13a487c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"bt_ma = VectorizedBacktester(\n",
|
||
" prices=price,\n",
|
||
" signal=ma_signal, # +1 = long, 0 = flat\n",
|
||
" cost_per_trade=0.001, # 0.1% per trade (reasonable for liquid stocks)\n",
|
||
" name=\"双均线策略 (MA Crossover 20/60)\",\n",
|
||
")\n",
|
||
"\n",
|
||
"# ── Strategy B: RSI Mean Reversion RSI均值回归策略 ──────────────────────────\n",
|
||
"bt_rsi = VectorizedBacktester(\n",
|
||
" prices=price,\n",
|
||
" signal=rsi_signal_shifted, # +1 = long, -1 = short, 0 = flat\n",
|
||
" cost_per_trade=0.001,\n",
|
||
" name=\"RSI均值回归策略 (RSI Mean Reversion 14)\",\n",
|
||
")\n",
|
||
"\n",
|
||
"# ── Benchmark: Buy & Hold 基准:买入并持有 ───────────────────────────────────\n",
|
||
"# Buy & Hold (买入持有) is always our benchmark: simply hold the asset forever.\n",
|
||
"# It requires zero skill and zero effort — any strategy must beat this to\n",
|
||
"# justify the extra complexity and transaction costs.\n",
|
||
"# 买入持有是永远的基准策略:无需技能、零成本。任何策略都必须超越它才有意义。\n",
|
||
"bt_bh = VectorizedBacktester(\n",
|
||
" prices=price,\n",
|
||
" signal=pd.Series(1, index=price.index, dtype=float), # always long / 始终做多\n",
|
||
" cost_per_trade=0.0, # no trading costs / 无交易成本\n",
|
||
" name=\"Buy & Hold 基准 (买入持有)\",\n",
|
||
")\n",
|
||
"\n",
|
||
"bt_ma.print_metrics()\n",
|
||
"bt_rsi.print_metrics()\n",
|
||
"bt_bh.print_metrics()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9d601da1",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 6: Visualization 可视化\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e6b6bfc8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"fig = plt.figure(figsize=(16, 22))\n",
|
||
"gs = gridspec.GridSpec(6, 2, figure=fig, hspace=0.45, wspace=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 1: Price + MA signals 价格 + 均线信号 ────────────────────────────────\n",
|
||
"ax1 = fig.add_subplot(gs[0, :]) # span full width\n",
|
||
"ax1.plot(price, color=\"#1f77b4\", linewidth=1, label=\"价格 Price\")\n",
|
||
"ax1.plot(ma_short, color=\"orange\", linewidth=1.2, label=f\"SMA{SHORT_WIN} (快线)\")\n",
|
||
"ax1.plot(ma_long, color=\"red\", linewidth=1.2, label=f\"SMA{LONG_WIN} (慢线)\")\n",
|
||
"\n",
|
||
"# Shade long periods (持多仓的区间着色)\n",
|
||
"ax1.fill_between(\n",
|
||
" price.index, price.min(), price.max(),\n",
|
||
" where=(ma_signal == 1).values,\n",
|
||
" alpha=0.12, color=\"green\", label=\"多头持仓区间 Long Period\"\n",
|
||
")\n",
|
||
"ax1.set_title(\"策略A — 双均线信号 (MA Crossover Signals)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax1.legend(loc=\"upper left\", fontsize=8)\n",
|
||
"ax1.set_ylabel(\"价格 Price\")\n",
|
||
"ax1.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 2: RSI RSI指标 ────────────────────────────────────────────────────\n",
|
||
"ax2 = fig.add_subplot(gs[1, :])\n",
|
||
"ax2.plot(rsi14, color=\"purple\", linewidth=1)\n",
|
||
"ax2.axhline(RSI_OVERBOUGHT, color=\"red\", linestyle=\"--\", linewidth=1, label=f\"超买线 {RSI_OVERBOUGHT}\")\n",
|
||
"ax2.axhline(RSI_OVERSOLD, color=\"green\", linestyle=\"--\", linewidth=1, label=f\"超卖线 {RSI_OVERSOLD}\")\n",
|
||
"ax2.axhline(50, color=\"gray\", linestyle=\":\", linewidth=0.8)\n",
|
||
"ax2.fill_between(rsi14.index, RSI_OVERSOLD, rsi14,\n",
|
||
" where=(rsi14 < RSI_OVERSOLD), alpha=0.25, color=\"green\",\n",
|
||
" label=\"超卖区域 Oversold\")\n",
|
||
"ax2.fill_between(rsi14.index, rsi14, RSI_OVERBOUGHT,\n",
|
||
" where=(rsi14 > RSI_OVERBOUGHT), alpha=0.25, color=\"red\",\n",
|
||
" label=\"超买区域 Overbought\")\n",
|
||
"ax2.set_ylim(0, 100)\n",
|
||
"ax2.set_title(f\"策略B指标 — RSI({14}) 均值回归信号 (RSI Mean Reversion)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax2.set_ylabel(\"RSI\")\n",
|
||
"ax2.legend(loc=\"upper left\", fontsize=8, ncol=2)\n",
|
||
"ax2.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 3: Bollinger Bands 布林带 ────────────────────────────────────────────\n",
|
||
"ax3 = fig.add_subplot(gs[2, :])\n",
|
||
"ax3.plot(price, color=\"#1f77b4\", linewidth=1, label=\"价格 Price\")\n",
|
||
"ax3.plot(bb[\"mid\"], color=\"orange\", linewidth=1.2, label=\"中轨 Middle (SMA20)\")\n",
|
||
"ax3.plot(bb[\"upper\"],color=\"red\", linewidth=1, linestyle=\"--\", label=\"上轨 Upper (+2σ)\")\n",
|
||
"ax3.plot(bb[\"lower\"],color=\"green\", linewidth=1, linestyle=\"--\", label=\"下轨 Lower (-2σ)\")\n",
|
||
"ax3.fill_between(price.index, bb[\"upper\"], bb[\"lower\"], alpha=0.07, color=\"blue\")\n",
|
||
"ax3.set_title(\"布林带 (Bollinger Bands 20, 2σ)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax3.set_ylabel(\"价格 Price\")\n",
|
||
"ax3.legend(loc=\"upper left\", fontsize=8, ncol=2)\n",
|
||
"ax3.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 4: MACD MACD指标 ────────────────────────────────────────────────────\n",
|
||
"ax4 = fig.add_subplot(gs[3, :])\n",
|
||
"ax4.plot(macd_df[\"macd\"], color=\"blue\", linewidth=1, label=\"MACD线 (DIF)\")\n",
|
||
"ax4.plot(macd_df[\"signal\"], color=\"orange\", linewidth=1, label=\"信号线 (DEA)\")\n",
|
||
"colors = [\"green\" if v >= 0 else \"red\" for v in macd_df[\"histogram\"]]\n",
|
||
"ax4.bar(macd_df.index, macd_df[\"histogram\"], color=colors, alpha=0.5, width=1, label=\"柱状图 Histogram\")\n",
|
||
"ax4.axhline(0, color=\"black\", linewidth=0.8)\n",
|
||
"ax4.set_title(\"MACD (12/26/9) — 趋势确认指标 (Trend Confirmation)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax4.set_ylabel(\"MACD\")\n",
|
||
"ax4.legend(loc=\"upper left\", fontsize=8)\n",
|
||
"ax4.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 5: Equity Curves 净值曲线 ────────────────────────────────────────────\n",
|
||
"ax5 = fig.add_subplot(gs[4, :])\n",
|
||
"ax5.plot(bt_ma.equity, color=\"blue\", linewidth=1.5, label=\"策略A: 双均线 MA Crossover\")\n",
|
||
"ax5.plot(bt_rsi.equity, color=\"purple\", linewidth=1.5, label=\"策略B: RSI 均值回归 RSI Reversion\")\n",
|
||
"ax5.plot(bt_bh.equity, color=\"gray\", linewidth=1.2, linestyle=\"--\", label=\"基准: 买入持有 Buy & Hold\")\n",
|
||
"ax5.set_title(\"净值曲线对比 (Equity Curve Comparison)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax5.set_ylabel(\"账户价值 Portfolio Value (元)\")\n",
|
||
"ax5.legend(loc=\"upper left\", fontsize=9)\n",
|
||
"ax5.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f\"¥{x/1e4:.0f}万\"))\n",
|
||
"ax5.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# ── Plot 6: Drawdown 回撤曲线 ────────────────────────────────────────────────\n",
|
||
"ax6 = fig.add_subplot(gs[5, :])\n",
|
||
"ax6.fill_between(bt_ma.drawdown.index, bt_ma.drawdown, 0, alpha=0.5, color=\"blue\", label=\"策略A\")\n",
|
||
"ax6.fill_between(bt_rsi.drawdown.index, bt_rsi.drawdown, 0, alpha=0.5, color=\"purple\", label=\"策略B\")\n",
|
||
"ax6.fill_between(bt_bh.drawdown.index, bt_bh.drawdown, 0, alpha=0.3, color=\"gray\", label=\"Buy & Hold\")\n",
|
||
"ax6.set_title(\"回撤曲线 (Drawdown Curves)\", fontsize=13, fontweight=\"bold\")\n",
|
||
"ax6.set_ylabel(\"回撤幅度 Drawdown\")\n",
|
||
"ax6.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f\"{x:.0%}\"))\n",
|
||
"ax6.legend(loc=\"lower left\", fontsize=9)\n",
|
||
"ax6.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"plt.suptitle(\n",
|
||
" \"量化交易策略开发与回测演示\\nQuantitative Trading: Strategy Development & Backtesting\",\n",
|
||
" fontsize=15, fontweight=\"bold\", y=1.005,\n",
|
||
")\n",
|
||
"plt.savefig(\"strategy_backtest_demo.png\", dpi=120, bbox_inches=\"tight\")\n",
|
||
"plt.show()\n",
|
||
"print(\"\\n[图表] 已保存至 strategy_backtest_demo.png\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ff8384a5",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 7: Walk-Forward Validation 滚动前向验证\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# A critical warning for all new quant traders / 对所有量化新手的重要警告:\n",
|
||
"#\n",
|
||
"# In-sample overfitting (样本内过拟合) is the #1 trap in backtesting.\n",
|
||
"# 样本内过拟合是回测中最大的陷阱。\n",
|
||
"#\n",
|
||
"# If you test 100 different parameter sets on the same data and pick the best,\n",
|
||
"# that \"best\" result will almost certainly NOT hold out of sample.\n",
|
||
"# 如果在同一份数据上测试100组参数并选最好的,这个\"最优\"结果在样本外几乎必然失效。\n",
|
||
"# This is called data snooping bias / 数据窥探偏差 or p-hacking.\n",
|
||
"#\n",
|
||
"# Walk-Forward Validation (滚动前向验证) helps guard against this:\n",
|
||
"# ┌──────────────────────────────────────────────────────────────────────┐\n",
|
||
"# │ Window 1: [TRAIN period 1] → optimize params → TEST on period 1+ │\n",
|
||
"# │ Window 2: [TRAIN period 2] → optimize params → TEST on period 2+ │\n",
|
||
"# │ …repeat, always training on past, testing on future │\n",
|
||
"# │ 始终用过去数据训练,用未来数据测试 │\n",
|
||
"# └──────────────────────────────────────────────────────────────────────┘\n",
|
||
"# Only report the concatenated OUT-OF-SAMPLE test results.\n",
|
||
"# 只汇报样本外(OOS)的测试结果。\n",
|
||
"#\n",
|
||
"# Below: a simplified version — we just split into train / test (80/20).\n",
|
||
"# 下面是简化版:直接按 80/20 切分训练集和测试集。\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "5e99677e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"TRAIN_RATIO = 0.8\n",
|
||
"split_idx = int(len(price) * TRAIN_RATIO)\n",
|
||
"split_date = price.index[split_idx]\n",
|
||
"\n",
|
||
"price_train = price.iloc[:split_idx]\n",
|
||
"price_test = price.iloc[split_idx:]\n",
|
||
"\n",
|
||
"print(f\"\\n{'=' * 55}\")\n",
|
||
"print(f\" 滚动前向验证 / Walk-Forward Split\")\n",
|
||
"print(f\"{'=' * 55}\")\n",
|
||
"print(f\" 训练期 Train: {price_train.index[0].date()} → {price_train.index[-1].date()} ({len(price_train)} 天)\")\n",
|
||
"print(f\" 测试期 Test : {price_test.index[0].date()} → {price_test.index[-1].date()} ({len(price_test)} 天)\")\n",
|
||
"\n",
|
||
"# ── Optimize MA windows on TRAIN set 在训练集上优化均线参数 ────────────────────\n",
|
||
"#\n",
|
||
"# Grid search (网格搜索): try all combinations in the parameter space.\n",
|
||
"# This is the simplest optimization method — good for small parameter spaces.\n",
|
||
"# 网格搜索:遍历参数空间内的所有组合。适合参数空间小的情形。\n",
|
||
"\n",
|
||
"print(\"\\n[优化] 在训练集上搜索最优均线参数...\")\n",
|
||
"print(\" (搜索空间: short=[5,10,15,20,30], long=[30,40,50,60,80,100])\")\n",
|
||
"\n",
|
||
"best_sharpe = -np.inf\n",
|
||
"best_short = SHORT_WIN\n",
|
||
"best_long = LONG_WIN\n",
|
||
"results_grid = []\n",
|
||
"\n",
|
||
"for sw in [5, 10, 15, 20, 30]:\n",
|
||
" for lw in [30, 40, 50, 60, 80, 100]:\n",
|
||
" if sw >= lw:\n",
|
||
" continue # short must be shorter than long / 短期必须小于长期\n",
|
||
" ma_s = sma(price_train, sw)\n",
|
||
" ma_l = sma(price_train, lw)\n",
|
||
" sig = (ma_s > ma_l).astype(int).shift(1).fillna(0)\n",
|
||
" bt = VectorizedBacktester(price_train, sig, cost_per_trade=0.001, name=\"grid\")\n",
|
||
" m = bt.metrics()\n",
|
||
" sharpe_val = float(m[\"夏普比率 Sharpe Ratio\"])\n",
|
||
" results_grid.append({\"short\": sw, \"long\": lw, \"sharpe\": sharpe_val})\n",
|
||
" if sharpe_val > best_sharpe:\n",
|
||
" best_sharpe = sharpe_val\n",
|
||
" best_short = sw\n",
|
||
" best_long = lw\n",
|
||
"\n",
|
||
"print(f\"\\n 最优参数 (训练集 in-sample): short={best_short}, long={best_long}\")\n",
|
||
"print(f\" 训练集夏普比率 In-sample Sharpe: {best_sharpe:.3f}\")\n",
|
||
"\n",
|
||
"# ── Apply best params on TEST set 将最优参数应用于测试集 ──────────────────────\n",
|
||
"ma_s_test = sma(price_test, best_short)\n",
|
||
"ma_l_test = sma(price_test, best_long)\n",
|
||
"sig_test = (ma_s_test > ma_l_test).astype(int).shift(1).fillna(0)\n",
|
||
"\n",
|
||
"bt_test = VectorizedBacktester(price_test, sig_test, cost_per_trade=0.001,\n",
|
||
" name=f\"MA({best_short}/{best_long}) — 测试集 OOS\")\n",
|
||
"bt_test.print_metrics()\n",
|
||
"\n",
|
||
"print(\"\\n\" + \"=\" * 55)\n",
|
||
"print(\" ⚠️ 注意 / WARNING:\")\n",
|
||
"print(\" 训练集(in-sample)夏普 通常高于 测试集(out-of-sample)夏普\")\n",
|
||
"print(\" In-sample Sharpe is typically HIGHER than out-of-sample.\")\n",
|
||
"print(\" 夏普衰减 (Sharpe decay) 是策略过拟合的典型信号。\")\n",
|
||
"print(\" Sharpe decay is a classic sign of overfitting.\")\n",
|
||
"print(\"=\" * 55)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "95c33a90",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 8: Return Distribution Analysis 收益率分布分析\n",
|
||
"\n",
|
||
"# -----------------------------------------------------------------------------\n",
|
||
"# Before trusting your Sharpe ratio, check if the return distribution\n",
|
||
"# violates the normality assumption.\n",
|
||
"# 在相信夏普比率之前,检验收益率分布是否违背正态假设。\n",
|
||
"#\n",
|
||
"# Real returns typically show:\n",
|
||
"# 真实收益率通常呈现:\n",
|
||
"# Fat tails (厚尾 / leptokurtosis): extreme events more frequent than normal\n",
|
||
"# Negative skew (负偏态): crashes are larger than rallies\n",
|
||
"#\n",
|
||
"# A high Sharpe ratio on a fat-tailed distribution can be misleading.\n",
|
||
"# 厚尾分布下的高夏普比率可能具有误导性。\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e74d07b8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"fig2, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
|
||
"\n",
|
||
"# ── Return distribution histogram 收益率直方图 ─────────────────────────────\n",
|
||
"ax = axes[0]\n",
|
||
"r_ma = bt_ma.strat_ret.dropna()\n",
|
||
"r_bh = bt_bh.strat_ret.dropna()\n",
|
||
"\n",
|
||
"ax.hist(r_bh, bins=80, alpha=0.5, color=\"gray\", density=True, label=\"Buy & Hold\")\n",
|
||
"ax.hist(r_ma, bins=80, alpha=0.5, color=\"blue\", density=True, label=\"策略A: MA Crossover\")\n",
|
||
"\n",
|
||
"# Overlay a normal distribution for comparison / 叠加正态分布对比\n",
|
||
"x_range = np.linspace(r_bh.min(), r_bh.max(), 300)\n",
|
||
"ax.plot(x_range, stats.norm.pdf(x_range, r_bh.mean(), r_bh.std()),\n",
|
||
" color=\"red\", linewidth=1.5, linestyle=\"--\", label=\"正态分布 Normal Dist.\")\n",
|
||
"ax.set_title(\"收益率分布 (Return Distribution)\", fontsize=12)\n",
|
||
"ax.set_xlabel(\"日收益率 Daily Return\")\n",
|
||
"ax.set_ylabel(\"频率密度 Density\")\n",
|
||
"ax.legend(fontsize=8)\n",
|
||
"ax.grid(alpha=0.3)\n",
|
||
"\n",
|
||
"# Print distribution stats\n",
|
||
"print(f\"\\n[分布] 策略A — 日收益率统计:\")\n",
|
||
"print(f\" 偏度 Skewness : {r_ma.skew():.3f} (负值=左尾更厚 fat left tail)\")\n",
|
||
"print(f\" 峰度 Kurtosis : {r_ma.kurtosis():.3f} (>0 表示厚尾 fat tails vs normal)\")\n",
|
||
"\n",
|
||
"# ── Monthly returns heatmap 月度收益热力图 ─────────────────────────────────\n",
|
||
"ax = axes[1]\n",
|
||
"monthly = bt_ma.strat_ret.resample(\"M\").apply(lambda x: (1 + x).prod() - 1)\n",
|
||
"monthly_df = pd.DataFrame({\n",
|
||
" \"year\": monthly.index.year,\n",
|
||
" \"month\": monthly.index.month,\n",
|
||
" \"ret\": monthly.values,\n",
|
||
"})\n",
|
||
"pivot = monthly_df.pivot(index=\"year\", columns=\"month\", values=\"ret\")\n",
|
||
"pivot.columns = [\"Jan\",\"Feb\",\"Mar\",\"Apr\",\"May\",\"Jun\",\"Jul\",\"Aug\",\"Sep\",\"Oct\",\"Nov\",\"Dec\"]\n",
|
||
"\n",
|
||
"import matplotlib.colors as mcolors\n",
|
||
"cmap = mcolors.LinearSegmentedColormap.from_list(\"rg\", [\"#d73027\",\"#ffffff\",\"#1a9850\"])\n",
|
||
"im = ax.imshow(pivot.values, cmap=cmap, aspect=\"auto\",\n",
|
||
" vmin=-0.15, vmax=0.15)\n",
|
||
"ax.set_xticks(range(12))\n",
|
||
"ax.set_xticklabels(pivot.columns, fontsize=8)\n",
|
||
"ax.set_yticks(range(len(pivot.index)))\n",
|
||
"ax.set_yticklabels(pivot.index, fontsize=9)\n",
|
||
"for i in range(len(pivot.index)):\n",
|
||
" for j in range(12):\n",
|
||
" v = pivot.values[i, j]\n",
|
||
" if not np.isnan(v):\n",
|
||
" ax.text(j, i, f\"{v:.1%}\", ha=\"center\", va=\"center\", fontsize=6,\n",
|
||
" color=\"black\" if abs(v) < 0.08 else \"white\")\n",
|
||
"ax.set_title(\"策略A月度收益热力图\\n(Monthly Return Heatmap)\", fontsize=11)\n",
|
||
"plt.colorbar(im, ax=ax, format=plt.FuncFormatter(lambda x, _: f\"{x:.0%}\"))\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"plt.savefig(\"return_distribution.png\", dpi=120, bbox_inches=\"tight\")\n",
|
||
"plt.show()\n",
|
||
"print(\"[图表] 已保存至 return_distribution.png\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4dfb5f58",
|
||
"metadata": {},
|
||
"source": [
|
||
"# =============================================================================\n",
|
||
"# SECTION 9: Summary & Next Steps 总结与后续\n",
|
||
"\n",
|
||
"# ============================================================================="
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7bac6b3d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(f\"\"\"\n",
|
||
"{'=' * 70}\n",
|
||
" 总结 Summary\n",
|
||
"{'=' * 70}\n",
|
||
" 本 Demo 演示了量化策略开发与回测的完整流程:\n",
|
||
"\n",
|
||
" ① 技术指标计算 Technical Indicators\n",
|
||
" SMA / EMA / RSI / MACD / Bollinger Bands\n",
|
||
"\n",
|
||
" ② 信号生成 Signal Generation\n",
|
||
" 策略A: 双均线金叉死叉 (MA Crossover) — 趋势跟随\n",
|
||
" 策略B: RSI 超买超卖 (RSI Reversion) — 均值回归\n",
|
||
"\n",
|
||
" ③ 向量化回测引擎 Vectorized Backtester\n",
|
||
" 考虑了交易成本(佣金+滑点)和前视偏差(lookahead bias)\n",
|
||
"\n",
|
||
" ④ 绩效指标 Performance Metrics\n",
|
||
" Sharpe / Sortino / Max Drawdown / Calmar / Win Rate / Profit Factor\n",
|
||
"\n",
|
||
" ⑤ 前向验证 Walk-Forward Validation\n",
|
||
" 训练集优化参数 → 测试集验证 → 防止过拟合\n",
|
||
"\n",
|
||
" ⑥ 收益率分布 Return Distribution\n",
|
||
" 偏度/峰度检验,月度热力图\n",
|
||
"\n",
|
||
" 下一步学习方向 Next Steps:\n",
|
||
" ──────────────────────────────────────────────────────────────\n",
|
||
" • 因子选股策略 (Alpha Factor Models) — Fama-French, Momentum\n",
|
||
" • 组合优化 (Portfolio Optimization) — Mean-Variance, Risk Parity\n",
|
||
" • 事件驱动回测 (Event-Driven Backtesting) — more realistic execution\n",
|
||
" • 机器学习信号 (ML-based Signals) — XGBoost, LSTM for return prediction\n",
|
||
" • 风险管理 (Risk Management) — Position sizing, Stop-loss, VaR\n",
|
||
"{'=' * 70}\n",
|
||
"\"\"\")"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (trading)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"name": "python",
|
||
"version": "3.11.0"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|