trading/quant_strategy_backtest_dem...

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fb3aacb0",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# Quantitative Trading — Strategy Development & Backtesting Demo\n",
    "# 量化交易 — 策略开发与回测演示\n",
    "\n",
    "# =============================================================================\n",
    "#\n",
    "# 本文件是数据管道 (quant_data_pipeline_demo.py) 的续集。\n",
    "# This file is the sequel to the data pipeline demo.\n",
    "#\n",
    "# Topics covered / 涵盖主题:\n",
    "#   1. Technical Indicators  技术指标  (MA, RSI, MACD, Bollinger Bands)\n",
    "#   2. Signal Generation     信号生成  (entry & exit rules)\n",
    "#   3. Two Demo Strategies   两个示范策略:\n",
    "#        A. Dual Moving Average Crossover  双均线金叉死叉策略\n",
    "#        B. RSI Mean Reversion             RSI 均值回归策略\n",
    "#   4. Vectorized Backtest Engine  向量化回测引擎\n",
    "#   5. Performance Metrics         绩效指标\n",
    "#        (Sharpe, Sortino, Max Drawdown, Win Rate …)\n",
    "#   6. Visualization               可视化\n",
    "#\n",
    "# Prerequisites / 前置条件:\n",
    "#   pip install numpy pandas matplotlib scipy\n",
    "#\n",
    "# Running / 运行方式:\n",
    "#   python quant_strategy_backtest_demo.py\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "156c36ec",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.gridspec as gridspec\n",
    "from scipy import stats\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')\n",
    "\n",
    "# 中文字体配置 / Chinese font config\n",
    "plt.rcParams['font.sans-serif'] = ['WenQuanYi Zen Hei', 'Arial Unicode MS', 'SimHei', 'DejaVu Sans']\n",
    "plt.rcParams['axes.unicode_minus'] = False\n",
    "\n",
    "np.random.seed(42)\n",
    "print(\"=\" * 70)\n",
    "print(\"  量化交易策略开发与回测演示\")\n",
    "print(\"  Quantitative Trading: Strategy Development & Backtesting Demo\")\n",
    "print(\"=\" * 70)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62cbe290",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 0: Synthetic Price Data  合成价格数据\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# We simulate a single stock using Geometric Brownian Motion (几何布朗运动),\n",
    "# the classical model that underlies the Black-Scholes formula.\n",
    "#\n",
    "# GBM formula:\n",
    "#   dS = μ·S·dt + σ·S·dW\n",
    "#\n",
    "# Discrete form (what we actually compute each day):\n",
    "#   S_t = S_{t-1} · exp( (μ - σ²/2)·dt + σ·√dt·ε )\n",
    "#\n",
    "# where:\n",
    "#   μ  = drift / 年化漂移率 (expected annual return)\n",
    "#   σ  = volatility / 年化波动率\n",
    "#   dt = 1/252   (one trading day as a fraction of a year)\n",
    "#   ε  ~ N(0,1) (standard normal random shock / 标准正态随机扰动)\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04ed6429",
   "metadata": {},
   "outputs": [],
   "source": [
    "def generate_price_series(\n",
    "    n_days: int = 1500,\n",
    "    mu: float = 0.10,       # 年化预期收益率 / annual expected return\n",
    "    sigma: float = 0.25,    # 年化波动率 / annual volatility\n",
    "    s0: float = 100.0,      # 初始价格 / initial price\n",
    "    seed: int = 42,\n",
    ") -> pd.Series:\n",
    "    \"\"\"\n",
    "    Generate a synthetic daily price series via GBM.\n",
    "    用几何布朗运动生成合成日线价格序列。\n",
    "    \"\"\"\n",
    "    np.random.seed(seed)\n",
    "    dt = 1.0 / 252                                       # 每个交易日占一年的比例\n",
    "    epsilon = np.random.randn(n_days)                    # 每日随机冲击\n",
    "    log_returns = (mu - 0.5 * sigma ** 2) * dt + sigma * np.sqrt(dt) * epsilon\n",
    "    prices = s0 * np.exp(np.cumsum(log_returns))         # 累积乘积 → 价格路径\n",
    "\n",
    "    # 生成工作日日期序列 / generate business-day date index\n",
    "    dates = pd.bdate_range(start=\"2019-01-02\", periods=n_days)\n",
    "    return pd.Series(prices, index=dates, name=\"close\")\n",
    "\n",
    "\n",
    "price = generate_price_series()\n",
    "print(f\"\\n[数据] 生成模拟股票价格: {len(price)} 个交易日\")\n",
    "print(f\"       价格区间: {price.min():.2f} ~ {price.max():.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccda8a1f",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 1: Technical Indicators  技术指标\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# Technical indicators transform raw price/volume data into signals.\n",
    "# 技术指标将原始价格/成交量数据转化为交易信号。\n",
    "#\n",
    "# They are divided into two broad families:\n",
    "# 主要分为两大类:\n",
    "#\n",
    "#   ① Trend-following indicators  趋势跟随指标\n",
    "#       → Moving Averages (MA), MACD\n",
    "#       → Work well in trending markets (趋势市中效果好)\n",
    "#\n",
    "#   ② Oscillators / Mean-reversion indicators  震荡/均值回归指标\n",
    "#       → RSI, Bollinger Bands\n",
    "#       → Work well in range-bound / choppy markets (震荡市中效果好)\n",
    "\n",
    "# =============================================================================\n",
    "\n",
    "# ── 1-A  Simple Moving Average  简单移动平均线 (SMA) ──────────────────────────\n",
    "#\n",
    "# SMA_n(t) = (P_{t} + P_{t-1} + … + P_{t-n+1}) / n\n",
    "#\n",
    "# The SMA smooths out daily noise to reveal the underlying trend.\n",
    "# SMA 平滑日内噪音，揭示潜在趋势。\n",
    "# A longer window → smoother, but lags more behind recent price action.\n",
    "# 窗口越长 → 越平滑，但对价格变化的反应越滞后。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8ec5acb4",
   "metadata": {},
   "outputs": [],
   "source": [
    "def sma(prices: pd.Series, window: int) -> pd.Series:\n",
    "    \"\"\"Simple Moving Average / 简单移动平均线\"\"\"\n",
    "    return prices.rolling(window=window).mean()\n",
    "\n",
    "\n",
    "# ── 1-B  Exponential Moving Average  指数移动平均线 (EMA) ───────────────────\n",
    "#\n",
    "# EMA gives MORE weight to recent prices (recent data matters more).\n",
    "# EMA 给予近期价格更高权重（近期数据更重要）。\n",
    "#\n",
    "# EMA_t = α · P_t + (1 - α) · EMA_{t-1}\n",
    "# where  α = 2 / (n + 1)   (smoothing factor / 平滑因子)\n",
    "#\n",
    "# EMA reacts faster than SMA to price changes.\n",
    "# EMA 对价格变动的反应比 SMA 更灵敏。\n",
    "\n",
    "def ema(prices: pd.Series, span: int) -> pd.Series:\n",
    "    \"\"\"Exponential Moving Average / 指数移动平均线\"\"\"\n",
    "    return prices.ewm(span=span, adjust=False).mean()\n",
    "\n",
    "\n",
    "# ── 1-C  RSI  相对强弱指数 (Relative Strength Index) ─────────────────────────\n",
    "#\n",
    "# RSI measures the speed and magnitude of recent price changes.\n",
    "# RSI 衡量近期价格变动的速度和幅度。\n",
    "#\n",
    "# Formula:\n",
    "#   RS  = average_gain / average_loss  (over last n days)\n",
    "#   RSI = 100 - 100 / (1 + RS)\n",
    "#\n",
    "# Interpretation / 指标解读:\n",
    "#   RSI > 70  →  Overbought  超买  (price may be due for a pullback / 价格可能回调)\n",
    "#   RSI < 30  →  Oversold   超卖  (price may be due for a bounce  / 价格可能反弹)\n",
    "#   RSI = 50  →  Neutral    中性\n",
    "\n",
    "def rsi(prices: pd.Series, window: int = 14) -> pd.Series:\n",
    "    \"\"\"\n",
    "    Compute Wilder's RSI.\n",
    "    计算 Wilder 平滑法 RSI。\n",
    "    \"\"\"\n",
    "    delta = prices.diff()                      # 每日价格变化 / daily price change\n",
    "    gain = delta.clip(lower=0)                 # 只保留上涨部分 / keep only up-days\n",
    "    loss = -delta.clip(upper=0)                # 只保留下跌部分 / keep only down-days\n",
    "\n",
    "    # Wilder uses EMA with span = 2*n - 1 (equivalent to 1/n smoothing)\n",
    "    avg_gain = gain.ewm(alpha=1.0 / window, adjust=False).mean()\n",
    "    avg_loss = loss.ewm(alpha=1.0 / window, adjust=False).mean()\n",
    "\n",
    "    rs = avg_gain / avg_loss                   # 相对强弱值 / relative strength\n",
    "    return 100 - (100 / (1 + rs))             # 转换为 0~100 范围\n",
    "\n",
    "\n",
    "# ── 1-D  MACD  指数平滑异同移动平均线 ────────────────────────────────────────\n",
    "#\n",
    "# MACD reveals the relationship between two EMAs.\n",
    "# MACD 揭示两条 EMA 之间的关系。\n",
    "#\n",
    "# Components / 构成:\n",
    "#   MACD Line  MACD线  = EMA(12) - EMA(26)   (fast minus slow / 快线减慢线)\n",
    "#   Signal Line 信号线 = EMA(9) of MACD Line  (trigger line / 触发线)\n",
    "#   Histogram  柱状图  = MACD Line - Signal Line\n",
    "#\n",
    "# Trading rules / 交易规则:\n",
    "#   MACD crosses above Signal  →  Bullish (金叉, buy signal  / 买入信号)\n",
    "#   MACD crosses below Signal  →  Bearish (死叉, sell signal / 卖出信号)\n",
    "\n",
    "def macd(prices: pd.Series,\n",
    "         fast: int = 12, slow: int = 26, signal: int = 9\n",
    "         ) -> pd.DataFrame:\n",
    "    \"\"\"\n",
    "    Compute MACD, Signal line, and Histogram.\n",
    "    计算 MACD线、信号线和柱状图。\n",
    "    \"\"\"\n",
    "    ema_fast   = ema(prices, fast)\n",
    "    ema_slow   = ema(prices, slow)\n",
    "    macd_line  = ema_fast - ema_slow           # MACD 线\n",
    "    signal_line = ema(macd_line, signal)       # 信号线 (DIF的EMA)\n",
    "    histogram  = macd_line - signal_line       # 柱状图 (MACD Bar)\n",
    "    return pd.DataFrame({\n",
    "        \"macd\": macd_line,\n",
    "        \"signal\": signal_line,\n",
    "        \"histogram\": histogram,\n",
    "    })\n",
    "\n",
    "\n",
    "# ── 1-E  Bollinger Bands  布林带 ─────────────────────────────────────────────\n",
    "#\n",
    "# Bollinger Bands place upper/lower envelopes around a moving average.\n",
    "# 布林带在移动平均线上下各画一条\"包络线\"。\n",
    "#\n",
    "# Formula:\n",
    "#   Middle Band  中轨  = SMA(n)\n",
    "#   Upper Band   上轨  = SMA(n) + k·σ_n     (k = 2 by default / 默认 k=2)\n",
    "#   Lower Band   下轨  = SMA(n) - k·σ_n\n",
    "#\n",
    "# where σ_n is the rolling standard deviation / 滚动标准差\n",
    "#\n",
    "# When price touches the lower band → oversold area (超卖区域)\n",
    "# When price touches the upper band → overbought area (超买区域)\n",
    "# Band width (带宽) contracts before explosive moves (波动收窄常预示突破)\n",
    "\n",
    "def bollinger_bands(prices: pd.Series, window: int = 20, k: float = 2.0\n",
    "                    ) -> pd.DataFrame:\n",
    "    \"\"\"\n",
    "    Compute Bollinger Bands.\n",
    "    计算布林带（上轨、中轨、下轨）。\n",
    "    \"\"\"\n",
    "    mid    = sma(prices, window)               # 中轨 (SMA)\n",
    "    std    = prices.rolling(window).std()      # 滚动标准差\n",
    "    upper  = mid + k * std                     # 上轨\n",
    "    lower  = mid - k * std                     # 下轨\n",
    "    # %B indicator: where is the current price within the band?\n",
    "    # %B 指标：当前价格在带宽中的位置 (0=下轨, 1=上轨)\n",
    "    pct_b  = (prices - lower) / (upper - lower)\n",
    "    return pd.DataFrame({\n",
    "        \"upper\": upper, \"mid\": mid, \"lower\": lower, \"pct_b\": pct_b\n",
    "    })\n",
    "\n",
    "\n",
    "# Compute all indicators on our simulated price series\n",
    "# 对模拟价格序列计算所有指标\n",
    "sma20  = sma(price, 20)       # 20日均线 / 20-day SMA\n",
    "sma60  = sma(price, 60)       # 60日均线 / 60-day SMA (longer trend)\n",
    "rsi14  = rsi(price, 14)       # 14日RSI  / 14-day RSI\n",
    "macd_df = macd(price)         # MACD (12/26/9)\n",
    "bb     = bollinger_bands(price, window=20, k=2.0)\n",
    "\n",
    "print(\"\\n[指标] 技术指标计算完成:\")\n",
    "print(f\"  SMA20   — 首个有效值日期: {sma20.first_valid_index().date()}\")\n",
    "print(f\"  SMA60   — 首个有效值日期: {sma60.first_valid_index().date()}\")\n",
    "print(f\"  RSI14   — 首个有效值日期: {rsi14.first_valid_index().date()}\")\n",
    "print(f\"  MACD    — 首个有效值日期: {macd_df['macd'].first_valid_index().date()}\")\n",
    "print(f\"  BollingerBands — 首个有效值日期: {bb['mid'].first_valid_index().date()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "739084bb",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 2: Strategy A — Dual Moving Average Crossover\n",
    "#            策略 A — 双均线金叉/死叉策略\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# One of the oldest and most intuitive trend-following strategies.\n",
    "# 最古老也最直观的趋势跟随策略之一。\n",
    "#\n",
    "# Logic / 逻辑:\n",
    "#   Golden Cross (金叉): short MA crosses ABOVE long MA  → BUY  (做多)\n",
    "#   Death Cross  (死叉): short MA crosses BELOW long MA  → SELL (平仓)\n",
    "#\n",
    "# Rationale / 原理:\n",
    "#   When the short-term average rises above the long-term average, it signals\n",
    "#   that recent momentum is stronger than the historical trend → bullish.\n",
    "#   短期均线上穿长期均线，意味着近期动能强于历史趋势 → 看涨。\n",
    "#\n",
    "# Parameters / 参数:\n",
    "#   SHORT_WINDOW = 20  (fast line / 快线)\n",
    "#   LONG_WINDOW  = 60  (slow line / 慢线)\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a99d5899",
   "metadata": {},
   "outputs": [],
   "source": [
    "SHORT_WIN = 20   # 短期均线窗口 / short-term MA window\n",
    "LONG_WIN  = 60   # 长期均线窗口 / long-term MA window\n",
    "\n",
    "ma_short = sma(price, SHORT_WIN)\n",
    "ma_long  = sma(price, LONG_WIN)\n",
    "\n",
    "# ── Signal generation  信号生成 ───────────────────────────────────────────────\n",
    "#\n",
    "# Signal (信号) = +1 when we should be LONG (持多仓), 0 when out of market (空仓)\n",
    "#\n",
    "# Step 1: raw_signal = 1 whenever short MA > long MA (short MA above long MA)\n",
    "# Step 2: detect crossovers (cross = today's signal ≠ yesterday's signal)\n",
    "#\n",
    "# We use a \"position\" approach — hold the position until it reverses.\n",
    "# 使用\"持仓\"方式 — 持有直到信号翻转。\n",
    "\n",
    "# raw_signal: 1 = short above long (看多区域), 0 = short below long (看空区域)\n",
    "raw_signal = (ma_short > ma_long).astype(int)\n",
    "\n",
    "# Align signals: use yesterday's signal to trade today (avoid lookahead bias)\n",
    "# 用昨天的信号决定今天的仓位，避免\"未来数据偷窥\" (前视偏差 / lookahead bias)\n",
    "ma_signal = raw_signal.shift(1).fillna(0)\n",
    "\n",
    "print(\"\\n[策略A] 双均线信号生成完成\")\n",
    "print(f\"  多头持仓天数 (Signal=1): {int(ma_signal.sum())} 天\")\n",
    "print(f\"  空仓天数     (Signal=0): {int((ma_signal == 0).sum())} 天\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd6312ac",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 3: Strategy B — RSI Mean Reversion\n",
    "#            策略 B — RSI 均值回归策略\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# This is a contrarian strategy: buy when the market seems \"too weak\",\n",
    "# sell when it seems \"too strong\".\n",
    "# 这是一个逆势策略：市场\"跌过头\"时买入，\"涨过头\"时卖出。\n",
    "#\n",
    "# Logic / 逻辑:\n",
    "#   RSI drops below oversold level (超卖线, default 30)  →  BUY  signal\n",
    "#   RSI rises above overbought level (超买线, default 70) →  SELL signal\n",
    "#\n",
    "# This exploits mean reversion (均值回归): extreme prices tend to revert.\n",
    "# 利用均值回归特性：极端价格倾向于回归均值。\n",
    "#\n",
    "# Risk / 风险:\n",
    "#   In a strong trend, RSI can stay oversold/overbought for long stretches.\n",
    "#   在强趋势中，RSI 可以长时间停留在超卖/超买区域，造成连续亏损。\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c428d624",
   "metadata": {},
   "outputs": [],
   "source": [
    "RSI_OVERSOLD  = 30   # 超卖线 / oversold threshold\n",
    "RSI_OVERBOUGHT = 70  # 超买线 / overbought threshold\n",
    "\n",
    "def rsi_signal(rsi_series: pd.Series,\n",
    "               oversold: float = 30,\n",
    "               overbought: float = 70) -> pd.Series:\n",
    "    \"\"\"\n",
    "    Generate long/short/flat signals from RSI.\n",
    "    根据 RSI 生成多空平信号。\n",
    "\n",
    "    Returns a Series of:\n",
    "      +1  →  Long  (做多)\n",
    "      -1  →  Short (做空)\n",
    "       0  →  Flat  (空仓, no position)\n",
    "    \"\"\"\n",
    "    position = pd.Series(0, index=rsi_series.index, dtype=float)\n",
    "    current_pos = 0   # 当前持仓状态 / current position state\n",
    "\n",
    "    for i in range(1, len(rsi_series)):\n",
    "        r = rsi_series.iloc[i]\n",
    "        if pd.isna(r):\n",
    "            position.iloc[i] = 0\n",
    "            continue\n",
    "\n",
    "        # Entry rules / 入场规则\n",
    "        if r < oversold and current_pos == 0:\n",
    "            current_pos = 1     # 超卖 → 做多 / oversold → go long\n",
    "\n",
    "        elif r > overbought and current_pos == 0:\n",
    "            current_pos = -1    # 超买 → 做空 / overbought → go short\n",
    "\n",
    "        # Exit rules / 出场规则\n",
    "        # Exit long when RSI recovers above 50 (回到中性区域 / back to neutral)\n",
    "        elif current_pos == 1 and r > 50:\n",
    "            current_pos = 0\n",
    "\n",
    "        # Exit short when RSI falls below 50\n",
    "        elif current_pos == -1 and r < 50:\n",
    "            current_pos = 0\n",
    "\n",
    "        position.iloc[i] = current_pos\n",
    "\n",
    "    return position\n",
    "\n",
    "\n",
    "rsi_pos = rsi_signal(rsi14, RSI_OVERSOLD, RSI_OVERBOUGHT)\n",
    "\n",
    "# Shift by 1 day to avoid lookahead bias / 前移一天避免前视偏差\n",
    "rsi_signal_shifted = rsi_pos.shift(1).fillna(0)\n",
    "\n",
    "print(\"\\n[策略B] RSI信号生成完成\")\n",
    "print(f\"  多头持仓天数 (Signal=+1): {int((rsi_signal_shifted == 1).sum())} 天\")\n",
    "print(f\"  空头持仓天数 (Signal=-1): {int((rsi_signal_shifted == -1).sum())} 天\")\n",
    "print(f\"  空仓天数     (Signal= 0): {int((rsi_signal_shifted == 0).sum())} 天\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3bb7e0d4",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 4: Vectorized Backtest Engine  向量化回测引擎\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# A backtest (回测) simulates how a strategy would have performed\n",
    "# on historical data. It is the primary tool for validating a strategy\n",
    "# before risking real money.\n",
    "# 回测是在历史数据上模拟策略表现的工具，是真实投资前验证策略的主要手段。\n",
    "#\n",
    "# Two main backtest styles / 两种主要回测方式:\n",
    "#\n",
    "#   ① Vectorized backtest  向量化回测\n",
    "#       - Compute all positions & P&L as array operations at once (numpy/pandas)\n",
    "#       - Very fast; good for strategy exploration\n",
    "#       - 所有仓位和盈亏一次性用数组运算计算，速度极快，适合策略探索\n",
    "#\n",
    "#   ② Event-driven backtest  事件驱动回测\n",
    "#       - Simulate time step-by-step, reacting to each market event\n",
    "#       - More realistic (handles fills, slippage, latency, order queuing)\n",
    "#       - 逐笔模拟市场事件，更真实（考虑成交、滑点、延迟等），速度较慢\n",
    "#\n",
    "# We use the vectorized approach here for clarity and speed.\n",
    "# 此处使用向量化方式，兼顾清晰度和速度。\n",
    "#\n",
    "# Cost model  交易成本模型:\n",
    "#   - Commission (佣金): charged each time you trade (per trade)\n",
    "#   - Slippage   (滑点): the difference between the expected fill price and\n",
    "#                        the actual fill price (price moves against you)\n",
    "#   We approximate both as a percentage of the trade value.\n",
    "#   两者合并近似为交易金额的固定比例。\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d6b6a55",
   "metadata": {},
   "outputs": [],
   "source": [
    "class VectorizedBacktester:\n",
    "    \"\"\"\n",
    "    A simple vectorized backtesting engine.\n",
    "    简单的向量化回测引擎。\n",
    "\n",
    "    Assumptions / 假设:\n",
    "      • Long-only or long/short positions\n",
    "      • Trade at next-day's open (用下一天开盘价成交) — conservative assumption\n",
    "        We approximate this by using the same day's close shifted by 1 day.\n",
    "      • Round-trip cost (单次交易成本) = 2 × cost_per_trade\n",
    "        (pay cost on entry AND exit / 进出各收一次)\n",
    "      • No leverage (无杠杆), position size is 100% of capital when in trade\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(\n",
    "        self,\n",
    "        prices: pd.Series,\n",
    "        signal: pd.Series,\n",
    "        cost_per_trade: float = 0.001,  # 0.1% one-way / 单向 0.1% (含佣金+滑点)\n",
    "        initial_capital: float = 1_000_000.0,  # 初始资金 / initial capital\n",
    "        name: str = \"Strategy\",\n",
    "    ):\n",
    "        self.prices = prices\n",
    "        self.signal = signal.reindex(prices.index).fillna(0)\n",
    "        self.cost_per_trade = cost_per_trade\n",
    "        self.initial_capital = initial_capital\n",
    "        self.name = name\n",
    "        self._run()\n",
    "\n",
    "    def _run(self):\n",
    "        \"\"\"Core backtesting logic.  核心回测逻辑。\"\"\"\n",
    "        prices = self.prices\n",
    "        signal = self.signal\n",
    "\n",
    "        # ── Daily price return  日收益率 ────────────────────────────────────\n",
    "        daily_ret = prices.pct_change().fillna(0)\n",
    "\n",
    "        # ── Strategy return (before costs)  策略日收益率（扣除成本前）─────────\n",
    "        # Strategy return = signal × market return\n",
    "        # 策略当日收益率 = 持仓方向 × 市场当日收益率\n",
    "        strat_ret_gross = signal * daily_ret\n",
    "\n",
    "        # ── Transaction cost  交易成本 ──────────────────────────────────────\n",
    "        # Detect position changes (signal changes from one day to the next)\n",
    "        # 检测仓位变化（信号从一天到下一天发生变化）\n",
    "        position_change = signal.diff().fillna(0).abs()  # >0 means we traded\n",
    "        # Cost is charged each time position changes\n",
    "        # 每次仓位变化时扣除成本\n",
    "        cost = position_change * self.cost_per_trade\n",
    "\n",
    "        # ── Net strategy return  策略净收益率 ───────────────────────────────\n",
    "        strat_ret_net = strat_ret_gross - cost\n",
    "\n",
    "        # ── Equity curve  净值曲线 ───────────────────────────────────────────\n",
    "        # The equity curve tracks how 1 unit of capital grows over time.\n",
    "        # 净值曲线追踪单位资本随时间的增长。\n",
    "        # (1 + daily_net_return) compounded every day\n",
    "        equity = self.initial_capital * (1 + strat_ret_net).cumprod()\n",
    "        equity_bh = self.initial_capital * (1 + daily_ret).cumprod()  # Buy & Hold benchmark\n",
    "\n",
    "        # ── Drawdown  回撤 ──────────────────────────────────────────────────\n",
    "        # Drawdown measures how far we are from the peak at any point in time.\n",
    "        # 回撤衡量当前净值距离历史最高点的跌幅。\n",
    "        rolling_max = equity.cummax()\n",
    "        drawdown = (equity - rolling_max) / rolling_max  # always <= 0\n",
    "\n",
    "        # Store results for later analysis\n",
    "        self.daily_ret    = daily_ret\n",
    "        self.strat_ret    = strat_ret_net\n",
    "        self.equity       = equity\n",
    "        self.equity_bh    = equity_bh\n",
    "        self.drawdown     = drawdown\n",
    "        self.n_trades     = int((position_change > 0).sum())\n",
    "        self.total_cost   = cost.sum()\n",
    "\n",
    "    # ── Performance metrics  绩效指标 ──────────────────────────────────────────\n",
    "    #\n",
    "    # A well-rounded strategy evaluation uses multiple metrics, because\n",
    "    # no single number captures the full picture.\n",
    "    # 全面的策略评估需要多个指标，因为单一数字无法描述全貌。\n",
    "    #\n",
    "    # Key metrics / 关键指标:\n",
    "    #   Total Return    总收益率  — how much did we make in total?\n",
    "    #   CAGR            年化复合增长率 — annualized compounded growth rate\n",
    "    #   Sharpe Ratio    夏普比率  — return per unit of total risk (risk-adjusted)\n",
    "    #   Sortino Ratio   索提诺比率 — return per unit of DOWNSIDE risk only\n",
    "    #   Max Drawdown    最大回撤  — worst peak-to-trough decline\n",
    "    #   Calmar Ratio    卡玛比率  — CAGR / Max Drawdown (reward vs worst loss)\n",
    "    #   Win Rate        胜率     — fraction of days (or trades) with positive P&L\n",
    "    #   Profit Factor   盈亏比   — total profit / total loss\n",
    "\n",
    "    def metrics(self) -> dict:\n",
    "        \"\"\"Compute and return a dictionary of performance metrics.\n",
    "           计算并返回绩效指标字典。\"\"\"\n",
    "        r = self.strat_ret\n",
    "        eq = self.equity\n",
    "        n  = len(r)\n",
    "        years = n / 252.0          # approximate years in sample / 样本年数估算\n",
    "\n",
    "        # Total return / 总收益率\n",
    "        total_return = (eq.iloc[-1] / self.initial_capital) - 1\n",
    "\n",
    "        # CAGR  年化复合增长率\n",
    "        # CAGR = (EndValue / StartValue)^(1/years) - 1\n",
    "        cagr = (1 + total_return) ** (1 / years) - 1\n",
    "\n",
    "        # Annualized volatility  年化波动率\n",
    "        ann_vol = r.std() * np.sqrt(252)\n",
    "\n",
    "        # Sharpe Ratio  夏普比率\n",
    "        # Sharpe = (Mean excess return) / StdDev(return) × √252\n",
    "        # Excess return = strategy return - risk-free rate\n",
    "        # 超额收益率 = 策略收益率 - 无风险利率\n",
    "        # We use 0 as risk-free rate for simplicity (or assume it's netted out)\n",
    "        risk_free = 0.0\n",
    "        sharpe = (r.mean() - risk_free / 252) / r.std() * np.sqrt(252) if r.std() > 0 else 0\n",
    "\n",
    "        # Sortino Ratio  索提诺比率\n",
    "        # Like Sharpe but only penalizes DOWNSIDE volatility\n",
    "        # 类似夏普，但只惩罚下行波动率（亏损波动率）\n",
    "        downside = r[r < 0]\n",
    "        downside_std = downside.std() * np.sqrt(252) if len(downside) > 0 else 1e-9\n",
    "        sortino = (cagr - risk_free) / downside_std if downside_std > 0 else 0\n",
    "\n",
    "        # Maximum Drawdown  最大回撤\n",
    "        max_dd = self.drawdown.min()           # most negative value (最大负值)\n",
    "\n",
    "        # Calmar Ratio  卡玛比率\n",
    "        # Calmar = CAGR / |Max Drawdown|\n",
    "        calmar = cagr / abs(max_dd) if max_dd != 0 else 0\n",
    "\n",
    "        # Win rate  胜率 (fraction of trading days with positive return)\n",
    "        win_rate = (r > 0).mean()\n",
    "\n",
    "        # Profit factor  盈亏比\n",
    "        # = Sum of positive returns / |Sum of negative returns|\n",
    "        gross_profit = r[r > 0].sum()\n",
    "        gross_loss   = abs(r[r < 0].sum())\n",
    "        profit_factor = gross_profit / gross_loss if gross_loss > 0 else np.inf\n",
    "\n",
    "        return {\n",
    "            \"总收益率   Total Return\":    f\"{total_return:.2%}\",\n",
    "            \"年化收益率 CAGR\":            f\"{cagr:.2%}\",\n",
    "            \"年化波动率 Ann. Volatility\":  f\"{ann_vol:.2%}\",\n",
    "            \"夏普比率  Sharpe Ratio\":     f\"{sharpe:.3f}\",\n",
    "            \"索提诺比率 Sortino Ratio\":   f\"{sortino:.3f}\",\n",
    "            \"最大回撤  Max Drawdown\":     f\"{max_dd:.2%}\",\n",
    "            \"卡玛比率  Calmar Ratio\":     f\"{calmar:.3f}\",\n",
    "            \"胜率      Win Rate\":         f\"{win_rate:.2%}\",\n",
    "            \"盈亏比    Profit Factor\":    f\"{profit_factor:.3f}\",\n",
    "            \"交易次数  # Trades\":         str(self.n_trades),\n",
    "            \"总成本    Total Cost\":       f\"{self.total_cost:.4%}\",\n",
    "        }\n",
    "\n",
    "    def print_metrics(self):\n",
    "        \"\"\"Pretty-print the performance report.  格式化打印绩效报告。\"\"\"\n",
    "        print(f\"\\n{'=' * 55}\")\n",
    "        print(f\"  策略绩效报告 / Performance Report: {self.name}\")\n",
    "        print(f\"{'=' * 55}\")\n",
    "        for k, v in self.metrics().items():\n",
    "            print(f\"  {k:<35} {v}\")\n",
    "        print(f\"{'=' * 55}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea411c63",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 5: Run Backtests  执行回测\n",
    "\n",
    "# =============================================================================\n",
    "\n",
    "# ── Strategy A: MA Crossover  双均线策略 ──────────────────────────────────────"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e13a487c",
   "metadata": {},
   "outputs": [],
   "source": [
    "bt_ma = VectorizedBacktester(\n",
    "    prices=price,\n",
    "    signal=ma_signal,      # +1 = long, 0 = flat\n",
    "    cost_per_trade=0.001,  # 0.1% per trade (reasonable for liquid stocks)\n",
    "    name=\"双均线策略 (MA Crossover 20/60)\",\n",
    ")\n",
    "\n",
    "# ── Strategy B: RSI Mean Reversion  RSI均值回归策略 ──────────────────────────\n",
    "bt_rsi = VectorizedBacktester(\n",
    "    prices=price,\n",
    "    signal=rsi_signal_shifted,   # +1 = long, -1 = short, 0 = flat\n",
    "    cost_per_trade=0.001,\n",
    "    name=\"RSI均值回归策略 (RSI Mean Reversion 14)\",\n",
    ")\n",
    "\n",
    "# ── Benchmark: Buy & Hold  基准：买入并持有 ───────────────────────────────────\n",
    "# Buy & Hold (买入持有) is always our benchmark: simply hold the asset forever.\n",
    "# It requires zero skill and zero effort — any strategy must beat this to\n",
    "# justify the extra complexity and transaction costs.\n",
    "# 买入持有是永远的基准策略：无需技能、零成本。任何策略都必须超越它才有意义。\n",
    "bt_bh = VectorizedBacktester(\n",
    "    prices=price,\n",
    "    signal=pd.Series(1, index=price.index, dtype=float),  # always long / 始终做多\n",
    "    cost_per_trade=0.0,   # no trading costs / 无交易成本\n",
    "    name=\"Buy & Hold 基准 (买入持有)\",\n",
    ")\n",
    "\n",
    "bt_ma.print_metrics()\n",
    "bt_rsi.print_metrics()\n",
    "bt_bh.print_metrics()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d601da1",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 6: Visualization  可视化\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6b6bfc8",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure(figsize=(16, 22))\n",
    "gs  = gridspec.GridSpec(6, 2, figure=fig, hspace=0.45, wspace=0.3)\n",
    "\n",
    "# ── Plot 1: Price + MA signals  价格 + 均线信号 ────────────────────────────────\n",
    "ax1 = fig.add_subplot(gs[0, :])   # span full width\n",
    "ax1.plot(price, color=\"#1f77b4\", linewidth=1, label=\"价格 Price\")\n",
    "ax1.plot(ma_short, color=\"orange\", linewidth=1.2, label=f\"SMA{SHORT_WIN} (快线)\")\n",
    "ax1.plot(ma_long,  color=\"red\",    linewidth=1.2, label=f\"SMA{LONG_WIN}  (慢线)\")\n",
    "\n",
    "# Shade long periods (持多仓的区间着色)\n",
    "ax1.fill_between(\n",
    "    price.index, price.min(), price.max(),\n",
    "    where=(ma_signal == 1).values,\n",
    "    alpha=0.12, color=\"green\", label=\"多头持仓区间 Long Period\"\n",
    ")\n",
    "ax1.set_title(\"策略A — 双均线信号  (MA Crossover Signals)\", fontsize=13, fontweight=\"bold\")\n",
    "ax1.legend(loc=\"upper left\", fontsize=8)\n",
    "ax1.set_ylabel(\"价格 Price\")\n",
    "ax1.grid(alpha=0.3)\n",
    "\n",
    "# ── Plot 2: RSI  RSI指标 ────────────────────────────────────────────────────\n",
    "ax2 = fig.add_subplot(gs[1, :])\n",
    "ax2.plot(rsi14, color=\"purple\", linewidth=1)\n",
    "ax2.axhline(RSI_OVERBOUGHT, color=\"red\",   linestyle=\"--\", linewidth=1, label=f\"超买线 {RSI_OVERBOUGHT}\")\n",
    "ax2.axhline(RSI_OVERSOLD,   color=\"green\", linestyle=\"--\", linewidth=1, label=f\"超卖线 {RSI_OVERSOLD}\")\n",
    "ax2.axhline(50,             color=\"gray\",  linestyle=\":\",  linewidth=0.8)\n",
    "ax2.fill_between(rsi14.index, RSI_OVERSOLD, rsi14,\n",
    "                  where=(rsi14 < RSI_OVERSOLD), alpha=0.25, color=\"green\",\n",
    "                  label=\"超卖区域 Oversold\")\n",
    "ax2.fill_between(rsi14.index, rsi14, RSI_OVERBOUGHT,\n",
    "                  where=(rsi14 > RSI_OVERBOUGHT), alpha=0.25, color=\"red\",\n",
    "                  label=\"超买区域 Overbought\")\n",
    "ax2.set_ylim(0, 100)\n",
    "ax2.set_title(f\"策略B指标 — RSI({14})  均值回归信号  (RSI Mean Reversion)\", fontsize=13, fontweight=\"bold\")\n",
    "ax2.set_ylabel(\"RSI\")\n",
    "ax2.legend(loc=\"upper left\", fontsize=8, ncol=2)\n",
    "ax2.grid(alpha=0.3)\n",
    "\n",
    "# ── Plot 3: Bollinger Bands  布林带 ────────────────────────────────────────────\n",
    "ax3 = fig.add_subplot(gs[2, :])\n",
    "ax3.plot(price,      color=\"#1f77b4\", linewidth=1,   label=\"价格 Price\")\n",
    "ax3.plot(bb[\"mid\"],  color=\"orange\",  linewidth=1.2, label=\"中轨 Middle (SMA20)\")\n",
    "ax3.plot(bb[\"upper\"],color=\"red\",     linewidth=1,   linestyle=\"--\", label=\"上轨 Upper (+2σ)\")\n",
    "ax3.plot(bb[\"lower\"],color=\"green\",   linewidth=1,   linestyle=\"--\", label=\"下轨 Lower (-2σ)\")\n",
    "ax3.fill_between(price.index, bb[\"upper\"], bb[\"lower\"], alpha=0.07, color=\"blue\")\n",
    "ax3.set_title(\"布林带 (Bollinger Bands 20, 2σ)\", fontsize=13, fontweight=\"bold\")\n",
    "ax3.set_ylabel(\"价格 Price\")\n",
    "ax3.legend(loc=\"upper left\", fontsize=8, ncol=2)\n",
    "ax3.grid(alpha=0.3)\n",
    "\n",
    "# ── Plot 4: MACD  MACD指标 ────────────────────────────────────────────────────\n",
    "ax4 = fig.add_subplot(gs[3, :])\n",
    "ax4.plot(macd_df[\"macd\"],   color=\"blue\",   linewidth=1,   label=\"MACD线 (DIF)\")\n",
    "ax4.plot(macd_df[\"signal\"], color=\"orange\", linewidth=1,   label=\"信号线 (DEA)\")\n",
    "colors = [\"green\" if v >= 0 else \"red\" for v in macd_df[\"histogram\"]]\n",
    "ax4.bar(macd_df.index, macd_df[\"histogram\"], color=colors, alpha=0.5, width=1, label=\"柱状图 Histogram\")\n",
    "ax4.axhline(0, color=\"black\", linewidth=0.8)\n",
    "ax4.set_title(\"MACD (12/26/9) — 趋势确认指标  (Trend Confirmation)\", fontsize=13, fontweight=\"bold\")\n",
    "ax4.set_ylabel(\"MACD\")\n",
    "ax4.legend(loc=\"upper left\", fontsize=8)\n",
    "ax4.grid(alpha=0.3)\n",
    "\n",
    "# ── Plot 5: Equity Curves  净值曲线 ────────────────────────────────────────────\n",
    "ax5 = fig.add_subplot(gs[4, :])\n",
    "ax5.plot(bt_ma.equity,   color=\"blue\",   linewidth=1.5, label=\"策略A: 双均线  MA Crossover\")\n",
    "ax5.plot(bt_rsi.equity,  color=\"purple\", linewidth=1.5, label=\"策略B: RSI 均值回归  RSI Reversion\")\n",
    "ax5.plot(bt_bh.equity,   color=\"gray\",   linewidth=1.2, linestyle=\"--\", label=\"基准: 买入持有  Buy & Hold\")\n",
    "ax5.set_title(\"净值曲线对比  (Equity Curve Comparison)\", fontsize=13, fontweight=\"bold\")\n",
    "ax5.set_ylabel(\"账户价值  Portfolio Value (元)\")\n",
    "ax5.legend(loc=\"upper left\", fontsize=9)\n",
    "ax5.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f\"¥{x/1e4:.0f}万\"))\n",
    "ax5.grid(alpha=0.3)\n",
    "\n",
    "# ── Plot 6: Drawdown  回撤曲线 ────────────────────────────────────────────────\n",
    "ax6 = fig.add_subplot(gs[5, :])\n",
    "ax6.fill_between(bt_ma.drawdown.index,  bt_ma.drawdown,  0, alpha=0.5, color=\"blue\",   label=\"策略A\")\n",
    "ax6.fill_between(bt_rsi.drawdown.index, bt_rsi.drawdown, 0, alpha=0.5, color=\"purple\", label=\"策略B\")\n",
    "ax6.fill_between(bt_bh.drawdown.index,  bt_bh.drawdown,  0, alpha=0.3, color=\"gray\",   label=\"Buy & Hold\")\n",
    "ax6.set_title(\"回撤曲线  (Drawdown Curves)\", fontsize=13, fontweight=\"bold\")\n",
    "ax6.set_ylabel(\"回撤幅度  Drawdown\")\n",
    "ax6.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f\"{x:.0%}\"))\n",
    "ax6.legend(loc=\"lower left\", fontsize=9)\n",
    "ax6.grid(alpha=0.3)\n",
    "\n",
    "plt.suptitle(\n",
    "    \"量化交易策略开发与回测演示\\nQuantitative Trading: Strategy Development & Backtesting\",\n",
    "    fontsize=15, fontweight=\"bold\", y=1.005,\n",
    ")\n",
    "plt.savefig(\"strategy_backtest_demo.png\", dpi=120, bbox_inches=\"tight\")\n",
    "plt.show()\n",
    "print(\"\\n[图表] 已保存至 strategy_backtest_demo.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff8384a5",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 7: Walk-Forward Validation  滚动前向验证\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# A critical warning for all new quant traders / 对所有量化新手的重要警告:\n",
    "#\n",
    "# In-sample overfitting (样本内过拟合) is the #1 trap in backtesting.\n",
    "# 样本内过拟合是回测中最大的陷阱。\n",
    "#\n",
    "# If you test 100 different parameter sets on the same data and pick the best,\n",
    "# that \"best\" result will almost certainly NOT hold out of sample.\n",
    "# 如果在同一份数据上测试100组参数并选最好的，这个\"最优\"结果在样本外几乎必然失效。\n",
    "# This is called data snooping bias / 数据窥探偏差 or p-hacking.\n",
    "#\n",
    "# Walk-Forward Validation (滚动前向验证) helps guard against this:\n",
    "# ┌──────────────────────────────────────────────────────────────────────┐\n",
    "# │ Window 1:  [TRAIN period 1] → optimize params → TEST on period 1+   │\n",
    "# │ Window 2:  [TRAIN period 2] → optimize params → TEST on period 2+   │\n",
    "# │  …repeat, always training on past, testing on future                │\n",
    "# │  始终用过去数据训练，用未来数据测试                                          │\n",
    "# └──────────────────────────────────────────────────────────────────────┘\n",
    "# Only report the concatenated OUT-OF-SAMPLE test results.\n",
    "# 只汇报样本外（OOS）的测试结果。\n",
    "#\n",
    "# Below: a simplified version — we just split into train / test (80/20).\n",
    "# 下面是简化版：直接按 80/20 切分训练集和测试集。\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e99677e",
   "metadata": {},
   "outputs": [],
   "source": [
    "TRAIN_RATIO = 0.8\n",
    "split_idx = int(len(price) * TRAIN_RATIO)\n",
    "split_date = price.index[split_idx]\n",
    "\n",
    "price_train = price.iloc[:split_idx]\n",
    "price_test  = price.iloc[split_idx:]\n",
    "\n",
    "print(f\"\\n{'=' * 55}\")\n",
    "print(f\"  滚动前向验证 / Walk-Forward Split\")\n",
    "print(f\"{'=' * 55}\")\n",
    "print(f\"  训练期 Train: {price_train.index[0].date()} → {price_train.index[-1].date()}  ({len(price_train)} 天)\")\n",
    "print(f\"  测试期 Test : {price_test.index[0].date()}  → {price_test.index[-1].date()}  ({len(price_test)} 天)\")\n",
    "\n",
    "# ── Optimize MA windows on TRAIN set  在训练集上优化均线参数 ────────────────────\n",
    "#\n",
    "# Grid search (网格搜索): try all combinations in the parameter space.\n",
    "# This is the simplest optimization method — good for small parameter spaces.\n",
    "# 网格搜索：遍历参数空间内的所有组合。适合参数空间小的情形。\n",
    "\n",
    "print(\"\\n[优化] 在训练集上搜索最优均线参数...\")\n",
    "print(\"  (搜索空间: short=[5,10,15,20,30], long=[30,40,50,60,80,100])\")\n",
    "\n",
    "best_sharpe = -np.inf\n",
    "best_short  = SHORT_WIN\n",
    "best_long   = LONG_WIN\n",
    "results_grid = []\n",
    "\n",
    "for sw in [5, 10, 15, 20, 30]:\n",
    "    for lw in [30, 40, 50, 60, 80, 100]:\n",
    "        if sw >= lw:\n",
    "            continue   # short must be shorter than long / 短期必须小于长期\n",
    "        ma_s = sma(price_train, sw)\n",
    "        ma_l = sma(price_train, lw)\n",
    "        sig  = (ma_s > ma_l).astype(int).shift(1).fillna(0)\n",
    "        bt   = VectorizedBacktester(price_train, sig, cost_per_trade=0.001, name=\"grid\")\n",
    "        m    = bt.metrics()\n",
    "        sharpe_val = float(m[\"夏普比率  Sharpe Ratio\"])\n",
    "        results_grid.append({\"short\": sw, \"long\": lw, \"sharpe\": sharpe_val})\n",
    "        if sharpe_val > best_sharpe:\n",
    "            best_sharpe = sharpe_val\n",
    "            best_short  = sw\n",
    "            best_long   = lw\n",
    "\n",
    "print(f\"\\n  最优参数 (训练集 in-sample): short={best_short}, long={best_long}\")\n",
    "print(f\"  训练集夏普比率 In-sample Sharpe: {best_sharpe:.3f}\")\n",
    "\n",
    "# ── Apply best params on TEST set  将最优参数应用于测试集 ──────────────────────\n",
    "ma_s_test = sma(price_test, best_short)\n",
    "ma_l_test = sma(price_test, best_long)\n",
    "sig_test  = (ma_s_test > ma_l_test).astype(int).shift(1).fillna(0)\n",
    "\n",
    "bt_test   = VectorizedBacktester(price_test, sig_test, cost_per_trade=0.001,\n",
    "                                  name=f\"MA({best_short}/{best_long}) — 测试集 OOS\")\n",
    "bt_test.print_metrics()\n",
    "\n",
    "print(\"\\n\" + \"=\" * 55)\n",
    "print(\"  ⚠️  注意 / WARNING:\")\n",
    "print(\"  训练集(in-sample)夏普 通常高于 测试集(out-of-sample)夏普\")\n",
    "print(\"  In-sample Sharpe is typically HIGHER than out-of-sample.\")\n",
    "print(\"  夏普衰减 (Sharpe decay) 是策略过拟合的典型信号。\")\n",
    "print(\"  Sharpe decay is a classic sign of overfitting.\")\n",
    "print(\"=\" * 55)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95c33a90",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 8: Return Distribution Analysis  收益率分布分析\n",
    "\n",
    "# -----------------------------------------------------------------------------\n",
    "# Before trusting your Sharpe ratio, check if the return distribution\n",
    "# violates the normality assumption.\n",
    "# 在相信夏普比率之前，检验收益率分布是否违背正态假设。\n",
    "#\n",
    "# Real returns typically show:\n",
    "# 真实收益率通常呈现:\n",
    "#   Fat tails (厚尾 / leptokurtosis):  extreme events more frequent than normal\n",
    "#   Negative skew (负偏态):            crashes are larger than rallies\n",
    "#\n",
    "# A high Sharpe ratio on a fat-tailed distribution can be misleading.\n",
    "# 厚尾分布下的高夏普比率可能具有误导性。\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e74d07b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig2, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
    "\n",
    "# ── Return distribution histogram  收益率直方图 ─────────────────────────────\n",
    "ax = axes[0]\n",
    "r_ma  = bt_ma.strat_ret.dropna()\n",
    "r_bh  = bt_bh.strat_ret.dropna()\n",
    "\n",
    "ax.hist(r_bh, bins=80, alpha=0.5, color=\"gray\",  density=True, label=\"Buy & Hold\")\n",
    "ax.hist(r_ma, bins=80, alpha=0.5, color=\"blue\",  density=True, label=\"策略A: MA Crossover\")\n",
    "\n",
    "# Overlay a normal distribution for comparison / 叠加正态分布对比\n",
    "x_range = np.linspace(r_bh.min(), r_bh.max(), 300)\n",
    "ax.plot(x_range, stats.norm.pdf(x_range, r_bh.mean(), r_bh.std()),\n",
    "        color=\"red\", linewidth=1.5, linestyle=\"--\", label=\"正态分布 Normal Dist.\")\n",
    "ax.set_title(\"收益率分布  (Return Distribution)\", fontsize=12)\n",
    "ax.set_xlabel(\"日收益率  Daily Return\")\n",
    "ax.set_ylabel(\"频率密度  Density\")\n",
    "ax.legend(fontsize=8)\n",
    "ax.grid(alpha=0.3)\n",
    "\n",
    "# Print distribution stats\n",
    "print(f\"\\n[分布] 策略A — 日收益率统计:\")\n",
    "print(f\"  偏度 Skewness : {r_ma.skew():.3f}  (负值=左尾更厚 fat left tail)\")\n",
    "print(f\"  峰度 Kurtosis : {r_ma.kurtosis():.3f}  (>0 表示厚尾 fat tails vs normal)\")\n",
    "\n",
    "# ── Monthly returns heatmap  月度收益热力图 ─────────────────────────────────\n",
    "ax = axes[1]\n",
    "monthly = bt_ma.strat_ret.resample(\"M\").apply(lambda x: (1 + x).prod() - 1)\n",
    "monthly_df = pd.DataFrame({\n",
    "    \"year\":  monthly.index.year,\n",
    "    \"month\": monthly.index.month,\n",
    "    \"ret\":   monthly.values,\n",
    "})\n",
    "pivot = monthly_df.pivot(index=\"year\", columns=\"month\", values=\"ret\")\n",
    "pivot.columns = [\"Jan\",\"Feb\",\"Mar\",\"Apr\",\"May\",\"Jun\",\"Jul\",\"Aug\",\"Sep\",\"Oct\",\"Nov\",\"Dec\"]\n",
    "\n",
    "import matplotlib.colors as mcolors\n",
    "cmap = mcolors.LinearSegmentedColormap.from_list(\"rg\", [\"#d73027\",\"#ffffff\",\"#1a9850\"])\n",
    "im = ax.imshow(pivot.values, cmap=cmap, aspect=\"auto\",\n",
    "                vmin=-0.15, vmax=0.15)\n",
    "ax.set_xticks(range(12))\n",
    "ax.set_xticklabels(pivot.columns, fontsize=8)\n",
    "ax.set_yticks(range(len(pivot.index)))\n",
    "ax.set_yticklabels(pivot.index, fontsize=9)\n",
    "for i in range(len(pivot.index)):\n",
    "    for j in range(12):\n",
    "        v = pivot.values[i, j]\n",
    "        if not np.isnan(v):\n",
    "            ax.text(j, i, f\"{v:.1%}\", ha=\"center\", va=\"center\", fontsize=6,\n",
    "                    color=\"black\" if abs(v) < 0.08 else \"white\")\n",
    "ax.set_title(\"策略A月度收益热力图\\n(Monthly Return Heatmap)\", fontsize=11)\n",
    "plt.colorbar(im, ax=ax, format=plt.FuncFormatter(lambda x, _: f\"{x:.0%}\"))\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.savefig(\"return_distribution.png\", dpi=120, bbox_inches=\"tight\")\n",
    "plt.show()\n",
    "print(\"[图表] 已保存至 return_distribution.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4dfb5f58",
   "metadata": {},
   "source": [
    "# =============================================================================\n",
    "# SECTION 9: Summary & Next Steps  总结与后续\n",
    "\n",
    "# ============================================================================="
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7bac6b3d",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"\"\"\n",
    "{'=' * 70}\n",
    "  总结 Summary\n",
    "{'=' * 70}\n",
    "  本 Demo 演示了量化策略开发与回测的完整流程：\n",
    "\n",
    "  ① 技术指标计算  Technical Indicators\n",
    "     SMA / EMA / RSI / MACD / Bollinger Bands\n",
    "\n",
    "  ② 信号生成  Signal Generation\n",
    "     策略A: 双均线金叉死叉 (MA Crossover) — 趋势跟随\n",
    "     策略B: RSI 超买超卖  (RSI Reversion) — 均值回归\n",
    "\n",
    "  ③ 向量化回测引擎  Vectorized Backtester\n",
    "     考虑了交易成本(佣金+滑点)和前视偏差(lookahead bias)\n",
    "\n",
    "  ④ 绩效指标  Performance Metrics\n",
    "     Sharpe / Sortino / Max Drawdown / Calmar / Win Rate / Profit Factor\n",
    "\n",
    "  ⑤ 前向验证  Walk-Forward Validation\n",
    "     训练集优化参数 → 测试集验证 → 防止过拟合\n",
    "\n",
    "  ⑥ 收益率分布  Return Distribution\n",
    "     偏度/峰度检验，月度热力图\n",
    "\n",
    "  下一步学习方向  Next Steps:\n",
    "  ──────────────────────────────────────────────────────────────\n",
    "  • 因子选股策略 (Alpha Factor Models) — Fama-French, Momentum\n",
    "  • 组合优化     (Portfolio Optimization) — Mean-Variance, Risk Parity\n",
    "  • 事件驱动回测 (Event-Driven Backtesting) — more realistic execution\n",
    "  • 机器学习信号 (ML-based Signals) — XGBoost, LSTM for return prediction\n",
    "  • 风险管理     (Risk Management) — Position sizing, Stop-loss, VaR\n",
    "{'=' * 70}\n",
    "\"\"\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (trading)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}