第 5 章 量化建模与回测:从模型设定到策略验证
quantmod 包为金融数据统计建模提供了流畅且模块化的操作框架,其工作流程可概括为从模型构建、参数估计到策略回测的完整闭环。这种设计使复杂的量化分析任务被拆解为一系列可复用的函数调用,极大提升了分析效率与灵活性。
5.1 模型构建流程
5.1.1 设定模型形式
模型构建始于对目标变量与解释变量关系的数学表达。通过 specifyModel 函数,用户可以采用类似 R 语言中线性回归的公式语法来定义模型结构。例如,使用 Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL),0:3) + Hi(IXIC)
可以构建一个预测苹果公司次日开盘收盘价差的模型。
## [1] "IXIC" "AAPL"
##
## quantmod object: Build date:
##
## Model Specified:
## Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC)
##
## Model Target: Next.OpCl.AAPL Product: AAPL
## Model Inputs:
##
## Fitted Model:
##
## None Fitted
该模型考虑了苹果公司自身最近4天的开盘最高价差以及纳斯达克综合指数的当日最高价。这种表达不仅清晰定义了变量间的逻辑关系,还通过 Lag 函数灵活引入了时间滞后效应,使模型能够捕捉市场的惯性特征。
5.1.2 查看模型数据
模型设定完成后,modelData 函数可用于查看模型对应的数据集。这一功能在调试阶段尤为重要,通过观察生成的数据结构,用户可以确认变量计算是否符合预期。
值得注意的是,quantmod 包在处理时间序列数据时会自动处理缺失值和对齐问题,例如通过 na.rm=TRUE 参数可以便捷地移除包含缺失值的样本。
## Next.OpCl.AAPL
## 2007-01-08 0.0707922
## 2007-01-09 0.0237467
## 2007-01-10 -0.0014593
## 2007-01-11 0.0003174
## 2007-01-12 0.0148410
## 2007-01-16 -0.0267530
## Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08 0.006631
## 2007-01-09 0.075535
## 2007-01-10 0.032190
## 2007-01-11 0.008755
## 2007-01-12 0.004969
## 2007-01-16 0.016409
## Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08 0.005013
## 2007-01-09 0.006631
## 2007-01-10 0.075535
## 2007-01-11 0.032190
## 2007-01-12 0.008755
## 2007-01-16 0.004969
## Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08 0.022606
## 2007-01-09 0.005013
## 2007-01-10 0.006631
## 2007-01-11 0.075535
## 2007-01-12 0.032190
## 2007-01-16 0.008755
## Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08 0.003361 2446
## 2007-01-09 0.022606 2450
## 2007-01-10 0.005013 2461
## 2007-01-11 0.006631 2489
## 2007-01-12 0.075535 2503
## 2007-01-16 0.032190 2509
若希望直接生成模型数据集,buildData 函数提供了更简洁的方式,但其结果仅用于数据探查,不能直接传递给模型训练函数。
## Next.OpCl.AAPL
## 2007-01-08 0.0707922
## 2007-01-09 0.0237467
## 2007-01-10 -0.0014593
## 2007-01-11 0.0003174
## 2007-01-12 0.0148410
## 2007-01-16 -0.0267530
## Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08 0.006631
## 2007-01-09 0.075535
## 2007-01-10 0.032190
## 2007-01-11 0.008755
## 2007-01-12 0.004969
## 2007-01-16 0.016409
## Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08 0.005013
## 2007-01-09 0.006631
## 2007-01-10 0.075535
## 2007-01-11 0.032190
## 2007-01-12 0.008755
## 2007-01-16 0.004969
## Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08 0.022606
## 2007-01-09 0.005013
## 2007-01-10 0.006631
## 2007-01-11 0.075535
## 2007-01-12 0.032190
## 2007-01-16 0.008755
## Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08 0.003361 2446
## 2007-01-09 0.022606 2450
## 2007-01-10 0.005013 2461
## 2007-01-11 0.006631 2489
## 2007-01-12 0.075535 2503
## 2007-01-16 0.032190 2509
5.1.3 估计模型参数
参数估计是模型训练的核心环节。buildModel 函数支持多种建模方法,包括线性回归(lm)、广义线性模型(glm)等。在训练过程中,用户需要指定训练周期,例如 training.per=c(‘2013-08-01’,‘2013-09-30’) 将限定模型仅使用该时间段内的数据进行参数估计。这种时间窗口划分机制使模型能够在历史数据上进行滚动训练,有效评估模型的时间稳定性。训练完成后,通过 print 和 summary 函数可以查看模型的关键统计信息,包括系数估计值、显著性水平以及拟合优度等指标,帮助用户评估模型质量。
## 简单的线型模型
bM <- buildModel(q.model,method='lm',training.per=c('2013-08-01','2013-09-30'))
## 显示结果
print(bM,width = 50)
## <S4 Type Object>
## attr(,"model.id")
## [1] "lm1752147173.54446"
## attr(,"model.spec")
## Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC)
## attr(,"model.formula")
## Next.OpCl.AAPL ~ Lag.OpHi.AAPL.0.3.Lag.0 + Lag.OpHi.AAPL.0.3.Lag.1 +
## Lag.OpHi.AAPL.0.3.Lag.2 + Lag.OpHi.AAPL.0.3.Lag.3 + Hi.IXIC
## <environment: 0x7f8f5b9f2300>
## attr(,"model.target")
## [1] "Next.OpCl.AAPL"
## attr(,"model.inputs")
## [1] "Lag.OpHi.AAPL.0.3.Lag.0"
## [2] "Lag.OpHi.AAPL.0.3.Lag.1"
## [3] "Lag.OpHi.AAPL.0.3.Lag.2"
## [4] "Lag.OpHi.AAPL.0.3.Lag.3"
## [5] "Hi.IXIC"
## attr(,"build.inputs")
## [1] "Lag.OpHi.AAPL.0.3.Lag.0"
## [2] "Lag.OpHi.AAPL.0.3.Lag.1"
## [3] "Lag.OpHi.AAPL.0.3.Lag.2"
## [4] "Lag.OpHi.AAPL.0.3.Lag.3"
## [5] "Hi.IXIC"
## attr(,"symbols")
## [1] "AAPL" "IXIC"
## attr(,"product")
## [1] "AAPL"
## attr(,"price.levels")
## `\001NULL\001`
## attr(,"training.data")
## [1] "2013-08-01" "2013-08-02" "2013-08-05"
## [4] "2013-08-06" "2013-08-07" "2013-08-08"
## [7] "2013-08-09" "2013-08-12" "2013-08-13"
## [10] "2013-08-14" "2013-08-15" "2013-08-16"
## [13] "2013-08-19" "2013-08-20" "2013-08-21"
## [16] "2013-08-22" "2013-08-23" "2013-08-26"
## [19] "2013-08-27" "2013-08-28" "2013-08-29"
## [22] "2013-08-30" "2013-09-03" "2013-09-04"
## [25] "2013-09-05" "2013-09-06" "2013-09-09"
## [28] "2013-09-10" "2013-09-11" "2013-09-12"
## [31] "2013-09-13" "2013-09-16" "2013-09-17"
## [34] "2013-09-18" "2013-09-19" "2013-09-20"
## [37] "2013-09-23" "2013-09-24" "2013-09-25"
## [40] "2013-09-26" "2013-09-27" "2013-09-30"
## attr(,"build.date")
## [1] "2025-07-10 19:32:53.544324"
## attr(,"fitted.model")
##
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
##
## Coefficients:
## (Intercept)
## 6.36e-02
## Lag.OpHi.AAPL.0.3.Lag.0
## 2.01e-01
## Lag.OpHi.AAPL.0.3.Lag.1
## -8.97e-02
## Lag.OpHi.AAPL.0.3.Lag.2
## -1.06e-01
## Lag.OpHi.AAPL.0.3.Lag.3
## 7.86e-02
## Hi.IXIC
## -1.77e-05
##
## attr(,"model.data")
## Next.OpCl.AAPL Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08 0.0707922 0.0066310
## 2007-01-09 0.0237467 0.0755349
## 2007-01-10 -0.0014593 0.0321898
## 2007-01-11 0.0003174 0.0087555
## 2007-01-12 0.0148410 0.0049689
## 2007-01-16 -0.0267530 0.0164087
## 2007-01-17 -0.0328992 0.0004098
## 2007-01-18 -0.0014669 0.0001086
## 2007-01-19 -0.0263629 0.0115086
## 2007-01-22 -0.0003501 0.0002246
## ...
## 2025-06-25 -0.0021347 0.0110201
## 2025-06-26 -0.0040121 0.0060071
## 2025-06-27 0.0156428 0.0065878
## 2025-06-30 0.0055645 0.0266324
## 2025-07-01 0.0168972 0.0170320
## 2025-07-02 0.0065991 0.0212053
## 2025-07-03 -0.0128362 0.0117841
## 2025-07-07 -0.0004284 0.0166918
## 2025-07-08 0.0076839 0.0063303
## 2025-07-09 NA 0.0085907
## Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08 0.0050134
## 2007-01-09 0.0066310
## 2007-01-10 0.0755349
## 2007-01-11 0.0321898
## 2007-01-12 0.0087555
## 2007-01-16 0.0049689
## 2007-01-17 0.0164087
## 2007-01-18 0.0004098
## 2007-01-19 0.0001086
## 2007-01-22 0.0115086
## ...
## 2025-06-25 0.0041957
## 2025-06-26 0.0110201
## 2025-06-27 0.0060071
## 2025-06-30 0.0065878
## 2025-07-01 0.0266324
## 2025-07-02 0.0170320
## 2025-07-03 0.0212053
## 2025-07-07 0.0117841
## 2025-07-08 0.0166918
## 2025-07-09 0.0063303
## Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08 0.0226056
## 2007-01-09 0.0050134
## 2007-01-10 0.0066310
## 2007-01-11 0.0755349
## 2007-01-12 0.0321898
## 2007-01-16 0.0087555
## 2007-01-17 0.0049689
## 2007-01-18 0.0164087
## 2007-01-19 0.0004098
## 2007-01-22 0.0001086
## ...
## 2025-06-25 0.0033229
## 2025-06-26 0.0041957
## 2025-06-27 0.0110201
## 2025-06-30 0.0060071
## 2025-07-01 0.0065878
## 2025-07-02 0.0266324
## 2025-07-03 0.0170320
## 2025-07-07 0.0212053
## 2025-07-08 0.0117841
## 2025-07-09 0.0166918
## Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08 0.0033608 2446
## 2007-01-09 0.0226056 2450
## 2007-01-10 0.0050134 2461
## 2007-01-11 0.0066310 2489
## 2007-01-12 0.0755349 2503
## 2007-01-16 0.0321898 2509
## 2007-01-17 0.0087555 2497
## 2007-01-18 0.0049689 2476
## 2007-01-19 0.0164087 2454
## 2007-01-22 0.0004098 2455
## ...
## 2025-06-25 0.0174535 20053
## 2025-06-26 0.0033229 20187
## 2025-06-27 0.0041957 20312
## 2025-06-30 0.0110201 20418
## 2025-07-01 0.0060071 20339
## 2025-07-02 0.0065878 20397
## 2025-07-03 0.0266324 20625
## 2025-07-07 0.0170320 20512
## 2025-07-08 0.0212053 20481
## 2025-07-09 0.0117841 20645
## attr(,"quantmod.version")
## numeric(0)
## attr(,"class")
## [1] "quantmod"
## attr(,"class")attr(,"package")
## [1] "quantmod"
##
## quantmod object: lm1752147173.54446 Build date: 2025-07-10 19:32:53.544324
##
## Model Specified:
## Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC)
##
## Model Target: Next.OpCl.AAPL Product: AAPL
## Model Inputs: Lag.OpHi.AAPL.0.3.Lag.0, Lag.OpHi.AAPL.0.3.Lag.1, Lag.OpHi.AAPL.0.3.Lag.2, Lag.OpHi.AAPL.0.3.Lag.3, Hi.IXIC
##
## Fitted Model:
##
## Modelling procedure: lm
## Training window: 42 observations from 2013-08-01 to 2013-09-30
##
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.02207 -0.00766 -0.00028 0.00689 0.03572
##
## Coefficients:
## Estimate Std. Error
## (Intercept) 6.36e-02 1.40e-01
## Lag.OpHi.AAPL.0.3.Lag.0 2.01e-01 2.28e-01
## Lag.OpHi.AAPL.0.3.Lag.1 -8.97e-02 2.29e-01
## Lag.OpHi.AAPL.0.3.Lag.2 -1.06e-01 2.34e-01
## Lag.OpHi.AAPL.0.3.Lag.3 7.86e-02 2.31e-01
## Hi.IXIC -1.77e-05 3.75e-05
## t value Pr(>|t|)
## (Intercept) 0.45 0.65
## Lag.OpHi.AAPL.0.3.Lag.0 0.88 0.38
## Lag.OpHi.AAPL.0.3.Lag.1 -0.39 0.70
## Lag.OpHi.AAPL.0.3.Lag.2 -0.46 0.65
## Lag.OpHi.AAPL.0.3.Lag.3 0.34 0.74
## Hi.IXIC -0.47 0.64
##
## Residual standard error: 0.0132 on 36 degrees of freedom
## Multiple R-squared: 0.0391, Adjusted R-squared: -0.0943
## F-statistic: 0.293 on 5 and 36 DF, p-value: 0.914
5.1.4 模型结果提取
模型结果提取是连接训练与应用的桥梁。getModelData 函数能够从训练好的模型中提取关键数据,如预测值、残差等,这些数据可进一步用于策略构建或风险分析。例如,通过分析模型残差的分布特征,用户可以识别模型未捕捉到的市场异常波动,从而对模型进行改进。
提取前述模型结果:
##
## quantmod object: lm1752147173.54446 Build date: 2025-07-10 19:32:53.544324
##
## Model Specified:
## Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC)
##
## Model Target: Next.OpCl.AAPL Product: AAPL
## Model Inputs: Lag.OpHi.AAPL.0.3.Lag.0, Lag.OpHi.AAPL.0.3.Lag.1, Lag.OpHi.AAPL.0.3.Lag.2, Lag.OpHi.AAPL.0.3.Lag.3, Hi.IXIC
##
## Fitted Model:
##
## Modelling procedure: lm
## Training window: 42 observations from 2013-08-01 to 2013-09-30
##
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
##
## Coefficients:
## (Intercept)
## 6.36e-02
## Lag.OpHi.AAPL.0.3.Lag.0
## 2.01e-01
## Lag.OpHi.AAPL.0.3.Lag.1
## -8.97e-02
## Lag.OpHi.AAPL.0.3.Lag.2
## -1.06e-01
## Lag.OpHi.AAPL.0.3.Lag.3
## 7.86e-02
## Hi.IXIC
## -1.77e-05
5.2 策略回测与评估
策略回测是量化模型落地的关键环节。tradeModel 函数提供了基础的回测功能,能够基于训练好的模型生成交易信号并模拟交易过程。默认情况下,回测采用 1 倍杠杆,但通过 leverage 参数可以灵活调整杠杆水平,例如设置为 2 时将使用 2 倍杠杆进行回测,这有助于评估策略在不同资金管理方案下的表现。回测结果会显示关键绩效指标,如累计收益率、最大回撤等,直观反映策略的盈利能力和风险控制能力。
##
## Model: lm1752147173.54446
##
## C.A.G.R.: -13.24% H.P.R.: -97.78%
##
## Returns by period summary:
##
## weekly monthly quarterly
## Max. 11.42% 16.82% 26.20%
## 3rd Qu. 1.58% 2.65% 2.05%
## Mean -0.35% -1.52% -4.40%
## Median -0.37% -1.44% -4.15%
## 2rd Qu. -2.22% -6.33% -12.44%
## Min. -22.08% -24.44% -23.94%
## yearly
## Max. 60.86%
## 3rd Qu. -12.42%
## Mean -15.96%
## Median -19.75%
## 2rd Qu. -26.73%
## Min. -45.64%
##
## Period to date returns:
##
## weekly monthly quarterly
## 0.55% -2.35% -2.35%
## yearly
## -19.28%
##
## Model: lm1752147173.54446
##
## C.A.G.R.: -28.12% H.P.R.: -99.99%
##
## Returns by period summary:
##
## weekly monthly quarterly
## Max. 23.77% 35.70% 56.36%
## 3rd Qu. 3.08% 4.72% 3.62%
## Mean -0.70% -3.09% -8.79%
## Median -0.80% -3.10% -9.71%
## 2rd Qu. -4.44% -12.48% -24.67%
## Min. -41.26% -45.38% -43.48%
## yearly
## Max. 139.66%
## 3rd Qu. -27.31%
## Mean -28.67%
## Median -40.87%
## 2rd Qu. -48.21%
## Min. -77.62%
##
## Period to date returns:
##
## weekly monthly quarterly
## 1.08% -4.70% -4.70%
## yearly
## -39.33%