第 5 章 量化建模与回测:从模型设定到策略验证

quantmod 包为金融数据统计建模提供了流畅且模块化的操作框架,其工作流程可概括为从模型构建、参数估计到策略回测的完整闭环。这种设计使复杂的量化分析任务被拆解为一系列可复用的函数调用,极大提升了分析效率与灵活性。

5.1 模型构建流程

5.1.1 设定模型形式

模型构建始于对目标变量与解释变量关系的数学表达。通过 specifyModel 函数,用户可以采用类似 R 语言中线性回归的公式语法来定义模型结构。例如,使用 Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL),0:3) + Hi(IXIC)可以构建一个预测苹果公司次日开盘收盘价差的模型。

# 加载包
require(quantmod)
# 下载数据
getSymbols(c("^IXIC","AAPL"))
## [1] "IXIC" "AAPL"
# 设定模型形式
q.model <- specifyModel(Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL),0:3) + Hi(IXIC))
# 查看模型
q.model
## 
## quantmod object:     Build date:   
## 
## Model Specified: 
##      Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC) 
## 
## Model Target:  Next.OpCl.AAPL         Product:  AAPL 
## Model Inputs:   
## 
## Fitted Model: 
## 
##  None Fitted

该模型考虑了苹果公司自身最近4天的开盘最高价差以及纳斯达克综合指数的当日最高价。这种表达不仅清晰定义了变量间的逻辑关系,还通过 Lag 函数灵活引入了时间滞后效应,使模型能够捕捉市场的惯性特征。

5.1.2 查看模型数据

模型设定完成后,modelData 函数可用于查看模型对应的数据集。这一功能在调试阶段尤为重要,通过观察生成的数据结构,用户可以确认变量计算是否符合预期。

值得注意的是,quantmod 包在处理时间序列数据时会自动处理缺失值和对齐问题,例如通过 na.rm=TRUE 参数可以便捷地移除包含缺失值的样本。

head(modelData(q.model))
##            Next.OpCl.AAPL
## 2007-01-08      0.0707922
## 2007-01-09      0.0237467
## 2007-01-10     -0.0014593
## 2007-01-11      0.0003174
## 2007-01-12      0.0148410
## 2007-01-16     -0.0267530
##            Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08                0.006631
## 2007-01-09                0.075535
## 2007-01-10                0.032190
## 2007-01-11                0.008755
## 2007-01-12                0.004969
## 2007-01-16                0.016409
##            Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08                0.005013
## 2007-01-09                0.006631
## 2007-01-10                0.075535
## 2007-01-11                0.032190
## 2007-01-12                0.008755
## 2007-01-16                0.004969
##            Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08                0.022606
## 2007-01-09                0.005013
## 2007-01-10                0.006631
## 2007-01-11                0.075535
## 2007-01-12                0.032190
## 2007-01-16                0.008755
##            Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08                0.003361    2446
## 2007-01-09                0.022606    2450
## 2007-01-10                0.005013    2461
## 2007-01-11                0.006631    2489
## 2007-01-12                0.075535    2503
## 2007-01-16                0.032190    2509

若希望直接生成模型数据集,buildData 函数提供了更简洁的方式,但其结果仅用于数据探查,不能直接传递给模型训练函数。

bD=buildData(Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL),0:3) + Hi(IXIC))
head(bD)
##            Next.OpCl.AAPL
## 2007-01-08      0.0707922
## 2007-01-09      0.0237467
## 2007-01-10     -0.0014593
## 2007-01-11      0.0003174
## 2007-01-12      0.0148410
## 2007-01-16     -0.0267530
##            Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08                0.006631
## 2007-01-09                0.075535
## 2007-01-10                0.032190
## 2007-01-11                0.008755
## 2007-01-12                0.004969
## 2007-01-16                0.016409
##            Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08                0.005013
## 2007-01-09                0.006631
## 2007-01-10                0.075535
## 2007-01-11                0.032190
## 2007-01-12                0.008755
## 2007-01-16                0.004969
##            Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08                0.022606
## 2007-01-09                0.005013
## 2007-01-10                0.006631
## 2007-01-11                0.075535
## 2007-01-12                0.032190
## 2007-01-16                0.008755
##            Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08                0.003361    2446
## 2007-01-09                0.022606    2450
## 2007-01-10                0.005013    2461
## 2007-01-11                0.006631    2489
## 2007-01-12                0.075535    2503
## 2007-01-16                0.032190    2509

5.1.3 估计模型参数

参数估计是模型训练的核心环节。buildModel 函数支持多种建模方法,包括线性回归(lm)、广义线性模型(glm)等。在训练过程中,用户需要指定训练周期,例如 training.per=c(‘2013-08-01’,‘2013-09-30’) 将限定模型仅使用该时间段内的数据进行参数估计。这种时间窗口划分机制使模型能够在历史数据上进行滚动训练,有效评估模型的时间稳定性。训练完成后,通过 print 和 summary 函数可以查看模型的关键统计信息,包括系数估计值、显著性水平以及拟合优度等指标,帮助用户评估模型质量。

## 简单的线型模型
bM <- buildModel(q.model,method='lm',training.per=c('2013-08-01','2013-09-30'))
## 显示结果
print(bM,width = 50)
## <S4 Type Object>
## attr(,"model.id")
## [1] "lm1752147173.54446"
## attr(,"model.spec")
## Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC)
## attr(,"model.formula")
## Next.OpCl.AAPL ~ Lag.OpHi.AAPL.0.3.Lag.0 + Lag.OpHi.AAPL.0.3.Lag.1 + 
##     Lag.OpHi.AAPL.0.3.Lag.2 + Lag.OpHi.AAPL.0.3.Lag.3 + Hi.IXIC
## <environment: 0x7f8f5b9f2300>
## attr(,"model.target")
## [1] "Next.OpCl.AAPL"
## attr(,"model.inputs")
## [1] "Lag.OpHi.AAPL.0.3.Lag.0"
## [2] "Lag.OpHi.AAPL.0.3.Lag.1"
## [3] "Lag.OpHi.AAPL.0.3.Lag.2"
## [4] "Lag.OpHi.AAPL.0.3.Lag.3"
## [5] "Hi.IXIC"                
## attr(,"build.inputs")
## [1] "Lag.OpHi.AAPL.0.3.Lag.0"
## [2] "Lag.OpHi.AAPL.0.3.Lag.1"
## [3] "Lag.OpHi.AAPL.0.3.Lag.2"
## [4] "Lag.OpHi.AAPL.0.3.Lag.3"
## [5] "Hi.IXIC"                
## attr(,"symbols")
## [1] "AAPL" "IXIC"
## attr(,"product")
## [1] "AAPL"
## attr(,"price.levels")
## `\001NULL\001`
## attr(,"training.data")
##  [1] "2013-08-01" "2013-08-02" "2013-08-05"
##  [4] "2013-08-06" "2013-08-07" "2013-08-08"
##  [7] "2013-08-09" "2013-08-12" "2013-08-13"
## [10] "2013-08-14" "2013-08-15" "2013-08-16"
## [13] "2013-08-19" "2013-08-20" "2013-08-21"
## [16] "2013-08-22" "2013-08-23" "2013-08-26"
## [19] "2013-08-27" "2013-08-28" "2013-08-29"
## [22] "2013-08-30" "2013-09-03" "2013-09-04"
## [25] "2013-09-05" "2013-09-06" "2013-09-09"
## [28] "2013-09-10" "2013-09-11" "2013-09-12"
## [31] "2013-09-13" "2013-09-16" "2013-09-17"
## [34] "2013-09-18" "2013-09-19" "2013-09-20"
## [37] "2013-09-23" "2013-09-24" "2013-09-25"
## [40] "2013-09-26" "2013-09-27" "2013-09-30"
## attr(,"build.date")
## [1] "2025-07-10 19:32:53.544324"
## attr(,"fitted.model")
## 
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
## 
## Coefficients:
##             (Intercept)  
##                6.36e-02  
## Lag.OpHi.AAPL.0.3.Lag.0  
##                2.01e-01  
## Lag.OpHi.AAPL.0.3.Lag.1  
##               -8.97e-02  
## Lag.OpHi.AAPL.0.3.Lag.2  
##               -1.06e-01  
## Lag.OpHi.AAPL.0.3.Lag.3  
##                7.86e-02  
##                 Hi.IXIC  
##               -1.77e-05  
## 
## attr(,"model.data")
##            Next.OpCl.AAPL Lag.OpHi.AAPL.0.3.Lag.0
## 2007-01-08      0.0707922               0.0066310
## 2007-01-09      0.0237467               0.0755349
## 2007-01-10     -0.0014593               0.0321898
## 2007-01-11      0.0003174               0.0087555
## 2007-01-12      0.0148410               0.0049689
## 2007-01-16     -0.0267530               0.0164087
## 2007-01-17     -0.0328992               0.0004098
## 2007-01-18     -0.0014669               0.0001086
## 2007-01-19     -0.0263629               0.0115086
## 2007-01-22     -0.0003501               0.0002246
##        ...                                       
## 2025-06-25     -0.0021347               0.0110201
## 2025-06-26     -0.0040121               0.0060071
## 2025-06-27      0.0156428               0.0065878
## 2025-06-30      0.0055645               0.0266324
## 2025-07-01      0.0168972               0.0170320
## 2025-07-02      0.0065991               0.0212053
## 2025-07-03     -0.0128362               0.0117841
## 2025-07-07     -0.0004284               0.0166918
## 2025-07-08      0.0076839               0.0063303
## 2025-07-09             NA               0.0085907
##            Lag.OpHi.AAPL.0.3.Lag.1
## 2007-01-08               0.0050134
## 2007-01-09               0.0066310
## 2007-01-10               0.0755349
## 2007-01-11               0.0321898
## 2007-01-12               0.0087555
## 2007-01-16               0.0049689
## 2007-01-17               0.0164087
## 2007-01-18               0.0004098
## 2007-01-19               0.0001086
## 2007-01-22               0.0115086
##        ...                        
## 2025-06-25               0.0041957
## 2025-06-26               0.0110201
## 2025-06-27               0.0060071
## 2025-06-30               0.0065878
## 2025-07-01               0.0266324
## 2025-07-02               0.0170320
## 2025-07-03               0.0212053
## 2025-07-07               0.0117841
## 2025-07-08               0.0166918
## 2025-07-09               0.0063303
##            Lag.OpHi.AAPL.0.3.Lag.2
## 2007-01-08               0.0226056
## 2007-01-09               0.0050134
## 2007-01-10               0.0066310
## 2007-01-11               0.0755349
## 2007-01-12               0.0321898
## 2007-01-16               0.0087555
## 2007-01-17               0.0049689
## 2007-01-18               0.0164087
## 2007-01-19               0.0004098
## 2007-01-22               0.0001086
##        ...                        
## 2025-06-25               0.0033229
## 2025-06-26               0.0041957
## 2025-06-27               0.0110201
## 2025-06-30               0.0060071
## 2025-07-01               0.0065878
## 2025-07-02               0.0266324
## 2025-07-03               0.0170320
## 2025-07-07               0.0212053
## 2025-07-08               0.0117841
## 2025-07-09               0.0166918
##            Lag.OpHi.AAPL.0.3.Lag.3 Hi.IXIC
## 2007-01-08               0.0033608    2446
## 2007-01-09               0.0226056    2450
## 2007-01-10               0.0050134    2461
## 2007-01-11               0.0066310    2489
## 2007-01-12               0.0755349    2503
## 2007-01-16               0.0321898    2509
## 2007-01-17               0.0087555    2497
## 2007-01-18               0.0049689    2476
## 2007-01-19               0.0164087    2454
## 2007-01-22               0.0004098    2455
##        ...                                
## 2025-06-25               0.0174535   20053
## 2025-06-26               0.0033229   20187
## 2025-06-27               0.0041957   20312
## 2025-06-30               0.0110201   20418
## 2025-07-01               0.0060071   20339
## 2025-07-02               0.0065878   20397
## 2025-07-03               0.0266324   20625
## 2025-07-07               0.0170320   20512
## 2025-07-08               0.0212053   20481
## 2025-07-09               0.0117841   20645
## attr(,"quantmod.version")
## numeric(0)
## attr(,"class")
## [1] "quantmod"
## attr(,"class")attr(,"package")
## [1] "quantmod"
# 查看更详细的结果
print(summary(bM),width = 50)
## 
## quantmod object:   lm1752147173.54446    Build date:  2025-07-10 19:32:53.544324 
## 
## Model Specified: 
##      Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC) 
## 
## Model Target:  Next.OpCl.AAPL         Product:  AAPL 
## Model Inputs:  Lag.OpHi.AAPL.0.3.Lag.0, Lag.OpHi.AAPL.0.3.Lag.1, Lag.OpHi.AAPL.0.3.Lag.2, Lag.OpHi.AAPL.0.3.Lag.3, Hi.IXIC 
## 
## Fitted Model: 
## 
##  Modelling procedure:  lm 
##  Training window:  42  observations from  2013-08-01 to 2013-09-30
## 
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.02207 -0.00766 -0.00028  0.00689  0.03572 
## 
## Coefficients:
##                          Estimate Std. Error
## (Intercept)              6.36e-02   1.40e-01
## Lag.OpHi.AAPL.0.3.Lag.0  2.01e-01   2.28e-01
## Lag.OpHi.AAPL.0.3.Lag.1 -8.97e-02   2.29e-01
## Lag.OpHi.AAPL.0.3.Lag.2 -1.06e-01   2.34e-01
## Lag.OpHi.AAPL.0.3.Lag.3  7.86e-02   2.31e-01
## Hi.IXIC                 -1.77e-05   3.75e-05
##                         t value Pr(>|t|)
## (Intercept)                0.45     0.65
## Lag.OpHi.AAPL.0.3.Lag.0    0.88     0.38
## Lag.OpHi.AAPL.0.3.Lag.1   -0.39     0.70
## Lag.OpHi.AAPL.0.3.Lag.2   -0.46     0.65
## Lag.OpHi.AAPL.0.3.Lag.3    0.34     0.74
## Hi.IXIC                   -0.47     0.64
## 
## Residual standard error: 0.0132 on 36 degrees of freedom
## Multiple R-squared:  0.0391, Adjusted R-squared:  -0.0943 
## F-statistic: 0.293 on 5 and 36 DF,  p-value: 0.914

5.1.4 模型结果提取

模型结果提取是连接训练与应用的桥梁。getModelData 函数能够从训练好的模型中提取关键数据,如预测值、残差等,这些数据可进一步用于策略构建或风险分析。例如,通过分析模型残差的分布特征,用户可以识别模型未捕捉到的市场异常波动,从而对模型进行改进。

提取前述模型结果:

getModelData(bM)
## 
## quantmod object:   lm1752147173.54446    Build date:  2025-07-10 19:32:53.544324 
## 
## Model Specified: 
##      Next(OpCl(AAPL)) ~ Lag(OpHi(AAPL), 0:3) + Hi(IXIC) 
## 
## Model Target:  Next.OpCl.AAPL         Product:  AAPL 
## Model Inputs:  Lag.OpHi.AAPL.0.3.Lag.0, Lag.OpHi.AAPL.0.3.Lag.1, Lag.OpHi.AAPL.0.3.Lag.2, Lag.OpHi.AAPL.0.3.Lag.3, Hi.IXIC 
## 
## Fitted Model: 
## 
##  Modelling procedure:  lm 
##  Training window:  42  observations from  2013-08-01 to 2013-09-30
## 
## Call:
## lm(formula = quantmod@model.formula, data = training.data)
## 
## Coefficients:
##             (Intercept)  
##                6.36e-02  
## Lag.OpHi.AAPL.0.3.Lag.0  
##                2.01e-01  
## Lag.OpHi.AAPL.0.3.Lag.1  
##               -8.97e-02  
## Lag.OpHi.AAPL.0.3.Lag.2  
##               -1.06e-01  
## Lag.OpHi.AAPL.0.3.Lag.3  
##                7.86e-02  
##                 Hi.IXIC  
##               -1.77e-05

5.2 策略回测与评估

策略回测是量化模型落地的关键环节。tradeModel 函数提供了基础的回测功能,能够基于训练好的模型生成交易信号并模拟交易过程。默认情况下,回测采用 1 倍杠杆,但通过 leverage 参数可以灵活调整杠杆水平,例如设置为 2 时将使用 2 倍杠杆进行回测,这有助于评估策略在不同资金管理方案下的表现。回测结果会显示关键绩效指标,如累计收益率、最大回撤等,直观反映策略的盈利能力和风险控制能力。

# 基础回测
tradeModel(bM)
## 
##   Model:  lm1752147173.54446 
## 
##   C.A.G.R.:  -13.24%     H.P.R.:  -97.78% 
## 
##   Returns by period summary:
## 
##              weekly monthly quarterly
##     Max.     11.42%  16.82%    26.20%
##     3rd Qu.   1.58%   2.65%     2.05%
##     Mean     -0.35%  -1.52%    -4.40%
##     Median   -0.37%  -1.44%    -4.15%
##     2rd Qu.  -2.22%  -6.33%   -12.44%
##     Min.    -22.08% -24.44%   -23.94%
##              yearly
##     Max.     60.86%
##     3rd Qu. -12.42%
##     Mean    -15.96%
##     Median  -19.75%
##     2rd Qu. -26.73%
##     Min.    -45.64%
## 
##   Period to date returns:
## 
##              weekly monthly quarterly
##               0.55%  -2.35%    -2.35%
##               yearly
##              -19.28%
# 带杠杆回测
tradeModel(bM, leverage=2)
## 
##   Model:  lm1752147173.54446 
## 
##   C.A.G.R.:  -28.12%     H.P.R.:  -99.99% 
## 
##   Returns by period summary:
## 
##              weekly monthly quarterly
##     Max.     23.77%  35.70%    56.36%
##     3rd Qu.   3.08%   4.72%     3.62%
##     Mean     -0.70%  -3.09%    -8.79%
##     Median   -0.80%  -3.10%    -9.71%
##     2rd Qu.  -4.44% -12.48%   -24.67%
##     Min.    -41.26% -45.38%   -43.48%
##              yearly
##     Max.    139.66%
##     3rd Qu. -27.31%
##     Mean    -28.67%
##     Median  -40.87%
##     2rd Qu. -48.21%
##     Min.    -77.62%
## 
##   Period to date returns:
## 
##              weekly monthly quarterly
##               1.08%  -4.70%    -4.70%
##               yearly
##              -39.33%