Operations Research and Management Science ›› 2024, Vol. 33 ›› Issue (3): 184-190.DOI: 10.12005/orms.2024.0096

• Application Research • Previous Articles     Next Articles

Short-term Volatility Prediction of Gold Futures Based on High-frequency Data and EN-LSTM

QIU Dongyang1, DING Ling1, HE Yifu2   

  1. 1. School of Economics and Finance, Chongqing University of Technology, Chongqing 400054, China;
    2. Faculty of Science, University of Hong Kong, Hong Kong 999077, China
  • Received:2021-08-19 Online:2024-03-25 Published:2024-05-20

基于高频数据和EN-LSTM的黄金期货短期波动率预测

邱冬阳1, 丁玲1, 何一夫2   

  1. 1.重庆理工大学经济金融学院,重庆400054;
    2.香港大学理学院, 香港九龙 999077
  • 通讯作者: 丁 玲(1996-),女,河南周口人,硕士研究生,研究方向:计量方法,金融市场等。
  • 作者简介:邱冬阳(1970-),男,重庆人,博士,教授,硕士生导师,研究方向:中国金融市场波动性,金融风险等;何一夫(2000-),男,重庆人,本科生,研究方向:决策分析。
  • 基金资助:
    国家哲学社会科学基金重点项目(17AJY028)

Abstract: In 2020, the sudden outbreak of the COVID-19 pandemic triggered a profound integration of cutting-edge fields such as the internet, big data, and artificial intelligence into the financial markets. This integration has led to a transformation in trading, settlement, and information dissemination modes of futures, causing a noticeable increase in the instability and uncertainty of the gold futures market. Exploring the inherent patterns of gold futures price volatility under these new conditions is essential for providing warnings and preventing “black swan” risks for all participants in the gold futures market.
   The marginal contribution of this paper lies in two main areas: the model improvement section, where an EN-LSTM combination is employed to predict high-frequency volatility in gold futures based on characteristics extracted from high-frequency data, demonstrating that the predictive performance of the integrated model is significantly superior to using the LSTM model alone; and the empirical application section, which achieves real-time out-of-sample forecasting of high-frequency data, dynamically contracting the rolling time window and enhancing the practicality of financial time series forecasting.
   The paper integrates an Elastic Net (EN) and Long Short-Term Memory (LSTM) model (referred to as EN-LSTM) for predicting high-frequency volatility in gold futures. Drawing inspiration from the contemporary practice of combining LASSO and LSTM models, a penalty term is introduced in the traditional linear regression model, with improvements made on the LASSO penalty, forming the EN model. The EN model is then used for variable shrinkage, primarily reducing overfitting through variable selection and regularization, resulting in a novel integrated prediction model, EN-LSTM.
   The chosen sample in this study is the standard continuous main contract of gold futures from the Shanghai Futures Exchange, with a sample period from January 2, 2019, to December 31, 2020. High-frequency raw data is sourced from the Tonghuashun database. The paper begins by scaling 20-dimensional input variables using the EN model, feeding the scaled selected variables into the LSTM prediction model for training, ultimately outputting the high-frequency returns of Shanghai gold futures. The differential absolute value of returns is employed as a proxy variable for short-term volatility changes in Shanghai gold futures.
   From the empirical results, the following conclusions can be drawn: Firstly, in terms of data frequency, the prediction accuracy is higher with high-frequency data. Secondly, considering the training time steps, the prediction performance is the most ideal with 15 training time steps. Furthermore, a comparison of returns before and after the impact of the pandemic confirms that the EN-LSTM prediction model accurately captures the changes brought about by the pandemic, reflecting the micro effects of macro environmental shifts in a timely manner.
   Additionally, further research is needed to determine the applicability of the EN model for the analysis and prediction of daily, monthly, and other data. The real-time dynamic contraction of the rolling time window also requires further development.

Key words: high-frequency data, machine learning, EN-LSTM model, gold futures

摘要: 以上海黄金期货2019—2020年期间的1分钟高频交易数据为样本,选取具有变量选择和长短期记忆特性的EN-LSTM,运用滚动时间窗口的样本外预测,对比不同数据频率的短期波动率预测模型对波动率的刻画和预测能力。实证研究表明:EN-LSTM能拟合上海黄金期货高频交易波动率特征;数据频率会对上海黄金期货短期波动率的预测带来显著影响,1分钟的数据频率预测精度明显高于更为低频的数据。研究结论有助于黄金期货市场参与各方分散和化解金融风险。

关键词: 高频数据, 机器学习, EN-LSTM, 黄金期货

CLC Number: