--- metrics: - mae --- # XGBoost Model for Pricing European Options The paper where this model was used is available on arXiv: https://arxiv.org/abs/2307.00476. The GitHub Repository with the source code for this project is available here: https://github.com/juan-esteban-berger/Options_Pricing_AutoML_TensorFlow_XGBoost ## Model Architecture The XGBoost model uses gradient boosting decision trees with a maximum depth of 10. It is trained to learn complex patterns from historical data of an option’s underlying asset. The model does not use implied volatility as a feature like the Black-Scholes model. Instead, it learns all the necessary features from other feature variables and the past 20 lags of the underlying securities’ closing prices. ## Installation ```sh pip install xgboost ``` ## Usage ```python import xgboost as xgb import pandas as pd import numpy as np # Load the trained XGBoost model model = xgb.Booster() model.load_model('path-to-your-model-file') # Define the feature names feature_names = [ 'strike_price', 'implied_volatility', 'zero_coupon_rate', 'index_dividend_yields', 'option_type', # 1 for call, 0 for put 'time_to_maturity', 'underlying_asset_current_price' # past 20 days' closing prices ] + [f'lag_{i}' for i in range(1, 21)] # Example data (replace this with actual data) example_data = [ 100, # strike_price 0.2, # implied_volatility 0.03, # zero_coupon_rate 0.01, # index_dividend_yields 1, # option_type (1 for call) 30, # time_to_maturity (in days) 105 # underlying_asset_current_price # dummy data for past 20 days' closing prices ] + list(np.random.random(20)) # Convert the example data to DMatrix format data = pd.DataFrame([example_data], columns=feature_names) dmatrix_data = xgb.DMatrix(data) # Make prediction prediction = model.predict(dmatrix_data) # Output the prediction print(f'Predicted option price: {prediction[0]}') ``` ## Results The XGBoost model with a maximum depth of 10 outperformed both TensorFlow and Google’s AutoML Regressor in terms of mean absolute error and mean absolute percentage error. It was also trained in a fraction of the time compared to its closest competitor, Google’s AutoML Regressor. Given the performance of the XGBoost model, it is evident why it is a popular choice among machine learning engineers and data scientists. With high volumes of data and computing resources becoming highly available, using machine learning and deep learning methods for options pricing becomes a viable option for pricing securities.