Monte Carlo in securities forecasting
Looking for the Spanish version? Follow this link.
Have you ever tried to go into the investment world? If so, you probably have seen articles or comments from experts explaining why you should invest in this or that stock. Or, maybe through some investment platform, you may have seen an illustrative chart showing you how your savings would behave if you invest them today, with a growing curve over time. Here we will make a brief introduction on how this forecasting works, the challenges they face to be accurate, and the use of an alternative method that, for its simplicity compared to more traditional alternatives, is worthy of analysis.
What subjects will we discuss?
- Trying to estimate the performance of securities
- Monte Carlo simulation as an alternative
- Simplifying modeling with asset classes
- Refining the estimates with correlations
Trying to estimate the performance of securities
Photo by Adam Śmigielski on Unsplash
Given all the possible variables that influence it, the financial market is complex and dynamic by nature: Environmental variables, such as macroeconomic factors, international events, political changes, or government regulations, may impact it. As part of this ecosystem, securities are also exposed to these impacts and also to their own, and every change in them also impacts the market. For instance, due to company decisions or quality changes to a product or service, the stock values of a particular company may change drastically in a short period, which could also impact the stock value of companies in the same industry.
Due to the dynamism and exposure to multiple factors, forecasting the performance of securities is a non-trivial problem that does not have a universal solution that brings accurate results for every use case. Nevertheless, having good predictions helps individuals, investors, and organizations make better financial decisions. This is why, even with all the challenges this problem faces, we have a large number of methods that try to simulate the performance and fluctuations of the financial ecosystem, or at least part of it, with different levels of complexity and accuracy in the results.
These methods can be categorized as follows:
- Fundamental analysis: Review of specific economic factors that may influence the value of a particular security (like financial health or growth potential). These methods require a comprehensive analysis of the data of each security, limiting the reuse and versatility of the results.
- Qualitative Technical Analysis: Predict future values based on historical data of a particular security, mostly based on a visual analysis of the chart of that data. This approach assumes that observed historical patterns will also be observed in the future.
- Quantitative Technical Analysis: Predict future values based on historical data of a particular security, by using traditional statistical models built using time-ordered timeseries data analysis, and trends and seasonality. Since the models are built using historical data, some observed patterns will remain the same in the predictions, although this can be minimized using larger volumes of data.
- Machine learning: Training machine learning algorithms using historical data. If data from a particular security is used, the results will most likely be accurate for that specific security or similar ones. If the algorithm is trained with multiple securities instead, significant data preprocessing is required to avoid correlation cancellation or other effects that limit the generalization of the results. Also, since these are black-box algorithms, explaining the reason behind the results may be impossible.
All these approaches, although they are capable of generating results that fit some specific behaviors, significantly depend on historical data for each of the securities they want to simulate. This dependency makes it hard to generate accurate predictions on securities with limited data (like new stocks or growing markets), and also assumes that all observed patterns will happen in the future, ignoring sudden spikes in volatility, economic changes, or black swan events.
Monte Carlo simulation as an alternative
Photo by Kaja Sariwating on Unsplash
The Monte Carlo simulation, or Monte Carlo algorithm, is a method that uses random simulations to estimate possible outcomes of a phenomenon. The base idea of the method is simple: Generate a collection of simulations for a particular event of a phenomenon that we are simulating (for instance, estimate the return of a security on a particular day), each with random variations, and then use the simulations to estimate what may be the most probable outcome for this event. The more simulations are performed, the better the estimation of the possible outcome, given that the simulations tend to converge with a standard error order of , where is the number of simulations.
The simulations performed for the Monte Carlo algorithm are not pure chaos, but must follow a probability distribution that has been calculated beforehand, usually by analyzing historical data of the phenomenon that we want to simulate. For the case of securities, the use of the Geometric Brownian Motion model (GBM) is commonly accepted in the modeling of stocks and securities in the financial market, given that its results follow a log-normal distribution - assuming that the values cannot be less than zero, and allow the consideration of key market parameters, combining:
- Drift (average return): The general trend, upward or downward.
- Volatility (standard deviation): How much the values fluctuate.
- Random shocks (normal-distributed noise): Simulate unpredictable market factors.
Nevertheless, it is important to note that the GBM model is not perfect. It is mainly criticized for assuming a strict log-normal distribution in the results, and for not taking into account fat tails (events with returns much higher or much lower more frequent than expected for a normal distribution) or jumps (sudden, unpredictable, and often significant changes in returns). Even with these limitations, it is a flexible model that allows the generation of a reasonable simulation of the financial market, and we will use it to illustrate how to use the Monte Carlo algorithm.
Executing a single simulation
Using the formula of the GBM model, we can perform a single simulation to get the gross return for a given period of time, with:
Where:
- : Gross return for a time period . We can get the percentage return with .
- : Average return (drift).
- : Standard deviation (volatility).
- : Random number drawn from a normal distribution (random shock).
We can simplify the formula by considering , to get a single daily gross return :
Then, to calculate the compounded gross return for future days, we use a product of multiple daily gross returns:
As we can see, the only variations between the factors of the product are the random shocks , each of which is unique and represents the unpredictable changes in the market.
Using multiple simulations
Now that we know how to run a single simulation to obtain the daily gross return for a specific day in the future, we complete the Monte Carlo algorithm by running multiple simulations for that same day until we have enough data to estimate a result. Remember that the standard error of Monte Carlo decreases as .
Figuring out the optimal number of simulations required to estimate a reasonable result strongly depends on the problem we are dealing with. For the forecasting of securities, there is no clear consensus: Some propose that 10.000 or more simulations are required to get accurate results, while others estimate that the results converge over 1.000 simulations.
Regardless of the number of simulations we define for our particular case of interest, the process continues in the same way:
- We can estimate an approximation of gross return by calculating the median of the compounded gross returns we have calculated.
- We can estimate the worst and best cases by calculating the percentiles that we deem appropriate. For instance, we could use the 10th and 90th percentiles, respectively, ensuring that these percentiles exclude outliers from the results.
Simplifying modeling with asset classes
If we use the Monte Carlo simulation to estimate values using the distribution of each security independently, this method feeds from historical data, just as other methods under the quantitative technical analysis previously mentioned in this article, and therefore it has similar limitations, although it can take into account the unpredictability of the market by using the GBM model. Nevertheless, it is possible to adapt this method to calculate values for all securities in a market using only data from broad asset classes, the number of which is usually small and whose definitions tend to be stable.
Asset classes are commonly used to group securities, but if we choose a finite collection of asset classes, we could decompose all securities of a market based on how much they are exposed to each asset class in that collection (i.e., what percentage of the instrument is categorized in one asset class or another). Then, instead of simulating each security independently, we can represent the gross return of a security as a weighted sum of the gross returns of each asset class :
Where:
- : The number of broad asset classes that we are using.
- : The weight (proportion) of each asset class in this particular security. Note that the weights must comply with .
This approach greatly decreases the number of distributions that we need to use Monte Carlo for the universe of securities in a specific market, and could even give us better results, given that:
- The asset classes have more stable statistical properties than individual securities.
- All securities can be described as a weighted sum of a finite number of asset classes.
- Reduces computational complexity and allows for better risk modeling.
Now, it is possible to use the Monte Carlo algorithm just to calculate the gross returns of each broad asset class, and then just reuse these simulations in the weighted sum that we just described. Then, if we replace the asset classes with the compounded gross return for future days in the weighted sum formula, we get:
Where:
- : Number of days in the future that we are simulating.
- : Number of broad asset classes that we are using.
- : Weight (proportion) of the asset class in this particular security.
- : Average return (drift) of the asset class .
- : Standard deviation (volatility) of the asset class .
- : Random number drawn from a normal distribution (random shock), unique for the asset class in the day .
Refining the estimates with correlations
So far, our method looks pretty good: It allows us to generate predictions for the universe of securities in a financial market using broad asset classes, while enabling us to simulate risk and unpredictable variables in the market without major complications.
But if we only generate returns that vary randomly for each asset class independently, we would be ignoring the historical relations that can be observed between them. In the reality of the financial markets, the asset classes do not change randomly: Some tend to go upward and downward together (high correlation), while others change in opposite directions (negative correlation).
To moderate these phenomena correctly, we must use a correlation matrix that describes how the asset classes are related to each other, and use the Cholesky decomposition to transform our uncorrelated random shocks to correlated ones.
Cholesky decomposition
Most tools that allow us to manipulate financial data have a built-in way to apply the Cholesky decomposition without effort, but if you are not familiar with the technique and want to understand how to apply it, here is a simple example to help you.
If we only have two broad asset classes, let’s say stocks and bonds, and they are historically correlated by 60%, then this would be our correlation matrix:
Then, we use the Cholesky decomposition to get a triangular matrix (also known as the cholesky matrix), such that the equality holds. For our example, our matrix would be:
Having our cholesky matrix, we generate random shocks for a particular day (where is the number of broad asset classes that we are using, 2 in this example) on a matrix , as follows:
Now, if we calculate the cross product between our cholesky matrix and our random shocks , we get the correlated shocks for a particular day:
Updating the formula with the correlations
By understanding how to use correlations in the unpredictable market changes on a specific day, we can extrapolate their use in the compounded gross return calculation for future days, where for each day we will have to:
- Generate random shocks in a matrix, where is the number of broad asset classes that we are using.
- Calculate the correlated shocks with .
If we replace the random shocks that we have been using so far with the correlated shocks , we get:
Where is the correlated shock for the asset class in the day .
Conclusions
The Monte Carlo algorithm is a simulation technique that draws from parameters estimated using historical data (similar to quantitative approaches), allowing for the modeling of key aspects of financial markets. In addition, we found that it is possible to overcome the limitations of needing historical data for each security by using correlated broad asset classes to describe them, also simplifying the calculations by requiring a finite number of statistical models to simulate the entire universe of securities in a financial market.
In a future article, we will present a simple implementation of the method described here, using Python and some specific libraries. Also, it would be interesting to evaluate simulations using machine learning algorithms, considering the same adaptations discussed in this article.
If you are interested in checking the Spanish version of this article, you can do so by following this link.