Sample conditional multivariate random variable?

There’s multivariate random variable, future prices of assets, 5 years from now:

X = [Gold, Silver, SP500]

There’s historical prices for X available for last 50 years. It’s possible to fit historical prices to get multivariate probability distribution of future prices

P(X)

How to fit the multivariate conditional probability distribution? To get better prediction, as (let’s suppose it is so) the current prices have predictive power for the future prices.

P(X|CurrentX)

I don’t need the distribution itself, just the ability to sample X given CurrentX. If that helps the individual prices have Pareto distribution.

Use Case

Below is the time series of Gold prices, normalised in some way. We can bin those prices into 5 bins marked with different colors (yellow when gold prices are highest).

enter image description here

As we can see today’s gold price (grey line) is in the second lowest bin. Or, in other words - 75% of time for last 90 years gold price was higher than it is now.

We want to know what the price of gold will be in the next 5 years. We can sample it from the past data. We can take all the points in bin2 (all the blue dots, there are 5 such regions) and for every such point see what the value in next 5 years will be - we get a set of future gold prices \{g_{i}...\}. And then aggregate it into histogram for the gold price in next 5 years.

Now we almost can feel the money in our pocket, but we can’t just buy gold now. As while it seems like the general trend for gold is up, but there’s a chance it can go down, ruining our investment.

So we need to play safe and consider buying some let’s say silver. And we can do all the same calculations for the silver and get the future prices of silver \{s_{i}...\}.

But, there’s a problem - we can sample future prices for gold \{g_{i}...\} and for silver \{s_{i}...\}. But what we want is to sample both at the same time \{(g_{i}, s_{i})...\} with respect to correlation.

This doesn’t seem to be related to Julia specifically. This is a general problem. There are two simple ways to do this, probably an infinite number of hard ways to do it.

First, you could sample from the historical data using a block bootstrap. From what you’ve said so far, in your case, I’d probably do this. See https://github.com/colintbowers/DependentBootstrap.jl

Second, you could fit a statistical model to all the data that you do have and then use it to forecast/simulate. This sort of seems to be what you are asking about. Suppose you say that returns for gold, silver, stocks, etc follow a Multivariate Normal distribution. You can estimate the parameters and then just draw random vectors from that distribution. See https://github.com/JuliaStats/Distributions.jl

2 Likes

A simple example of sampling correlated normally distributed R.V.s: https://github.com/mcreel/Econometrics/blob/master/Examples/GLS/cholesky.jl

There are many other possibilities, depending on exactly what you need.

1 Like

Thanks for advices!

@mcreel Cholesky seems to be close to what I’m looking for, will check it out.

@tbeason About fitting multivariate statistical model, maybe I’m missing something. The fitted model would provide the multivariate probability distribution:

P(X) where X = [Gold, Silver, SP500]

But I need a conditional multivariate probability distribution

P(X|CurrentX) where X = [Gold, Silver, SP500]

Can I do that with statistical model?

P.S.

They say returns follow Pareto distribution with \alpha \approx 2 not Normal.

I never said that it was a good assumption. I just offered that as an example. If you say you need a conditional distribution, then obviously you need some sort of statistical model. That is well beyond drawing correlated random variables though.

2 Likes