[ANN] PortfolioOptimisers.jl: Ape together strong

PortfolioOptimisers.jl

Editorial note: If you read this post and then go to the docs’ homepage or the package’s readme and they seem familiar, it’s because I started writing this before I had a proper readme.

Portfolio optimisation is the science of reducing investment risk by being clever about how you distribute your money. Ironically, some of the most robust ways to ensure risk is minimised is to distribute your money equally among a portfolio of proven assets. There exist however, a rather large number of methods, risk measures, constraints, prior statistics estimators, etc. Which give a huge number of combinations.

PortfolioOptimisers.jl is an attempt at providing as many as possible, and to make it possible to add more by leveraging Julia’s type system.

The feature list is quite large and under active development. New features will be added over time. Check out the examples and API documentation for details.

Please feel free to file issues and/or start discussions if you have any issues using the library, or if I haven’t got to writing docs/examples for something you need. That way I know what to prioritise.

Caveats

Documentation

  • Mathematical formalism: I’ve got API documentation for a lot of features, but the mathematical formalisms aren’t yet thoroughly explained. It’s more of a high level view.
  • Citation needed: I haven’t gone over all the citations for the docs because stabilising the API, adding new features, and writing the API docs has taken priority.
  • Docstring examples: some features require set up steps, and I haven’t had the patience to do that. Mostly the examples are still mostly for doctesting my implementation of Base.show for my types, and showcasing low-hanging fruit of functionality.

API

  • Unstable: there will likely be breaking changes as I figure out better, more general ways to do things, or better naming conventions.

Internals

  • Dependencies: some deps are only used for certain small things, I may end up removing them in favour of having just the small bit of functionality the package needs. I’m very open to replacement suggestions.

Features

The feature list is rather large, so I will attempt to summarise it via interpretative dance as best I can. There are also some experimental features (some tracking risk measures) that I’m not sure how well they’d perform, but they’re interesting nonetheless, especially when used in clustering optimisations. Luckily, those haven’t been documented yet, so I haven’t had to reckon with the consequences of my actions just yet.

Without further ado, here is a summary of the features in this package.

Price data

Every optimisation but the finite allocation work off of returns data. Some optimisations may use price data in the future.

  • Preprocessing to drop highly correlated and/or incomplete data. These are not well integrated yet, but the functions exist.
  • Computing them, validating and cleaning up data.

Co-moment matrix processing

Price data is often noisy and follows general macroeconomic trends. Every optimisation model is at risk of overfitting the data. In particular, those which rely on summary statistics (moments) can be overly sensitive to the input data, for example a covariance matrix. It is therefore important to have methods that increase the robustness of their estimation.

  • Positive definite projection.
  • Matrix denoising.
    • Spectral, shrunk, fixed.
  • Matrix detoning.

Moment estimation

Many of these can be used in conjunction. For example, some covariance estimators use expected returns, or variance estimators in their calculation, and some expected returns use the covariance in turn. Also, some accept weight vectors.

  • Expected returns.
    • Arithmetic expected returns.
    • Shrunk expected returns.
      • James-Stein, Bayes-Stein, Bodnar-Okhrin-Parolya. All of them with Grand Mean, Volatility Weighted, Mean Squared Error targets.
    • Equilibrium expected returns.
    • Excess expected returns.
  • Variance.
  • Covariance/Correlation matrix.
    • Custom: estimator + processing pipeline.
    • Pearson: weighted, unweighted, any StatsBase.CovarianceEstimator.
      • Full.
      • Semi.
    • Gerber.
      • Gerber 0, 1, 2. Standardised and unstandardised.
    • Smyth-Broby.
      • Smyth-Broby 0, 1, 2. Standardised and unstandardised.
      • Smyth-Broby-Gerber 0, 1, 2. Standardised and unstandardised.
    • Distance covariance.
    • Lower tail dependence.
    • Kendall.
    • Spearman.
    • Mutual information.
      • Predefined, Hacine-Gharbi-Ravier, Knuth, Scott, Freedman-Draconis bin widths.
    • Coskewness.
      • Full.
      • Semi.
    • Cokurtosis.
      • Full.
      • Semi.
    • Implied volatility.

Regression

Factor models and implied volatility use regression in their estimation.

  • Stepwise.
    • Forward and Backward.
      • P-value, Corrected and “vanilla” Akaike info, Bayesian info, R-squared, and Adjusted R-squared criteria.
  • Dimensional reduction.
    • Principal Component.
    • Probabilistic Principal Component.

Ordered weights array and Linear moments

  • Ordered weights arrays.
    • Gini Mean Difference.
    • Conditional Value at Risk.
    • Weighted Conditional Value at Risk.
    • Tail Gini.
    • Worst Realisation.
    • Range.
    • Conditional Value at Risk Range.
    • Weighted Conditional Value at Risk Range.
    • Tail Gini Range.
  • Linear Moments Convex Risk Measure: linear moments can be combined using different minimisation targets.
    • Normalised Constant Relative Risk Aversion.
    • Minimum Squared Distance.
    • Minimum Sum Squares.

Distance

Distance matrices are used for clustering. They are related to correlation distances, but all positive and with zero diagonal.

  • Distance: these compare pairwise relationships.
  • Distance of distances: these are computed by applying a distance metric to every pair of columns/rows of the distance matrix. They compare the entire space and often give more stable clusters.

Individual entries can be raised to an integer power and scaled according to whether that power is even or odd. The following methods can be used to compute distance matrices.

  • Simple.
  • Absolute.
  • Logarithmic.
  • Correlation.
  • Variation of Information.
    • Predefined, Hacine-Gharbi-Ravier, Knuth, Scott, Freedman-Draconis bin widths.
  • Canonical: depends on the covariance/correlation estimator used.

Phylogeny

These define asset relationships. They can be used to set constraints on and/or compute the relatedness of assets in a portfolio.

  • Clustering.
    • Optimal number of clusters:
      • Predefined, Second order difference, Standardised silhouette scores.
    • Hierarchical clustering.
    • Direct Bubble Hierarchy Trees.
      • Local Global sparsification of the inverse covariance/correlation matrix.
  • Phylogeny matrices.
    • Network (Minimum Spanning Tree) adjacency.
    • Clustering adjacency.
  • Centrality vectors and average centrality.
    • Betweenness, Closeness, Degree, Eigenvector, Katz, Pagerank, Radiality, Stress centrality measures.
  • Asset phylogeny score.

Constraint generation

These let users easily manually or programatically define optimisation constraints.

  • Equation parsing.
  • Linear weights.
  • Risk budget.
  • Asset set matrices.
  • Phylogeny.
    • Phylogeny matrix.
      • Semi definite.
      • Mixed integer programming.
    • Centrality.
  • Weight bounds.
  • Buy-in threshold.

Prior statistics

As previously mentioned, every optimisation but the finite allocation work off of returns data. These returns can be adjusted and summarised using these estimators. Like the moment estimators, these can be mixed in various ways.

  • Empirical.
  • Factor model.
  • High order moments (coskewness and cokurtosis).
  • Black-Litterman.
    • Vanilla.
    • Bayesian.
    • Factor model.
    • Augmented.
  • Entropy pooling.
  • Opinion pooling.

Uncertainty sets

These sets can be used to make some optimisations more robust. Namely, there exist uncertainty sets on expected returns and covariance. They can be used on any optimisation which uses any one of these quantities.

  • Box.
    • Delta.
    • Normally distributed returns.
    • Autoregressive Conditional Heteroskedasticity.
      • Circular, moving, stationary bootstrap.
  • Ellipse uncertainty sets.
    • Normally distributed returns.
    • Autoregressive Conditional Heteroskedasticity.
      • Circular, moving, stationary bootstrap.

Turnover (Rebalancing)

These penalise moving away from a benchmark vector of weights.

  • Risk measure (experimental).
  • Constraints.
  • Fees.

Fees

These encode various types of fees, which can be used in portfolio optimisation and analysis.

  • Relative long.
  • Relative short
  • Fixed long.
  • Fixed short.
  • Turnover (rebalance).

Tracking

These can be used to track the performance of an index, indicator, or portfolio.

  • Risk measure (experimental).
  • Constraints.

There are four things that can be tracked.

  • Returns via L1 or L2 norm.
    • Asset weights.
    • Returns vector.
  • Risk tracking via asset weights.
    • Dependent variables (experimental).
    • Independent variables.

Risk measures

Different optimisations support different risk measures, most measures can also be used to quantify a portfolio’s risk-return characteristics.

  • Variance.
  • Risk Contribution Variance.
    • Asset risk contribution.
    • Factor risk contribution.
  • Uncertainty set variance.
  • Standard deviation.
  • First lower moment.
  • Second lower moment.
    • Semi variance.
    • Semi deviation.
  • Second central moment (historical returns, no covariance matrix).
    • Variance.
    • Standard deviation.
  • Mean absolute deviation.
  • Third lower moment (historical returns, no coskewness matrix).
    • Standardised (semi skewness).
    • Unstandardised.
  • Fourth lower moment (historical returns, no cokurtosis matrix).
    • Standardised (semi kurtosis).
    • Unstandardised.
  • Third central moment (historical returns, no coskewness matrix).
    • Standardised (skewness).
    • Unstandardised.
  • Fourth central moment (historical returns, no cokurtosis matrix).
    • Standardised (kurtosis).
    • Unstandardised.
  • Square root kurtosis.
    • Full.
    • Semi.
  • Negative skewness.
    • Full.
    • Semi (experimental).
  • Negative quadratic skewness.
    • Full.
    • Semi (experimental).
  • Value at Risk.
  • Conditional Value at Risk.
  • Distributionally Robust Conditional Value at Risk.
  • Entropic Value at Risk.
  • Relativistic Value at Risk.
  • Value at Risk Range.
  • Conditional Value at Risk Range.
  • Distributionally Robust Conditional Value at Risk Range.
  • Entropic Value at Risk Range.
  • Relativistic Value at Risk Range.
  • Drawdown at Risk.
    • Absolute (simple returns).
    • Relative (compounded returns).
  • Conditional Drawdown at Risk.
    • Absolute (simple returns).
    • Relative (compounded returns).
  • Entropic Drawdown at Risk.
    • Absolute (simple returns).
    • Relative (compounded returns).
  • Relativistic Drawdown at Risk.
    • Absolute (simple returns).
    • Relative (compounded returns).
  • Ordered Weights Array risk measure.
  • Ordered Weights Array range risk measure.
  • Average Drawdown.
  • Ulcer Index.
  • Maximum Drawdown.
  • Brownian Distance Variance.
  • Worst Realisation.
  • Range.
  • Equal risk.
  • Turnover risk.
  • Tracking risk.
  • Mean return risk.
  • Ratio of measures.

Portfolio statistics

These are used to summarise a portfolio’s risk and return characteristics.

  • Expected returns.
    • Arithmetic.
    • Kelly (Logarithmic).
  • Risk-adjusted return ratio.
    • Vanilla.
    • Sharpe ratio information criterion.
  • Risk contribution.
    • Asset risk contribution.
    • Factor risk contribution.

Optimisation

There are many different optimisation methods, each with different characteristics and configurable options, including exclusive constraint types and risk measures. Though all of them have an optional fallback method in case the optimisation fails.

  • Clustering.
    • Hierarchical Risk Parity.
    • Hierarchical Equal Risk Contribution.
    • Nested Clustered Optimisation.
    • Schur Complement Hierarchical Risk Parity.
  • JuMP-based.
    • Mean Risk.
    • Factor Risk Contribution.
    • Near Optimal Centering.
    • Risk Budgeting.
      • Asset risk budgeting.
      • Factor risk budgeting.
    • Relaxed Risk Budgeting.
  • Stacking.
  • Naive.
    • Inverse volatility.
    • Equal weighted.
    • Random weighted.
  • Finite Allocation.
    • Discrete.
    • Greedy.

Optimisation constraints

Many of these use the various constraint generation mechanisms mentioned above. These constrain the optimisation so the results meet the user’s requirements. Some have specific requirements like a Mixed Integer Programming capable solver, others cannot be used in conjunction with each other, and there exist combinations that make problems infeasible.

  • JuMP-based.
    • Risk constraints.
      • Maximum risk for all supported measures (can be simultaneously provided).
    • Return constraints.
      • Minimum return.
      • Expected return uncertainty set.
    • Pareto front/surface/hypersurface (efficient frontier 2D, 3D, ND).
      • Via risk constraints.
      • Via return constraints.
    • Objective functions.
      • Minimum risk.
      • Maximum utility.
      • Maximum risk-adjusted return ratio.
      • Maximum return.
    • Budget constraints.
      • Long and/or short budget.
        • Exact.
        • Upper and lower bounds.
      • Cost budget.
      • Market impact budget.
    • Weight bounds.
    • Linear weights.
    • Cardinality.
      • Asset.
      • Set.
    • Group cardinality.
      • Asset.
      • Set.
    • Long and short buy-in threshold.
    • Turnover.
    • Fees.
    • Tracking.
    • Phylogeny.
    • Centrality.
    • Regularisation.
      • L1.
      • L2.
    • Custom: via subtyping and multiple dispatch.
      • Constraint.
      • Objective.
      • Objective penalty.
  • Non-JuMP-based.
    • Weight bounds.
      • Upper.
      • Lower.
    • Weight finaliser.
  • Optimisers without a fixed risk measure.
    • Scalarisers for multiple simultaneous risk measures.
      • Weighted sum.
      • Max risk.
      • LogSumExp.

Plotting

  • Simple or compound cumulative returns.
    • Portfolio.
    • Assets.
  • Portfolio composition.
    • Single portfolio.
    • Multi portfolio.
      • Stacked bar.
      • Stacked area.
  • Risk contribution.
    • Asset risk contribution.
    • Factor risk contribution.
  • Asset dendrogram.
  • Asset clusters + optional dendrogram.
  • Simple or compound drawdowns.
  • Portfolio returns histogram + density.
  • 2/3D risk measure scatter plots.

I’ve done a fair bit of work in this area and can testify that:

attempt at providing as many as possible

you weren’t exaggerating! That is a very comprehensive list of techniques. This is a great resource, I look forward to digging into it. Thank you!

Glad you like it.

I’m very open to suggestions and contributions.

I’d also like to add stuff like synthetic returns, but i gotta wait for Copulas.jl to add vines. Adding cross validation would also be nice.

PortfolioOptimisers.jl v0.14.2 Release

Breaking changes

  • Lots of quality of life breaking releases since the inaugural post.
    • Better names for abstract and concrete types.
    • Shorter and unambiguous fieldnames.

Bug fixes

  • Lots of minor and quality of life bug fixes. Most don’t affect any results aside from disallowing certain types/values that would error further down the chain rather than at instantiation.
  • One pretty big bug one relating to high order prior statistics using low order factor prior statistics.

New features

  • New risk measures and formulations.
  • Enable existing measures on more optimisation types.

Docs

  • Lots of new API docs, including instructions on defining interfaces for new estimators and algorithms.
  • A few new examples.
  • Better readmes.

Project maturity

  • Stabler API, still expect breaking changes with v0.X releases.

Roadmap

I don’t have a timeline or a serious plan, but there are a few features I’d like the package to have. In order of desirability:

  1. Cross validation prediction.
  2. Pipelines, i.e. data validation → optimisation → prediction → post-processing
  3. Report generation.
    3a. Better plots.
    3b. Perhaps excel output.
  4. A non-shitpost logo (though tbh I kinda like that it’s a shitpost).

Feedback and contributions welcome

The package is quite functional, but it is missing a lot of docs and sample use. If anyone out there is using/wants to use the package I’d love to hear about how it can be improved and would love some contributions/help with implementation and design.

PortfolioOptimisers.jl v0.17.0

Breaking changes

  • Renamed a few internal fields to make it clear they are incompatible with more advanced features like cross validation and the outer optimisation on nested clusters and stacking optimisers. This is because they need to generate returns data dynamically according to the cross validation folds/inner optimisation results.

New features

Cross validation and hyper parameter tuning come with a simple example each.

  • Cross validation.
    • KFold: basic splitting into training and testing sets.
    • Walk Forward: enables dynamic optimisation with periodic rebalancing.
      • Index based.
      • Date based: It’s now possible to rebalance on weird dates like the 2nd tuesday of every 3 weeks or something equally arbitrary.
    • Combinatorial: combinatorially generated paths for much more robustness.
    • Multiple Randomised: temporal and cross sectional (sampling the asset universe) walk forward path randomisation for much more robustness but less generality.
  • Ability to define which constraints which use benchmark weights can carry forward previous optimisation weights in walk forward and multiple randomised cross validation.
  • Time-dependent constraints to be used in walk forward and multiple randomised cross validation. None have been implemented as I didn’t have the creativity, but the interface exists and is implemented.
  • NestedClusters and Stacking optimisers can now use cross validation to predict the returns to be used in their outer estimators.
  • Hyper parameter tuning: it’s now possible to evaluate out-of-sample values for all the different non-finite allocation optimisers which eliminates much of the guesswork.
    • Grid search cross validation.
    • Randomised search cross validation.
  • Prediction machinery. These came with the cross validation and are the predicted performance of an optimisation, they are the out-of-sample performance of an optimisation.

Plans

I’ve been burning the candle at both ends so I’ll ease off from my anxiety-induced freneticism to make the package “feature complete”. I classify the remaining features as nice-to-haves. For now the top priority would be the API docs and improved examples.

Features

  • Pipelines, most of the hard work is done. And this isn’t a game-changing feature, more of a nice to have. I’ll implement them eventually.
  • Synthetic returns prior. I need a package to implement vine copulas.
  • Expand the hyperparameter tuning to do adaptive refinement of parameters.
  • Adding more quality of life features like summaries, reports, post-processing bells and whistles (intersects heavily with plots).

Documentation

  • API: I just discovered DocStringExtensions.jl, but I’m having trouble implementing my own templates (if anyone has an example please help a brother out and point me to it). I’d have to redo a bunch of the docs I already have, but I think it’s worth it. Especially for a package like this one.
  • Examples: more examples with more plots and fewer tables would be nice. This means I have to implement the plots though.
  • Formalisms and citations. I really need to keep adding these, there are just so many features that it’s overwhelming at times, but I’ll be passively adding these as patch releases.

Plotting

  • More plots in general with nicer APIs, especially now that cross validation and prediction features have been added to the package, it would be nice to have plots that show how the predictions behave as a function of time.

Future breaking changes

  • I never expect more breaking changes, but there always are. However, they should slow down now. I’m liking how the APIs for the important parts have been turning out. However, there’s always a better name for a property or a type and I never hesitate to make the change if I think it’ll make interfacing with the package more predictable and logical.

Feedback and contributions always welcome

As always, ideas, contributions and criticisms are always welcome.

PortfolioOptimisers.jl v0.19.0

Highlights

New features

Breaking

  • Added benchmark prices. They can be specified per asset (as a matrix), or as an index (as a vector). This enables optimising on excess returns which is a form of tracking. It’s compatible with every optimiser, cross validation, and prediction. Breaking because I had to add a flag to all optimisation estimation structs even though it didn’t break existing code or APIs.
  • Improved the consistency of various moment estimators which internally compute the variance. They also used to contain expected returns estimators, but we now use the ones inside the variance estimator because the expected returns estimator was computing the mean used to compute the variance.
  • expend_train -> expand_train keyword in WalkForwardEstimator subtypes to make it consistent with reduce_test.

Non-breaking

  • SubsetResampling optimiser. This takes random, unique, samples of the asset universe and optimises each one in turn. The final weights are the (optionally weighted) average weights of each asset in each optimisation. If an asset wasn’t sampled for a given optimisation, its weight is taken to be zero. It interpolates between the equal weighted portfolio and the pure internal optimiser. It’s quite cool, i like this optimiser a lot because it adds noise that reduces overfitting for mid assets, but buries shit ones and lets great ones shine through. It’s like a targeted yet stochastic combination of the L_n penalty families.
  • LogRiskBudgeting what used to be the normal risk budgeting algorithm can now accept an orthant vector, which means risk budgeting portfolios can now take on negative weights. But they have to be specified a priori. Limited to asset risk budgeting, factor risk budgeting already allows for negative asset weights. It might be possible to generalise but i haven’t thought it through.
  • MixedIntegerRiskBudgeting explores all orthants to get a low risk portfolion that meets the risk budget threshold. It’s very expensive because there are 2^N sign combinations, where N is the number of assets. You don’t have to specify the orthant vector, it is obtained on the fly. It also doesn’t guarantee the lowest risk portfolio simply because the MIP optimiser might give up depending on your settings. Limited to asset risk budgeting, factor risk budgeting already allows for negative asset weights. It might be possible to generalise but i haven’t thought it through.
  • Windowed variants of all moment estimators. They simply wrap one of the already provided estimators. You can provide a window which can be an integer to take the last window number of entries, or a vector to index those specific indices. The moments will be computed based on that window rather than the entire history. They also support weighted estimators by currying the outer weights inward with the right indexing.

Docs

  • Added a quick start guide.
  • I’ve added more docstring examples and defined some of the necessary interfaces one would need to implement to add custom behaviour.
  • I caved and unleashed the clankers on the API docs. I only asked for API documentation of function signatures, variable, type and function explanations. Almost 18k lines later here we are. It seems to have done a fairly good job, but one does not simply read 18k lines and find all issues at once. But at least, the APIs are now documented so people can actually use the entire library without looking at the code. I will methodically go over the new docs to improve their quality.

Bug fixes

  • Fixed a bug in risk budgeting optimisers where constraints were being correctly built from the wrong variable, which meant unnecessary coupling between systems that also needlessly made certain constraint combinations impossible to define.

Future plans

  • Improve the documentation.
    • The clanker-made docs are competent, but not to the standard I’d like.
    • They’re missing interface definitions, sources, language standardisation, nicer mathematical notation, not all methods are documented (mainly internal ones).
    • Split the API docs into public and private.
    • More examples and entries to the user guide.
  • Plots, this is a sore point for me.
    • I have had trouble with Makie.jl due to its less than stellar support for dendrograms.
    • Plots has the problem that I have to load GraphRecipes.jl and StatsPlots.jl to load the extension, which is less than ideal.
    • I’d like to have interactive plots with mouse-over labels but that would lock me into Plotly
    • The API of my plotting functions sucks and I haven’t figured out a way to make it better. Recipes seem fragile but I’ll maybe ask a clanker to figure it out for me.
    • Need plots for prediction results, not just optimisation results.
  • Cross sectional factor prior.
  • Copula based synthetic returns. Needs vine copulas, I’ve not found any Julia implementation.
  • Remove python dependencies. Needs bootstrapping and bin width computation functions, but I’ve found no suitable Julia library that does either of these.
  • Pipelines.
  • Online statistical methods maybe? Though I’m not sure how to make most of the current ones can be updated. I’m inclined to let it ride as it is, it may not be worth it. Unless someone else wants to give it a stab and open a PR.

You can now do ridiculous things like this. I don’t see why anyone would, but you can. Seriously, don’t try this, it’ll probably run for hours. This is barely scratching the surface of all the features in the library.

using PortfolioOptimisers, StatsPlots, GraphRecipes, Clarabel, HiGHS, Pajarito, JuMP,
      YFinance, TimeSeries, DataFrames

# Use MIP solvers because we will use buy in threshold constraints.
Highs_resetGlobalScheduler(1)
slv = [Solver(; name = :mip1,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip2,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.95)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip3,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.90)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip4,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.85)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip5,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.80)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip6,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.75)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip7,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.7)),
              check_sol = (; allow_local = true, allow_almost = true)),
       Solver(; name = :mip8,
              solver = optimizer_with_attributes(Pajarito.Optimizer,
                                                 JuMP.MOI.Silent() => false,
                                                 "oa_solver" => optimizer_with_attributes(HiGHS.Optimizer,
                                                                                          JuMP.MOI.Silent() => false,
                                                                                          "threads" => 6),
                                                 "conic_solver" => optimizer_with_attributes(Clarabel.Optimizer,
                                                                                             JuMP.MOI.Silent() => false,
                                                                                             "max_step_fraction" => 0.6,
                                                                                             "max_iter" => 1500,
                                                                                             "tol_gap_abs" => 1e-4,
                                                                                             "tol_gap_rel" => 1e-4,
                                                                                             "tol_ktratio" => 1e-3,
                                                                                             "tol_feas" => 1e-4,
                                                                                             "tol_infeas_abs" => 1e-4,
                                                                                             "tol_infeas_rel" => 1e-4,
                                                                                             "reduced_tol_gap_abs" => 1e-4,
                                                                                             "reduced_tol_gap_rel" => 1e-4,
                                                                                             "reduced_tol_ktratio" => 1e-3,
                                                                                             "reduced_tol_feas" => 1e-4,
                                                                                             "reduced_tol_infeas_abs" => 1e-4,
                                                                                             "reduced_tol_infeas_rel" => 1e-4)),
              check_sol = (; allow_local = true, allow_almost = true))];

# Convert YFinance data to TimeArray.
function stock_price_to_time_array(x)
    # Only get the keys that are not ticker or datetime.
    coln = collect(keys(x))[3:end]
    # Convert the dictionary into a matrix.
    m = hcat([x[k] for k in coln]...)
    return TimeArray(x["timestamp"], m, Symbol.(coln), x["ticker"])
end

# We'll use PCA regression for factor modelling.
assets = sort!(["JCI", "TGT", "CMCSA", "CPB", "MO", "APA", "MRSH", "JPM", "ZION", "PSA",
                "BAX", "BMY", "LUV", "PCAR", "TXT", "TMO", "DE", "MSFT", "HPQ", "SEE", "VZ",
                "CNP", "NI", "T", "BA"]);
factors = sort!(["MTUM", "QUAL", "VLUE", "SIZE", "USMV"]);

Date_0 = "2024-01-01";
Date_1 = "2026-10-05";

# Download the price data using YFinance.
prices = get_prices.([assets; factors]; startdt = Date_0, enddt = Date_1);
prices = stock_price_to_time_array.(prices);
prices = hcat(prices...);
cidx = colnames(prices)[occursin.(r"adj", string.(colnames(prices)))];
prices = prices[cidx];
TimeSeries.rename!(prices, Symbol.([assets; factors]));

# Compute returns.
rd = prices_to_returns(prices[Symbol.(assets)], prices[Symbol.(factors)]);

# Asset sets
sets = AssetSets(; dict = Dict("nx" => rd.nx));

# Covariance estimator.
ce = PortfolioOptimisersCovariance(; ce = SmythBrobyCovariance());

# Expected returns estimator.
me = ShrunkExpectedReturns(; ce = ce);

# Clusters estimator.
cle = ClustersEstimator(; ce = ce);

# Cross validation.
n_folds, n_test_folds = optimal_number_folds(length(rd.ts), 60, 10);
cv = OptimisationCrossValidation(;
                                 cv = CombinatorialCrossValidation(; n_folds = n_folds,
                                                                   n_test_folds = n_test_folds,
                                                                   purged_size = 3,
                                                                   embargo_size = 7));

# When used in optimisations, vectors of risk measures 
# get scalarised via one of three methods (default is weighted sum).
# Expensive JuMP compatible risk measures. 
jr0 = [NegativeSkewness(; settings = RiskMeasureSettings(; scale = 20)),
       RelativisticDrawdownatRisk()];
jr1 = [Kurtosis(; settings = RiskMeasureSettings(; scale = 30)), RelativisticValueatRisk()];

# Clustering risk measures.
hr0 = [MedianAbsoluteDeviation(; settings = HierarchicalRiskMeasureSettings(; scale = 15)),
       RelativeEntropicDrawdownatRisk()];

# Prior statistics.
pe = HighOrderFactorPriorEstimator(;
                                   pe = FactorPrior(;
                                                    pe = EmpiricalPrior(; me = me, ce = ce),
                                                    re = DimensionReductionRegression()));
# Weight bounds
wb = WeightBounds(; lb = -1, ub = 1);

# Buy in threshold (MIP constraint)
st = Threshold(; val = 0.1);
lt = ThresholdEstimator(; val = UniformValues());

# Base optimisers.
jopt = JuMPOptimiser(; pe = pe, slv = slv, wb = wb, sbgt = 1, lt = lt, st = st, sets = sets);
hopt = HierarchicalOptimiser(; pe = pe, cle = cle, cle_pr = false, slv = slv);

# JuMP optimisers.
jopt0 = NearOptimalCentering(; r = jr0, obj = MaximumRatio(; rf = 4.2 / 252 / 100),
                             opt = jopt);
jopt1 = MeanRisk(; r = jr1, obj = MaximumRatio(; rf = 4.2 / 252 / 100), opt = jopt);

# Hierarchical optimisers.
hopt0 = HierarchicalEqualRiskContribution(; ri = hr0, ro = jr0, opt = hopt);

# Clustering optimiser.
nopt = NestedClustered(; pe = pe, wb = wb, cle = cle, cle_pr = false, opti = jopt0,
                       opto = jopt0, cv = cv);

# Stacking optimiser.
sopt = Stacking(; pe = pe, wb = wb, opto = jopt0, opti = [jopt0, jopt1, hopt0, nopt],
                cv = cv);

# Subset resampling optimiser.
rsopt = SubsetResampling(; pe = pe, wb = wb, opt = sopt, subset_size = 0.9, n_subsets = 10);

# Optimise
res = optimise(rsopt, rd)