If you are interested just in deriviatves for your comparative statics, you can always just use adjoint rules for constrained optimization. It is basically the envelope condition, and something that I think they are trying to add to optimizers like Optim/etc. so it is automatic (i.e. you can differentiate the solver itself). But @ChrisRackauckas might be able to tell you were that sort of stuff lies. I will say, though, that JuMP is pretty difficult to work with since it is a DSL and doesn’t really work in vectors the same way other optimizers do in julia.
Yes, I want eg \frac{\partial c}{\partial \rho}
I don’t know what that means. Can you show w/ my example above?
@pkofod Did anyone ever make an rrule
for Optim using ChainRulesCore? Maybe this is easier to see with existing code rather than talking in the abstract?
But the principle @Albert_Zevelev is pretty easy. If you have the calculation of a function (which is the forward
or primal
calculation in ML-speak) then you can look at derivatives around that solution using the implicit funciton theorem and other calculus tricks. I think in the case of your function doing constrained optimization it is just solving Envelope theorem - Wikipedia but maybe it is more subtle. In ML speak that is the “reverse” calculation which brings you back to the gradient from the underlying calculation of the function. Introduction · ChainRules is the package that we can all use to define these rules. And once defined, they can compose in any julia code.
Nice to see Frank Ramsey’s 1928 classic in Julia! Good job!
I’m not exactly sure I know what would have to be done (for example, I don’t understand the syntax in the example in the docs)
Assuming local convexity you can write an rrule that doesn’t require backprop of the solver through the KKT relation. This isn’t implemented anywhere yet, and one of the things it would need is a way to skip this trick if local convexity does not hold (well enough)
It’s not convexity per se that’s required. More like uniqueness, continuity and sub-differentiability of the optimal solution. Convexity doesn’t guarantee this because the solution isn’t necessarily going to be unique so the optimal solution may not even be a function for a derivative to make sense. A non-convex problem solved globally can also have a global optimal solution that’s a function, continuous and differentiable wrt the parameters.
Anyways, I have this feature planned for Nonconvex.jl and you can already do some form of it using DiffOpt.jl.
We’re planning on adding it at the GalacticOptim level and the sources in ChainRules support · Issue #5 · SciML/Optimization.jl · GitHub seem to all want local convexity IIRC.
Convexity alone is not enough. Strong convexity is because it guarantees uniqueness.
Cool new paper by Pablo Guerron & co, “Parallel Computation of Sovereign Default Models”, where they solve using Julia & GPUs:
we want to bypass the steep learning and implementation costs of languages like C++ CUDA (Compute Unified Device Architecture) in economic research. To this end, we propose a framework for efficient parallel computing with the modern language Julia. The paper offers detailed analysis of parallel computing, Julia-style acceleration tricks, and coding advice. The benchmark implementation in Julia with CUDA shows a substantial speed up over 1,000 times compared to standard Julia. We provide an accompanying Github repository with the codes and the benchmarks.
The version of the paper they have in the repo is very old. I couldn’t find an updated version of the paper, could you? The have both versions there.
I wish researchers doing structural econometrics also switch to Julia. I know some people still using Gauss.
Hi all,
I came across this issue, which started with a kind of negative tone. After having used Matlab for 25 years, I migrated to Julia some two years ago. Furthermore, before sticking with Julia, I tried Python as well. Here are my two cents.
In five/six years, since Julia started to be more widely used, there is already more information about using Julia to solve problems in economics and finance than in 30 years of Matlab. The following set of lectures/books/ packages are a good illustration of my point:
-
QuantEcon here (slow update but still very useful)
-
Florian Oswald PhD lecture notes here
-
Paul Soderlind MSc and PhD lecture notes for economics and finance here
-
Takeki Sunakawa PhD lecture notes here (in Japanase, but any good translator solves the problem quickly)
-
Alisdair Mckay notes (Computational Notes on Heterogeneous-Agent Macroeconomics) here. They are so well done that after being written in Julia initially, someone added the Python version as well.
-
Richard Dennis has written a series of small packages for approximation (Smolyak, PiecewiseLinear, Chebyshev) and a larger package (SolveDSGE) that are extremely useful. They can be found here.
-
Mykel Kochenderfer & colleagues have finished a recent book (A broad introduction to algorithms for optimal decision making under uncertainty), most of which can be used in economics and finance. It can be found here. The algorithms are already available and I was told that the notebooks would come next.
-
Daisuke Oyama has maintained a Julia package for game theory here.
I think I can stop here, but some more could be added.
Finally, I hope that something like Dynare will not be implemented in Julia. Dynare seems to be very useful, but is a horrible black box type of tool. Many PhD students use it without knowing what they are doing. They just learn the mass in Latin and spell it out. Solving a DSGEM with Dynare is not the same as solving a system of differential equations with DifferentialEquations.jl.
https://frbny-dsge.github.io/DSGE.jl/latest/
And I’m not sure of the status of the actual Dynare Julia implementation:
@nilshg, I did not mention the FRB New York stuff because the list was becoming quite long and the former is quite well known.
And, as mentioned in my previous post, I just hope that nobody comes up with the idea of doing something similar to Dynare in Julia. For Ph.D. students, Dynare is the worst thing that we can find. Instead of forcing students to learn (and be able to formulate a critical perspective of what they are doing), it allows students to pretend to have knowledge about something they know nothing about. Simply put: it’s a black box.
Dynare is a pretty big and complex project, and it’s been developed for many years now, so understanding all of it would be a lot of work. However, the code is free and ready for inspection, and the methods are well documented, so it’s far from being a black box. I’ve had some interactions with several people in the Dynare community over the years, and they were exceptionally ready to explain and give help. Ph.D. students can learn as much as they want to about Dynare and the methods it offers, they only need to put in the time to do so.
It’s true that Dynare can be used by people who don’t fully understand what it does, or why. I myself fell into that group when I was using it. But there was no reason I couldn’t have learned more about it, if I had wanted to.
I respectively disagree on Dynare. It isn’t a black box at all… even if the economics of those sorts of models often are.
Dynare itself is an outstanding piece of software, with an enormous amount of testing and engineering embedded in it. Sure, you hit the occasional bug, but nothing like what you would have for custom code (which you might not even notice until someone else tries to replicate your results). Unless you are pushing the limits on what it was designed for, I would trust Dynare’s results over any custom code. Re-implementing dynare features on their own is a distraction that PhD students and practitioners should avoid - to let them focus on interesting underlying economics. Maybe implement a perturbation solution once as practice and then throw out your code.
There is a separate issue though: Dynare is almost too good at what it does! The fact that dynare is so easy to write models may have led people to use DSGE-style models where they aren’t appropriate from an economic perspective, or are leading people down the wrong modeling path because it is a huge effort to break out of its framework. But we can’t blame dynare if PhD students are using it for economic models that don’t make any sense, or they don’t understand how constraining perturbation solutions might actually be… that is the fault of their economics education, not the tool that implements it.
The broader issue with dynare is related: it can’t support cutting-edge methodological concepts in its current Matlab form - and would have limitations even if designed carefully with julia, so for many applications it just can’t help. But for the right economic models that fit into its structure, it is risky to rewrite things.
I believe that all of the work on the Julia side of dynare is in DynareJulia · GitLab My guess is that the direction it will take in Julia is that it will be componentized to whatever extent it can, so that for the cases where you can’t use the full dynare stack, you can write custom code that uses pieces of it.
In essence, I do not disagree with the two previous entries in this issue. I never mentioned that the Dynare code was something inaccessible. It is a black box in the sense that someone who does not have any significant knowledge of DSGEM can solve a model, can come up with fancy diagrams and tables as a signal of knowledge that unfortunately does not exist. This is the unfortunate side of Dynare.
I can provide two practical examples. One is the Sisufus punishment inflicted upon those who have to teach Ph.D. students to solve DSGEM. This punishment is inflicted because students have to learn the concept of Rational Expectations, solve simple equations with unconditional expectations, linearize more or less complicated models, learn the Jordan decomposition, and finish with the QZ decomposition. Then they still have to learn how to put all those ingredients in some computational framework. Moreover, I am skipping the part of global approximation (projection), which shows students that what they do in linearization is just voodoo economics in many situations. And then comes the crucial part: students come and say, “why do I do need to know all that stuff if there is Dynare out there that does everything for me, even the linearization?” Someone with good common sense would say: “the computer can not be a substitute for your knowledge”. Nevertheless, most students are practical people, want to show performance rather than knowledge.
The second practical example is rather sad. I know the case of someone who graduated from a highly respected European university and knew quite well the old song in DSGEM associated with the Blanchard-Kahn stability conditions: “n control variables, n eigenvalues >|1|; m state variables, m eigenvalues <|1|”. As a teacher, he criticized a student who correctly solved/simulated the standard Real Business Cycle Model using the Jordan decomposition. This method guarantees stability if the B-K conditions are reversed in this particular model (the left-hand side matrix is singular, and to decouple the system, it has to be premultiplied by the inverse of the right-hand side matrix, …).
The two examples above are not the responsibility of Dynare. They are the result of people taking advantage of what that package allows. It does everything out of a simple model written in structural form. People may know nothing about computation and be able to simulate a model using a computational method. They may know nothing about a DSGEM and solve such a model with a sophisticated mathematical method. This is not simply unfortunate; it is very dangerous. When one deals with a DSGEM, there are assumptions that have to be made, and they can only be stipulated if people know what they are talking about. Can we imagine someone publishing a paper about differential equations just because he/she used the package Differential Equations.jl to do the computation work but knows nothing about differential equations?
There is one point I disagree with @jlperla.
“I respectively disagree on Dynare. It isn’t a black box at all… even if the economics of those sorts of models often are.”
In what context can we say that the economics of a DSGEM is a black box? In such types of models, everything is as clear as water. The assumptions are clearly spelled out. The mathematical results can be easily refuted if they are wrong. The computation part of their solution is very robust (there are several ways to approach the problem, and all give the same solution).
We may not like the assumptions. That’s OK. We can change them and see if we can do better. If there is something completely clear about its content in macroeconomics, it is a DSGEM (all assumptions made are on the table). We may not like them, but that is another point—nothing to do with them being a black box.
I can imagine someone writing a paper using differential equations, but not about differential equations. But I think that the knowledge of this stuff starts on the theory, not on the actual implementation of them. I think it is totally fine if someone asks on discourse the best ODE algorithm given some statement on the problem structure, and then has only minimal understanding of the details of the algorithm (except enough to know where its tradeoffs lie).
If they need more training, it is that people need more linear algebra (and basic numerical analysis) training if they are going to be using differential equations as a central part of their work. if they are doing nonstandard models and have stiff equations, maybe training on conditioning/etc. (see https://julia.quantecon.org/tools_and_techniques/iterative_methods_sparsity.html for example). They also need to have a mental model in order to form heuristics on the right algorithms in the right circumstances. But I am not sure if there are any analogies to that in the standard DSGE, linearized perturbation approaches unless you are trying to do something nonstandard. There aren’t a lot of variations in the standard algorithms where tradeoffs are required.
I completely understand what you mean, and agree with you. I am just saying that you can’t blame Dynare for this And to me the worst of all worlds is people rewriting code that someone else has already tested. I would rather they use Dynare and learn more economics (or maybe about eigenvalues and/or fixed points in function spaces if they have limited knowledge).
I agree completely with you on all of this stuff except the conclusions. To me, the failure is that people are not learning enough linear algebra, not that they are spending too little time programming standard algorithms. If you understand saddle-path equilibria, what a perturbation method is actually doing, BK conditions, etc. then I think a student can figure out how to convert between Klein and SGU canonical forms/etc. or whatever if they really needed to. Otherwise, let them use dynare for implementation where it works, and quiz them on the theory (eg. make them log-linearize on paper and do a simple paper example where the BK conditions fail). Global methods are a different animal, of course, and for that they need to actually program things up themselves because no two models are the same.
That you have every right to disagree with, and one I feel less confident about (as it is not my area of research, and is an economics rather than a software one) My point is that dynare is so easy to use that it can lead people to layer on all sorts of moving parts in a model which makes interpretation very difficult. I am thinking of the examples with dozens of variables, shocks, parameters, etc… If you poke the DSGE solution (i.e. it is the properties of the solution which is a black box, not the code or algorithm itself) with an impulse or a change in parameters, mortals like me have trouble interpreting the output. For simple models, this rarely applies, of course. But this is a methodological discussion where I am somewhat ignorant, and could be completely wrong. But narrowly on the Dynare question, I don’t blame dynare since I think it does what it can to generate output for interpretation. At least with Dynare I am confident that any odd behavior is due to the solution itself, rather than being some software bug the researcher didn’t see.
@jlperla, I agree with all your points, but there is a little issue that you are missing … because you know a lot about computation. I am a user, and I sit on the opposite side. Dynare – instead of fostering the use of computational tools, instead of increasing the appetite for students to learn a little bit of computation – has exactly the opposite effect. It is not about algebra because students feel very comfortable here; it is about computation, where they feel very insecure. There is a compulsory course on algebra in undergrad studies in economics (I believe everywhere) but no compulsory course on programming.
It is extremely easy (and now with notebooks it is even much easier) to write down a set of code lines and simulate a DSGEM (using the Jordan or the QZ decompositions). They become fascinated when they end the course and realize how easy it is to simulate such models. When that happens, they start using the routines to simulate models as if they were in a sausages factory. But they know that a 1% shock is OK, while a 10% shock does not make much sense. They know they are linearizing and not just pushing something down into the stomach of Dynare.
The problem is the beginning: they have to learn a little bit of Julia (they tremble), they have to adapt the mathematical syntax into the computational language syntax (they tremble), they make small mistakes in adapting pieces of code (which is perfectly natural, but they tremble), and through this journey, they have at the back of their heads, the message from all those that do not know anything about programming, “uuuse Dynare”. Yep, Dynare is the greatest obstacle for graduate students to get motivated to learn a little bit of computation. It has a devastating effect even on faculty staff.