Early stage econ Phd student here focusing in micro theory but considering a shift/expansion into applied micro IO/health. I’ve previously learned some basic Python from a more traditional software engineering viewpoint (still beginner). Once I started to look into the scientific computing stuff (e.g. numpy) it seemed a bit awkward (np.this.that etc.) and I started looking into Julia. So far Julia seems really great for numerics but I’m still super basic and now trying to see whether to invest further in it or go back to Python.
I want to develop proficiency in microeconometric coding e.g. for health econ (and eventually data science of various sorts) as well as general numerical analysis that might come in handy for game theory. Based on the IO class I just took and this Quantecon Post from last year, it seems like the applied micro packages might be more well developed in Python. Looking long term, I’d say I’m 50-50 academia vs. industry.
Probably I’m getting a biased take on this forum, but: any thoughts on Python vs. Julia? Thanks.
Not sure if this is relevant to you, but for monte carlo style simulations, Julia will be way better. In general, Julia has much better problems where you need to write a loop than python. If you can vectorize everything, numpy will work OK (although often kind of ugly), but some problems are basically impossible to solve efficiently in python.
If you want to do structural models, which you might well do in IO/health, then the speed advantages which come w/ Julia can be quite significant. In my own structural IO/health work, I found this to be a big benefit vis-a-vis Matlab. I also found it very easy to parallelize important parts of my estimation and counterfactuals, which is also obviously a huge plus if you have access to a Big Machine somewhere at your university.
By revealed preference, the fact that a bunch of senior people in the field are investing their time in learning Julia is a good signal. (There are also juniors and JMCs who are working in the language too.)
In the end if you decide to do something “really structural” in an IO sense (BBL, a Rust model, whatever), you are going to have to program a lot of it from scratch yourself, so there is not much chance you’ll find a well-developed package for any part of the estimation in Python or any language. (BLP is an exception now that PyBLP exists, but you probably aren’t going to do ‘plain vanilla’ BLP in your JMP, so you may need to write your own programs there anyway.) In that case the speed of development, speed of debugging and speed of execution of what you write are all going to matter a lot and that’s an area where I think Julia really shines.
I have found most of the ML I need in pre-written Julia packages which run well and work. There might be (I really don’t know) more available in Python or things might be available “sooner”, but whatever you need I think you can find. Or, frankly, it is easy enough to write your own.
But to another point you made: yes, you will absolutely get a biased sample from this forum! (It is possible to be biased and correct though!)
If you need a specific Python package, it is quite easy to call Python code from Julia or vice-versa using PyCall.jl (or pyjulia).
I guess my question would be “Why do you want to develop proficiency in microeconometric coding?” Is this just for something fun to do or to make the professor happy? Then Julia would probably be fine, and the code would probably execute faster.
Are you trying to build your resume? Then Python might be better, at least more people have probably heard of python. I’m not in that field, so I don’t know if Julia has made serious inroads there.
You might just want to look at what packages are available, play with those packages to see which are better for you, and choose the language based on that. I don’t think coding in Python is any easier or harder than coding in Julia.
If you’re interested in doctoral studies, then you’ll most likely end up programming yourself more than using packages. For that Julia is certainly an excellent and highly productive language. For masters level studies focused on applied work, using packages is perhaps the way to go, and it may be that other languages still have an advantage, depending on what sort of work you’re interested in doing.
I have been using Julia for teaching and research for quite a while now, and the examples I present to the students are all in Julia (https://github.com/mcreel/Econometrics). Just today I had a good time programming a new example.
I’m in a PhD program (sorry “grad student” is ambiguous: I’ll update the OP)
This really depends on the amount of programming do. This is not really field-specific: in either health econ or IO, people do all kinds of diverse stuff.
If mostly use canned MX algorithms, Stata or R is still probably the best choice, but if you want to work with structural models, especially fit them to data, then I would recommend investing in Julia.
In this context, Python would be the worst compromise in my opinion: not as fast and versatile as Julia for numerical work, but not comparable to Stata or R when it comes to the library ecosystem.
I agree with Tamas that Python vs. Julia is not the relevant comparison, unless you specifically do stuff for which PyBLP is the answer.
When you say “applied micro” I’m thinking (quasi-)linear models for causal inference of the Angrist/Pischke mould, and for that indeed I would probably say R is your best bet, followed by Stata.
One thing to keep in mind though is that interoperability between Julia and other languages is excellent - I’m using
RCall all the time in a workflow like:
- Parse or web scrape data in Julia with CSV.jl, Gumbo and friends…
- Turn into an estimation dataset using DataFrames.jl
- Create all sorts of descriptives plots in StatsPlots.jl
- Estimate some statistical models using GLM.jl
- Find out that there is some new specific estimator that exactly fits my problem, which someone invented in a paper published recently and implemented in R → use
RCall to send my estimation dataset to R, estimate the model using one line of R, go back to Julia
StataCall so presumably one could do something similar with that, but I don’t have a working Stata license anymore.
With this, I get the benefit of all the recent advances produced in the R ecosystem, without actually having to remember how many layers of square brackets to use
[[]] in order to index some vector which isn’t actually a vector but a list or something.