Areas for new users to contribute to?

I am a new user to Julia and am extremely impressed with the language. I used to be an active R developer contributing to the community, and am now looking for ways to contribute to Julia.

I am very interested in scientific computing, specifically in

Machine Learning
Causal inference
Numeric optimization
Reproducible Research

I have ~7 years experience of developing in R and Rcpp. Are there any areas that are suitable for a new comer to contribute to? What are the priorities in the community?

5 Likes

Just browse the topics here (for example, List of most desired features for Julia v1.x , and Specific Domains - JuliaLang). Or at github: Issues · JuliaLang/julia · GitHub

From my own perspective: Priorities are in getting the package ecosystem up to language v0.6, work on infrastructure (tooling around the language), contribute to testing, consolidate packages.

Welcome ! In Julia itself, many issues have been marked as “intro issue”, indicating that we know how to fix them, and the solution is not too hard to implement.

You can also start by contributing to packages, which is easier as a first step. In particular, the JuliaStats packages would certainly benefit from help in many different areas (our team is quite small currently). You can have a look at the open issues, or see what you consider as missing pieces in the API.

Finally, since I see you’re the author of several R packages, you could create your own Julia package implementing features which are still lacking. This will likely lead you to contributing to fundamental packages once you realize some pieces are missing there to integrate your package in the wider ecosystem.

5 Likes

For optimization you could look at Optim.jl and BlackBoxOptim.jl. One issue is that the two packages don’t use the same interface, it could be investigated whether a more streamlined interfaced could be implemented. There’s also algorithms that can be implemented (e.g. recent CMA-ES variations). I don’t think Optim.jl supports parallel evaluation, which is probably quite useful in some cases.

I also started implementing the Black-Box Optimization Benchmarking (BBOB) functions for benchmarking. BlackBoxOptim.jl has some benchmark but this could be made into a more general, separate package.

1 Like

Isn’t that what JuMP is for?

JuMP is a DSL. MathProgBase is the interface. It would be nice if everything standardized there, but MathProgBase is pretty complicated. And MathProgBase is being replaced with MathOptInterface. I hope it’s made with the intention of making it easy for Optim.jl and BlackBoxOptim.jl to join in.

@nalimilan this looks like a good place to start for me. I have been a data scientist in the industry for 5 years and can probably contribute to this area.

Being a new user, I think one of the best areas to contribute to in the short term is writing a tutorial navigating all the packages from the point of view of a new user. It seems there are DataFrames now but the future might be DataTables or IterableTables. Also for plotting a lot of tutorials are written for Gadfly, but Plots seems to be the better way to go these days. A lot of information in blog posts/tutorials are out of date, and admittedly without centralization it will be hard to keep them up to date at the speed Julia is changing.

If I want to organize a subdomain for JuliaTutorials, what should I do? I’ve noticed that every domain in Julia has its own github group. Are these officially registered, or can I open one myself?

Tutorials would certainly be useful, though individual packages also often lack basic documentation (docstrings, minimal Documenter manual…). Organizations are mainly there to allow a group of people to coordinate their efforts. In your case, you could just start a project under your own account, and move it to an organization when the need for it is obvious (there is not registration process).

I’ve been using your fastVAR package at work! Julia doesn’t appear to have much in terms of time series models. There’s been work here and recently here for a time series data structure. The only package I’ve seen for any time series models was this one which appears to be abandoned. Look forward to seeing what you develop in Julia.