Adding a build step for CmdStan inside a Julia package

I’m looking for a way to ship a working CmdStan with a Julia project.
Building it via BinaryBuilder seems to be off the table (see https://github.com/JuliaPackaging/Yggdrasil/issues/1023), but for my use-case it would be totally fine to build it locally.

The necessary dependencies are a C++ compiler (g++ or clang) and make (mingw32-make for Windows), would it be possible to use artifacts for these? There seems to be a LLVM_jll, but I’m unsure if it is possible to make make available via an artifact.

Could someone provide some input on whether or not this is going to work? I’m not really familiar with all the BinaryBuilder stuff yet, so it would be great to know that what I want to do is possible before spending too much time on it.

Thank you!

1 Like

My apologies no-one is answering your question. I took another look if I could come up with a better answer than in the BinaryBuilder related issue but once more came up empty handed. With the recent interest in CmdstanR and CmdstanPy, which pretty much take a similar approach as CmdStan.jl, I’d hoped broader interest would emerge but haven’t seen that yet.

I wonder if asking this directly on Stan Discourse is a possible next step. I think with JuliaPackaging Julia might be in a better position to solve this issue. I’ll link your question above (C++ & make) to JuliaPackaging/Yggdrasil ( https://github.com/JuliaPackaging/Yggdrasil/issues/1023#issue-617171438 ).

3 Likes

Thank you for your reply! It was a very specific question so I didn’t really expect an answer in such short time. I agree that the approach of bundling Cmdstan like in CmdStan.jl and CmdstanR is the way forward. The Stan development team had lots of issues with RStan (it’s still stuck on 2.19) and there were quite a lot of installation issues on MacOS Catalina.

I really hope that we have a relocatable CmdStan binary one day (I faintly remember reading about this issue on the Stan discourse), but in the mean-time maybe it is possible to leverage the JuliaPackagin infrastructure to provide a more convenient way to include CmdStan.

I don’t think there is a way around building it locally (I suspect the IntelTBB stuff for multithreading and OpenCL configuration issues are a show stopper), but maybe it is possible to provide most dependencies via the _jll system and then add a build.jl script on top of that.

Daniel, just thinking a bit more about your suggestion. I’m not sure what platform you use to test (in my case MacOS).

Your basic assumption is that the C++ compiler & make are the dependencies. If this is correct, one way to test this is to use homebrew to install both gcc and make and see if the makefile can be adjusted with a make/local file to use those in the make build step (step 4 below).

I don’t think a user without a toolchain will have git installed so in my basic build steps:

1. git clone https://github.com/stan-dev/cmdstan.git --recursive
2. cd cmdstan
3. Copy ./make/local from previous install
4. make -j9 build
5. make examples/bernoulli/bernoulli
6. ./examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.json
7. ls -l output.csv
8. bin/stansummary output.csv

steps 1 to 3 should be replaced by downloading an artifact. Probably separate Pkg.build() methods are needed for unix, macOS and windows.

Not sure if this idea would work for the mingw toolchain on windows.

1 Like

I’m trying my luck with it now, going to report back if something works. I’m not having much experience with the whole building stuff though, so me giving up is a possible scenario.

Best regards,

Daniel

1 Like

Just a quick update: I got a working make binary through BinaryBuilder, but when using clang++ from Clang_jll I get this error on make build:

In file included from src/cmdstan/stansummary.cpp:1:
In file included from src/cmdstan/stansummary_helper.hpp:4:
In file included from stan/src/stan/mcmc/chains.hpp:4:
In file included from stan/src/stan/io/stan_csv_reader.hpp:4:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/algorithm/string.hpp:18:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/algorithm/string/std_containers_traits.hpp:18:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/config.hpp:44:
stan/lib/stan_math/lib/boost_1.72.0/boost/config/detail/select_stdlib_config.hpp:18:12: fatal error: 'cstddef' file not found

I don’t understand enough about compiling C++ code yet, but it seems like boost tries to compile with the wrong standard library or something?

Funnily enough adding CXXFLAGS+=-stdlib=libc++ in make/local gets boost to compile, but then I get some weird linker error:

/usr/bin/ld: stan/lib/stan_math/lib/boost_1.72.0/stage/lib/libboost_program_options.a(cmdline.o): in function `void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .constprop.0]':
cmdline.cpp:(.text+0x89): undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long)'
/usr/bin/ld: cmdline.cpp:(.text+0xb5): undefined reference to `std::__throw_logic_error(char const*)'

Daniel, I wonder if this has to do with the fact that some parts are built with c++ (which I think might ne c++11) and some with c++1y. Don’t recall exactly if 1y == 14 or what.

On my system though, using above CXXFLAGS+=-stdlib=libc++, it seems to compile, build and run so I wonder if the make artifact and in particular the Clang.jll artifact are identical. Do you mind sharing your setup?

Also, I am on Big Sur and have found in some cases that the Xcode tools work better.

Also noted below message during the build:

The Boost C++ Libraries were successfully built!

The following directory should be added to compiler include paths:

    /Users/rob/Projects/StanSupport/cmdstan.25/stan/lib/stan_math/lib/boost_1.72.0

The following directory should be added to linker library paths:

    /Users/rob/Projects/StanSupport/cmdstan.25/stan/lib/stan_math/lib/boost_1.72.0/stage/lib

Do you see those in your make build output? (cmdstan.25 is just my test version/location)

The steps I’m using right now are

using LLVM_full_jll

LLVM_full_jll.clang() do exe
    cd("/path/to/cmdstan")
    run(`make CXX=$(joinpath(LLVM_full_jll.PATH, "clang++")) build`)
end

So I’m just using the system make right now. I’m going to try building some other programs to if it is a cmdstan specific issue.

It’s interesting that you can get it to run with CXXFLAGS+=-stdlib=libc++. Maybe it has something to do with clang++ already being the default compiler on MacOS.
Unfortunately it takes me pretty long to figure everything out (never did any C or C++ programming), but I have not lost hope yet.

Hi Daniel,

After some experimenting I got stuck at exactly the same point ( code in https://github.com/goedman/InstallCmdStan.jl ). It looks like ‘borrowing’ clang++ from LLVM requires additional updates, maybe even to the cmdstan makefile?

Best, Rob

Hi @goedman,

I’ve also done some more digging. It seems like CmdStan does not like an absolute path for CXX, but this is fine because running

LLVM_full_jll.clang() do exe
    run(`clang++ --version`)
end

adapts the PATH to the custom clang. Adding CXX=clang++ and CXXFLAGS+=-stdlib=libc++ to make/local does work for me when compiling a Stan model (progress!), but it fails when compiling Stan utilities like stansummary.
The issue is that this still requires a system libc++, which may not be available.

I still have some ideas to try out, but maybe we just have to call in the cavalry and do a post on the Stan discourse and just see what they have to say.

Great work, here. Thanks to you all. I look forward to what this effort brings.

I can’t speak to all of the Stan tools/utilities you might be interested in, but I have much of stansummary written in Julia, albeit untested, here. The code could use some attention and some cleaning up, but it’s a good start. This could be a reasonable path forward, if you all can’t get stansummary to compile correctly.

This isn’t the way much of the Stan team wants interfaces to go, each interface coding up their own sample statistics tools, but it is the way the R and Python interfaces currently operate.

Let me know if this code is of any interest. I’d be happy to help out the StanJulia effort.

1 Like

Hi @daniel,

On my system, using LLVM’s clang, it never compiles anything that needs a library (e.g. stdio.h, iostream). I wonder what’s the best way to turn this question into a MWE, maybe point to InstallCmdStan.jl?

Hi @roualdes,

Very interesting, Hadn’t seen your work. From albeit untested I assume you haven’t compared it to stansummary or the read_summary(...) DataFrame in StanJulia? Or the corresponding MCMCChains.jl summary? If we can compile and run a stan model, I agree with you, we’re 90% there (although I think we should target 100%).

A little embarrassed to admit it, but I hadn’t seen MCMCChains.jl before. That looks great. Thanks for pointing it out.

Agreed, 100% is a good goal.

Hi @roualdes, I might have another use case for your code. In StanSample.jl (and probably a few other packages in the Stan…jl collection in StanJulia), for the upcoming v3, I’m switching to NamedTuples as the default output from read_samples(). Combined NamedTuples are just a nicer way for multilevel models. ‘Combined’ means that if Stan returns a.1, a.2, a.3 , ... as multilevel parameters the NameTuple contains the Vector or Matrix a. Your code might be useful to create a stansummary from such a NamedTuple…

Sounds good. I’ll file an issue at StanSample.jl, and we can discuss further.

I am wondering if you found a solution for this. I need Stan for a CI script in StanRun.jl, and building it is quite time-consuming.

Not sure if your question is for Daniel or me, but I have not found a solution other than in the Github CI workflows, e.g. for Stan.jl. Which is basically what you came up with long ago.

Recently there was another attempt to turn cmdstan into an artifact but I think that also stranded.

1 Like

I am currently working on this, there’s an open PR for CmdStan on Yggdrasil. I just finished up GNUMake to bundle with it. I’m circling back to finish up CmdStan in the next few days.

2 Likes