I’m looking for a way to ship a working CmdStan with a Julia project.
Building it via BinaryBuilder seems to be off the table (see https://github.com/JuliaPackaging/Yggdrasil/issues/1023), but for my use-case it would be totally fine to build it locally.
The necessary dependencies are a C++ compiler (g++ or clang) and make (mingw32-make for Windows), would it be possible to use artifacts for these? There seems to be a LLVM_jll, but I’m unsure if it is possible to make
make available via an artifact.
Could someone provide some input on whether or not this is going to work? I’m not really familiar with all the BinaryBuilder stuff yet, so it would be great to know that what I want to do is possible before spending too much time on it.
My apologies no-one is answering your question. I took another look if I could come up with a better answer than in the BinaryBuilder related issue but once more came up empty handed. With the recent interest in CmdstanR and CmdstanPy, which pretty much take a similar approach as CmdStan.jl, I’d hoped broader interest would emerge but haven’t seen that yet.
I wonder if asking this directly on Stan Discourse is a possible next step. I think with
JuliaPackaging Julia might be in a better position to solve this issue. I’ll link your question above (C++ & make) to JuliaPackaging/Yggdrasil ( https://github.com/JuliaPackaging/Yggdrasil/issues/1023#issue-617171438 ).
Thank you for your reply! It was a very specific question so I didn’t really expect an answer in such short time. I agree that the approach of bundling Cmdstan like in CmdStan.jl and CmdstanR is the way forward. The Stan development team had lots of issues with RStan (it’s still stuck on 2.19) and there were quite a lot of installation issues on MacOS Catalina.
I really hope that we have a relocatable CmdStan binary one day (I faintly remember reading about this issue on the Stan discourse), but in the mean-time maybe it is possible to leverage the
JuliaPackagin infrastructure to provide a more convenient way to include CmdStan.
I don’t think there is a way around building it locally (I suspect the IntelTBB stuff for multithreading and OpenCL configuration issues are a show stopper), but maybe it is possible to provide most dependencies via the _jll system and then add a build.jl script on top of that.
Daniel, just thinking a bit more about your suggestion. I’m not sure what platform you use to test (in my case MacOS).
Your basic assumption is that the C++ compiler & make are the dependencies. If this is correct, one way to test this is to use homebrew to install both gcc and make and see if the makefile can be adjusted with a
make/local file to use those in the
make build step (step 4 below).
I don’t think a user without a toolchain will have
git installed so in my basic build steps:
1. git clone https://github.com/stan-dev/cmdstan.git --recursive
2. cd cmdstan
3. Copy ./make/local from previous install
4. make -j9 build
5. make examples/bernoulli/bernoulli
6. ./examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.json
7. ls -l output.csv
8. bin/stansummary output.csv
steps 1 to 3 should be replaced by downloading an artifact. Probably separate
Pkg.build() methods are needed for unix, macOS and windows.
Not sure if this idea would work for the mingw toolchain on windows.
I’m trying my luck with it now, going to report back if something works. I’m not having much experience with the whole building stuff though, so me giving up is a possible scenario.
Just a quick update: I got a working
make binary through BinaryBuilder, but when using clang++ from Clang_jll I get this error on
In file included from src/cmdstan/stansummary.cpp:1:
In file included from src/cmdstan/stansummary_helper.hpp:4:
In file included from stan/src/stan/mcmc/chains.hpp:4:
In file included from stan/src/stan/io/stan_csv_reader.hpp:4:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/algorithm/string.hpp:18:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/algorithm/string/std_containers_traits.hpp:18:
In file included from stan/lib/stan_math/lib/boost_1.72.0/boost/config.hpp:44:
stan/lib/stan_math/lib/boost_1.72.0/boost/config/detail/select_stdlib_config.hpp:18:12: fatal error: 'cstddef' file not found
I don’t understand enough about compiling C++ code yet, but it seems like boost tries to compile with the wrong standard library or something?
Funnily enough adding
make/local gets boost to compile, but then I get some weird linker error:
/usr/bin/ld: stan/lib/stan_math/lib/boost_1.72.0/stage/lib/libboost_program_options.a(cmdline.o): in function `void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .constprop.0]':
cmdline.cpp:(.text+0x89): undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long)'
/usr/bin/ld: cmdline.cpp:(.text+0xb5): undefined reference to `std::__throw_logic_error(char const*)'
Daniel, I wonder if this has to do with the fact that some parts are built with c++ (which I think might ne c++11) and some with c++1y. Don’t recall exactly if 1y == 14 or what.
On my system though, using above CXXFLAGS+=-stdlib=libc++, it seems to compile, build and run so I wonder if the make artifact and in particular the Clang.jll artifact are identical. Do you mind sharing your setup?
Also, I am on Big Sur and have found in some cases that the Xcode tools work better.
Also noted below message during the build:
The Boost C++ Libraries were successfully built!
The following directory should be added to compiler include paths:
The following directory should be added to linker library paths:
Do you see those in your
make build output? (cmdstan.25 is just my test version/location)
The steps I’m using right now are
LLVM_full_jll.clang() do exe
run(`make CXX=$(joinpath(LLVM_full_jll.PATH, "clang++")) build`)
So I’m just using the system
make right now. I’m going to try building some other programs to if it is a cmdstan specific issue.
It’s interesting that you can get it to run with
CXXFLAGS+=-stdlib=libc++. Maybe it has something to do with
clang++ already being the default compiler on MacOS.
Unfortunately it takes me pretty long to figure everything out (never did any C or C++ programming), but I have not lost hope yet.
After some experimenting I got stuck at exactly the same point ( code in GitHub - goedman/InstallCmdStan.jl: Test to see if cmdstan can be used as an artifact ). It looks like ‘borrowing’ clang++ from LLVM requires additional updates, maybe even to the cmdstan makefile?
I’ve also done some more digging. It seems like CmdStan does not like an absolute path for CXX, but this is fine because running
LLVM_full_jll.clang() do exe
adapts the PATH to the custom clang. Adding
make/local does work for me when compiling a Stan model (progress!), but it fails when compiling Stan utilities like
The issue is that this still requires a system
libc++, which may not be available.
I still have some ideas to try out, but maybe we just have to call in the cavalry and do a post on the Stan discourse and just see what they have to say.
Great work, here. Thanks to you all. I look forward to what this effort brings.
I can’t speak to all of the Stan tools/utilities you might be interested in, but I have much of
stansummary written in Julia, albeit untested, here. The code could use some attention and some cleaning up, but it’s a good start. This could be a reasonable path forward, if you all can’t get
stansummary to compile correctly.
This isn’t the way much of the Stan team wants interfaces to go, each interface coding up their own sample statistics tools, but it is the way the R and Python interfaces currently operate.
Let me know if this code is of any interest. I’d be happy to help out the StanJulia effort.
On my system, using LLVM’s clang, it never compiles anything that needs a library (e.g. stdio.h, iostream). I wonder what’s the best way to turn this question into a MWE, maybe point to InstallCmdStan.jl?
Very interesting, Hadn’t seen your work. From
albeit untested I assume you haven’t compared it to
stansummary or the
read_summary(...) DataFrame in StanJulia? Or the corresponding MCMCChains.jl summary? If we can compile and run a stan model, I agree with you, we’re 90% there (although I think we should target 100%).
A little embarrassed to admit it, but I hadn’t seen MCMCChains.jl before. That looks great. Thanks for pointing it out.
Agreed, 100% is a good goal.
Hi @roualdes, I might have another use case for your code. In StanSample.jl (and probably a few other packages in the Stan…jl collection in StanJulia), for the upcoming v3, I’m switching to NamedTuples as the default output from
read_samples(). Combined NamedTuples are just a nicer way for multilevel models. ‘Combined’ means that if Stan returns
a.1, a.2, a.3 , ... as multilevel parameters the NameTuple contains the Vector or Matrix
a. Your code might be useful to create a stansummary from such a NamedTuple…
Sounds good. I’ll file an issue at StanSample.jl, and we can discuss further.
I am wondering if you found a solution for this. I need Stan for a CI script in StanRun.jl, and building it is quite time-consuming.
Not sure if your question is for Daniel or me, but I have not found a solution other than in the Github CI workflows, e.g. for Stan.jl. Which is basically what you came up with long ago.
Recently there was another attempt to turn cmdstan into an artifact but I think that also stranded.
I am currently working on this, there’s an open PR for CmdStan on Yggdrasil. I just finished up GNUMake to bundle with it. I’m circling back to finish up CmdStan in the next few days.