Using cppyy for julia/C++ interoperability

Browsing the julia universe, I found interpolations.jl, which offers b-spline interpolation. I am the author of a comprehensive C++ b-spline library, so I thought I’d post a greeting issue to the colleagues and point them to my work. In this issue, a discussion about interfacing C++ and julia ensued: It transpired that usage of C++ in julia currently has unresolved issues, like the unavailability of Cxx.jl for newer julia versions. This gave me the idea to wrap the experimental python module I made for my library instead. This module is wrapped using cppyy. Importing the python module with pycall worked, and shortly after this initial success it dawned upon me that importing cppyy itself into julia would be a viable option to interface arbitrary C++ code with julia.

I posted my findings and ideas to the issue mentioned above, and there was a vivid echo, including a few posts from cppyy’s author, whom I’d also written to, to tell him about what I tought was an exciting discovery. I’d like to invite you over to the issue on Interpolation.jl’s tracker to see how the discussion went so far - I’ve written this here post following the suggestion of @mkitti. This site also seems to be a more appropriate location for the discussion than Interpolation.jl’s issue tracker where it happened to originate.

Just to give you a rough idea: cppyy is a well-established package based on cling, which offers a wide selection of tools for python/C++ interoperability. Other languages able to bind python code seem to have found it an option to interface with C++ (wlav mentions Perl and Go), so my idea wasn’t really novel, it’s just that noone seems to have thought of it on the julia side. I think it might be an interesting route to explore - even without any further integration, the power of this approach shows with a simple three-liner, which I quote here:

julia> using PyCall; cppyy = pyimport("cppyy")
PyObject <module 'cppyy' from '.../python3.8/site-packages/cppyy/__init__.py'>

julia> cppyy.cppdef(
        """#include <iostream>
            template<typename T>
            void blah ( const T& arg )
            { std::cout << "hello " << arg << std::endl ; }""")
true

julia> cppyy.gbl.blah(1.234)
hello 1.234

Note the RTTI triggered by the call’s argument, which is quite a mouthful for a wrappee, and the inclusion of the entire iostream.h header, which, incidentally, makes it’s entire content usable by the wrapper, even though this isn’t exploited here. As wlav pointed out, cppyy offers far more than just wrapping C++, he mentions

cross-language inheritance and callbacks (crucial for component frameworks, e.g. GUIs), basic automatic memory management, transparent smartptr handling, and in upstream’s use automatic I/O

and in another post he invites julians to cooperate. I hope this may trigger a mutually beneficial cooperation. I wrote an initial posting here which has proper cross-references to everything, but the site’s policy rejected it because it has more than two links. I posted the fully xreffed version to the item on Interpolations.jl, sorry for the inconvenience.

With regards

Kay F. Jahnke

15 Likes

Does that mean it’s easy to include Python dependencies?

No.

People have done what they can with conda.jl and others, but the short answer is that python should be avoided when not strictly necessary.

Even the new vscode jupyter notebook stuff will have no python anywhere in the stack.

1 Like

Asking as someone without any C++ experience - what is the advantage of Cxx.jl compared to the built-in ccall?
For me it seems that the unavaliability of Cxx.jl is not a showstopper, there is a large number of wrapped C/C++ libraries available in Julia.

2 Likes

It’s not possible to call arbitrary C++ routines, e.g. with templates or classes or static overloading, using ccall — you really need to hook into the C++ compiler somehow to get the correct ABI.

Cxx.jl is similar to cppyy in that it accomplishes this by invoking Clang at runtime. (cppyy does so via Cling.)

Alternatively, you can wrap a plain C interface around your C++ code and call that. That’s what e.g. CxxWrap.jl does (and is also how SWIG and some other wrapper generators in Python work).

8 Likes

cling is very impressive. They’ve also recreated a fast, interactive environment for C++ like Pluto.

I definitely prefer julia’s syntax to C++. However, Vassilev et al. has created an excellent environment for rapid development using existing high-performance C++ libraries.

1 Like

I knew I was gonna see ROOT…sigh*

1 Like

Manpower is all you need.®

3 Likes

exactly, we lack manpower! so imagine if we were not writing, whatever this is:

> ls
awkward-1.0  numba  numpy
> gocloc .
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                        1559         102441         118551         383907
C                              163          40681          70887         165182
C++                            319           7496           1370          93560
ReStructuredText               511          25159              0          84318
C Header                       286           7929          11588          43236

(for comparison, Julia base + all stdlib is 300k)

(ROOT is an amazing and mega project that has 2832913 line of C++ and 253056 line of C, excluding headers which are an extra 51651+899063. Honestly, all Julia pkgs I ever used probably don’t even add up to that

4 Likes

Just chiming in here as somebody who has wrapped a couple of HEP C++ libraries. I’ve used CxxWrap, because it works. I actually tried in Cxx first, because it looked like it had a significantly lower threshold, but I just got stuck.
Often I’m just a casual user and just want to try things out. In that case, implementing a CxxWrap binding is too much work for me, but if I could just use Cxx or cppyy, I’d try it.

It’s clear that for new Julia packages, the easiest is to use pure Julia, but for interfacing legacy packages, @jling’s language breakdown points exactly to the issue: It’s a lot of work replacing all of that code, so any packages that gets us part of the way there is a win in my book.

5 Likes

@jlperla: I think that simply ‘avoiding python’ is defeatist. Look at it as an opportunity instead. To get back to my b-spline library, the python wrapper I did with cppyy adds a fair amount of functionality, namely the pythonization layer. With this layer (and cppyy in the back) I can, at run time, pass template arguments to the construction of objects, and cppyy instantiates templates it hasn’t yet ready and the JIT creates code which is as fast as precompiled binary. The python layer is genuinely useful - if it weren’t there I had to write this part in julia to produce the same functionality, which would require an ‘intimate’ wrap with deep ties to the C++ code.

The different layers are quite well decoupled: the (julia) caller can set up arrays and parameters and then calls into the python layer. The python layer inspects the parametrization and builds code if necessary, then it delegates to the C++ layer which eventually does the number-crunching. It seems like a good way to split up the work, and the wrap of the already functional python module in julia can take over the python functionality pretty much 1:1, because in julia you wouldn’t really want a different interface.

So for the concrete example, and to answer @jstrube’s post, I’ll simply go ahead and see if it works - beyond my initial proof of concept. All the yes-butting might be silenced by a pudding proof. Succeeding, this is certainly much more exciting than just yet another b-spline module. If using cppyy doesn’t work for every C++/python module, then the code and/or tools can be worked over until they do.

Note that you will need my bspline module and your python3 must have cppyy installed for this to work. Here we go:

julia> using PyCall

julia> bspline = pyimport("bspline")
PyObject <module 'bspline' from '/home/kfj/.local/lib/python3.8/site-packages/bspline/__init__.py'>

julia> src=bspline.make_array((1000,1000),"double",3)
PyObject <bspline.array object at 0x7f9f66d3c460>

julia> trg=bspline.make_array((1000,1000),"double",3)
PyObject <bspline.array object at 0x7f9f66d3fb50>

julia> bspl=bspline.bspline("double", 3, 2, [1000,1000])
PyObject <bspline.bspline object at 0x7f9f66d3fbb0>

julia> set!(src, (5,5), (1.0,2.0,3.0))
(1.0, 2.0, 3.0)

julia> bspl.prefilter(src)

julia> bspline.index_transform(bspl,trg)

julia> get(trg,(5,5))
(0.9999999999999997, 1.9999999999999993, 2.9999999999999996)

This is a demonstration that the higher-level functions in the bspline module can be accessed from julia. I’ve resorted to creating the source and target arrays with the bspline module for this example - the arrays are normal numpy arrays (you can get this ‘aspect’ by accessing src.as_np, for example). If one were to ‘julianize’ the module, suitable array-to-array conversion would be nice-to-have, especially when it comes to multi-channel data.

So I create two arrays src and trg, each with a million tuples of three doubles. To make sure that the spline ‘works’ I assign (1,2,3) to the tuple at coordinates (5,5), which would be (6,6) in julia. I also create a bspline object, passing the number of channels (3), the dimensionality (2) and the shape (1000,1000). Note that I might have passed more arguments to, e.g. to get a spline of different degree (up to 45). The magic here is that the data type, number of channels and the dimensionality, which are template arguments in C++ (and therefore have to be known at compile-time) are run-time arguments to the python module, and are used to create suitable code on-the-fly if necessary.

Next I call ‘prefilter’, passing the ‘src’ array. This takes the values in src as knot point values for the spline, performs the prefiltering, and deposits the spline coefficients inside the b-spline object bspl, where they are held during the object’s scope, and can be accessed by referring to, e.g., bspl.core.as_np, which exposes them as a numpy array.

Now I call index_transform, passing the spline as a functor (the spline’s built-in default evaluator will be used) and trg, the target array. index_transform passes discrete indices into the target array to the functor passed as it’s first argument and writes the result to the target at that very index. This function is not a common feature, but it’s an ideal strting point for stencil code, where you fill a target array by means of a functor which is invoked with the target coordinates and takes it from there (an inverse transformation).

Finally, I expect that the index_transform should have deposited the evaluations of the spline in the target array, so I expect to find the tuple (1,2,3) at trg[6.6], which is indeed the case, minus a bit of error which has accrued along the way. I refrain from showing the result for the other 999999 tuples, they should all be near-zero :wink:

Beyond the mere proof of viability, this little demonstration should reveal that the process is quite fast - after all the arrays involved are reasonably large (you may have to repeat the process to see the true speed). The python layer has added benefits like accessing the module’s documentation, try:

py"help"(bspline)

Which is not what you would get “for free” when wrapping the C++ code - I hope this shows that having the python layer is indeed an added benefit. What you will notice is that the array handling is a bit clumsy, but this is due to my inadequacy using julia - NumPy arrays and julia arrays are compatible at a deep level (via python’s array interface), the trouble is to get the parametrization right, hence my ‘chickening out’ and using the bspline module to create suitable arrays. As these arrays are transparently accessible by julia, this is not really a problem either.

To conclude this lengthy post, and as an answer to @jstrube, I think that adding a little julia wrapper to ‘grok’ the python module would be much less work than producing an equivalent by wrapping the C++ code. How about giving that a shot? I’d also like to invite mkitti (I’d like to invite you via an at reference, but julialang.org forbids me to mention more than two other users because I’m new) , who has kindly offered to help wrapping my code for julia to consider the path via python - it may be a bit circuitous and pull in a lot of collateral code, but the result is extremely versatile and fast, as I hope to have demonstrated above - and it’s almost there. Wrapping the C++ code would - as it stands - not give us run-time template instantiation, all we could hope for is to wrap already instantiated templates with CxxWrap with additional intrusion into the C++ code - or waiting for Cxx to maybe become available. And with a julia wrap of the python module, a rich body of code would be immediately available to all other julia code requiring cardinal b-splines. I don’t know of another package with this scope.

1 Like

What are the installation instructions for Python?

There appears to be another bspline package in PyPI:

This is exactly the weakness of this approach.

The Python interoperability from Julia is a great tool that I use all the time at work, both with internal and public packages. It has the drawback that you need to manage the Python packages but it’s only marginally worse than using Python packages from a Python program, so it’s manageable. It’s nowhere at the level of a pure Julia solution when it comes to reproducibility and just working aspects though.

So would I use this package from a project (assuming I had a need for splines and it solves the problem better than alternative Julia packages)? If I already had other Python dependencies, almost certainly. If not, maybe, depending on what trade-offs would be involved in using another package.

Would I use this package as a dependency of another public package? Absolutely not. I would never subject my package users to indirect Python dependencies.

In conclusion this approach will produce a package that may be useful, or even very useful. But it will in some aspects be a second class package.

3 Likes

Sorry, but mine is quite experimental and not (yet) in PyPI, it’s the one I’ve linked to in the other thread:

https://bitbucket.org/kfj/python-vspline.

In a nutshell, install vigraimpex-dev from your distro, install NumPy, pip-install cppyy, clone my repo and import the bspline folder in it. In long:

So far I’ve not uploaded my module it to PyPI because I couldn’t quite figure out how to formulate the dependency from the two C++ libraries involved, short of copying (a large part of) them in whole. So the python module requires an install of vigra (libvigraimpex-dev package on debian, this is the C++ library, not it’s python module). If this is problematic, I may be able to add a small set of vigra headers to the python module - all I use from vigra is it’s array handling and small aggregates.

If you want to invoke explicit SIMD code, it will also require the vc-dev package, or better, a build of Vc from source, picking the 1.4 branch and subsequently installing it. Vc is optional - vspline will fall back to it’s own SIMD backend if Vc is not present, which is a bit slower.

On top of that, cppyy needs to be installed on your system, I’d recommend pip-installing that into your python3 environment or using a virtualenv for it - cppyy is still evolving, and getting the latest via pip will probably be miles ahead of what your distro offers. And, of course, you’ll need NumPy.

Then you can git-clone the python-vspline repo to wherever you like; if you want to have the python module ready to import, either run your session from the repo’s root or copy the ‘bspline’ folder to where your python3 can import it. To work out if the module is readily accessible, try

import bspline

in a python3 session (if your python has the ‘bspline’ module from PyPI, rename the ‘bspline’ folder to something else and import that, or uninstall the module from PyPI first), and if this succeeds, my example julia session should also work.

The python module is configurable: see config.py in the bspline folder, where you can set a few options in the first few lines. As it comes from the repo, it will not require Vc, and it won’t rely on precompiled binary code, which you might obtain by running the makefile in the bspline folder, which will deposit the shared libs right there. Precompiled binary will reduce warm-up times, slotting in machine code rather than JITting the C++ code with cling. Use of Vc will be faster, but not dramatically so - you might reserve that for speed-testing if you want to see vspline’s full potential.

Sorry for the complicated setup, consider that ‘teething problems’. If my instructions don’t work for you, please get back to me.

I agree that setting this up is not comfortable, and my code is still experimental. Nevertheless I’ve worked quite hard at reducing the ‘pain’, and I hope to get it honed to a point where a simple ‘pip3 install vspline’ will do the trick, grabbing all the necessary dependencies. My package hasn’t generated much interest yet, so I’ve not invested overly much energy in making it comfortable to access. I don’t know how good julia’s pip integration is - this might be something a julia vspline package could simply do during setup, failing if pip can’t be had - I trust julia can invoke other programs to the effect. Having the module in pip would automatically draw in cppyy as a dependency, and as I have explained in my response to @mkitti, I could include a bunch of vigra headers to avoid the need to install vigraimpex-dev.

So, yes, accessing it is second class right now. But if you have read my post closely, you’ll have to agree that the benefits are absolutely first class. And the question is, how can we leverage the first-class content? Can we maybe make it painless to even the casual user? I have proposed to use cppyy to directly wrap C++ code, but the follow-up post should make a good point for using a python layer in between.

What I’m most interested now is to compare my code with pure julia performance-wise. I have not managed so far to pass julia arrays in and create the arrays python-side (maybe it’s due to the index ordering, I’m still trying to figure it out). My challenge is there: erect a cubic b-spline over a 1000X1000 array of tuples of three doubles, and evaluate it at 1000000 coordinate pairs, storing the result in another array. I am not making any exotic demands - I could ask for a heptic b-spline over a 5D array of C++ long doubles - so that extant b-spline code in julia can compete. I expect that for this simple example julia should perform similarly well, but that remains to be seen.

vspline has seen several man-years of development and it’s very comprehensive and fast. I think it could be a good thing to have in julia, and it may serve as a pointer at how to achieve the wrap of a complex template metacode construct via cppyy, without sacrificing too much functionality because it’s ‘hard to wrap’.

… and, finally, I’ve replied to both @GunnarFarneback and @mkitti, but julialang.org has notified me that it has flagged my postings as spam and hidden them, because they look like ‘advertisements’. If you can’t see my posts, please let me know. I’m sorry I have to refer to my work with hyperlinks, but I can’t really think of how else to do it.

At the moment they are hidden by default but you can click to view them. I’m sure this is something the discourse moderators can and will fix by marking it as a false positive, and I expect they were automatically notified of the flagging.

Okay, cool, that’s good enough I suppose. Thanks for getting back to me!

I very narrowly mean “avoiding python” as an interoperability layer if it isn’t strictly necessary. Python itself has many great uses, and I spend half my time using it. But at any point in time, there is a 50/50 chance that my conda setup is broken or outdated. When I am working on a python project it is annoying but fine because I can setup virtual environments, etc. to make things reproducible. Then my CI instantiates that environment/requirements.txt, etc. and it is pretty reproducible. But anyone who has done this with python called from Julia knows how fragile it is to have both working on machines, continuous integration servers, etc. This is all python setup and environment fragility and has nothing to do with Julia.

All of that is worthwhile if there is a package which is inherently connected to python which makes that burden worthwhile. But this package is for splines, and there are already lots of packages in Julia which have splines (even if they aren’t as good as yours). If yours is better, then I would love to try it out, but not if it means adding in a whole python environment and its ensuring maintenance/reproducibility issues just to do so.

Anyways, that is my last comment on this. I would absolutely love to have another spline/interpolation package in the julia ecosystem, and I really hope your wrapper is successful.

Yeah, I’m sorry about that. We’ve rejected the automated flags and I’ve added bitbucket to our list of allowed domains alongside GitHub and gitlab since we’re happy for folks to post their code here (even if they’re relatively new) so this shouldn’t happen again for others — and I’d expect you’ll rapidly hit the requirements for the next trust level which will completely remove these limits.

Thanks for coming here to share this!

4 Likes