Ccall c++ sort vector of String

Hi,

Would someone be able to provide an example on how to :

1/ pass a vector of strings to ccall
2/ sort the vector of strings using c++ std::sort
3/ get the sorted string vector back in Julia

Thank you

1 Like

C++ is really hard to call from any language other than C++ uniless you write C-callable wrappers. That’s what CxxWrap.jl does. You can’t use ccall directly with C++, only with C-like (extern "C") functions. See e.g. this talk.

There was an experimental package to call C++ directly from within Julia by invoking the C++ compiler (LLVM) as needed, called Cxx.jl, but it is currently defunct. See e.g. this talk.

4 Likes

R via Rcpp makes it very easy to call C++. I think it would actually take no more than 5 lines of code to do. I know of Cxx.jl and CxxWrap.jl but they are either not maintained, compatible with the latest version of Julia or difficult to use (from my point of view). I actually managed to call simple c++ function that return double or int by using extern C but I can’t figure out how to do a bit more complex things like processing a vector of strings.

You piqued my interest because I’ve experienced the pain of wrapping C++ code for other languages in the past, so I looked it up. https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-modules.pdf describes several ways that Rcpp can generate those C-callable wrappers mentioned by @stevengj. All require explicit code modifications.

1 Like

Yes and it remains very easy to read.
Check this gallery of examples:

A piece of art, I wish something like this existed for Julia.

Actually we are using CxxWrap.jl to embed a satellite computer software inside a Julia simulation. To be honest, it was very easy to do. You just need to create wrap functions to send/receive data from C++ and everything just works. It maybe different from what you are used to in R, but Julia is not R. I highly suggest to you forget a little bit Rcpp and try to understand the design concept of CxxWrap. It is amazing!

4 Likes

If you’re not gonna wrap a large C++ library, exposing a C-callable function is fair enough.

@Gnimuc Ultimately the goal is to be able to wrap a large C++ library, but given that I can’t even call a simple C++ function I don’t see how I am going to do that. I usually try to start small and make my way up.

@Ronis_BR , I do for sure understand that Julia is not R and I would be happy to believe you when you say that CxxWrap.jl is amazing, unfortunely my experience tells me otherwise. I don’t know maybe I am too dumb to understand how to use it, maybe you can educate me?

I think explaining how to solve the problem in my original post would be a good step towards that…

At this moment(Julia 1.7+), there are three options:

  1. Manually write and build a C wrapper library over that C++ library and directly use the C library in Julia.
  2. Use CxxWrap.jl. (If you don’t know how to use it, you could ask questions about which part you don’t understand on Discourse/GitHub/Slack.)
  3. Use CxxInterface.jl. (This is a new package with a much simpler design.)

I’m also working on another package called ClangCompiler.jl. At this moment, you can use it to compile a cpp source file with a C wrapper and ccall the C API within Julia.

julia> using ClangCompiler

julia> using ClangCompiler.LLVM

julia> args = get_compiler_args();

shell> cat test.cpp
#include <iostream>
#include <string>
#include <vector>

#ifdef __cplusplus
extern "C" {
#endif

int std_sort_3_strs_with_same_length_c_wrapper(char *x[]) {
  std::vector<std::string> strs;
  strs.emplace_back(x[0]);
  strs.emplace_back(x[1]);
  strs.emplace_back(x[2]);

  std::sort(strs.begin(), strs.end());

  std::copy(strs[0].begin(), strs[0].end(), x[0]);
  std::copy(strs[1].begin(), strs[1].end(), x[1]);
  std::copy(strs[2].begin(), strs[2].end(), x[2]);

  return 0;
}

#ifdef __cplusplus
}
#endif

julia> cc = CXCompiler(IRGenerator("./test.cpp", args), LLJIT(;tm=JITTargetMachine()));

julia> link_process_symbols(cc)

julia> compile(cc)

julia> addr = lookup(cc.jit, "std_sort_3_strs_with_same_length_c_wrapper")
OrcTargetAddress(0x00000001280db000)

julia> @eval sort3strs(x) = ccall($(pointer(addr)), Cint, (Ptr{Ptr{UInt8}},), x)
sort3strs (generic function with 1 method)

julia> x = ["zxy", "abc", "def"]
3-element Vector{String}:
 "zxy"
 "abc"
 "def"

julia> sort3strs(x)
0

julia> x
3-element Vector{String}:
 "abc"
 "def"
 "zxy"

julia> dispose(cc)
Ptr{LLVM.API.LLVMOpaqueError} @0x0000000000000000

In theory, option 1 will always work. But it sounds like you don’t have much experience about how to ccall a C function in Julia, so I’d like to try to explain a little bit more here:

I think you could find the answer in the official Julia doc.

https://docs.julialang.org/en/v1/manual/calling-c-and-fortran-code/#man-bits-types

As the input argument type of ccall in the first step should be C-compatible, you need to find a way to convert a C char* array to a C++ container type that can be passed to std::sort.

Basically, this is nothing new. The sorted results can not be directly return back to Julia, so you need to find a way to convert and move the result to Julia.

6 Likes

First you need to create the C++ file that does what you want and expose it to Julia. Let’s call it sort.cpp:

#include "jlcxx/jlcxx.hpp"

void sort_string(std::vector<std::string>& v)
{
    std::sort(v.begin(), v.end());
}

JLCXX_MODULE define_julia_module(jlcxx::Module &mod)
{
    // This exposes the function `sort_string` to the module.
    mod.method("sort_string", &sort_string);
}

Now, you need to compile it. In my machine, I can run the following command:

clang++ \
    --std=c++17 -shared -undefined dynamic_lookup \
    -o libsort.dylib \
    -I../../.julia/artifacts/c598c0cf1873e3cacf620287193aaf12bf5f8ace/include/ \
    -I/Applications/Julia-1.7.app/Contents/Resources/julia/include/julia -L/Applications/Julia-1.7.app/Contents/Resources/julia/lib/Julia \
    sort.cpp

In your case, you will need to change the paths. After that, I get the wrap library libsort.dylib.

Now, we need to create a module in Julia that contains the wrapped functions. This is very simple and we will handle the type conversions between Julia arrays and std::vector internally:

module Sort
    using CxxWrap

    @wrapmodule("libsort")

    function sort(v::Vector{String})
        # We need to convert the Julia vector into the
        # `std::vector<std::string>`, which is already wrapped by CxxWrap.
        svec = StdVector(CxxRef.(StdString.(v)))

        # Call the function that will sort the vector using `std::sort`.
        sort_string(svec)

        # Now, just convert the sorted vector back to a Julia array.
        return String[svec...]
    end

    function __init__()
        @initcxx
    end
end

Finally, you just need to include this module and the sort can be performed as follows:

julia> v = ["C", "X", "X", "W", "R", "A", "P"]
7-element Vector{String}:
 "C"
 "X"
 "X"
 "W"
 "R"
 "A"
 "P"

julia> Sort.sort(v)
7-element Vector{String}:
 "A"
 "C"
 "P"
 "R"
 "W"
 "X"
 "X"

Of course this example is the simplest way I found to perform what you described. There is probably 100 other alternatives much better and faster.

17 Likes

For both R and Julia the foreign function interface, .Call in the case of R and ccall for Julia, uses C semantics. In the case of R the ,Call interface is rather simple because only one type of object can be passed as an argument or returned as a function value. This object is a pointer to a C struct called SEXPREC - see section 5.10 of the “Writing R Extensions” manual. This C struct contains a union (in the C language sense), which is what makes different types of vectors in R possible,

Rcpp, which as you say is an impressive piece of software, provides C++ wrappers around the various types of SEXPREC and interfaces with parts of the R environment to provide compilation, linkage, etc., more or less automatically. But eventually every package using Rcpp ends up exposing functions with C calling semantics to the R runtime.

My point here is that the majority of the work for interfacing takes place on the C++ side, because the objects that can be passed back and forth through .Call are relatively simple.

By contrast ccall in Julia can handle pretty much any interface that can be specified in a C header file. You don’t need to write, compile and debug “glue” code to rearrange the arguments from the Julia side to those needed from the C side. You just dlopen the library and call a function with C semantics.

Consider the RCall package in Julia. It allows you to run and interact with an R process from within Julia. The package is 100% Julia code. There is not a single line of glue code written in, say, C. All of the adapting of one language to the approach of the other is performed in the Julia code. And it does this by building a model in Julia of all the externally exposed structs and functions in R. I have, not entirely frivolously, suggested that if someone wants to understand the internals of R they should study the Julia code in RCall - I think it is much easier to understand than the source code for R.

Would it be possible to do things the other way - build a model of the Julia language and run-time system in R? No. (And I am not completely talking off the top of my head here - I have contributed to both R and Julia and even S - the language on which R was based.)

As for creating an “Rcpp for Julia”, the first question is what would it be used for? My impression is that the majority of R packages that use Rcpp do so to speed up code that can’t be easily vectorized. Certainly that was my motivation when I rewrote parts of the lme4 package for R to use Rcpp and RcppEigen. You don’t need to rewrite Julia code in C, C++, or Fortran to speed it up. Often you can get better performance out of the Julia code than you could by rewriting it. The MixedModels.jl package, which is 100% Julia code, is usually much faster than the lme4 package for R. Having been the principle author of both I can tell you that it was much more pleasant developing MixedModels.jl than lme4. I could try things in MixedModels.jl that I wouldn’t have tried in lme4 because it was just too difficult.

The other reason for wanting to interface between Julia and C++ is to gain access to the facilities of an existing and supported C++ library. A lot of effort goes into creating and maintaining such libraries and it may be needless duplication of effort to re-implement such code in a new language. However, experience in the Julia community is that the balance between re-implementing and interfacing is moving more and more in the direction of re-implementing. Several companies and groups like the Climate Modeling Alliance create and maintain Julia code bases of considerable complexity. Consider the Arrow project which provides implementations of a columnar storage format and analysis capability accessible from many different languages. The Julia implementation is the only one that does not rely on the C/C++ library. It implements the Arrow specification in Julia code. (It helps when @quinnj is interested in the project and does the heavy lifting.) This would be too complicated in most other high level languages.

Nevertheless, it would have been possible to interface to the Arrow library through the C interface if we had wanted to do that. The reason that it is difficult to interface with C++ libraries is because the interface looks so weird (name mangling, etc.) As others have suggested, if you have a C++ library that you really want to access from within Julia, the best way is to write a C-callable wrapper. I would say, again not entirely frivolously, that the awkwardness in the interface is not because Julia is deficient but because C++ is weird.

15 Likes

Thank you @Ronis_BR, what is suppose to be in the following folder?

-I../../.julia/artifacts/c598c0cf1873e3cacf620287193aaf12bf5f8ace/include/

I don’t have it on my side.

You’re welcome!

Oops, sorry :sweat_smile: I should have mentioned that. This folder contains the libcxxwrap with is downloaded automatically when you install CxxWrap. You can check your directory by running:

julia -e "using CxxWrap; @show CxxWrap.prefix_path()"
3 Likes

Thanks .
I get the following error when I compile :

sort.cpp:1:10: fatal error: 'jlcxx/jlcxx.hpp' file not found
#include "jlcxx/jlcxx.hpp"
         ^~~~~~~~~~~~~~~~~
1 error generated.

Thank you for the code a links.
I tried to use ClangCompiler.jl but I have the following error:

julia> using ClangCompiler
[ Info: Precompiling ClangCompiler [06fc9500-c033-43bc-8ca2-e20da63309d9]
ERROR: LoadError: LoadError: InitError: could not load library "/Users/gitboy/.julia/artifacts/d0af4887d2b2068d2248af9b298aa9bb9da4d251/lib/libclangex.dylib"
dlopen(/Users/gitboy/.julia/artifacts/d0af4887d2b2068d2248af9b298aa9bb9da4d251/lib/libclangex.dylib, 1): Library not loaded: @rpath/libclang-cpp.dylib
  Referenced from: /Users/gitboy/.julia/artifacts/d0af4887d2b2068d2248af9b298aa9bb9da4d251/lib/libclangex.dylib
  Reason: image not found

You code is however very useful to understand how one can wrap a C++ code and call it from Julia.

@Ronis_BR I figured out what was wrong.
It is working now. Thank you very much for that. Your example is great.
You should consider a PR to CxxWrap to improve the documentation with that example.
It will benefit other people as well.

5 Likes

@dmbates , I’ll try to answer your questions. If I miss some please do not hesitate to tell me.
Q: As for creating an “Rcpp for Julia”, the first question is what would it be used for?
A: It would be used to call existing code or library from Julia. Some people might have the skills and the time to re-invest the wheel and test that is actually working like a wheel. I personally do not. Julia itself exist because C/C++ and other projects written in those languages exist. These languages have been around forever and will be around in the next 50 years. So in my view being able to interface smoothly with these languages is key. Julia exists because it is standing on the shoulders of giants.

You:If you have a C++ library that you really want to access from within Julia, the best way is to write a C-callable wrapper.
Me: Yes that is exactly what I am trying to do, unfortunately I did not find a lot of example to learn from. I think that @Ronis_BR and @Gnimuc examples are very good and should be advertised much more. I actually thank them for sharing their knowledge. I wish there are more example out there.

You: I would say, again not entirely frivolously, that the awkwardness in the interface is not because Julia is deficient but because C++ is weird.
Me: I am not going to comment on “C++” being weird. I just find your comment poor and sad. I feel sorry that you cannot appreciate how amazing that language is. I believe all language are unique and amazing in their own way whether it is R, Python, C, C++ or Julia.

I should have been more careful with my last comment about C++ being weird. I was thinking of the linkage, which is how different languages must communicate with each other. If you have a C or Fortran header file you can determine how to call a function. To do the same with a C++ header file you essentially need to have a C++ compiler, because of name mangling.

2 Likes

I think it’s pretty much optimal. Note that it is also possible to just continue using the (C++) StdVector and convert to (Julia) Vector only when needed. The trade-off is that conversion to Vector copies the data, but index operations become much faster than on a StdVector. So the best approach depends on what you need to do with the result.

4 Likes

I have tried on a windows machine and I get the following error.
Any idea how to solve that?

D:\Code\test>clang++ --std=c++17 -shared -undefined dynamic_lookup -o libsort.dll 
-I C:\Users\gitboy\.julia\artifacts\ddaec29ab3e1def8127a1757203ee4a75c1bbead\include\
-ID:\julia-1.7.1-win64\julia-1.7.1\include\julia
-LD:\julia-1.7.1-win64\julia-1.7.1\lib\julia sort.cpp

In file included from sort.cpp:1:
In file included from C:\Users\gitboy\.julia\artifacts\ddaec29ab3e1def8127a1757203ee4a75c1bbead\include\jlcxx/jlcxx.hpp:13:
In file included from C:\Users\gitboy\.julia\artifacts\ddaec29ab3e1def8127a1757203ee4a75c1bbead\include\jlcxx/julia_headers.hpp:15:
In file included from D:\julia-1.7.1-win64\julia-1.7.1\include\julia\julia.h:12:
In file included from D:\julia-1.7.1-win64\julia-1.7.1\include\julia\julia_fasttls.h:14:
  D:\julia-1.7.1-win64\julia-1.7.1\include\julia/dirpath.h:10:9: warning: 'PATH_MAX' macro
    redefined [-Wmacro-refedined]
  #define PATH_MAX MAX_PATH
          ^
  D:\llvm-mingw-20211002-msvcrt-x86_64\llvm-mingw-20211002-msvcrt-x86_64\include\limits.h:20:9:
      previous definition is here
  #define PATH_MAX   260
          ^
  1 warning generated
ld.lld error undefined symbol: __declspec(dllimport) jl_symbol
>>> referenced by C:\Users\gitboy\AppData\Local\Temp\2\sort-3bb21c.o: (jlcxx::FunctionWrapperBase& jlcxx::
Module::method<void, std::__1::vector<std::_1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator
<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1
...etc