My application is currently written in Matlab with few very specific and time consuming functions implemented as a MEX extensions in C++. I am thinking about switching to Julia or Python, and here is the problem: how to integrate C++ code with the high-level language like Julia. I presume that a complete example of integration would be interesting for many researchers who need ultimate performance but lack of understanding of low level details.
I would appreciate if somebody provided me with a sample C++ code that implements the following typical scenario:

Suppose an algorithm is implemented as a function within a shared or static library complied with Julia specific settings and headers (example?).

The function takes dense and sparse matrices as an input and returns the result in, say, a dense matrix.

The function should be able to get matrix sizes and reallocate the output matrix if necessary.

The function is supposed to be invoked many times and should be able to overwrite the elements of the external, persistent output matrix directly (once it has a correct size) rather than returning a temporary matrix, in order to void extra reallocations.
The most interesting point is how to traverse the elements of Julia dense and sparse matrices.
The sample code could implement a dense matrix times sparse matrix multiplication, for example.
Say, KxM dense matrix times MxN sparse matrix assuming column major layout.
Thank you in advance.

(A bit off topic: checkout https://github.com/JuliaInterop/MATLAB.jl, which allows you to call matlab from Julia. So you can call your MEX files like this. Probably not the best performance wise but good for testing and a step-by-step migration.)

Are you asking the forum for help in working out how to do this yourself - or offering a salaried assignment with a set of predefined outcomes? I am sure there are quite many developers on the forum who would be interested in taking on a salaried assignment.

I wish I could. We tried different vectorization and parallelization strategies using plain Matlab. However, due to specific nature of the algorithm, C++/Mex was far better solution.

Thatâ€™s why rewriting in Julia is an excellent option. What technical requirements are preventing this?
Is your code available? Otherwise it is basically impossible to provide assistance.

Input:
dense matrices A, C : k x n
sparse matrix B : (m*n) x n

I need to compute the resultant m x k matrix:
result = kron(eye(m), ones(1,n)) * (repmat(transpose(A), m, 1) .* (B * transpose(C)))

The C++ code does not make repmat(â€¦) explicitly but uses a modular arithmetic to
expand the matrix A virtually. Otherwise, explicit construction of a huge
(m*n) x k matrix would be both time consuming (poor caching) and memory wasteful.
Also, the C++ implementation does quite advance parallel job partition, which
avoids large intermediate memory buffers and any inter-thread synchronization,
besides a barrier at the very end.

This why I need direct access to (sparse) matrix elements.

This should be trivial to implement in Julia at +/- the same speed as C++. Just try it and report back with a â€śWhy is this slowâ€¦â€ť post, usually people will be very helpful in getting it tuned.

Julia is made to loop the same way C++ does. Copy the same algorithm over to Julia and you should be fine. It should be a pretty quick translation if there arenâ€™t many library calls in there. It should end up with about the same speed (using similar compiler settings).

The interesting thing about Julia is that you can write things â€ślike C/FORTRANâ€ť but you donâ€™t have to if you donâ€™t want to, and if you do, itâ€™ll compile to very similar (or the same) code as youâ€™d get from those lower level languages.