Julia call from Python3 running in single core

Sorry to bother you. Here is my few tests…

  1. I have downloaded the julia to my directory /home/harish/julia
  2. I see the above “.so” file in /home/harish/julia/lib/julia, I dont see anywhere else.
  3. Now i changed your code like this, correct me if i am wrong.
    import ctypes
    JLPATH=“/home/harish/julia”
    jl = ctypes.PyDLL(JLPATH+“/lib/julia/libjulia.so”, ctypes.RTLD_GLOBAL)
    jl.jl_init(JLPATH+“/bin/”)
    jl.jl_eval_string(“”" addprocs(1) “”“)
    jl.jl_eval_string(”“” println(nprocs()) “”")

ERROR :System image file “//…/lib/julia/sys.ji” not found

  1. I dont see above mentioned file in “cd /usr/lib/” (I am not sure usr or /usr is inside julia dir or system dir)

That workaround probably won’t work on 0.3.

Above code is working fine in 0.5 and Python 2.7, but fails in 3.4 with below error

ERROR :System image file “//…/lib/julia/sys.ji” not found

Can i know the python verison?

Finding out how to add processes when calling Julia from Python could be of interest by itself, but I would reiterate that doing so will not help in fitting a mixed-effects model using the MixedModels package. The package only uses one process.

I did point out in one of the forums to which you posted that much of the work in fitting such a model is dense numeric linear algebra. Each case is a little different about exactly which BLAS or LAPACK routines are called, which is why I asked if you could post the formula for the model and the characteristics of the data. I would still appreciate it if you could do that. The inputFormula in your original post would fail on invalid syntax, I think, and I haven’t been able to see where it gets modified, if it does.

The benefit from having multiple threads is only in the BLAS/LAPACK calls in the evaluation of the objective function to be optimized.

Here is the draft of the formula. Its nor the same what i have but similar with changed column names. I am doing Mixed models based on Random and fixed variables

formula = ‘volume ~ 1 + logprice + col1 + col2 + col3 + price1 + price2 + price3 + price4 + ( ( 0 + logprice ) | employee ) + ( ( 0 + col1 ) | employee ) + ( ( 0 + col2 ) | employee ) + ( ( 0 + col3 ) | employee )’

My question is – this takes 820 sec when i execute only julia, if i call the same code from pyjila its 2-3 hours. I dont think its due to internal parallelism issue (any mixed model package). It may be due python to julia communication.

This is an important point. Looks like I was on the wrong track here (though it did reveal a different bug). There might be a few ways BLAS could end up performing differently in the embedded pyjulia situation.

For example, if NumPy is also compiled against OpenBLAS and loads it first, then Julia might pick up the wrong shared library. From what I can tell, Julia’s BLAS ccalls don’t use a hard path, so dlopen could be defaulting to the existing handle because everything is in the same address space. If that library was previously initialized by NumPy with a lower number of threads, then performance would be different.

@Harish_Kumar what Python distribution are you using? Anaconda comes with MKL so it shouldn’t be an issue there, but other Python distributions probably compile against OpenBLAS.

python2.7 from Anaconda distribution.

yes, even we tried with python2.7, addprocs working fine (with work around). Currently we are using python 3.4 where it fails with “ERROR :System image file “//…/lib/julia/sys.ji” not found” error. Any help on this?

Try JLPATH=b"/home/harish/julia" and JLPATH+b"/lib/julia/libjulia.so". Note the " b " at the beginning. (maybe a unicode/ctypes issue).

Do you get the expected performance?

You may be right that the bottleneck may be in the communication of data between Python and Julia.

One way to move a pandas dataframe to Julia is to write a feather file in Python/pandas and read it in Julia using the Feather package. The Python code is something like

import feather
import pandas as pd

#df is spark df
df = pd.read_csv('/home/hpcuser/test.csv')
feather.write_dataframe(df, 'test.feather')

If you know that there are no missing data values in the data frame I recommend calling Feather.read in Julia as

using Feather, DataFrames, MixedModels

df = Feather.read("test.feather", nullable = false)

It happens that the particular formula you use is not handled as efficiently in the current (0.7.0) release of MixedModels. In the numeric representation of the model, the random-effects terms should be amalgamated into a single term with a special structure. I know how to do the algebra, I just haven’t worked out a good way of specifying the model. It is possible to work backwards reassembling the terms that have been specified separately, but it may be more effective to allow for another argument instead so the model is specified as

volume ~ 1 + logprice + col1 + col2 + col3 + price1 + price2 + price3 + price4 +
    (1 + logprice + col1 + col2 + col3 + price1 + price2 + price3 + price4 | employee)

along with an indication that the unconditional covariance of the random effects is diagonal.

I am having some trouble in collecting the data back after the julia call. Can i know what is the data type of jl.jl_eval_string() , i always get a integer back but in my code i return dictionary.

calcLME = jl.jl_eval_string(juliaCode)
result = calcLME(inputData)

TypeError: ‘int’ object is not callable

I tried this option as well → jl_value_t *result

Or feather is the work around?

This is not working. Same issue

We’ve provided several suggestions: number of processors, number of BLAS threads, and communication between Python and Julia. Without a minimal example demonstrating the problem, including data (generated or otherwise), we are all just guessing, so continuing this thread is not productive.

Sure. lets close this thread. Thanks every one for help. here is the conclusion from this thread

  1. Still working on python3.5 and PyDLL issue, its not adding base URL
  2. New Mixed models is not allowing me to put my own formula for random estimates

Thanks again i will update this thread once i get all these things working.

1 Like