How to run an entire python script in pycall

Hi there @stevengj
I understand completely, I’ll go back and revisit my python scripts to see what I can do to be more structured in my approach. I don’t want to start off on the wrong foot.

@lungben
certainly worth a try and thanks for the suggestion. @stevengj pointed out that the real approach should be to call the functions not the script. I’ll go back and revisit my python scripts to see what I can do to make it more modular.

1 Like

tried that but didn’t work. I think I need to go back and REALLY look into the aims and aspirations of PyCall. I suspect that this is a massive misread on my part. I “thought” calling the python script would populate the julia space with the python variables ( dataframes, variables) and it does PERFECTLY if you call the function. I understand, NOW, that calling the whole script would be a poor idea and I’ll go back and revisit my approach.

to all that tried to help, thank you all so much. I read through the splendid PyCall docs and pondered my options. In the end I think it’s a better approach for me to leave the existing python scripts mostly unchanged and consider an interprocess communication architecture uni directional from python to julia. The python scripts login to a financial data feed and gather real time data, normally I would do all the data manipulation in python ( numpy, pandas et al) and use dash plotly to represent the results. It wouldn’t take too much for the python scripts to “write” the datastream to some form of IPC network ( I use Linux) and have the julia script read it.
What I used in PyCall was excellent and I can see it’s value but in my use case I think that the IPC route would be a better option.
thank you all for trying to help a poor noob, I certainly learnt some things.

1 Like

hi @lungben
I don’t like giving up on things, especially when I think my adventures could help others. That said I would like to investigate your concept of wrapping the code below in a function and calling the whole thing from PyCall ( awesome package by the way). Are there any gotcha’s that you can see in this approach?

The “problem” with julia ( like linux or python) is that there are so many possible approaches and history dictates I will take ALL of the poor ones before stumbling on the right one. My usual approach is to just jump in, make mistakes and then spend too much time duct taping when I should just take time to start off on the right foot. Any tips you might have would be appreciated.

import pandas as pd
import numpy as np
import uuid
env_dir = '/home/dave/j_sandbox/'

dev_env_csv    =  env_dir + '/csv/'  # where to get the expert symbols from


expert_symbols_csv = "tontine_symbols.csv"  # the symbols table with SPX index added AND the category stuff  added 4 9 20


def get_stuff(csv_file):

     df_symbol_to_process_list =  pd.read_csv( csv_file )
              
     return df_symbol_to_process_list
    
if __name__  == '__main__' :

    mac_address = hex(uuid.getnode())
    
    df_expert_symbols = get_stuff (dev_env_csv + expert_symbols_csv)

This seems way more complicated to me. Why not just run the script as-is (note: the latest PyCall release has a @pyinclude("test.py") macro) and just fetch the global(s) you want with py"someglobal"?

The advice about not using scripts full of globals (writing functions and other structured-programming constructs instead), is just general software-engineering advice; it’s not necessary for calling Python from Julia.

PS. Note that I don’t think __name__ == '__main__' will be True in Python if you execute a file with exec as we are doing here, so you could just replace that with if True:

1 Like

Hi @stevengj

thanks for sticking with me on this. I think I mentioned that if there was a complicated and boneheaded approach, that would be my first port of call :slight_smile: so instead of buckling down and REALLY considering what everyone has been saying I concentrated ( last night) on my goto approach of getting the python scripts to generate a stream of text messages which I planned on writing a finite state machine to consume. You see what you are dealing with here!

I sat back this morning and considered @lungben and your observations ( thanks for not laughing by the way) and put aside my whiteboarding to revisit PyCall ( outstanding package by the way).

For some reason my attempt to run the package as is yielded no results as I couldn’t access

df_expert_symbols
mac_address

In the Pluto cell after I had run the python script. That threw me off. Right now I can’t post the code as I don’t know how to do that but I’ll figure it out. Though it’s in the image ( sorry) I sent earlier on.

AHA you see you are giving me an out. the lure of globals beckons and I have to resist! At any time I am one step away from a GOTO, tempt me not :slight_smile:

To be honest I would like to just dump the __name__ == '__main__' as it serves no purpose here. It’s only there just in case I wanted to send args to the cronjob that runs the python script. Thanks for pointing that out.

I am determined to get this working with PyCall given the more I sit down and read about it the more impressed I am.

As I’ve said repeatedly, these are not Julia globals, they are Python globals. You would access them with py"df_expert_symbols" and py"mac_address" in Julia (after ditching the __name__ == '__main__'). (Also, update to the latest PyCall and use @pyinclude("test.py").)

5 Likes

Hi @stevengj and I did try py"df_expert_symbols BUT that was before your observation that the --name__ functionality needs to go. Sorry for the confusion. When I get back to the machine I’ll make the changes and see what happens. Thanks again for sticking with this.

Actually, it __name__ == '__main__' is true by default, so it should be fine in PyCall:

julia> py"__name__"
"__main__"

julia> py"__name__ == '__main__'"
true

But, as I said above, if you are in Pluto, you should be sure to update to the latest version of PyCall and use the built-in @pyinclude("test.py") macro rather than the original pyinclude function I posted, since the latter may use the wrong module’s globals for Pluto.

hi there
will do on all counts. Trying to dump all globals, Thanks for the advice and guidance. wish me luck :wink:

By the way, there is also a problem with doing this kind of thing in Pluto, because Pluto doesn’t know about the dependencies implied by using Python globals — it doesn’t know that a Python global depends on running your Python script, so it may execute the cells in the wrong order unless you run the Python script in the same cell as where you try to fetch the global.

(You could alternatively run it in the REPL or in a Jupyter notebook.)

1 Like

noted and thanks. I AM using Pluto exclusively for my development so any gotcha’s are welcome so I can watch out for them. My understanding of Pluto is that it uses in order cell dependencies so I’d expect it to always execute the cells in the order they appear on the gui. I’ll watch out for out of order completion though!.

avoiding the REPL not because it isn’t great I just like simple things, for example in coding python I have always used Idle3. I don’t really want to use vstudio or Atom.
ALSO I am going to investigate calling Pluto NB from the webapi functionality so I’ll have a julia script running cells and using the features of Pluto to alter the cell’'s behavior.

thanks again for all your help

Nope, it analyzes the code to figure out dependencies, so that if you have a cell x = 3 and then change it to x = 4, it will re-run all cells that depend on x even if they are listed previously in the GUI) See the Pluto FAQ:

Pluto.jl figures out the dependency graph between cells, and knows exactly which cells to re-evaluate, in which order. A cell never executes twice.

However, Pluto can only analyze dependencies in Julia code, and has no way of inferring the correct dependencies for Python code that you execute via py"..." or @pyinclude AFAIK.

So the order in which Pluto executes code that has hidden Python dependencies might be unreliable. In contrast, Jupyter notebooks execute cells only when you run them explicitly, so you manually control the order of execution.

2 Likes

when I said cells I meant Pluto. It makes sense that Pluto can only build the dependency graph in julia world and not python. I figured as much but stated it poorly. Thanks for the heads up. I will be VERY careful. I’m hoping that this will keep me under control and avoid GOTO mindset :slight_smile:

I’m not going to be using Jupyter as I have settled on Pluto.

When you always give Julia names to Python objects, Pluto should get the dependencies correctly.
Example:

my_func(a, 9) # my_func and a are defined later, but Pluto should figure this out

begin
py"""
def my_func(a, b):
    return a*b
"""
my_func = py"my_func" # give the Python function a Julia name
end

begin
py"""
a = 42*7
"""
a = py"a"
end

Edit: as @stevengj pointed out, the assignment to the Julia name must be in the same Pluto cell as its definition on Python side.

1 Like

(If you do so in the same cell where the Python objects are defined, e.g. in the same cell that calls py"def ..." or @pyinclude.)

2 Likes

Right, I missed to point that out, fading memories, thanks!

1 Like

thank you for the example really helps.

1 Like

Hi all
not given up on this, just a sync and stabilize period. The more I read about julia the more there is to like. I don’t want to mess this one up so I’m spending a few days reading books and watching videos like this one
developing julia packages which is excellent. I want to make sure I start off on the right path. Thank you all for the guidance. The PyCall package is wonderful and I want to make sure I don’t mess up. When I get this working I’ll post my “code” so you can all have a good laugh :slight_smile:

have a happy christmas and a safe new year.