[solved] How to load Modules on demand? To speed up

Alex1 · July 19, 2025, 2:46am

I want to speed up running Julia script. Some code executed rarely and require heavy libraries.

The problem is

ERROR: LoadError: syntax: "import" expression not at top level.

Eval doesn’t work ether, it complains about some world counter.

Any solution?

using PyCall, DataFrames

function cached(get::Function, id::AbstractString)
  get() # In reality it will be serialised to disk
end

data = cached("data") do
  import Pandas # <== Error
  py"""
  import numpy as np
  import pandas as pd
  """
  df_py = pyeval("""pd.DataFrame({ "id": np.arange(1, 11), "value": np.random.randn(10), })""")
  DataFrame(Pandas.Pandas.DataFrame(df_py))
end

println(data)

P.S. In this specific case it’s possible to avoid using Pandas, like use Python → csv → Julia etc. But it’s a common pattern and would be nice to know how to solve it.

technocrat · July 19, 2025, 3:26am

Can’t have using or import in function local scope—it has to be global.

Is there a compelling reason to serialize? Because

using CSV, DataFrames
data_store = "/path/to/csv_file.csv"

function fetch(csv_file)
     return CSV.read(fetch_file, DataFrame)
end

is pretty simple. And if you’ve got big N, an SQL backend has got to be a more robust way to go.

ufechner7 · July 19, 2025, 3:45am

You can use

  eval(expr)

  Evaluate an expression in the global scope of the containing module. Every
  Module (except those defined with baremodule) has its own 1-argument
  definition of eval, which evaluates expressions in that module.

Alex1 · July 19, 2025, 3:59am

Thanks, I found easier way to convert data from python

df = data_py.load().reset_index(drop=true)
data = DataFrame(df.to_dict(orient="list"))

As for eval the whole block had to be put in eval

using PyCall, DataFrames

function cached(get::Function, id::AbstractString)
  get() # In reality it will be serialised to disk
end

data = cached("data") do
  # @eval import Pandas # <= Doesnt' work, the whole block had to be put in eval

  @eval begin
    import Pandas
    py"""
    import numpy as np
    import pandas as pd
    """
    df_py = pyeval("""pd.DataFrame({ "id": np.arange(1, 11), "value": np.random.randn(10), })""")
    DataFrame(Pandas.Pandas.DataFrame(df_py))
  end
end

println(data)

Benny · July 19, 2025, 4:14am

Imports are not just how you load packages; in fact, loading only occurs for packages when they haven’t been loaded for a session, so subsequent imports are much cheaper. Imports trade names among global scopes in Julia; eval worked to an extent because it evaluates code in the global scope. Unlike Python, names can’t be traded with local scopes.
The world age counter is a byproduct of optimizing JIT compilation. Dispatched methods are compiled before they are called, so your method was compiled before the @eval import Pandas could run. The compiled method had no idea what Pandas.Pandas.DataFrame meant, let alone what methods it had, so it compiled to throwing a MethodError. When it executes, the Pandas.Pandas.DataFrame object does exist after the import, despite the compiled method not knowing about the methods. If you had executed the method a 2nd time interactively, it would work, but the method would still lag in the previous world age’s methods and often be recompiled. In order to use the methods immediately and avoid that repeated recompilation, you can use invokelatest on the call, but that sacrifices optimizations, similar to how executing code in the global scope does. Depending on what you’re doing, that sacrifice might be acceptable.

There could be improvements depending on how you’re conditionally evaluating imports and code. Your example only shows an unconditional higher-order function call, so there’s no opportunity to omit anything and your @eval import and calls are a needless sacrifice. Is it possible to make a slightly bigger example with a condition?

Alex1 · July 19, 2025, 4:21am

Your example only shows an unconditional higher-order function call

The code is conditioned, the real code for cached, it skip the computation if cache exist, I omit it in the example to keep it short.

function cached(get::Function, id::AbstractString)
  date_s = Dates.format(Dates.now(), dateformat"yyyy-mm-dd")
  path = "./tmp/cache/$(id)-$(date_s).jls"

  if isfile(path)
    @info "cache loading" id
    return deserialize(path)
  end

  @info "cache calculating" id
  result = get()
  mkpath(dirname(path))
  serialize(path, result)
  result
end

About the @eval - is such usage - to avoid compilation of conditioned code block undesirable? Like making optimising other code worse? The code in eval block doesn’t have to be fast but say would it affect performance of some unrelated fit_model() function?

data = cached("data") do
  # @eval import Pandas # <= Doesnt' work, the whole block had to be put in eval

  @eval begin
    import Pandas
    py"""
    import numpy as np
    import pandas as pd
    """
    df_py = pyeval("""pd.DataFrame({ "id": np.arange(1, 11), "value": np.random.randn(10), })""")
    DataFrame(Pandas.Pandas.DataFrame(df_py))
  end
end

Benny · July 19, 2025, 5:02am

The way cached is written there, the get input function cannot be conditional (in your call, that’s the part in the do block), it’s the get() call that is. That’s why you had to put the actual work of the do block inside @eval to make it entirely conditioned on the call instead of being compiled inside the get input function with the associated drawbacks.

Also worth mentioning that with what is written so far, you still only define the cached function to call once, so all of this is far easier in the global scope without a function at all (for example, you can do conditional imports in top-level if blocks). But I assume you might want to call cached many times in the same script (maybe a loop over many id), and eval (and related things like include) in a function is pretty much how you’d rerun a script without dealing with method compilation.

As in Python, you should avoid conditional import schemes in a longer-lived process with enough repeated calls because it seriously hampers maintaining more complex code for increasingly negligible or unlikely savings. For example, if you’re iterating through many id with scattered caches, you should load everything you needed to deal with the uncached from the start. On the other hand, if you’re iterating id that either only have caches (maybe a whole directory of caches) or don’t have caches at all (filling an empty directory), you could use different scripts for them to begin with instead of dealing with the unnecessary overhead of repeated conditional checks.

Really depends on how much Julia code there is. In your example as stated, the @eval block doesn’t directly do much in Julia so there isn’t much to optimize anyway. Your cached function call won’t know much about the input get method, but there isn’t much code afterward that needs to know. Again, I’d only be concerned for repeated calls in longer-lived processes; short scripts can afford to be a little rougher.

Alex1 · July 19, 2025, 5:39am

Maybe I miss something, the way cached written is a standard and well known pattern. A slow_get function that could be called any time and some sort of cache. Cache don’t know about slow_get and slow_get don’t know about cache, and the main script don’t know about internals of those two too - it’s supposed to be that way.

Suppose that slow_get is some slow call to download data from network, or slow computation (as it is in this case, made by some legacy python code). I just wrap slow_get it into cache, and it solves the problem and makes slow operation instant.

Benny · July 19, 2025, 5:50am

Those are all true statements, so I’m don’t know what you might be missing or are trying to ask.

Topic		Replies	Views
Import a module only when needed Performance	5	664	May 8, 2022
Possibility of `local import` statements in future? Internals & Design scope	63	3880	May 2, 2018
Is there a mechanism in Julia to load modules on demand? General Usage module	6	799	January 25, 2020
Questions about Compiler and Compiling Modules New to Julia	21	2203	January 17, 2019
An idea for putting imports inside of Julia functions General Usage	2	384	April 1, 2022

[solved] How to load Modules on demand? To speed up

Related topics