When running using Spark
:
julia> using Spark
ERROR: InitError: JavaCall.JavaCallError("JULIA_COPY_STACKS should not be set on Windows.")
Stacktrace:
[1] assertroottask_or_goodenv
@ C:\Users\joel\.julia\packages\JavaCall\MlduK\src\jvm.jl:233 [inlined]
[2] _init(opts::Vector{String})
@ JavaCall C:\Users\joel\.julia\packages\JavaCall\MlduK\src\jvm.jl:285
[3] init()
@ JavaCall C:\Users\joel\.julia\packages\JavaCall\MlduK\src\jvm.jl:277
[4] init(; log_level::String)
@ Spark C:\Users\joel\.julia\packages\Spark\89BUd\src\init.jl:56
[5] init
@ C:\Users\joel\.julia\packages\Spark\89BUd\src\init.jl:16 [inlined]
...
during initialization of module Spark
But then if I run using Spark
again, it goes through. When trying to use it:
julia> spark = SparkSession.builder.appName("Main").master("local").getOrCreate()
ERROR: JavaCall.JavaCallError("Class Not Found org/apache/spark/sql/SparkSession")
Stacktrace:
[1] _metaclass
@ C:\Users\joel\.julia\packages\JavaCall\MlduK\src\core.jl:383 [inlined]
[2] metaclass(class::Symbol)
@ JavaCall C:\Users\joel\.julia\packages\JavaCall\MlduK\src\core.jl:389
[3] jcall(::Type{JavaCall.JavaObject{Symbol("org.apache.spark.sql.SparkSession")}}, ::String, ::Type, ::Tuple{})
@ JavaCall C:\Users\joel\.julia\packages\JavaCall\MlduK\src\core.jl:225
[4] getproperty(#unused#::Type{SparkSession}, prop::Symbol)
@ Spark C:\Users\joel\.julia\packages\Spark\89BUd\src\session.jl:48
[5] top-level scope
@ REPL[4]:1
I have jdk 11 and mvn in path. JAVA_HOME set to jdk 11.
Side question… Will Spark.jl allow me to read a parquet file from hdfs running in kubernetes (and with the spark master running in kubernetes if that’s needed)?