Databricks and Julia

hi,

for my job we are exploring the datalakehouse possibilities in DataBricks. Also we probably need to write new code (currently in PL/SQL) into a language that is accepted in the databricks envirroment.

Question:
I am new to the databricks field, I have seen that spark and Julia should work together but I donโ€™t see any explicit teamups with Julia and Databricks. Does that mean I canโ€™t use Julia in DataBricks or because spark is the underlying technology that is being used by Databricks I CAN use Julia in Databricks?if yes, what are you experiences with Databricks and Julia?

best,

As far as I know, Databircks uses their own Spark cluster manager not available in the open source version, so Spark.jl is unlikely to work out of the box there. However, if you manage to link Spark.jl to the Databrickโ€™s libraries instead of building the open source version, Iโ€™d expect APIs to be compatible.

@AlexanderChen I know its a few years later, but TidierDB.jl now supports Databricks as a backend for querying. It works thru the rest api. further documentation can be found here

connecting and querying is as simple as:

instance_id = "string_id"
token "string_token"
warehouse_id = "e673cd4f387f964a"
con = connect(:databricks, instance_id, token, "DEMODB", "PUBLIC", warehouse_id)
# After connection is established, a you may begin querying.

@chain db_table(con, "mtcars") begin
   @select(wt)
   @mutate(test = wt *2)
   @aside @show_query _
   @collect
end

WITH cte_2 AS (
SELECT  wt, wt * 2 AS test
        FROM tidierdb.default.mtcars)  
SELECT *
        FROM cte_2
32ร—2 DataFrame
 Row โ”‚ wt       test    
     โ”‚ Float64  Float64 
โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1 โ”‚   2.62     5.24
   2 โ”‚   2.875    5.75
   3 โ”‚   2.32     4.64
   4 โ”‚   3.215    6.43
  โ‹ฎ  โ”‚    โ‹ฎ        โ‹ฎ
  29 โ”‚   3.17     6.34
  30 โ”‚   2.77     5.54
  31 โ”‚   3.57     7.14
  32 โ”‚   2.78     5.56
         24 rows omitted
1 Like