Dash datatable using a nested dict anyone done it?

anon69491625 · June 13, 2022, 11:02pm

Hi there
I’m looking for an example I can examine that loads a nested dict into a data table. I can’t seem to find any. I see many for dataframes but none for dict like this one.

ata_flow = 
Dict{String, Any}("AMZN" => Dict{String, Any}("hv10" => "45.99", "price" => "122.16", "iv_%" => "83.4", "hv20" => "52.14", "iv" => "40.64", "hv5" => "41.21", "prc_%" => "7.51"), "VZ" => Dict{String, Any}("hv10" => "13.87", "price" => "51.27", "iv_%" => "65.61", "hv20" => "12.71", "iv" => "17.62", "hv5" => "10.32", "prc_%" => "19.37"), "C" => Dict{String, Any}("hv10" => "21.26", "price" => "51.78", "iv_%" => "72.73", "hv20" => "42.75", "iv" => "31.75", "hv5" => "20.79", "prc_%" => "11.86"), "IEX" => Dict{String, Any}("hv10" => "19.16", "price" => "195.55", "iv_%" => "70.36", "hv20" => "24.97", "iv" => "27.62", "hv5" => "18.77"))

I want to end up with a dash table like this one

              HV_10  PRICE    IV_%  hv20  iv  hv5  prc_%
AMZN  45.99   122.16 83.4  52.14 40.64 41.21  7.51
VZ        13.87 51.27  65.61 12.71 17.62 10.32  19.37
C          21.26 51.78 72.73 42.75 31.75 20.79 11.86
IEX      19.16 195.55 70.36 24.97 27.62 18.77

thanks to a great deal of help from the community I have the columns

columns=[Dict("name" =>i,"id" => i) for i in  collect(keys(first(values(data_flow))))]

but the data load has me stumped.

if I try

data = Dict.(pairs.(Values) )

I don’t get the row entries ie “AMZN” but I get the values in the right place. I’m almost there but I can’t figure out how to “insert” the key from the Dict for the values.

here is where I am at right now.

app.layout = dash_datatable(
              id="table",
              columns=[Dict("name" =>i,"id" => i) for i in  collect(keys(first(values(data_flow))))],
              data = Dict.(pairs.(Values) )
              )

bkamins · June 14, 2022, 4:15pm

Is this what you want (I am writing a one-liner):

julia> insertcols!(reduce(vcat, DataFrame.(values(ata_flow)), cols=:union), :stock => collect(keys(ata_flow)))
4×8 DataFrame
 Row │ hv10    hv20    hv5     iv      iv_%    prc_%    price   stock
     │ String  String  String  String  String  String?  String  String
─────┼─────────────────────────────────────────────────────────────────
   1 │ 45.99   52.14   41.21   40.64   83.4    7.51     122.16  AMZN
   2 │ 13.87   12.71   10.32   17.62   65.61   19.37    51.27   VZ
   3 │ 21.26   42.75   20.79   31.75   72.73   11.86    51.78   C
   4 │ 19.16   24.97   18.77   27.62   70.36   missing  195.55  IEX

or (a bit longer but maybe easier to understand)

julia> df = DataFrame()
0×0 DataFrame

julia> foreach(row -> push!(df, row, cols=:union), values(ata_flow))

julia> df.stock .= keys(ata_flow)
4-element Vector{String}:
 "AMZN"
 "VZ"
 "C"
 "IEX"

julia> df
4×8 DataFrame
 Row │ hv10    price   iv_%    hv20    iv      hv5     prc_%    stock
     │ String  String  String  String  String  String  String?  String
─────┼─────────────────────────────────────────────────────────────────
   1 │ 45.99   122.16  83.4    52.14   40.64   41.21   7.51     AMZN
   2 │ 13.87   51.27   65.61   12.71   17.62   10.32   19.37    VZ
   3 │ 21.26   51.78   72.73   42.75   31.75   20.79   11.86    C
   4 │ 19.16   195.55  70.36   24.97   27.62   18.77   missing  IEX

bkamins · June 14, 2022, 4:16pm

The key thing to note here is that you can easily convert Dict to a DataFrame and what follows should be easy (once you are in DataFrames.jl realm)

anon69491625 · June 14, 2022, 6:32pm

thank you so much, I was going to recode the whole thing to avoid Dict entirely and only use df. I am more comfortable with dataframes. This is an excellent solution and avoids me having to spend any more time in Dict space.

In python what I want to do took 30 minutes from installing Dash to working prototype with a LOT of help from youtube. Using Dash in julia was not even close to the same experience, I would suggest more worked examples starting with nested dict.

bkamins · June 14, 2022, 7:33pm

I might be opinionated here, but you can currently expect the easiest “data transformation” experience if you learn GitHub - JuliaData/DataFrames.jl: In-memory tabular data in Julia and GitHub - JuliaData/DataFramesMeta.jl: Metaprogramming tools for DataFrames (the second package is for convenience).

The reason is that there were years of adding different functionalities to this ecosystem.

While it is possible to handle everything using more basic types (and most likely it will be faster to run) it usually will require much more knowledge of Julia to do it correctly (for example I think you already know how many options push! in DataFrames.jl allows for).

One of the good things about DataFrames.jl is that then you can do conversions to many different data formats that support Tables.jl table interface. Here is a short example:

julia> df = DataFrame(a=1:3, b=11:13)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1     11
   2 │     2     12
   3 │     3     13

julia> Tables.rowtable(df)
3-element Vector{NamedTuple{(:a, :b), Tuple{Int64, Int64}}}:
 (a = 1, b = 11)
 (a = 2, b = 12)
 (a = 3, b = 13)

julia> Tables.columntable(df)
(a = [1, 2, 3], b = [11, 12, 13])

julia> Dict.(pairs.(eachrow(df)))
3-element Vector{Dict{Symbol, Int64}}:
 Dict(:a => 1, :b => 11)
 Dict(:a => 2, :b => 12)
 Dict(:a => 3, :b => 13)

julia> Dict(pairs(eachcol(df)))
Dict{Symbol, AbstractVector} with 2 entries:
  :a => [1, 2, 3]
  :b => [11, 12, 13]

So as you can see once you have your “final” data frame ready it is easy to convert it to many target formats.

anon69491625 · June 14, 2022, 7:54pm

Hi Professor
right from day one I wanted to use dataframes but a prior discourse thread led me to Dict. I’ll go through the links you were kind enough to provide.

thank you for the great worked example which made sense to me right out of the gate. I suppose I have a built in resistance to Dict, I have no idea why.

In my case Dict is useful as I can build out a structure using a ZMQ stream. I can start with nothing in the Dict and build it out as new stock symbols come in. As new “column” data comes in for a stock I just add the key/value pair and constantly update the existing columns as data comes in So “AMZN” “price” would be constantly updated in place.

thank you again for all your help.

theakson

so the ZMQ data stream looks like with each LINE equates to a ZMQ message buffer ( one line per loop)

"IND~SPX~LAST~4382.46",
    "IND~SPX~CLOSE~4412.53",
    "IND~SPX~OPTION_IMPLIED_VOL~20.82",
    "STK~AAPL~LAST~167.38",
    "STK~AAPL~VOLUME~671420.0",
    "STK~AAPL~CLOSE~165.75",
    "STK~AAPL~HALTED~0.0",
    "STK~AAPL~OPTION_IMPLIED_VOL~31.28",
    "STK~AAPL~IV~31.27",
    "STK~AAPL~IV_PERCENTILE~86.61",
    "STK~AMZN~IV~40.64",
    "STK~AMZN~IV_PERCENTILE~96.06",
    "IND~SPX~IV~20.48",
    "IND~SPX~IV_PERCENTILE~85.83",
    "IND~VIX~IV~116.36",
    "IND~VIX~IV_PERCENTILE~70.47",
    "STK~AAPL~CLOSE~167.97",
    "STK~AAPL~HV20~25.9",
    "STK~AAPL~HV10~24.93",
    "STK~AAPL~HV5~25.06",
    "STK~AAPL~PRICE_PERCENTILE~76.77",
    "IND~SPX~CLOSE~4397.95",
    "IND~SPX~HV20~17.76",
    "IND~SPX~HV10~13.78",
    "IND~SPX~HV5~12.76",
    "IND~SPX~PRICE_PERCENTILE~43.3"
"END"

and the Dict build out code looks like this ( remember I’m a noob!)

using DataFrames # https://docs.juliahub.com/DataFrames/AR9oZ/0.21.7/man/getting_started/
using ZMQ

context = Context()
in_socket = Socket(context, PULL)
ZMQ.bind(in_socket, "tcp://*:5555")

d_dash = Dict{String,Any}()

zmq_dash = Dict("LAST" => "price","CLOSE" => "price","OPTION_IMPLIED_VOL" => "iv",
                         "VOLUME"  => "VOLUME","IV" => "iv","IV_PERCENTILE" => "iv_%" ,"HV20" => "hv20",
                         "HV10" => "hv10","HV5" => "hv5" ,"PRICE_PERCENTILE" => "prc_%")

function update_d_dash(msg_symbol,field,value)
                       bucket_dict = get!(d_dash, msg_symbol) do
                                     Dict{String, Any}()
                       end
                       bucket_dict[field] = value
end

# MAIN loop

while true
    message = String(ZMQ.recv(in_socket))

    println("Received request: $message")

    if message == "END"
       println("dying")
       break
    end

    source,sym_in,field_in ,value_in  = split( message , "~")

    try
        field_out = zmq_dash[ field_in]                # ie field_in "OPTION_IMPLIED_VOL" => field_out "iv"
        update_d_dash( sym_in , field_out, value_in)

    catch e
        println("field_in : ", field_in, " not in cols" )
    end
end
ZMQ.close(in_socket)
ZMQ.close(context)

bkamins · June 15, 2022, 6:13am

Maybe something like this would be useful for you?

using DataFrames
input = ["IND~SPX~LAST~4382.46",
         "IND~SPX~CLOSE~4412.53",
         "IND~SPX~OPTION_IMPLIED_VOL~20.82",
         "STK~AAPL~LAST~167.38",
         "STK~AAPL~VOLUME~671420.0",
         "STK~AAPL~CLOSE~165.75",
         "STK~AAPL~HALTED~0.0",
         "STK~AAPL~OPTION_IMPLIED_VOL~31.28",
         "STK~AAPL~IV~31.27",
         "STK~AAPL~IV_PERCENTILE~86.61",
         "STK~AMZN~IV~40.64",
         "STK~AMZN~IV_PERCENTILE~96.06",
         "IND~SPX~IV~20.48",
         "IND~SPX~IV_PERCENTILE~85.83",
         "IND~VIX~IV~116.36",
         "IND~VIX~IV_PERCENTILE~70.47",
         "STK~AAPL~CLOSE~167.97",
         "STK~AAPL~HV20~25.9",
         "STK~AAPL~HV10~24.93",
         "STK~AAPL~HV5~25.06",
         "STK~AAPL~PRICE_PERCENTILE~76.77",
         "IND~SPX~CLOSE~4397.95",
         "IND~SPX~HV20~17.76",
         "IND~SPX~HV10~13.78",
         "IND~SPX~HV5~12.76",
         "IND~SPX~PRICE_PERCENTILE~43.3"]
df_long = DataFrame(stock=String[], variable=String[], value=Float64[])
for obs in input
    _, stock, variable, value_str = split(obs, '~')
    push!(df_long, (stock, variable, parse(Float64, value_str)))
end
df = unstack(df_long, :variable, :value, allowduplicates=true)

(in last line allowduplictes=true as in your data you have duplicate entries)

anon69491625 · June 15, 2022, 12:31pm

Hi Professor

I WAS going to settle in this morning to attack that very problem in my recode ( I am dumping Dict). Thank you so much for AGAIN giving me food for thought.

they aren’t duplicates as my usecase is that the entries in array input are actually a stream of updates. So each comes in and updates the existing df entries using stock as the row and variable as the column. Value updates the intersection of the two. Something for me to look into today

There is so much here for me to unpack and it’s REALLY REALLY helpful to have a worked example. I just ran it and it’s a great foundation for me to start my day. Thanks again.
theakson

Topic		Replies	Views
Is this how to load dash datatable from a Dict? Web Stack dashtable	0	816	June 8, 2022
Flattening YFinance.jl JSON result into a DataFrame New to Julia question	22	1301	April 11, 2023
Flatten dicts of dicts in DataFrame General Usage dataframes	16	236	February 19, 2025
Nested Dict can't figure out how to traverse the structure New to Julia dictionaries	4	425	June 13, 2022
Dict from dataframe General Usage dictionary , dataframes	5	1321	July 15, 2022

Dash datatable using a nested dict anyone done it?

Related topics