Freezing problems with GENIE/Stipple whilst updating using ZMQ feed

Not sure whether I got it right. But I’m still using the ZMQ socket.
I just feed the PULL socket fron another terminal where I set up a PUSH socket.

The refactoring is in price calc and removal of global vars

Hi there

thanks for the refactoring, a nice approach.

did you run the code? I just ran it ( remember I’m in a meeting working on phone tmux :-)))
got error

ERROR: LoadError: UndefVarError: n not defined

so I just bodged it using a global so as to maintain most of your code.

n = 1
while true
    print(lpad("$n", 5, '0'), "> ")
    receive!(socket, df_table) == 0 || break
    **global n += 1**
end

We can try to put it in place next week as we’re in meetings from now until Monday. We’ll see what happens with the mods we made tomorrow. We’ve got someone to keep an eye on it.

We think the best course, especially for your sanity, is for GENIE to come up with a MWE that accepts ZMQ data and uses it to update a datastructure real time as proof of concept. How about something from yahoo finance? That way you can add to your offering AND we can just mod it.

sound good?

Yes, I like this idea :slight_smile:

FYI @CiPa @Pere

using ZMQ would help in the sale to sensor equipment ( labs), IOT ( manufacturing) and anyone else that wants to have a reliable and easy to use data transport.

hi there @hhaensel
just got msg from person watching screen. On screen froze after 10 minutes. We’ll get your refactored code in place for Monday 5/11/23.

have an excellent weekend.

If the app freezes, please open the DevTools (right-click browser window → Inspect), chose “console” in the top tab and make a screen shot like this:


There’s probably some information on the root cause.
It might be that you have errors in the websocket. I experienced this when I chose a higher update rate.
Is it possible that two events follow very closely one after another?
In that case we might need to throttle the events.

@essenciary Would be great if we could queue the websocket messages.

Hi there

UPDATE. Decided to run the live system from CLI and caught the freeze event. Got the inspect snapshot, hit browser refresh which woke it up took another inspect snapshot, waited a few seconds to see it running and took a running properly inspect snapshot. Hope this is useful.

all snapshots in next message as they didn’t seem to want to upload into this message

can’t do it today BUT we’ll install the code you wrote into the cron process for 5/8/23.Please note I modified your code slightly as the n variable was outside the while loop. I made it global ( I know I know bad bad ) I didn’t want to mess up your code.

Thanks for the instructions about the inspect, we keep ourselves ignorant about all things web browserish, ironic really as our boss funded sun microsystems and formed the first java fund :wink:

In answer to your question. YEP the event timing can alter dramatically. For this functionality we can introduce a delay in the ZMQ if we have to. you might have already introduced a delay with the print statement in your loop. We’ll have to see what happens when we remove it but for now we’ll leave it in.

It’s not a problem as 1 sec for this data isn’t a problem. I think it would be useful to track this down for YOUR purposes so we’ll just let the data flow at the normal rate. We know that the data is flowing correctly, that the dataframe ( in our code and yours) updates effectively. The only thing not working consistently is the model.

have an excellent one

ps here is the code we will be using on 5/8/23 for the run. Please note the global bodge for n in the loop. We know it’s a no no but we didn’t want to mess your code up. Let us know if you have a newer version you want us to try.

using Stipple
using StippleUI
using StipplePlotly
using CSV, DataFrames, Dates , Logging
using ZMQ

log_date = Dates.format(now(),"yyyy_mm_dd_HH_MM")
log_name = "/home/dave/tontine_2022/data/logs/tontine2_log_" * log_date * ".log"
io = open( log_name, "w+")
logger = SimpleLogger(io)
global_logger(logger)


dash_columns = ["sym","close","price","sdmove","hv20","hv10","hv5","iv","iv%ile","prc%ile","ern_days"]

df = DataFrame([col => (col == "sym" ? String : Float64)[] for col in dash_columns])
df_table = Observable(df)

zmq_dash = Dict("LAST" => "price","CLOSE" => "close","OPTION_IMPLIED_VOL" => "iv","OPTION_HISTORICAL_VOL" => "iv",
                         "VOLUME"  => "volume","IV" => "iv","IV_PERCENTILE" => "iv%ile" ,"HV20" => "hv20",
                         "HV10" => "hv10","HV5" => "hv5" ,"PRICE_PERCENTILE" => "prc%ile","EARNDAYS" => "ern_days")


@vars TontineModel begin
    tontine_data::R{DataTable} = DataTable(df_table[])
    tontine_data_pagination::DataTablePagination = DataTablePagination(rows_per_page=100) #9/11/22
end

function ui(model::TontineModel)
    page(
        model, class="container", title="title TONTINE2 ", head_content=Genie.Assets.favicon_support(),

        [
        heading( "heading Tontine2 5555 9/11/22 from 9_11_22_stpl_pull.jl pag = 100 px = 3000"  )

        row([
            cell(class="st-module", [
            h5("h5 tontine data")
            table(:tontine_data;
            style="height: 3000px;",
            pagination=:tontine_data_pagination) # 9/11/22 added pagination
            ])
        ])
        ]
    )
end

function handlers(model)
    on(model.isready) do isready
        isready || return
        model.tontine_data[] = DataTable(df_table[])
    end

    on(df_table) do new_table
        model.isready[] || return
        model.tontine_data[] = DataTable(new_table)
    end

    model
end

route("/") do #9/12/22 https://discourse.julialang.org/t/noob-needs-help-debugging-and-understanding-reactive-models/87038/12
    global model
    model = init(TontineModel)
    model |> handlers |> ui |> html #  = has lowest precedence
end

up()       # up(9000; async = true, server = Stipple.bootstrap())


# it is good practice to append a '!' to the function name if that function changes the content of one or more variables
function price_calcs!(df, sym_in)

    #println( "entering price_calcs " ) 
    row = df[findfirst(==(sym_in), df.sym), :]
    expected_move = round(row.iv / 19.896 , digits = 2)
    change = round((row.price - row.close) / row.close * 100  , digits = 2)
    sdmove = round(change / expected_move, digits = 2)

    isinf(sdmove) && (sdmove = 999.99)
    
    try 
        # row is only a view on the DataFrame, so this updates the DataFrame!
        row.sdmove = sdmove
    catch y
        # println("ADDING SDMOVE CATCH something went wrong with sdmove into df_table : ", y)
        @info "adding SDMOVE something went wrong error : " y 
        row.sd_move = 999.00
    end

end

function receive!(socket, df_table)  
    message = String(ZMQ.recv(socket))
    @info "message : $(now()) : " message    # 9/11/22 https://julialogging.github.io/tutorials/logging-basics/
    flush(io)

    println("Received request: $message")

    if message == "END"
        println("dying")
        println(df_table)
        @info "=================================================================================================================> $(now())  got end dying"
        @info df_table

        try 
            ZMQ.close(socket)
            ZMQ.close(context)
            return 1
        catch zmq_error
            @info "*********************************************************==> something went wrong with closing zmq sockets => " zmq_error ZMQ.zmq_errno()
        end

    end

    in_source, sym_in, field_in, value_in = split( message , "~")
    value_fl = parse(Float64, value_in) # convert to Float64
    
    field_out = zmq_dash[field_in] # ie field_in "OPTION_IMPLIED_VOL" => field_out "iv"
    try
        #println("message to add : ", in_source," ",sym_in," ",field_out," " ,value_in)
        sym_in in df_table[].sym || push!(df_table[], (sym_in, 0.0, 0.0, 0.0, 0.0 , 0.0, 0.0 , 0.0 , 0.0 , 0.0 , 0.0))

        df_table[][findfirst(==(sym_in), df_table[].sym), Symbol(field_out)] = value_fl

        if field_out == "price" 
            price_calcs!(df_table[], sym_in)
        end

        try 
            notify(df_table)
        catch f
            #println(" NOTIFY CATCH  something went wrong with df_table : ", f)
            #println("sym_in :", sym_in," field_out : ", field_out)
            @info " $(now())    NOTIFY CATCH  something went wrong with df_table :" f
        end
    catch e
        @error "** PROBLEM WITH DF UPDATE >  "  field_out  e 
    end
    return 0
end

# ---------- Main ---------

empty!(df_table[])
notify(df_table)
context = Context()
socket = Socket(context, PULL)
ZMQ.bind(socket, "tcp://*:5555")

@info " $(now())  ===>>>    starting IN Socket 5555" 
flush(io)

n = 1
while true
    print(lpad("$n", 5, '0'), "> ")
    receive!(socket, df_table) == 0 || break
    global n += 1
end
```

the snapshots see names for event captured



From your UI it looks as the update frequency is not the problem.
If ever the freeze problem continues to exist, I’d put a field in the UI where you write the ZMQ messages to. And I’d add a textfield that updates another field via the server. in order to see wether the server is reactive.
Alternatively, you could add an admin console. I’ll let you know how to to that …

why do that? we can see the ZMQ messages are being processed properly in the log. They get sent and received as per plan. We don’t see ANY data loss.

The dataframe is updated in any of the code variants ( your and ours) so it’s not a logic problem. We don’t log that as it gets updated quite frequently sometimes example ( and this might be slow).

Info: message : 2023-05-01T15:20:13.122 : 
│   message = STK~NVDA~LAST~289.28
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:13.622 : 
│   message = STK~NVDA~LAST~289.36

so the only thing NOT happening is the representation of the update of the model in stipple.jl. Basically what we saw in 9/22/22.

we don’t want to add anything more to this ( thanks for the admin offer) as we can just switch to python/dash/pandas. We know it works and it’s easy to do. We can wait to see if GENIE puts up a MVP of ZMQ update of a datastructure in stipple and we can mod that.

We’ve done what we said we would do and put your code into the cron process to run on 5/8/23. This will be using live data ( ZMQ
PUSH PULL simplest form no throttle) and we’ve got someone to monitor it.

We are working to the principal that your code is the best shot to get stipple.jl functionality working and we are deploying your code. The ZMQ feeds are working exactly as we expect, bogumil’s DataFrames.jl works in either code base ( yours and ours) so that’s not the problem.

we’ll let you know what happened on 5/8/23 and will provide the inspect screen shot as requested should it freeze again.

here is an example of the way we log the ZMQ traffic inside the stipple script.

 Info: message : 2023-05-01T15:20:07.113 : 
│   message = STK~SPY~LAST~415.61
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:10.369 : 
│   message = STK~SBUX~LAST~114.59
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:13.122 : 
│   message = STK~NVDA~LAST~289.28
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:13.622 : 
│   message = STK~NVDA~LAST~289.36
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:14.373 : 
│   message = STK~IEFA~LAST~68.66
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:17.127 : 
│   message = STK~QQQ~LAST~322.19
└ @ Main /home/dave/tontine_2022/2022_live/9_11_22_stpl_pull.jl:151
┌ Info: message : 2023-05-01T15:20:23.386 : 
│   message = STK~IEFA~LAST~68.67

I have well understood the situation. But as I cannot debug your computer I can give you hints how to do it if you are willing to.
Why the fields? You could check whether server client interaction is still functional. It is an easy check and implemented in 5 mins. It’s not more than copying the hello world example in your code.
The admin console can do a lot more. You can, e.g. inspect df_table, ask for the number of attached listeners (to check whether they got lost).
For the admin console to work, make sure no one can misuse the computer, because this simple version does not have any user check and will allow full control over the server.
Add the following lines to your module:

using Stipple
using StippleUI

@vars Console begin
    input = ""
    output = ""
    error = ""
    enter = false
end

function ui_console(model)
    page(model, class = "container", [
        heading(join([a(img(class = "st-logo", src = "/img/logo.png", style = "height: 40px;"), href = "/"), "Console"]), style = "width: 100%;"),
        row(cell(class = "st-module", [
            cell(class = "st-br", [
                h5("Input"),
                textfield("t1", :input, "", placeholder = "type your input", label = "Input", :outlined, :filled, type = "textarea",
                    @on("keydown.shift.enter.prevent", "enter = true"), @on("keydown.ctrl.enter.prevent", "enter = true")
                )
            ]) |> row,
            cell(class = "st-br", [
                h5("Output"),
                textfield("t2", :output, "", placeholder = "no output yet ...", label = "Output", :outlined, :filled, type = "textarea")
            ]) |> row,
            cell(class = "st-br", [
                h5("Error"),
                textfield("t3", :error, "", placeholder = "no error yet ...", label = "Error", :outlined, :filled, type = "textarea")
            ]) |> row,
            row([
                cell(class="st-br st-ph", btn("Enter", color = "primary", @click("enter=true"), loading = :enter)),
                cell(class="st-br st-ph", uploader(
                    label = "File Upload",
                    url   = "/upload",
                    style = "max-width: 500px")
                )
            ]),
        ]))
    ], title = "Console")
end

# capture output, std_out, std_err and error information in case of failure
macro capture(expr)
    quote
        ex_string = ""
        original_stdout = stdout
        original_stderr = stderr
        (so_rd, so_wr) = redirect_stdout();
        (se_rd, se_wr) = redirect_stderr();

        out = try
            eval($(esc(expr)))
        catch ex
            ex_string = string(ex)
            ""
        end
        
        redirect_stdout(original_stdout)
        redirect_stderr(original_stderr)
        close(so_wr)
        close(se_wr)

        so = String(read(so_rd))
        se = String(read(se_rd))

        (out, so, se, ex_string)
    end
end

function console_handlers(console)
    onbutton(console.enter) do
        out, so, se, ex = @capture string(eval(Meta.parse("begin\n $(console.input[]) \nend")))
        !isempty(so) && (out = "standard out:\n $so\n---------------------------------------------\n$out")
        !isempty(se) && (ex = "standard error:\n $se\n---------------------------------------------\n$ex")
        console.output[] = out
        console.error[] = ex
    end

    console
end

route("/console") do
    @info "User entering debug mode!"
    console = init(Console)
    console |> console_handlers |> ui_console |> html
end

and browse to http://localhost:8000/console

There you can enter julia code and execute it by either pressing the button or by hitting ctrl + ENTER

hi @hhaensel

I wanted to state the situation as we see it just to make sure that we haven’t missed something. It seems that we agree that the logic of updating the dataframe in a timely and accurate manner is functional and not part of this issue. We can easily show that the dataframe is consistently updated consistently over weeks. This doesn’t seem be a computer configuration issue nor a core logic issue.

It seems unfair to us for you to dig any deeper into this. You have been very kind and provided us with a complete set of code that we can run. We stick to our long term suggestion that someone should write a MVP for representation of streaming data in stipple using ZMQ. We think it would be a boon for the GENIE 2.0 launch as well. We don’t have the julia skills to do it. We’ve dabbled with julia over the last year or so and decided to really try it out in April and May.

We have hired a temp to watch the run on 5/8/23 and briefed her on the inspect requirements. We don’t want to put the admin console code into the test as we want to just keep the situation as close to real life as we can.

We have put the code you provided into the cron process for running on monday. We’ll update this thread with the results. If it doesn’t work then so be it. You can’t replicate the problem so the issue must be our end and we’ll make the call to progress with Stipple or not based on the results.

Stipple is your project and this is your code. We know that the ZMQ data flow works ( we have gb’s of logs ) we know that the dataframe is updated correctly. We now have the best stipple code attempt for our functionality in your code. We just finished a dry run with the static data which worked, and a few minutes of market data that flowed in after hours trading which updated the streaming columns properly. There was no real volume in the trading so it didn’t stress the system at all. we’re concerned that this is a throughput issue. If that’s the case then it’s an easy fix. we can put a delay into the system upstream. One of the great things about queue theory is the flexibility it gets you to change things without messing with the core code.

Thank you for all your help in this. We’ll keep our fingers crossed on 5/8/23. We’d like to stay and watch but we’re in planning meetings all week. Next week it’s Carbon week so that should be fun :slight_smile:

sad to report that the test failed almost immediately. The person watching saw about 2 minutes of traffic and then the UI froze. We have some non intrusive testing running to look at traffic levels to see if there is a correlation with the volume of messages. We’ll try again tomorrow.

We want to help out but right now we are offsite and in meetings all day. We’ll look at implementing the console code as well.

Sorry to report that stipple UI froze again.

We had a physical machine built out

72 core 128gb ram 8tb hd.

using the latest linux debian, fully updated and upgraded.

We build the whole julia environment from scratch

$ juliaup st
 Default  Channel  Version                Update 
-------------------------------------------------
       *  release  1.8.5+0.x64.linux.gnu         


stipple_2023) pkg> st
Status `~/Desktop/tontine_2023/stipple_2023/Project.toml`
  [336ed68f] CSV v0.10.10
  [a93c6f00] DataFrames v1.5.0
  [4acbeb90] Stipple v0.26.8
  [a3c5d34a] StippleUI v0.22.3
  [c2297ded] ZMQ v1.2.2
  [ade2ca70] Dates
  [56ddb016] Logging

(stipple_2023) pkg> 

made sure that julia was the only app running (other than routine os tasks).

We did this to make sure that there were no other issues that could effect the test.

The Firefox browser was default

firefox 112.0.2 (64-bit)

We asked that the script be run from the cli so we could see the output of your println which carried on printing even though the stipple UI was frozen.

195877> Received request: STK~XLK~LAST~150.24
195878> Received request: STK~UPS~LAST~173.03
195879> Received request: STK~QCOM~LAST~106.08
195880> Received request: STK~C~LAST~46.25
195881> Received request: STK~TSLA~LAST~167.61
195882> Received request: STK~GOOGL~LAST~108.34

When the stipple UI froze it could NOT be refreshed
a message was observed

http://127.0.0.1:8000/#wsconnectionalert

The upshot of this is that we can’t continue with Stipple.jl at this time. We have decided to send the data, via ZMQ, to a python/dash environment and will revisit Stipple.jl when a MWE for streaming data is available.

thank you for you help in this matter @hhaensel without your code we would not have gotten anywhere close to where this is.

We treat the web as a utility that we plug into and we have no web developers in our company, that is why GENIE.jl @essenciary and Stipple remain so attractive to us. Our ONLY graphical representation goals are augmented reality. Having seen the utility of the next generation of headsets ( tethered and untethered) we aim to bypass the web stack component completely. We are even considering no screens and the ramifications of that tech. All interesting stuff.

Julia was designed to be a single language solution and it seems that it isn’t with regard to the web. we can’t use Pluto ( though we LOVE the idea) with PlutoHooks.jl hosted on a web site, we can’t use Stipple.jl because it freezes ( probably due to pacing issues). We discussed a 1 second delay but that isn’t going to work for some other applications we have, ( lab feeds , telemetry from IOT devices) which are sub second.

We think that a reason you are not seeing freezing is the delay you introduced in the test code you posted.

sleep(0.05)

delay which we cannot at this time. We hope this is the case as many people, we think, would use Stipple.jl with a delay of that duration. Just not us.

We don’t think the console code you gave us would help this situation as the MWE is more suited to the greater good.

thanks again