I am currently tracking progress during an iterative process in two different domains - numerical solutions of nonlinear equations, and machine learning. In order to track the progress, I have been simply pushing to a DataFrame in the iterative loop.
I have however just come across JuliaML/ValueHistories.jl and JuliaLogging/TensorBoardLogger.jl. It looks like such a logging approach is specifically made for what I am doing, which makes me feel like I should look more into it.
However - plotting the DataFrame after training is real simple. There is a noteworthy difference in overhead from tracking progress:
julia> @btime begin mvh = MVHistory() for i in 1:100 push!(mvh, :squared, i, i^2) push!(mvh, :cubed, i, i^3) end end 11.200 μs (193 allocations: 10.92 KiB) julia> @btime begin df = DataFrame(i=, squared=, cubed=) for i in 1:100 push!(df, [i i^2 i^3]) end end 44.500 μs (309 allocations: 17.98 KiB) julia> @btime begin df = DataFrame(i=Int64, squared=Int64, cubed=Int64) for i in 1:100 push!(df, [i i^2 i^3]) end end 34.500 μs (309 allocations: 18.08 KiB)
, but this time scale is so small that it is not a problem for my application. Setting up
TensorBoardLogger.jl, or extracting data from the types defined in
ValueHistories.jl, seems like a bit of extra work. So before I do something like that, I was wondering:
What are the potential benefits of logging, as opposed to pushing to a DataFrame?