Initializing a dataframe

Here are some further experiments, getting closer to what I really want. At this point, it is no longer about my original problem, but about me being obsessed with getting a specific result.

struct A end
struct B end

function getCurrents(::A)
           random  = rand()
           println("A")
           J_δ = [8, 6]
           J_β = [3,23]
           return (J_δ = J_δ, J_β = J_β, c = random)
end

function getCurrents(::B)
           dict = Dict()
           random  = rand()
           dict[:J_δ] = [8, 6]
           dict[:J_β] = [3,25]
           return dict
end

a = A();  output_named_tuple = getCurrents(a)
for (k,v) in zip(keys(output_named_tuple), output_named_tuple)
  # eval evaluates in the global context
  eval(:($k = $v))
end

b = B(); dict = getCurrents(b)

for (k,v) in zip(keys(dict), dict)
  # eval evaluates in the global context
  eval(:($k = $v))  # Defines J_β, J_δ
end

My function getCurrents, either returns a NamedTuple or a Dictionary. When it returns a dictionary, J_\alpha and J_\beta only appear once. In the original solution, J_\beta and J_\delta appear each three times, which violates the DRY principle. Lower down, I have two loops, one over the NamedTuple, and one over the dictionary. In both cases, the variables J_\beta and J_\delta are instantiated in the global space (perhaps the wrong term in Julia) in the sense that in any function I could type something like

  z = J_\alpha + J_\beta

and I would get a result.

My real objective is to specify the variables I wish to track only ONCE, and have my callback collect the value of these variables once per time step of an ODE solver, and store the results in a DataFrame. Of course it is possible. Anything is possible: after all, that is what packages and modules are all about. Creating easy to use functionality that is not simple with the current structure of julia and packaging it for the user.

I am open to any suggestions you might have. This little excursion has taught me about NamedTuples, Structures, zip, loops of various kinds and the most basic form of metaprogramming.

One question: I wonder how efficient or non-efficient my approach is. Note that efficiency is not the point here, but feasibility.

When using DifferentialEquations.jl, if I must unpack a dictionary everything the right-hand side is invokedm there might be a penalty. Those are experiments I might drive myself to run.

Another issue: in the right-hand side routine, without the callback, J_\beta and J_\delta are variables local to the method. In the current approach, J_\alpha and J_\beta are defined in the global space, which is never a good idea. So one question to answer is whether it is possible to apply a macro to create a variable in a local context of some kind.

I have read the following three links: ’

Thank you,

Gordon

I suspect that there is a simple and idiomatic solution, but I have to admit that with this meandering topic I no longer have a clear idea what the problem is.

If you want collect the arguments a function was called with, you could define a container and make it callable. Eg

struct CollectingVector{T}
    vector::Vector{T}
end

CollectingVector{T}() where T = CollectingVector(Vector{T}())

function (cv::CollectingVector)(x)
    push!(cv.vector, x)
    nothing
end

julia> cv = CollectingVector{typeof((J_δ = 1.0, c =  1.0))}(); # template for type

julia> cv((J_δ = 1, c = 2))

julia> cv.vector
1-element Array{NamedTuple{(:J_δ, :c),Tuple{Float64,Float64}},1}:
 (J_δ = 1.0, c = 2.0)

You almost certainly don’t need metaprogramming, but instead of learning about very basic things like loops and composite types (stuct) in the course of solving the problem, you may benefit from just working through the manual first.

Julia is a powerful language, but you won’t be able to harness that power without making an initial investment in some structured form.

There is:

colnames= Symbol.('A':'Z')
df = DataFrame(fill(Int, length(colnames)), colnames)

#now you can push your data line by line onto your df
for line in eachrow(rand(1:100, 100, length(colnames)))
    push!(df, Tuple(line))
end

The trick does the Array(created by fill) which contains the type of each column.

Thank you, everybody, I have learned a lot from you, beyond my initial studies of the language. I appreciate it.

Just for reference, I have done lots of reading, and experimentations, but have formed first impressions of the language. Another possible approach I also had not considered is using the possibly right tool for the job, if efficiency is not an: calling Python code. I will certainly keep reading all the great information out there and close the issue, which has certainly meandered.

1 Like