Suppose I wanna create some routine econ variables:
using FredData;
const f = Fred("Personal api_key");
#
d = get_data(f, "GDP"); y=d.df[:,4]
d = get_data(f, "PCEC"); c=d.df[:,4]
d = get_data(f, "GPDI"); i=d.df[:,4]
d = get_data(f, "GCE"); g=d.df[:,4]
d = get_data(f, "NETEXP"); nx=d.df[:,4]
Q: what is the best Julian way to automate this in a loop or something?
I had something in mind along the lines of:
vars = ["GDP", "PCEC", "GPDI", "GCE", "NETEXP"]
labs = [y, c, i, g, nx]
for (ii, dd) in enumerate(vars)
d = get_data(f, dd) # get data, eg "GDP"
labs[ii] = d.df[:,4] # create var, eg y="GDP"
filter!(x -> ! isnan(x), labs[ii]) # remove NaN from var
end
dict = Dict{Symbol, Any}()
vars = Dict{String, Symbol}("GDP" => :y,
"PCEC" => :c,
"GPDI" => :i,
"GCE" => :g,
"NETEXP" => :nx)
for (k, v) in vars
d = get_data(f, k)
dict[v] = d.df[:4]
end
# example of using variable
dict[:nx] # some value
# Or even
dict[vars["NETEXP"]] # the same as dict[:nx]
If you know types in advance itβs better to use them instead of Any.
Does anyone here know how to generate (non-dictionary) variables in a loop? (my original question)
Itβs very very easy in STATA where we do it all the time
clear*
set obs 10
gen yr = _n
gen treated = 1*(yr >= 3)
* dummies for each year post treatment
forvalues y = 0(1)7 {
gen treated_p`y' = 1*(yr == 3 + `y' )
}
* dummies for each year pre treatment
forvalues y = 1(1)2 {
gen treated_m`y' = 1*(yr == 3 - `y' )
}
Remember that Stata doesnβt have the same constraints Julia. There is never any dispute about what the variable x represents: itβs always a column in the data set.
Of course, it easy to do this in DataFrames. Part of the switch to using Strings as column names is to make this kind of Stata workflow easier.
for y in string.(0:7)
df[!, "treated_p" * y] = 1 .* df.yr .== 1 + df[!, y]
end
I think that, in this case, writing a macro may be your best option. Unless you are satisfied by some solution like:
function get_fourth_column(d) # unnecessary but simplifies
return d.df[:, 4]
end
fields = ["GDP", "PCEC", "GPDI", "GCE", "NETEXP"]
y, c, i, g, nx = get_fourth_column.(get_data.((f,), fields))
EDIT: Although realistically I would probably rather have a more general solution so that itβs easier to construct many different arguments inside a transform call.