How can the IF ELSE loop be applied to a data frame.
I want the loop to take the first 2 columns as conditions and give me an output in the final.
For eg: if “col A” == 4 && “col B” == 2
print(“Yes”)
else
print(“no”)
How can the IF ELSE loop be applied to a data frame.
I want the loop to take the first 2 columns as conditions and give me an output in the final.
For eg: if “col A” == 4 && “col B” == 2
print(“Yes”)
else
print(“no”)
Please check this out.
A column of a dataframe is just a vector, so
for (a, b) in zip(df.A, df.B)
(a==4) & (b==2) ? print("Yes") : print("No")
end
Or using broadcasting:
f(a,b) = (a==4) & (b==2) ? "Yes" : "No"
res = f.(df.A, df.B)
println(res)
How do you interpret the function broadcast?
df.A
and df.B
are Arrays with the same length. The function f
is applied to each “row” of the dataframe separately and its output is written into the array res
.
Hope this answers your question!
Does (df.A, df.B) == zip(df.A, df.B)
?
No.
zip(df.A, df.B)
gives an array of (2-element) tuples, whereas (df.A, df.B)
is a tuple of arrays.
So f.(df.A, df.B)
applies f
to every pair of arguments that it extracts from each row of columns df.A
and df.B
and returns a vector of “yes” or “no” strings?
Correct
Example:
df = DataFrame(a=(1:10).+2, b=1:10)
f(a,b) = (a==4) & (b==2) ? "Yes" : "No"
res = f.(df.a, df.b)
A few more details: the .
after a function is syntactic sugar. All dotted functions on a line are fused into a single function and applied element-wise to their arguments using the higher-level function broadcast
. So
sin.(cos.(xs) .+ 1) == broadcast(x -> sin(cos(x) + 1), xs)
The function broadcast
is similar to map
, but it broadcasts singleton array dimensions, e.g.
julia> ones(10,2) .+ (1:10)
10×2 Array{Float64,2}:
2.0 2.0
3.0 3.0
4.0 4.0
5.0 5.0
6.0 6.0
7.0 7.0
8.0 8.0
9.0 9.0
10.0 10.0
11.0 11.0
Strange nobody mentioned ifelse
.
julia> new_vector = ifelse.((df["col_A"] .== 4) .& (df["col_B"] .== 2), "Yes", "No")
julia> println(new_vector)