How to df[:newConcatenatedCol] = df[:stringCol] * string(df[:intCol])?

question

#1

How to convert a column of integers (or floats) in a column of strings and concatenate it to other string column(s) ?
e.g.:

df = DataFrame(a=["aa","ab","ac"],year=[2015,2016,2017])
df[:c] = df[:a] * " " * string(df[:year])

That doesn’t work (string(df[:year]) returns "[2015,2016,2017]" and not ["2015","2016","2017"])
From here I learned I can do:

df = DataFrame(a=["aa","ab","ac"],year=[2015,2016,2017])
df[:yearString] = [string(x) for x in df[:year]] 

but, aside it’s longer, still I have a MethodError when I then try to concatenate the columns:

df[:c] = df[:a] * " " * df[:yearString]

(bdw, df[:c] = df[:a] * " " alone works… )


#2

Found it… to concatenate a column with a constant I need to use *, while to concatenate between columns I need to use the vectorised version .*:

df = DataFrame(a=["aa","ab","ac"],year=[2015,2016,2017])
df[:yearString] = [string(x) for x in df[:year]] 
df[:c] = df[:a] * " " .* df[:yearString] 

or

df = DataFrame(a=["aa","ab","ac"],year=[2015,2016,2017])
df[:c] = map((x,y) -> string(x, " ", y), df[:a], df[:year])

#3

You should use .* in both cases since you want to operate element-wise.


#4

@nalimilan Actually I did try, but you get a MethodError when you use df[:a] .* " " :

df = DataFrame(a=["aa","ab","ac"],year=[2015,2016,2017])
df[:yearString] = [string(x) for x in df[:year]] 
df[:c] = df[:a] .* " " .* df[:yearString]

MethodError: no method matching .(::String, ::String)
Closest candidates are:
.
{T<:AbstractString}(!Matched::Array{T<:AbstractString,1}, ::AbstractString) at strings/basic.jl:85
.{T<:AbstractString}(::AbstractString, !Matched::Array{T<:AbstractString,1}) at strings/basic.jl:86
.
(!Matched::DataArrays.DataArray{T,N}, ::AbstractString) at /home/lobianco/.julia/v0.5/DataArrays/src/operators.jl:240

in macro expansion at /home/lobianco/.julia/v0.5/DataArrays/src/operators.jl:244 [inlined]
in macro expansion at /home/lobianco/.julia/v0.5/DataArrays/src/utils.jl:17 [inlined]
in .*(::DataArrays.DataArray{String,1}, ::String) at /home/lobianco/.julia/v0.5/DataArrays/src/operators.jl:242
in include_string(::String, ::String) at ./loading.jl:441
in eval(::Module, ::Any) at ./boot.jl:234
in (::Atom.##67#70)() at /home/lobianco/.julia/v0.5/Atom/src/eval.jl:40
in withpath(::Atom.##67#70, ::Void) at /home/lobianco/.julia/v0.5/CodeTools/src/utils.jl:30
in withpath(::Function, ::Void) at /home/lobianco/.julia/v0.5/Atom/src/eval.jl:46
in macro expansion at /home/lobianco/.julia/v0.5/Atom/src/eval.jl:109 [inlined]
in (::Atom.##66#69)() at ./task.jl:60


#5

Ah, sorry. That was a bug in DataArrays which seems to be fixed with latest master on Julia 0.6.


#6

You can try DataFramesMeta.jl.

using DataFramesMeta
@transform(df, c = mapslices(x -> join(x), [:a :year], [2]) |> vec)

You might as well check Query.jl.