How does replace.() work on a string vector containing missing?

replace() and replace.() both work on the same string vector containing no missing

julia> vs=["1.","2","3.","1"]
4-element Vector{String}:
 "1."
 "2"
 "3."
 "1"

julia> replace(vs, "1"=>"999")
4-element Vector{String}:
 "1."
 "2"
 "3."
 "999"

julia> replace.(vs, "1"=>"999")
4-element Vector{String}:
 "999."
 "2"
 "3."
 "999"

julia> vsm=["1.","2","3.",missing,"1"]
5-element Vector{Union{Missing, String}}:
 "1."
 "2"
 "3."
 missing
 "1"

julia> replace(vsm, "1"=>"999")
5-element Vector{Union{Missing, String}}:
 "1."
 "2"
 "3."
 missing
 "999"

#  in the case of an array of strings containing missing, the replace.() function cannot handle missing values.Why?

julia> replace.(vsm, "1"=>"999")
ERROR: MethodError: no method matching similar(::Missing, ::Type{Any})

a workaround!!?!?

julia> vsm=["1.","2","3.",[missing],"1"]
5-element Vector{Any}:
 "1."
 "2"
 "3."
 [missing]
 "1"

julia> replace(vsm, "1"=>"999")
5-element Vector{Any}:
 "1."
 "2"
 "3."
 [missing]
 "999"

julia> replace.(vsm, "1"=>"999")
5-element Vector{Any}:
 "999."
 "2"
 "3."
 Union{Missing, String}[missing]
 "999"

does this mean that the reason for the error is not the missing value but the fact that the missing is not contained in a vector?

I’m not sure why this is surprising. The issue is that replace is not defined for missing, but it is defined for Vectors.

This has nothing to do with broadcasting.

julia> replace(missing, "a" => "b")
ERROR: MethodError: no method matching similar(::Missing, ::Type{Any})

julia> replace([missing], "a" => "b")
1-element Vector{Union{Missing, String}}:
 missing

I would suggest using passmissing

julia> using Missings

julia> passmissing(replace)(missing, "a" => "b")
missing
1 Like

more than anything else, it is a curiosity that comes from the attempt to answer this
request without using flow control

it looks like just what was missing in previous proposals to the problem

tocurr3(e)=passmissing(parse)(Int,passmissing(replace)(e, r"\$?(\d),?(\d+)( )?(USD)?"=>s"\1\2"))

df3=transform!(df, "Accommodation cost"=>ByRow(tocurr3)=>:ac_cost)

Feels like using passmissing once is neater:

tocurr3(e)=passmissing(e->parse(Int,replace(e, r"\$?(\d),?(\d+)( )?(USD)?"=>s"\1\2")))(e)
1 Like

my question is precisely about this. Why does replace() work in one case and not in the other?

replace(::String,...) is a special case. You’ll find replace(1, 1=>999) doesn’t work either. Because what is the point of a replace method working on scalars - unless those scalars have some internal structure where substitution is meaningful like strings?

Okay. I had guessed this aspect. What I’m missing is understanding the reason (and I think there is a reason) why in the following two cases (where strings or string vectors are not involved but only missing and [missing] ) there is a behavior different


julia> replace(missing, "a" => "b")
ERROR: MethodError: no method matching similar(::Missing, ::Type{Any})

julia> replace([missing], "a" => "b")
1-element Vector{Union{Missing, String}}:
 missing

Does this help?

julia> replace("abc", "a" => "b")
"bbc"

julia> replace(["abc", "a", "b"], "a" => "b")
3-element Vector{String}:
 "abc"
 "b"
 "b"

Because the primary use of replace is to iterate a collection and replace any elements that equal e by s.
replace([1, 2, 3], 1=>999) == [999, 2, 3]. You can’t do this with scalars because it doesn’t make sense to iterate them. Strings are an exception because they are also a collection of substrings.

No. Because, in this case (ie without ‘missing’ elements) replace.() also works

julia> replace.(["abc", "a", "b"], "a" => "b")
3-element Vector{String}:
 "bbc"
 "b"
 "b"

I probably set the question wrong (but I wasn’t initially clear about the whole context).
What I wanted to clarify is what @pdeffebach pointed out
which I rewrite below so as to exclude the specifics of strings (scalar and iterable at the same time) from the question

ulia>        replace(missing, 1 => 2)
ERROR: MethodError: no method matching similar(::Missing, ::Type{Any})

julia>        replace([missing], 1 => 2)
1-element Vector{Union{Missing, Int64}}:
 missing

Right, but that’s completely analogous to

julia> replace(1, 1=>999)
ERROR...

julia> replace([1], 1=>999)
[999]

replace doesn’t work on scalars, only collections. missing isn’t a collection, but strings are, in a sense.

Note in your examples of replace(::Vector{String}) vs replace.(::Vector{String}), that they both “work”, but they don’t do the same thing. The first one treats each string as a scalar, and the second as a collection, so the results are different.

1 Like

mmmmh… maybe this clarifies my point better.
It’s not so much being misslingt that generates the error but being an immutable scalar(!?).
So if there was a passONE similar to passMISSING we would have this result…

using Missings
passmissing(replace)(missing,1=>999)
#hand made
using Ones
passone(replace)(1,2=>999)
1

at this point I wonder if it is correct that passmissing() changes this behavior in the case of missing

I mean yeah, but that has nothing to do with replace specifically. For instance, there is no sort(scalar) method (I assume, not at a computer right now), so sort(missing) is an error. But passmissing(sort)(missing) is missing. passmissing doesn’t care whether its argument function has a method for missing, if any of that argument function’s arguments is missing, passmissing returns missing.

Yes I have seen passmissing() do this.
I wonder if this isn’t a stretch in cases where, explicitly, the (wrapped) functions are not applicable to that specific situation (single missing and replace or sort functions , for example)

That’s exactly what is documented:

Return a function that returns missing if any of its positional arguments are 
missing (even if their number or type is not consistent with any of the 
methods defined for f) and otherwise applies f to these arguments.
2 Likes