Question regarding how replace! works:

Hi all, I am having some issues using replace! in my DataFrame. I get the following error when trying to change some strings in a column, so when I create a boxplot they are better labeled:

ArgumentError: string too large (8) to convert to InlineStrings.String7

This is the code that I’m using:

df_A.col_1 .= replace!(df_A.col1, "Amylo" => "Aymloids") 

But when I try to use the function with a short string, i.e. "A" it works just fine.

I assume that it is a simple problem, but I can’t find why it is happening.

Thanks in advance,

Likely you read your data in with CSV, which uses these things called InlineStrings (as opposed to regular Julia strings) for better performance.

There are a couple of possible fixes. First, you could just stop trying to use the in-place version of replace! and go with simply replace.(). Alternatively, you can search the CSV.jl docs which I’m almost certain will tell you about a keyword argument allowing you to read the strings in as regular strings. EDIT: It is called stringtype, so stringtype=String would work.

Oh, I swore I tried with replace.()! It worked great, and also thanks for the explication. I will also try using stringtype, and read de documentation.

Thanks!

cc @quinnj

One fix for this is to call df_A.col_1 = string.(df_A.col_1) to make it a “normal” string type. The current type is an optimization to reduce the memory footprint, make gc easier, and read strings in faster.

2 Likes