I have a database, Microsoft Access (Yes I know but I don’t know SQL well enough yet) format, of woody plant stem growth. There are 5 plants with 5 shoots each measured for a total of 25. The measurement of the same shoot was repeated every 4 days for 7 periods. These shoots were measured starting before the bud broke so there are string indicators in the database to signify the stage the bud was at the time of measuring. BB=Bud Break, BS=Bud Swell, LO=Leaf out and N/A means the shoot was missing for what ever reason (usually eaten). These codes are there to signify what stage of growth the bud is in and will be used in later analysis so I cannot just change them to 0 in the database.
When I use ODBC.jl to import the dataset by querying the database with an SQL query, It returns with the length measured in a string format, and places it into a DataFrame as a string. Not surprising given all those LO and BB and the BS but for the the purposes of this program, each one of them is a zero.
In order to convert the LO’s, BB’s, BS’s and N/A’s to zero I have used this block of code which seems terribly inelegent and redundant but I cannot get the ‘|’ or ‘||’ operator to work. If anyone has a better more elegant way please share.
replace!(df.Length,"lo"=>"0") replace!(df.Length,"bb"=>"0") replace!(df.Length,"bs"=>"0") replace!(df.Length,"n/a"=>"0")
However, as it want to happen sometimes the shoot didn’t get measured. It either got knocked off, or was not visible to or was just plain missed that trip. And this is the rub.
I want to convert the string values in the DataFrame column to Float64 but the missings that are there throws Julia off which causes her to throw an error. I want to maintain the missings in the Length column so I can use regression to estimate what that length would have been has the measurement actually been and replace the missing with a calculated value. I cannot figure out how to do the conversion from String to Float64 while maintaining the “missings”