I want to get TotalCharges of females in ascending order so that i could plot them, but sort function is not taking care of decimal point in number. as you can see 102 is coming after 1017. what should i do ?
Read the data as floats in the first place. The blanks " "
should be missing
. That’s what it is for.
From where and how are you reading the data?
i am using CSV file to read data. blanks were places occupied by Male in gender column so on filtering they got away.
How to read data as Floats ??
Try
chercn = CSV.read(...path..., DataFrame; missingstring = ' ')
if that doesn’t work just do
parse(Float64, female.TotalCharges)
If that fails you’ll be able to work out what sort of non-numeric data you’ve got in the csv file.
CSV.jl
is usually very good at determining the proper column type. But you can force it to adopt a certain column type as shown here Examples · CSV.jl
Although you should try Nils’ and my suggestion (below) first.
blanks were places occupied by Male in gender column so on filtering
Okay, but ideally you would have a missing
in the full dataframe wherever a value is absent. I suspect it’s because the CSV file has " "
in those places instead of ""
. But you can customize which values are handled as missing
Examples · CSV.jl.
Can you try loading the file with CSV.read(pathtofile, DataFrame; missingstring=["", " "])
It is showing error.
using Plots,DataFrames,CSV
chern=CSV.read(joinpath(dirname("/home/raman/Downloads/"),"a83a246242e4a760c6c4078e93ad481a0fcc66c973fe6a1bec4ff68f85fb9445_Telco-Customer-Churn.csv"),DataFrame ) ;missingstrings=' '
male=filter(:gender => ==("Male"),chern);
female=filter(:gender => ==("Female"),chern);
p1=plot(sort(male.TotalCharges) ,seriestype=:scatter);
sort(female.TotalCharges)
p2=plot(sort(female.TotalCharges) ,seriestype=:scatter, );
plot!(p1,p2)
and
The keyword is missingstring
(singular), or if you use missingstrings
(plural) you have to supply a vector of missing strings to match like @skleinbo has shown above.
how to calculate the number of females in file ?
sum(chern.Female .== "Female")
but you might want to look at some introductory Julia and/or DataFrames tutorials which in the medium term is probably vastly more productive than asking loads of relatively basic questions here.