using CSV, HTTP, DataFrames, Statistics, Plots, Dates
# Get Data
link = "https://github.com/azev77/Synthetic_Control_in_Julia/raw/main/HP_q.csv"
r = HTTP.get(link)
df = CSV.read(r.body, DataFrame, missingstring="NA")
y = df[(df.loc .== "adelaide"), "HPGyy"]
time = unique(df.Date)
plot(legend = :topleft)
plot!(time, y, lab="adelaide", color="green")
xaxis!(xrotation=45)
Gives
However, I only want the year, eg “2012” and not “01jun2012”.
The following does not work b/c Plots.jl doesn’t like repeated x-values
ttt = SubString.(time, 6, 9) # String of years for plots...
plot(legend = :topleft)
plot!(ttt, y, lab="adelaide", color="green")
xaxis!(xrotation=45)
I can’t test now. But that range is supposed to define the positions of the ticks. Thus, make it a subset of your time vector, and the labels that subset converted to strings.
edit: I see now that you have an x-axis which is string-only. To play safe, I would suggest the following:
plot(legend = :topleft)
plot!(eachindex(time), y, lab="adelaide", color="green")
xaxis!(xrotation=45,xticks=(1:4:41,[ time[i][end-3:end] for i in 1:4:41 ]))
My gut feeling is that it is more correct to use regexes, maybe
match.(r"\d{4}$", time)
It will give the same results, but feels somehow more correct, and will return nothing if the format is wrong.
Unfortunately, you have to access the actual matches by reading the fields of the match objects (m.match), which complicates broadcasting. If you create your own function
matchstring(m::RegexMatch) = m.match
you can do
years = matchstring.(match.(r"\d{2,4}$", time))
Too bad there are no default access methods for regex matches, it sets regexes apart from the rest of the base language.
Don’t work with strings at all! Use the Dates.jl standard library
Also, the line
times = unique(df.Date)
is very scary. How do you know the data is already sorted?
Try this
using Chain, DataFramesMeta, CSV, Dates, HTTP, Plots # please put your usings in MWEs
# Get Data
link = "https://github.com/azev77/Synthetic_Control_in_Julia/raw/main/HP_q.csv"
r = HTTP.get(link)
df = CSV.read(r.body, DataFrame, missingstring="NA")
p = @chain df begin
@transform Date = Dates.Date.(:Date, dateformat"dduuuyyyy")
@where :loc .== "adelaide"
@orderby :Date
begin
y = _.HPGyy
global t = _.Date
plot(legend = :topleft)
plot!(t, y, lab = "Adelaide", color = "green")
xaxis!(xrotation = 45)
end
end
It doesn’t solve your original problem entirely, since it doesn’t just show the year. But the defaults are nicer when using an actual date type.