Only plot a subset of x

using CSV, HTTP, DataFrames, Statistics, Plots, Dates

# Get Data 
link = "https://github.com/azev77/Synthetic_Control_in_Julia/raw/main/HP_q.csv"
r = HTTP.get(link)
df = CSV.read(r.body, DataFrame, missingstring="NA")

y = df[(df.loc .== "adelaide"), "HPGyy"]
time = unique(df.Date)
plot(legend = :topleft)
plot!(time, y, lab="adelaide", color="green") 
xaxis!(xrotation=45)

Gives
image
However, I only want the year, eg “2012” and not “01jun2012”.

The following does not work b/c Plots.jl doesn’t like repeated x-values

ttt = SubString.(time, 6, 9)        # String of years for plots...
plot(legend = :topleft)
plot!(ttt, y, lab="adelaide", color="green") 
xaxis!(xrotation=45)

image

I lost a few hours on this, please help

So you want all the data points, but only want the x ticks to be the year? Why not just modify xticks?

1 Like

yes. How?

This gives it, I think:

plot(time, y, xticks = (1:4:41, ttt[1:4:41]), lab="adelaide", color="green")
plot!([14], seriestype = :vline, lab="Tax", color="black")

image

1 Like

There is a problem: the position of xticks do not correspond to the appropriate indexes:

plot(time, y, xticks = (1:4:41, ttt[1:4:41]), lab="adelaide", color="green")
plot!([1:4:41], seriestype = :vline, lab="1:4:41", color="black")
plot!([0:4:40], seriestype = :vline, lab="0:4:40", color="blue")
plot!([0.5:4:40.5], seriestype = :vline, lab="0.5:4:40.5", color="orange")

image

1 Like

Does it work if you put some filter of time in the place of the range in the definition of xticks?

What do you mean?

I can’t test now. But that range is supposed to define the positions of the ticks. Thus, make it a subset of your time vector, and the labels that subset converted to strings.

edit: I see now that you have an x-axis which is string-only. To play safe, I would suggest the following:

plot(legend = :topleft)
plot!(eachindex(time), y, lab="adelaide", color="green")
xaxis!(xrotation=45,xticks=(1:4:41,[ time[i][end-3:end] for i in 1:4:41 ]))

image

2 Likes

My gut feeling is that it is more correct to use regexes, maybe

match.(r"\d{4}$", time)

It will give the same results, but feels somehow more correct, and will return nothing if the format is wrong.

Unfortunately, you have to access the actual matches by reading the fields of the match objects (m.match), which complicates broadcasting. If you create your own function

matchstring(m::RegexMatch) = m.match

you can do

years = matchstring.(match.(r"\d{2,4}$", time))

Too bad there are no default access methods for regex matches, it sets regexes apart from the rest of the base language.

2 Likes

Don’t work with strings at all! Use the Dates.jl standard library

Also, the line

times = unique(df.Date)

is very scary. How do you know the data is already sorted?

Try this

using Chain, DataFramesMeta, CSV, Dates, HTTP, Plots # please put your usings in MWEs
# Get Data 
link = "https://github.com/azev77/Synthetic_Control_in_Julia/raw/main/HP_q.csv"
r = HTTP.get(link)
df = CSV.read(r.body, DataFrame, missingstring="NA")
p = @chain df begin 
    @transform Date = Dates.Date.(:Date, dateformat"dduuuyyyy")
    @where :loc .== "adelaide"
    @orderby :Date
    begin 
        y = _.HPGyy
        global t = _.Date
        plot(legend = :topleft)
        plot!(t, y, lab = "Adelaide", color = "green")
        xaxis!(xrotation = 45)
    end
end

It doesn’t solve your original problem entirely, since it doesn’t just show the year. But the defaults are nicer when using an actual date type.

3 Likes

@Albert_Zevelev, you can try the workflow herein.

using Dates, CSV, DataFrames, Plots; gr()
# link = "https://github.com/azev77/Synthetic_Control_in_Julia/raw/main/HP_q.csv"
df = CSV.read("HP_q.csv", DataFrame, missingstring="NA")
y = df[(df.loc .== "adelaide"), "HPGyy"]
t = df[(df.loc .== "adelaide"), "Date"]
t1 = Date.(t, dateformat"dduuuyyyy")
YrTick = year(minimum(t1)):year(maximum(t1))

plot(t1, y, lab="adelaide", color="green", xticks=false, legend=:topright)
plot!(xticks=(Date.(YrTick),YrTick),xtickfontsize=8,xlabel="Year", ylabel="HPGyy")
xaxis!(xrotation=45)

NB:

  • the years’ tick marks are placed at 01-Jan-YYYY, but this can be changed for any other preference.
  • Simplified date parsing using dateformat"dduuuyyyy" as per @pdeffebach’s example above
4 Likes