By not passing an explicit mean to GP like you did in:
f = GP(Matern52Kernel())
you’re declaring “zero-mean” Gaussian process prior. See this line. Since your GP prior assumes zero-mean, the GP will converge towards 0 where there are not enough data points.
For any mean zero process it will eventually asymptote at zero, but if you wanted the extrapolation to “last” a bit longer before smoothly heading to zero, you might consider using a covariance function corresponding to a long memory process, which I think is most cleanly defined as a covariance function whose corresponding spectral density has a (integrable) singularity at the origin. Fractional ARIMA processes are an example, and I bet there is at least one time series package that does ARFIMA models.
Extrapolation in GPs is a sophisticated matter that need some special treatments. One usually need to use covariance kernels that are able to extrapolate as pointed out by @cgeoga . See for example this paper and this paper by Andrew Wilson which discusses a combination of kernel that works well for extrapolation in a lot of cases. Additional resources can be found in this webpage.