But if I set xlims and ylims, the image will be be clipped from 1 to 4, whereas the original plot starts from 0.5 to 4.5. I know in Python you can set extent to avoid this.
This question is a bit more general and both limits, [0.5 4.5] or [1 4] are correct. It all depends on what we want. In GMT this has been called the grid vs pixel registration and in Geotiff it’s known by the Pixel-is-area vs Pixel-is-point. Next fig and the GMT link above describe more these two ways of storing grids and images coordinates.