Shapefile.jl writing .shp without coordinates

I loaded a Shapefile in Julia, transformed it to a datafame and then did some data cleaning, getting rid of some unneeded rows of data. I saved the shapefile again, with a new name, using “Shapefile.write” and tried to open the file in ArcGIS Pro. The layer did open, but it didn’t display on the map, saying that it didn’t have a coordinate system.

I tried to solve that in Julia, and I’m assuming that the transformation to Dataframe might have stripped that information and I need to recover it, but I don’t know how.

So, I’m guessing I have two options:

  1. Is it possible to delete the rows of data without transforming the shapefile to a Dataframe? I’m basically a data analyst and I’m very familiar with DF, so that’s why I did it, but since the Shapefile seems compatible with the “Tables.jl” interface, I might be able to skip the transformation in some way.
  2. Or is it possible to add a coordinate system to the file when I’m saving the dataframe as a shapefile? And if so, how do I know I’m using the correct one?

I’m sure this might be a fairly easy problem, but I’m new to GIS and I’m not still very familiar with all the ins and outs of shapefiles.

I’ll appreciate any insight someone can give me.

Welcome @alejandromerchan, are you able to post the shapefile and a MWE?

As a quick fix you can always assign a coordinate system of a shapefile once it’s loaded into ArcGIS.

The file I’m using comes from this [website]. (California Public Land Survey System - California Pesticide Information Portal) Is the statewide file, on the lower left corner, but I’m assuming any of them would have the same issue (I haven’t checked).

MWE

using DataFrames, Shapefile
table = Shapefile.Table(“./Statewide_CA_PLSS_NAD83AlbersCA_20240718.shp”);
plss = DataFrame(table)
# Filter data where column SECTION has “OE”, “TA” or “EA” (Sections from the Delta, Tahoe and Salton Sea)
subset!(plss, :SECTION => ByRow(x → x ∉ [“EA”, “OE”, “TA”]));
# Save file as shapefile, with a different name, on the same folder
Shapefile.write(“plss_modified.shp”, plss)

You may want to try GMT.jl. When it reads a shapefile (D = gmtread("file.shp")) data is returned a simple tabular type (a GMTdataset) (or a vector of them). Numeric data is stored in the matrix field data so you can do your matrix manipulations and at the save back to .shp with gmtwrite("filenmae.shp", D).

Thanks for the suggestion, I’ve used GMT.jl in the past, for some plotting, but never for data manipulation. I’ll take a look.

See also
https://www.generic-mapping-tools.org/GMTjl_doc/documentation/utilities/getbyattrib/index.html#getbyattrib
&
https://www.generic-mapping-tools.org/GMTjl_doc/tutorials/vector_shp/vector_shp/

OK, this issue is that `Shapefile.Table() does not retain the coordinate info. The simple solution is to use GeoDataFrames.

use: df = GeoDataFrames.read(path2shp) then everything should work as expected.

If you want to manipulate or create geometries using the JuliaGeo ecosystem
there is a helpful tutorial here: Creating Geometry | GeometryOps.jl

1 Like

Thanks @alex-s-gardner but not quite there yet, I’m probably missing something. The files opened in ArcGIS without coordinates again, I might have to dig some more. I do like that with GeoDatFrames.jl I get the DF automatically and can do the manipulations with the same syntax.

@alejandromerchan the problem is that .shp files do not require CRS information in their original specification. Later on, people started to include .prj files alongside the .shp, which include a CRS string.

Currently, Shapefile.jl assumes that this string is a ESRI WKT1 string when it encounters a .prj file (even though that is not necessarily the case), and it is probably not writing the .prj file in the write function for some reason?

I suggest that you open an issue in Shapefile.jl if that is the case.

It might, but this also seems pretty simple (I renamed the shape file to make the commands shorter)

using GMT

D = gmtread("Statewide_CA_PLSS_NAD83AlbersCA_20240718.shp.zip");
D2 = filter(D, SECTION=["OE", "TA", "EA"]);
gmtwrite("OETAEA.shp", D2)
viz(D2, name="OEATEA.png")

This works for me:

using Shapefile
using Downloads: download
using GeoDataFrames
using ArchGDAL
using DataFrames

#https://juliageo.org/GeometryOps.jl/dev/tutorials/creating_geometry

# paths 2 files 
url = "https://calpip.cdpr.ca.gov/content/groundwater/shapefiles/Statewide_CA_PLSS_NAD83AlbersCA_20240718.zip"
local_zip = "/Users/gardnera/Downloads/junk.zip";

path2shp = "/Users/gardnera/Downloads/junk/Statewide_CA_PLSS_NAD83AlbersCA_20240718.shp"
path2shp_new = replace(path2shp, ".shp" => "_new.shp")

# download and unzip
download(url, local_zip)
run(`unzip $local_zip`)

df = GeoDataFrames.read(path2shp)
df = subset!(df, "SECTION" => ByRow(x -> x ∉ ["EA", "OE", "TA"]));

# there is one geometry that does not match the others
same_geometry_index = typeof(df.geometry[1]) .== typeof.(df.geometry)
Shapefile.write(path2shp_new , df[same_geometry_index,:];force=true)

1 Like

@joa-quim GMT is a great option as well.

That’s right: subset shp while preserve proj · Issue #116 · JuliaGeo/Shapefile.jl (github.com)

You could keep it manually using GeoInterface.crs(handle), and then supply it to the Shapefile writer manually when writing out again.

Thanks. I think I missed understanding that the OP wanted those NOT in ["OE", "TA", "EA"]. I don’t have that implemented in GMT.jl, but it should not be difficult to do.

Yes, @juliohm, the problem is somewhere there. I took the original .prj file, renamed it, moved it to the folder with my modified .shp file and it all worked. I’ll file an issue with Shapefile.jl.

1 Like

@visr how would this work? I can’t seem to extract the crs from the shapefile.

Done in master.

1 Like

We should probably embed the crs in all the Shapefile.jl geometry objects so it can’t get lost.

Not difficult to do.

The other discussed alternative is DataAPI.metadata returning the crs fir a DataFrame. But it won’t work for e.g. a NamedTuple table, so embedding the crs in Polygon and other geoms is probably the best option.

If DataFrames’ table materializer preserves metadata then we may just want to hook Shapefile.Table up to DataAPI for a quick fix…

Its not very hard tecnically, but its a commitment to updating the ecosystem to handle the metadata.

We would need to document the metadata names in the GeoInterface.jl docs and implement checking them everywhere.