Deserializing "Untrusted" Data

question

#1

How “safe” is deserializing data (à la deserialize) from an untrusted source? I’m okay with the process crashing if the data isn’t valid, but not okay with arbitrary code execution.

For context, I’m trying to run student code in a sandbox, serialize the result, and then deserialize it to examine and grade it.


#2

I lean towards:
Not safe at all.
serialization is designed for use for communicating between processes that are designed together, and are part of the same system.
I believe it has no concern for security.
I feel like it being able to cause arbitrary code execution during deserialization is potentially required; or at least impossible to stop.
And certainly stuff like buffer-overflow attacks are quiet possible.

I would suggest using some lowest common demonstrator format for transferring data from a untrusted source.
Like CSV, JSON, or perhaps BSON or HDF5.