I have a workflow where I repeatedly need to overwrite files with new content. I think the easiest way to overwrite a file is to use write(filename,newcontent).
My question is how safe this function is against data loss when the julia process calling it gets interrupted unexpectedly during the write operation (e.g. someone hitting CTRL-C, a cluster manager killing process because of timeout etc…). I basically want to make sure that the file always either contains the old content or the new one but it should ideally never end up in a state where both the old and the new content of the file are lost. Is it safer to do something like:
The short version is indeed not data loss safe (but very convenient when that is not a significant issue).
The usual approach to improve safety is to write the new content into a temporary file and once that has succeeded (and possibly been verified by reading it back or computing a hash), move it into its target location.
To make everything a bit more messy, both of the methods sgaure suggests above, will sadly not work on Windows, as rename does not overwrite and mv does not exist.
On Windows you can instead use run(`cmd /C MOVE /Y $old_filename $new_filename`)
I’m not sure if this is in fact atomic though, as a quick search yields contradictory information.
Another possible strategy is to put your contents into a database, for example SQLite with file based storage (DuckDB also qualifies, I guess). Databases are often chosen for handling larger amounts of content with non-trivial structuring, and may feel to be an overkill in this respect for simpler data. But overlooked is frequently another advantage: SQLite provides save inserts and updates=ACID transactions, across all supported OSs, and is, owing to its enormous popularity, very thouroughly tested. Trying to overwrite files safely but “manually” is to some extend reinventing the wheel.
A possible low-tech workaround is to do a two-step mv dance where the original file is temporarily moved to a backup name and then removed after the new file is in place.