The open(filename::AbstractString; lock = true, keywords...)
function has a keyword lock. According to official documentation “The lock keyword argument controls whether operations will be locked for safe multi-threaded access.” Even after putting lock keyword to true explicitly, I can open the same file in different a process(for testing I used 2 julia REPLs) and edit it from both process. Now, my questions are:
What does lock keyword actually do?
Is there any way to open a file exclusively in a process and make other processes wait for it to prevent data race?
Even after lock=true, can the file get corrupted if two processes try to write it at the same time?
What does "safe multi-threaded access" mean in this context?
I am using Julia 1.7.3 on linuxmint 20.3
Hi,
I’m glad you posted here, because there’s a related question I wanted to ask here after seeing your StackOverflow post , that might help answer this.
My question to others here is: is Base.Filesystem.open
supposed to be internal, or is it part of the user API? It’s not documented, nor exported, but the docstring for tempname
says “Open the file with JL_O_EXCL …”, and Base.Filesystem.open
is the only place we can pass these JL_O_
flags. So it seems like it’s intended to be part of the API? (I’ll add an answer below assuming it is so.)
1 Like
(For this answer, I’ll assume Base.Filesystem.open
is just missing a docstring, and is supposed to be part of the API.)
The lock
keyword is about access to the file from within different threads on the same Julia process. Note that the documentation only specifies “multi-threaded access”.
Is there any way to open a file exclusively in a process and make other processes wait for it to prevent data race?
For process level file locking, you need OS support, and for that, you’ll need to use Base.Filesystem.open
instead of a normal open
call, in both the Julia processes. The exact function call depends on what you’re trying to do:
- Does the file already exist?
- If one process has written to it, does the other process want to append to that?
4 Likes
Thank you for answering.
I was not aware of Base.Filesystem.open
. Please explain how to use this in this context or any documentation if possible. I am using Linux mint.
Here are there answers for your queries.
Does the file already exist?
May or may not, sometimes it will exist.
If one process has written to it, does the other process want to append to that?
Overwrite is ok but append will be better. Some processes will only read the file. Exclusivity is absolutely necessary. To be more specific file should not get corrupted if two processes try to write it i.e. Exclusivity for writing and Exclusivity for reading is ok to have.
Since you’re looking for write-exclusivity, you need to use a lockfile here. The basic idea is described briefly in this SO answer (the couple paragraphs after ‘The main reason why this flag exists are “lock files”’).
First, we should decide on a location for the lockfile. This should be some constant location that both your processes know about, and somewhere you have write permission. On Linux Mint, you should be able to create and use a folder under XDG_RUNTIME_DIR
:
# either in global scope or inside a `main` function if you have one
const lockfile_location = joinpath(ENV["XDG_RUNTIME_DIR"], "ritulahkar")
mkpath(lockfile_location)
"/run/user/1000/ritulahkar"
This should be done near the start of your program (mkpath
doesn’t error if the folder already exists, so it’s ok to do this from both processes).
Wherever either of your processes wants to write to the shared file, it should first check for the lock file. It should only write to the main shared file if the lock file doesn’t already exist, and in that case it should create the lock file to notify that it’s currently writing to the shared file.
using FileWatching: watch_file
using Base.Filesystem
function write_to_shared_file(sharedfilename, contenttowrite)
lockacquired = false
lockfilename = lockfile_location * basename(sharefilename) * ".lock"
local lockfilehandle
while !lockacquired
while isfile(lockfilename)
# watch_file will notify if the file status changes, waiting until then
# here we want to wait for the file to get deleted
watch_file(lockfilename)
end
try
# try to acquire the lock by creating lock file with JL_O_EXCL (exclusive)
lockfilehandle = Filesystem.open(lockfilename, JL_O_CREAT | JL_O_EXCL, 0o600)
lockacquired = true
catch err
# in case the file was created between our `isfile` check above and the
# `Filesystem.open` call, we'll get an IOError with error code `UV_EEXIST`.
# In that case, we loop and try again.
if err isa IOError && err.code == Base.UV_EEXIST
continue
else
rethrow()
end
end
end
# now that the lock is acquired, we can safely write to
# the actual shared file we want to write to
open(sharedfilename, append = true) do sharedfile
# write to the sharedfile here just as usual
write(sharedfile, contenttowrite)
end
# free up the lock so that the other process can acquire it if it needs
close(lockfilehandle)
Filesystem.unlink(lockfilename)
end
This function should exist in both the processes, and whenever either process wants to write to the shared file, it should go through this function.
Thank you for your help. I will try to implement it.
I was also doing some reading. Will it be good if I try to implement lock by ccall to flock
or fcntl
?
Yeah, if you only need this to work on this system (i.e. no worries about OS portability) and on the local filesystem (i.e. no network storages), then flock
or fcntl
on the shared file is an easier option that should work equally well.