Hello.
I’m wondering what is the best Julian way to copy files. My particular use case is I have a large number of files (of varying size) and I’m automating their data cleaning and backup process. Without going into great detail, it’s the writing/copying in the backup process that is the bottleneck. Doing a normal broadcasting copy in Julia yields inferior results compared to a manual “copy & paste” in Windows so I’m looking for alternate solutions.
Manual select-all copy takes x time.
Broadcasting Julia’s cp
takes ~5x as long as manual: cp.(source_files, backup_files)
Julia’s asyncmap
takes ~4x as long as manual: asyncmap(cp, source_files, backup_files)
Julia’s Threads.@spawn
macro takes ~1.5x as long as manual: paired_files = zip(source_files, backup_files); tasks = [Threads.@spawn(cp(file[1], file[2])) for file in paired_files]
I will say running length(Sys.cpu_info())
gives 4 which is how many threads I run this script with so I’d assume it’s “even” with any multi-threading the copy & paste manual version does.
Any suggestions for efficiently writing/copying multiple files quickly? I’m happy with the Threads.@spawn
approach performance but would like to see others’ ideas. Thanks.