Fastest way to write to and delete from the end of a file and for general editing


#1

Hello

A significant part of my program is to communicate some data back and forth through files, and I would like to maximize the speed of editing a file. Most of the time I will be writing new lines to the end of the file, or writing to the end of a line in the middle of a file. Is there any particular trick to speed up this part of the program?

Thank you.


#2

Manipulate everything in memory and write to files only at the end?


#3

Yes, but then I will have to write a 1000 different files for the 1000 different single-line changes I want to make throughout the program. If the file is made of 100,000 lines, that’s 100 M lines that the program will write, to change maybe just O(1000) words during the life of the program. So I am wondering if the modifications are all at the end, can that help in anyway to speed up the editing, both writing and deleting? Also, I wonder if there exists any special file division and hashing mechanism (behind the scenes) to enable quick search through a large file, to make point changes to it without disturbing the conventional reading and writing streams of the file.


#4

If nothing like that exists in Julia, maybe a library from another language, or a text editor used by CLI commands may do the trick.

Phrased differently, let’s say that I know the length of each line, can I go to the nth character in a file directly and perhaps overwrite the characters in place with some semantically harmless characters, as opposed to just deleting them? That’s assuming that appending to the end of a file does not create a performance issue (need confirmation), and that order of lines doesn’t matter, e.g. when dealing with constraints in a constraint program.


#5

It sounds like you may want some kind of proper database rather than just a bunch of files, but I am no expert in that area.


#6

There are seek and write, but I’m not sure if they can do what you’re asking.


#7

It really sounds like you need to rethink your use of files for communication. Files are ill-suited to rapid exchange of many small bits of information. There are many alternatives, but it is hard to give a specific suggestion without knowing more about what you are trying to do.


#8

Thank you for your comments. “seek” does the in-place replacement but I need to test it on a large example to see how efficient it is.

I am trying to use MiniZinc to parse my constraint programming model’s file and send it to Gecode, which solves the model. The model is written and modified in Julia multiple times during the course of the program, and MiniZinc is called through an OS subprocess call.

I realize now there exists minizinc-julia which enables passing the model as a Julia string. I will check that as an alternative. Thanks again.