I try to better understand how to write to and read from xlsx-files. But also to better understand the syntax and idea behind it.
For writing to a file the function is given:
XLSX.openxlsx(wb_path, mode=“rw”) do xf
… end
For reading from a file is given:
XLSX.openxlsx(“myfile.xlsx”, enable_cache=false) do f
… end
What do ‘xf’ and ‘f’ stand for? Maybe a silly question, but I could not find an explanation sofar. Hope you can clarify this.
In short, do blocks are an alternative syntax for creating and passing an anonymous function as the first argument to another function. The following are equivalent:
foo(x -> bar(x), y) # anonymous function with `->` syntax
# is equivalent to
foo(
function (x) # notice that we didn't give this function a name
bar(x)
end,
y) # anonymous function with `function` syntax
# is equivalent to
foo(y) do x
bar(x)
end # anonymous function created and passed via `do` syntax
There isn’t any significance to the name x here, it’s just the name given to the argument of the anonymous function. The xf and f in your case are similarly just names.
In this case, based on (what I’m guessing about) the XLSX.openxlsx function, I’m betting that they are file handles that can be used to access the contents of the file. The reason that an anonymous function is used here is to ensure that the file gets closed once the code is done reading it (or, importantly, if it stops reading because it throws an error).
You can use any variable name you like, but xf likely stands for “Excel file” and f for “file”. You then pass the variable name for the file, defined after the word do, to the other functions in the do block to lay out what operations you want to perform on the file while it is open.
Yeah, well, other way around. The functions in the do block refer to output of the openxlsx command which is often named xf. (I name it file below.)
Here is an alternative and not recommended method that may illustrate what is happening better:
using DataFrames, XLSX
df = DataFrame(
A = [1, 2, 3, 4, 1],
B = ["M", "F", "F", "M", "O"],
)
filepath = "./xlsxtest.xlsx"
file = XLSX.openxlsx(filepath, mode="w")
sheet = file[1]
XLSX.writetable!(sheet, df1)
XLSX.writexlsx(filepath, file) # this line is not needed in the `do` block version
close(file) # this line is not needed in the `do` block version
The do version defines all the commands in the middle as an anonymous function and then passes that anonymous function into the openxlsx function. That way the opening, reading/writing, and closing all happens in one step. It is less error-prone and faster that way.