How to Loop through excel files in a folder and update external links

Hi everyone, I am just starting out learning coding and essentially just use it to make ends meet at work at this stage, so apolagies in advance for being uneducated and illiterate in the matter.

I use the below code in python to loop through all xlsx files in a folder to update external links and save.

#import packages
from pathlib import Path
from openpyxl import workbook, load_workbook
import win32com.client
from tqdm import tqdm

#get excel files
SOURCE_DIR = r"K:\06_Lab\14_Mix Data & Submissions\Devon St Designs\Current Designs\Infrequently Used Designs"
excel_files = list(Path(SOURCE_DIR).glob("*.xlsx"))
number_of_excel_files = len(excel_files)
print(number_of_excel_files)
excel_files

#loop through files and update links
for excel_file in tqdm(excel_files):
    File = win32com.client.Dispatch("Excel.Application")
    File.Visible = 0
    wb = File.Workbooks.Open(excel_file, UpdateLinks=True)
    wb.Close(SaveChanges=1)

It is extremely slow - (100%|██████████| 62/62 [38:44<00:00, 37.49s/it])

I was wondering if someone could help me with the equivalent code in Julia or atleast point me in the right direction on the assumtion it would be much faster than python?

Thanks in advance for any infor or advice!

It will only be much faster than Python if Julia somehow has a very fast library to work with Excel files, what I doubt. Programmatically accessing Excel files is slow and terrible, I say by experience.

1 Like

Okay thank you very much!

It doesnt help they are clunky poorly designed sheets saved on a network folder =\

The most widely used excel reader and writer in Julia is XLSX.jl and I’d be interested to see how it performs in comparison, but I’m not sure it has the functionality that you’re after (updating links).

1 Like

Python isn’t doing the heavy work in your example, it is using the Excel Com Object itself to perform the update.

Excel the application is a UI over the top of the Excel Object.

So when the file is opened in Python, it is performing similar operations to opening the Excel application from cold via Explorer, which as I’m sure you know, is not quick as add-ins etc are loaded too.

“Update Links” involves also opening dependent files and reading data via Query Builder etc.

Opening Excel files programatically via XLSX.jl etc. gives you access to the values and objects in individual worksheets and charts within the file. It does not recalculate the sheet or do any updates.

The two operations are quite different.

You may have success, depending on your actual sheets but I suspect it will not do what you hope.

3 Likes

This is only tangentially related:

Is the K drive a network drive?
I wonder how much of this is latency related (maybe very little).

I am not sure if you can test the timings locally. If you copy all files to a local folder, you would need to have all linked files locally too (and all the links in the files would need to point to those). Depending on your files that may be difficult or cumbersome to achieve.

If you have the option to use OneDrive thar might be worth a try to. The files would be present on your ssd and still be accessible to colleagues. Might help with latency (and throughput in case you have large files)