[ANN] XLSX.jl v0.11.0 now released, bringing significant new functionality

This release introduces significant new functionality as set out below.

There are almost no changes in existing functional APIs in v0.11.0 compared with v0.10.4. Those changes that have been made are described briefly here.

This version drops support for Julia v1.6, and requires at least Julia v1.8.

Breaking changes

There is only one breaking change in this version:

  • infer_eltypes now defaults to true (e.g. in gettable and readtable). This is the more common use case but, if it is not your use case, you will need explicitly to set infer_eltypes = false in the relevant functions.

All other changes either introduce new functionality (documented elsewhere) or relate to internals only.

New Functions

A number of new functions have been added compared with v0.10.4.

These include 18 new functions to support formatting of cells and cell values together with functions to copy or delete a sheet, to merge cells and to add new defined names for cells or cell ranges. In addition, it is now also possible to assign AnnotatedStrings (from StyledStrings.jl) to cells to create content using Excel’s rich text formatting.

A new function, XLSXFile, is provided that takes a Tables.jl compatible table and creates a new XLSXFile object for writing and which can act as a sink for functions such as CSV.read.

A new function, renamesheet! is created to replace rename! for consistency in naming with addsheet!, copysheet! and deletesheet! and to avoid potential name conflicts when exported (e.g. with DataFrames.rename!). However, the existing function XLSX.rename! is retained (but not exported) to avoid a breaking change.

Two new functions, gettransposedtable and readtransposedtable, mirror gettable and readtable for worksheet tables that have data organised in rows rather than columns.

Some additional convenience functions have also been added to streamline functions that were already available (such as newxlsx, savexlsx).

A wide range of additional indexing options is now widely supported by most functions. Most functions now support indexing rows and columns using vectors, ranges and step ranges and will accept a colon.

Exported Functions

Most useful functions are now public, and can be used without the XLSX. prefix. The following function names are now exported:

  • Files and worksheets
    XLSXFile, readxlsx, openxlsx, opentemplate, newxlsx, writexlsx, savexlsx,
    Worksheet, sheetnames, sheetcount, hassheet,
    addsheet!, renamesheet!, copysheet!, deletesheet!

  • Cells & data
    CellRef, row_number, column_number, eachtablerow,
    readdata, getdata, gettable, readtable, readto,
    gettransposedtable, readtransposedtable,
    writetable, writetable!,
    setFormula,
    addDefinedName

  • Formats
    setFormat, setFont, setBorder, setFill, setAlignment,
    setUniformFormat, setUniformFont, setUniformBorder, setUniformFill, setUniformAlignment, setUniformStyle,
    setConditionalFormat,
    RichTextString, RichTextRun,
    setColumnWidth, setRowHeight,
    getMergedCells, isMergedCell, getMergedBaseCell, mergeCells

The iterator XLSX.eachrow has retained the XLSX prefix to avoid making a breaking change. However, Base.eachrow now refers to XLSX.eachrow for XLSX.Worksheet data types, meaning that eachrow can be used without qualification, too.

Fixed issues

This release addresses the following issues:

Full list

[Where have the Issue titles gone? They show in preview. They’re back!]

Documentation

The documentation for this package has been extended substantially to cover the new functionality and all changes are (should be) reflected therein. In particular, a detailed guide to using the new formatting functions has been added.

Internal changes

A number of changes to package internals have been made. Specifically, changes have been made to the following data structs:

  • SheetRowStreamIteratorState
  • WorksheetCacheIteratorState
  • WorksheetCache
  • XLSXFile
  • Workbook
  • Worksheet
  • SheetRow
  • Cell

In particular, the internal memory configuration of an XLSXFile object and its components has been changed significantly, nearly halving the package’s memory footprint.

Changed dependencies

v0.11.0 has now fully migrated to ZipArchives.jl whereas v0.10.4 relied upon both this and ZipFiles.jl. In addition, xml support is now from XML.jl rather than EzXML.jl.

The use of AnnotatedStrings is supported through a package extension. This requires StyledStrings.jl to be in the active environment.

New functionality that has been added has brought the following additional dependencies compared with v0.10.4:

  • Colors.jl
  • UUIDs.jl
  • Random.jl

In addition, the test suite now has dependencies on CSV.jl, Distributions.jl and StyledStrings.jl.

Precompilation

v0.11.0 now makes use of PrecompileTools.jl (initially only in a small way).

42 Likes

Fantastic work, Tim! XLSX.jl has been my favorite interface to Excel files, and I’m happy to see my feature request make it into the package. Do you have a roadmap for future releases?

I haven’t looked into it, but could plotting support be added? (Edit) I see that this has been requested in the past: Add plotted figures to existing excel file · Issue #134 · JuliaData/XLSX.jl · GitHub. I’ll see if I can do some groundwork and will take any hints that you might have :slight_smile:

4 Likes

In terms of a roadmap, I’m not currently looking very far ahead. My next task will be to address a significant proposed change in XML.jl. This is going to take a bit of work, I think. I’ve done some preliminary work but need to resume again now v0.11 is released.

(I added a comment on issue #134)

4 Likes

I am happy to see this happening @TimG ! :tada:

Thank you very much for reviving and maintaining this important package.

Cheers!

5 Likes

I also am a happy user of XLSX.jl. For another project I use OdsIO.jl, which calls a Python program. It works well, but it would be neat to see its functionality in pure Julia. I was curious if that would be possible in this program, or if the file types are just too different.

Glad you’re happy!

My feeling is that it is probably best to keep XLSX.jl just focused on Excel files. There is plenty of opportunity for further development in that space without widening scope to include .ods files, too. I’ve no experience with .ods files, so I don’t have any idea how similar/different they are from Excel files.

1 Like