Workbooks.jl / Julia and spreadsheets

Part I

Hello, long-time lurker here. This has been sitting with me for a long time, and now I’m not sure how to start. I think the core idea of this post is this manifesto (in the paraphrased words of someone smarter than me):

We need new Spreadsheet Application (SA) for this new era of computing. We need a SA that dispatches in the multiple ways we think. We need a SA that scales exponentially like our problems. We need a SA that integrates with our package ecosystem, letting people extend parts and contribute back to the core library all in one language. We need a modern SA in a modern language.

Yes, this is about spreadsheets. Wait, don’t go! I want to use this part of the post to make the case that spreadsheets are worth your time (if you already know, skip to part II). Hoping that someone who is more skilled than me will take an interest and get involved.

This is not just about Excel (which superceded/killed Lotus 1-2-3), or about Lotus 1-2-3 (which popularised the IBM PC, and killed VisiCalc), or about VisiCalc (which single-handedly popularised the Apple II). It’s about scientists having to rename genes to avoid date functionality from the early nineties. It’s about formatting errors leading to the secret service eavesdropping on the wrong telephone numbers. It’s about “sophisticated” risk analysis relying on copy-and-paste letting the London whale make a $6 bn loss on his own.

I think spreadsheets get used so much in the real world because they are incredibly convenient, for some kinds of work. In my mind, it’s an alternative that’s complementary to the linear style exemplified by a Jupyter or Pluto notebook. Some things are best understood in a top-to-bottom line-by-line format, other things seem to be clearer in a spreadsheet, laid out in a couple of pages of tables and charts. But then why are matrices not first-class objects in a software literally living in array-land? Why do we have to use VBA, with a performance penalty for custom functions? Why can’t we just plug in our favourite packages for graphing, for profiling, for differential equations?

Or to put it another way, I have to use spreadsheets every day for work; why does it have to be Excel? Why can’t I use a Jupyter-quality Julia-based spreadsheet?

Part II

I made a package! GitHub - felixrehren/Workbooks.jl: Spreadsheets with the power of Julia
Here’s what it can do:

  1. It can make a spreadsheet from the command-line and edit cells from the command-line
  2. The spreadsheet updates automatically! Look at cell B1 below
  3. It can teach me how spreadsheets work, and the simple algorithms behind them
  4. err … that’s it

image

Ok, so it falls way short of the dreams outlined in Part I. There’s some more ideas in there, e.g. about using the Pkg environment to create reproducible spreadsheets with user-defined code, but I’m struggling to take it further. Maybe this project was too greedy, but I at least wanted to share the dream. Would love it if anyone wants to get involved or do it better. Maybe one day there will be a good Julia-based spreadsheet application – I think it would be a game-changer. Thanks for reading, and thanks to everyone who contributes to the Julia ecosystem. It’s inspiring!
Felix

21 Likes

looks cool. looks like you are not using any of the reactive libraries when I looked at the Project.toml

Not sure if anyone remembers, but there was this company called Revolver one or Resolver one which was a python & spreadsheet hybrid.

1 Like

Yea I’ll piggyback on this comment. I’d guess that reactivity like Pluto.jl has would be beneficial here.

Would be cool I guess to have a table pop up like FloatingTableView.jl where you can interact with your worksheet, yet it is all still attached to the Julia session running in your REPL.

2 Likes

not just the visual aspect. I mean Excel is a reactive language meaning if A1=fn(B1, C1) then A1 and B1 changing will automatically change A1. So yeah, given you don’t use things like Reactive.jl, in your code if someone changes C1 you use a graph to keep track of C1 impacts A1 and so A1 gets updated. But I mean using a reactive library might save you that code (until you need to optimize i guess).

1 Like

That’s exactly what I meant. Isn’t that how Pluto works?

3 Likes

Maybe the way forward is again to separate data, formulae and notes like the Improv people did. I’m still sorry that they didn’t succeed with their product.

5 Likes

Interesting discussion, Excel is both awesome and a complete mess :wink:
I agree with @pbayer that the entanglement of logic and data is one of the biggest issues with Excel.
Maybe one way would be to have editable tables in Pluto for showing and input of data, but keep the formulas separately in different Pluto cells?

CC @fonsp @pankgeorg

2 Likes

u r right

Sharing a related package for future readers: GitHub - felipenoris/XLSX.jl: Excel file reader and writer for the Julia language.

2 Likes

I’ve been studying this problem on how teams actually use spreadsheets for a while. I came to the observation that there are actually two factors.

  1. Data Entry. Having Excel as a data-entry medium is easy and great and requires little programming insight.

  2. Reactive programming. This is the main advantage of interfaces like Pluto etc.

I’ve been thinking quite hard whether one can get a Google sheet like interface to address (1) in Pluto, but falls short in two areas.

  1. We’re still lacking bind2 functionality, which means that the only output from reactive components in Pluto must be Javascript literals or objects. We can get around this by passing JSONs around, but I’ll like to avoid that.

  2. Fundamentally, there is no easy way to save outputs of reactive components in Pluto. We need to work around some file reading capabilities to save the data entry components. With the way Pluto works it sometimes require us to restart the notebook if we mess up on the notebook. This may lead to loss in time if we have no easy way to save what our outputs is.

Fundamentally, my opinion is that something like Workbooks.jl won’t change that much, because people who are already programmers can easily write code to do things. I have a dream that someday I can get a Google Sheets like interface, and allow me to write Julia code everywhere to do things, but my current goal is to improve Pluto until that goal becomes doable.

4 Likes

An interactive Tables.jl-compatible table editor for data entry in IJulia and Pluto would be nice - we don’t really have a full solution for that yet, do we?

2 Likes

In some sense, we require quite a few changes to make this work.

If you can display Tables.jl tables, with a data entry that allows formulas, then I think together with Workbooks.jl you basically have a spreadsheet application? and maybe not too far away from “Google Sheets like interface [allowing] to write Julia code everywhere”!

The reactive, no-side-effect philosophy of Pluto is a good fit for spreadsheets, I think. While I’m clueless about interface work, very happy to chat if combining forces could be helpful

It is important to clearly define the inputs and outputs for an editable table so that the Pluto dependency tree is constructed correctly.
If we consider an editable table as a standard input object (like slider), you set its initial state in the constructor of the table. This is done only once unless the initial state itself changes, in this case a new slider is constructed and all previous user input lost.
The result of the user input would be @bind to a Julia variable. Re-execution of downstream code would on default be triggered on every change.
Persistency of user input could be archived e.g. by saving the table state into a file after each change. When opening the notebook again, this file is loaded and its input passed to the table constructor.

1 Like

That would work nicely for many use cases, I think.

1 Like

This doesn’t have the front end of a spreadsheet, but here’s an older Lisp implementation of the reactive dataflow model. https://github.com/kennytilton/cells

1 Like