Help testing new features of PrettyTables.jl

Hi guys!

First all, I would like to apologize. When I started to develop PrettyTables.jl I thought it will be a very small project just to replicate the functionality in this website. However, it became much bigger, with three back-ends, highlighting options, formatters, etc. Because of this, I made very bad decisions in the past that I need to correct now. Unfortunately, these corrections will lead to breaking changes in the next release.

Thus, I would like to ask if people that are using this package can help me to test the version that is in master, providing me feedback of what can be improved in the API.

So, let’s see the first breaking change so far.

I decided to change the behavior of hlines keyword in text back-end. In the past, this was used to draw horizontal lines in the table body. However, there was a demand to add an option to remove the first and last lines of the table. I did this by adding a field to the structure that handles the table format (TextFormat). So, it turns out that many people were creating new structures just to remove those lines. This is bad… Now, hlines can be used to remove those lines and the previous behavior was transferred to body_hlines. In the end, we can now easily do things like:

julia> using PrettyTables

julia> using Statistics

julia> data = [ 100 200 300 400
                235 452 332 201
                433 29  123 223 ];

julia> pretty_table([data mean(data, dims = 2)],
                    ["Experiment 1" "Experiment 2" "Experiment 3" "Experiment 4" "Mean"
                     "[Unit]"       "[Unit]"       "[Unit]"       "[Unit]"       "[Unit]"],
                    row_names       = ["Set A", "Set B", "Set C"],
                    hlines          = [:header],
                    vlines          = [1,5],
                    header_crayon   = crayon"bold yellow",
                    row_name_crayon = crayon"bold green",
                    highlighters    = Highlighter( (data,i,j)->j == 5,
                                                   foreground = :blue,
                                                   bold       = true))

I will appreciate if someone can take a look at the keywords hlines, body_hlines, and vlines of the Text back-end, and provide me some tips if this is a good API. This will help to reduce the number of breaking changes in the future until we reach the 1.0 milestone :slight_smile:

9 Likes

By the way, the LaTeX back-end also has breaking changes, but since it was marked as beta I think this is not a problem :slight_smile:

1 Like

If you are considering making breaking changes, I wonder if some of the interface can be simplified?

A couple weeks ago I was making an HTML table, and I found it difficult to get PrettyTables to format things as I wanted. It also seemed like each bit of formatting required learning a new syntax. For example, the keyword argument formatter wants a dictionary of the form Dict( col => (value, row) -> ...) while cell_alignment wants a dictionary of the form Dict( (row, col) -> ... ), and highlighters wants a tuple of HTMLHighlighters which each want a function of the form (data, i, j) -> Bool and a HTMLDecoration. All of these are describing how to style and print a cell, but they all require a different syntax / data structure for how to convey that information.

I ended up writing my own HTML-table writing code, a simple loop over rows and columns, which just takes a style function (row, col, value) -> css style string and a format function (row, col, value) -> formatted value, with the convention that row 0 is the header. For me, it was both simpler and more customizable (since I can apply arbitrary CSS to each cell). My solution here probably isn’t for everyone since you have to write CSS directly, but I found it easier to just look up how to do something in CSS than figure out which keyword argument I need from PrettyTables and what kind of input it expects. It also feels conceputally simpler to have two functions that just get called for every cell rather than various dictionaries full of functions and you have to reason about when each one will be called and what it should do.

If you’re curious, my original PrettyTables version is here and the new version is here.

2 Likes

I’m just a casual user, but I found the following workflow great: (1) read in data into an Excel spreadsheet (a small number of data, typed in by hand), (2) save the spreadsheet as CSV, (3) use CSV.read to read the data into a Julia DataFrame and do computations, (4) use PrettyTables to convert the data into LaTeX code, (5) copy and paste data in LaTeX (I used LyX).

There are probably even simpler ways to do this, but gone are the days of manually copy and paste of tabular data into LaTeX tables.

Thanks for your work!

1 Like

Thanks! This is what I was searching for :slight_smile: You are absolutely right. I will see if I can change everything to use a tuple of functions with the format (data,i,j).

Awesome! I am glad it is being useful :slight_smile:

2 Likes

@ericphanson

Well, I think I will not be able to make the API exactly the same. For example, for highlighters, it makes sense that the function that selects the highlighting options has access to the entire table:

f(data, i, j)

where data is a reference to the entire table. This is required if you want to highlight, for example, the lowest element in a row.

However, for formatters, it does not make sense for the functions to have access to the entire table. A formatter should change the formatting of a single element. Thus, in this case, the API should be:

f(value, i, j)

where value is the value of the cell (i,j).

Makes sense?

I think there’s a big advantage to a uniform API. One option would be to always pass both. Another is to just have the user pass closures, like I did in my second link (the formatter uses the variable tab which is not an argument to the function) to get access to whatever they need. Actually, by using closures the function could just have the arguments i and j and have the user close over the rest but that seems a little inconvenient.

I understand you point, but I think it will bring confusion.

For example, if you have 3 formatters:

formatters = (f1,f2,f3)

It will be applied in order like:

v = f1(v,i,j)
v = f2(v,i,j)
v = f3(v,i,j)

This makes sense. However, if we have also the data (reference to the table). Then data[i,j] == v in the first call but it will be different on the others.

Thus, I think I prefer to let this as I described for formatters. For highlighters and cell alignments it will be the same.

If it’s getting to be a big thing, have you considered adding American Psychological Association standard tables and charts?

1 Like

I hope that one day PrettyTables.jl will have the features to print tables that can follow that standard :slight_smile: There is still a long way to go. The most challenging thing will be to merge cells in text back-end.

2 Likes

Ah I see, I didn’t think of the case of multiple formatters.

1 Like

Agreed, I find I’m going back and forth, between Excel and Julia a lot. Same with R, which might still be easier for some statistics tasks, but I’m finding more stuff in Julia, all the time.

1 Like

@ericphanson

Done! I submitted the changes. Now, I think the API is pretty consistent. Everything is based on (data,i,j) or (data,i) (for the filters). data is the entire matrix for everything except for formatters, which receives only the value that need to be formatted.

Furthermore, I also modified the Highlighters API to be the same in all back-ends. You have an activation function, which tells if the cell must be highlighted or not, and a function that returns what must be applied. In this case, nothing is breaking because I built some wrappers to avoid that.

The good side is that we can now make conditional highlighting in Text back-end. See this example that shows a possible integration between PrettyTables.jl and ColorScheme.jl:

julia> using PrettyTables

julia> using ColorSchemes

julia> data = [ sind(x)*cosd(y) for x in 0:10:180, y in 0:10:180 ]

julia> hl = Highlighter((data,i,j)->true,
                        (h,data,i,j)->begin
                             color = get(colorschemes[:coolwarm], data[i,j], (-1,1))
                             return Crayon(foreground = (round(Int,color.r*255),
                                                         round(Int,color.g*255),
                                                         round(Int,color.b*255)))
                         end)

julia> pretty_table(data, ["x = $(x)°" for x = 0:10:180],
                    row_names = ["y = $(y)°" for y = 0:10:180],
                    highlighters = hl,
                    formatters = ft_printf("%.2f"))

10 Likes