I am building a new version of PrettyTables.jl with the features necessary to replace the LaTeX and HTML backend in DataFrames.jl. However, I am facing some problems.
Most of them is related to a feature in PrettyTables: the filters! Currently, you can pass a set of functions to filter the rows or columns that will be printed. I did this to filter some satellite telemetries that could indicate some problem back in the day. However, since DataFrames now uses PrettyTables to print tables to stdout, we can just use all the powerful mechanisms in DataFrame to perform the filtering.
So, I have the following question: Are anyone actively using the filters in PrettyTables? I am really considering removing them in the next release. It will make adding new features much easier.
I am not using the filters in PrettyTables. If a need to filter arise, I would prefer to use DataFrame operations. Making life simpler for the package maintainer is +1.
+1 from me too! Composing functionality (e.g. filter then print) provides a simpler and more flexible interface than overburdening a function with too much responsibility.
I actually do use filters for when I have data that is not in dataframes.
I am in favor of removing filters from PrettyTables, but would you share a proposed workflow @Ronis_BR that does not use filtering within PrettyTables?
Thanks!
Speaking as a PrettyTables user who basically never encounters DataFrames, I’m also all for removing those filters. It’s trivial and more convenient to filter a table beforehand anyway, thanks to Julia map() and filter() functions that work with many table types.
How is this decision related to DataFrames at all?
PrettyTables consumes more data types than its filters can handle, as they seem to apply only to matrix data?
Filtering is best done outside the package, but I wouldn’t say it’s trivial. Examples would be welcome, even if they would be outside the scope of the package.
It seems we can remove the filters. It should be easy to provide examples that can perform the same filtering outside PrettyTables. The only problem is that PrettyTables does not duplicate data to print the filtered content. Without it, you will have to filter the table before, creating another one, that will be printed.
Maybe this feature can belong to a new package, that does the filtering without copying.
How is this decision related to DataFrames at all?
It is just because since DataFrames now uses PrettyTables, you can use it to filter data if required, but should be overkill in most of cases if you are not using DataFrames.
PrettyTables consumes more data types than its filters can handle, as they seem to apply only to matrix data?
Yes, this is true.
I am in favor of removing filters from PrettyTables, but would you share a proposed workflow @Ronis_BR that does not use filtering within PrettyTables?
Thanks! I am starting to realize that it is one of those features that are used only in a very specific moments. I could let it there, no problem! However, it is also one of those features that makes almost everything very difficult.
For example, I need to print in HTML how many rows are hidden in case the user wants to render only a part of it. However, if it has a filter, I need to filter the entire table to check how many rows would be actually displayed… This is just one example.
Regarding tables that are not DataFrames, I wanted to mention the TableOperations package, which has filter and select methods for arbitrary tables. I haven’t checked the details, but I imagine those should cover most of the filtering functionality in PrettyTables.
This is good. I had been using the filters at times, because it was there. I think filtering ahead of the PrettyTable process makes more sense, and now you have taken away an option that I needed to think about.