[ANN] StringManipulation.jl v0.2.0 and TerminalPager v0.3.0

Hi!

I just tagged StringManipulation.jl v0.2.0 and TerminalPager.jl v0.3.0 :partying_face:

StringManipulaion.jl

I realized that some packages that I built were having a lot of duplicated code related to string processing with decorations (escape sequences). For example, in TerminalPager.jl, I need to compute the printable size of a string to check where I can break it to fit the screen. I also need to track the decoration to make it look consistent when the lines are drawn in the screen. All of this should consider UTF-8 characters that can have width higher than 1.

All this duplicated code is now inside StringManipulation.jl. The good side (besides not having to maintain the same thing over and over again) is that I could test almost everything related to string manipulations :slight_smile:

This package has some nice features that can be helpful in some context, like highlighting search matches for example:

julia> using StringManipulation

julia> str = """
       # In sed validum spumis una quam habet

       ## Cum ignes tibi

       Lorem markdownum [sine](http://huichonesta.net/positamquetransire)! Infans
       Sipylumque: venit sui *in* ambiguum petunt regnabat Cerealia quercus! Iussae sic
       superas **relinquunt tinguit iustis** quae, **spes iterum precando**. Qua acuta,
       vitiataque albas hastilibus Etruscam, ex classe furorem eheu et
       [menti](http://www.thermodontiacoaltius.org/) adspexit humilesque similemque
       nomen. Ut *vero velant ad* tibi receptae classis manantem laceri iterum, litore.

       ## Sunt est obsedit virgo

       Saepe obvia gentis, pervia Medea genetrix mori currum pyra viri, formosior
       quidem Viderat! Quas Alcimedon exuit! Domos biformis virginitas secundum Et
       dictis ad annos coniecit: suci uteri, de.

       > Satus cur imbris, tum in bracchia digiti *populusque luctante nam* materiam
       > spectare: oppidaque potiunda gravidi genitore raptas. Dryantaque **vertere
       > metuo**.
       """;

julia> highlight_search(str, r"qu") |> println

TerminalPager.jl

The new version of TerminalPager contains a lot of improvements. First, it now uses StringManipulation instead of the custom rendering engine. We have now a much more cleaner codebase that hopefully will help people to contribute with the project :slight_smile:

I also modified many things increasing the performance. Some runtime dispatches were fixed thanks to two AMAZING tools: SnoopCompile.jl and JET.jl.

However, we have breaking changes… sorry about that :frowning: If you use the pager without modifications, everything is fine. However, if you customize keybindings and use some internal options, you will need to adapt the code.

Next steps

The next step will be apply the experience I got and adapt PrettyTables.jl to use StringManipulation.jl. After that, I will perform the same analysis using SnoopCompile and JET trying to improve its performance. In parallel, I am working to finish the HTML and LaTeX backend of PrettyTables.jl so that it can be used by default by DataFrames (sorry @bkamins , I had more problems than I though I would had :smiley: , but I am working on it).

26 Likes

This is really fantastic. Can you please additionally please comment how not TerminalPager.jl handles very wide/tall tables?

1 Like

Thanks @bkamins !

It works very bad :sweat_smile: Currently, the system must render the entire table. However, I had a wonderful idea to solve this problem once for all (that’s why I started this modification in TerminalPager.jl):

  1. Create an interface (something like Tables.jl) to define a custom API for types that need to print data.
  2. TerminalPager.jl will ask for a view (initial row and column and number of rows and columns).
  3. The interface must render a chunk of data and send to TerminalPager.jl as requested. If a data outside the chunk is required, the interface must render it and free the memory of the previous one.
  4. It is also possible to request one initial information to render the header and row numbers at any point.

In this case, we will be able to show and analyze very large DataFrames. However, this interface must be data-aware. The code for a DataFrame will not work for the code of a matrix. However, we can create interface packages like DataFramesPager, etc.

The only downside is that you will loose the ability to search inside TerminalPager. However, we can turn on and off this feature depending on the size of the text to be shown.

2 Likes

Bravo! I think it is worth to work towards such an interface in the long run.
In the short run I think LaTeX + HTML backends are better priorities.
(of course it is up to you what you like to work on :smile: - these are just my suggestions respecting the fact that most likely you have a limited time budget)

3 Likes

Yes, for sure! I agree 100%, that is my next priority now :slight_smile:

3 Likes