Display more decimals in DataFrame

dataframes
#1

How do I make DataFrame to display more decimals in the REPL? e.g.

x = DataFrame(a=[1.142300004,1.142300051])

2×1 DataFrame
│ Row │ a       │
│     │ Float64 │
├─────┼─────────┤
│ 1   │ 1.1423  │
│ 2   │ 1.1423  │
1 Like

#2

May not be the best practice but the following seems to work (in the expense of affecting the display format of all Float64's)

julia> Base.show(io::IO, t::Float64) = @printf io  "%1.9f" t

julia> x = DataFrame(a=[1.142300004,1.142300051])
2×1 DataFrame
│ Row │ a           │
│     │ Float64     │
├─────┼─────────────┤
│ 1   │ 1.142300004 │
│ 2   │ 1.142300051 │

1 Like

#3

there is a good package for displaying tables on the terminal, PrettyTables.jl, an example:

using DataFrames, PrettyTables
x = DataFrame(a=[1.142300004,1.142300051])
pretty_table(x)
┌─────────────┐
│           a │
│     Float64 │
├─────────────┤
│ 1.142300004 │
│ 1.142300051 │
└─────────────┘

3 Likes

#4

Yes! And if you use PrettyTables.jl, then you can use the formatters and print the numbers anyway you like. If you feel something is missing, please let me know :slight_smile:

4 Likes

#5

@longemen3000 and @Ronis_BR - thanks for the suggestions.

Tweaking the Base.show function is a nice hack. I’m afraid about other side effects though. Nonetheless it solves the problem.

The pretty_table function works but it’s slow for large table - it took 20 seconds to display for my 6mm row x 2 column table.

Just some more thoughts -

I thought IOContext could come to rescue, as in the following example:

julia> show(IOContext(stdout, :compact => true), 1.142300004)
1.1423
julia> show(IOContext(stdout, :compact => false), 1.142300004)
1.142300004

But, the same trick doesn’t work for DataFrames.

julia> show(IOContext(stdout, :compact => false), x)
2×1 DataFrame
│ Row │ a       │
│     │ Float64 │
├─────┼─────────┤
│ 1   │ 1.1423  │
│ 2   │ 1.1423  │

It would be nice if the context can be passed all the way. Or, it’s even better if a :precision setting can be passed via the context. @bkamins what do you think?

0 Likes

#6
  1. It is very easy to make DataFrames.jl respect :compact property. If you think it is worth to add it please open an Issue on GitHub. I can implement the change and it can be discussed there.
  2. If something more sophisticated were needed I would rather have :renderer (or some similar name) property where you could pass a custom function replacing the default way DataFrames.jl converts values into strings for display purposes.
1 Like

#7

Both sounds like great ideas! It makes me more greedy now :slight_smile:

I can open an issue there for further discussions.

0 Likes

#8

Hi @tk3369,

Actually it was kind of a bug in PrettyTables.jl.

The problem is: to provide a lot of formatting capabilities, we need to process the entire table before printing it, i.e. to convert every entry to string. If you are printing a table with 6,000,000 rows, then it will be very slow because we need to check the size of each element, apply the formats, etc. However, for default printing (cropping the output to fit the screen, like the default behavior in Julia), we do not need to convert the entire matrix, only those lines / columns that will appear.

Hence, I have tagged a new version of PrettyTables.jl (v0.4.1) that fixes this problem. Now, printing a table with 6,000,000 takes less than 0,2s:

julia> A = randn(6_000_000,2);

julia> @time pretty_table(A)
┌──────────────────────┬──────────────────────┐
│               Col. 1 │               Col. 2 │
├──────────────────────┼──────────────────────┤
│  -1.5708105270195982 │   0.5849721303958696 │
│ -0.18240834262933073 │    2.297072454147525 │
│    3.038226193453734 │   -1.053995983297294 │
│  -0.4521717844067924 │  0.24068452887469385 │
│  -1.0036693148726834 │   1.3231192378011667 │
│  -1.0912654369677268 │  0.07918736934992252 │
│  -1.7374054279253734 │ -0.48331939530398493 │
│    0.711235697419586 │  0.13195834920782662 │
│   0.5730496383457973 │  0.31378900621944894 │
│  -0.2852634109648003 │ -0.41608014915233066 │
│   1.1752714990244069 │  -0.6651023540551017 │
│ -0.01302333291340418 │  -0.7285514485011084 │
│    0.892405345306186 │   0.5782248658188986 │
│ -0.34061548058754576 │    -2.36534519685229 │
│  -0.6444713255802805 │   1.2357514702560206 │
│  -0.1637680099888764 │ -0.24256742808025103 │
│   -0.442464606793289 │   1.4713960791728553 │
│  0.24643436155394485 │   0.6565996153774774 │
│  -1.3875142117660895 │   0.5376088505913476 │
│  -0.4771557567644743 │   -1.648645221341626 │
│   0.5300907816993856 │   1.2681887723331997 │
│   2.4628192182971866 │  -0.9386710121889111 │
│   0.7932936634888795 │   -1.539282914842811 │
│  -0.7819474124799908 │  0.17471592555458548 │
│   1.4122772642519938 │   1.7101655089219512 │
│    0.544239586271703 │   1.5501441864921355 │
│  -0.8011567261844883 │   0.5573234867787404 │
│   0.5466673104345015 │   -1.445852038822783 │
│  -2.6435832343401464 │    1.058476049459767 │
│    0.835091258446735 │  -1.1831995330500964 │
│ -0.08963293149512457 │  -1.5665514952683073 │
│   0.6917000348450123 │    1.745435162680872 │
│   2.0383430047135245 │  -0.5926896235283556 │
│   0.9125870984077767 │  -0.7465508196307007 │
│   0.7137596621623102 │  -0.6798703096596058 │
│  -0.9962745074964077 │   0.6635460632776334 │
│   1.0114147162983365 │  -0.5128772869544962 │
│  -0.5739645602650018 │ -0.16668185162965135 │
│  -0.0671281071488013 │  -1.3142292562497255 │
│ -0.15487193175125383 │   1.4284413652774037 │
│  -1.1900934971232457 │  -0.5883720211137512 │
│   0.9355445352841825 │     1.09238485599259 │
│  -2.7938345777067863 │  -0.9465073717450541 │
│ 0.023128382966333787 │  -0.6316322682344331 │
│  0.23656405073467132 │   1.5190501597674642 │
│          ⋮           │          ⋮           │
└──────────────────────┴──────────────────────┘
  0.080995 seconds (3.74 k allocations: 51.669 MiB, 67.51% gc time)

Of course, if you want to actually print the entire table formatted with PrettyTables, then it will still take a lot.

I think in the future we can improve that by avoiding the initial processing if the user wants by passing / guessing a fixed column size.

3 Likes