Hi, I am working with large-columned data sets. One small example is the dimension of (1452, 66584) whose data size is about 2GB. When it converted to a feather format, its size was down to 778MB. The problem is Feather.jl is unexpectedly slow for first reading, taking fairly large memory allocation, so I failed several times in ACF for memory issue and unexpectedly fast after that. Here is the output I succeeded in the following local machine:
Julia> using Feather
Julia> @time al=Feather.read("DO_gm_ofa_unadj_alpr_ch1.feather");
1850.519921 seconds (17.67 G allocations: 363.156 GiB, 1.80% gc time)
julia> @time al=Feather.read("DO_gm_ofa_unadj_alpr_ch1.feather");
3.102820 seconds (17.18 M allocations: 575.066 MiB, 7.20% gc time)
Julia Version 1.0.5
Commit 3af96bcefc (2019-09-09 19:06 UTC)
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
LLVM: libLLVM-6.0.0 (ORCJIT, haswell)
This file is not the only to work with; it is one of the files I jointly work with. Do you have any idea to read large-columned data fast?
You mean you couldn’t install it or the feature are not up to scratch, in terms of speed or usage? I am interested to know what are the failing if you can be so kind to volunteer your time to answer my question.