I converted Stata data to Julia by taking caution to convert data to equivalent data types as much as possible such as:
Stata byte type to Int8 in Julia, int to Int16, long to Int32, etc. What I found out was that a data set that was about 600 megabytes in Stata uses about ten times memory in Julia. After all data are converted to the equivalent data types, the Julia data set used more than 6 gigabytes of memory (judged by the Windows Task Manger).
My questions are:
- are there any people who experience the same problem?
- Is this my problem as opposed to the problem inherent in DataFrames + NullableArrays?
- If so, is there a way to get around this memory problem?