JLD load error for Dict

katdeane · November 24, 2020, 8:03pm

I am using dictionaries to store matrices of data under a series of tags such as the following:

"GroupIdentifier"=>Dict{Any,Any}("SubjectNumber"=>Dict{Any,Any}("StimulusType"=>Dict{Any,Any}("RecordingPlace"=>([dat1],[dat2]))))

I can save this data full Dict as a .jld file but when I try to load it, I get the following error message:

ERROR: stored type JLD.AssociativeWrapper{Core.Any,Core.Any,Base.Dict{Core.Any,Core.Any}} does not match currently loaded type

I tried to recreate this error in an example code for you all but when I made a small/simple Dict, save and load from the JLD package just worked.

As the data are matrices, I don’t think I can use DataFrames or any tables. I also think Arrow won’t work as it would try to put each value in it’s own column and row instead of inserting the full matrix into a cell.

Can someone help me decipher what this error message might mean and/or point me towards a better way of either storing or saving my data? Thanks!

stillyslalom · November 24, 2020, 8:27pm

Instead of using a nested dict, could you flatten the data structure, e.g.

Dict((GroupID=3, SubjectNumber=201, StimulusType=:green) => (rand(3), rand(3))

That allows you to store it in a DataFrame and should make the types concrete for easier serialization/deserialization.

katdeane · November 24, 2020, 8:55pm

Thanks for your response, I need to be able to chunk the data in different ways for analysis as I go on though—isn’t there a way to simply save the Dict as is with JLD or is there another structure that would work better initially? As far as I can tell, the series of info tags wouldn’t be a problem in a DataFrame or whatnot but what would be a problem is a dataset that takes more than one column and row.

It does seem that JLD2 and FileIO solves the problem though as the Dict looks the same going in and coming out with

using JLD2, FileIO
save("datafile.jld2","variname",Dictionary)
Variname = load("datafile.jld2")["variname"]

stillyslalom · November 24, 2020, 9:18pm

I’m not sure what you mean by “a dataset that takes more than one column and row”. Could you give an example?

charshaw · November 24, 2020, 9:20pm

I got a similar error yesterday when using JLD to save a DataFrame object. The weird thing is that this used to work just fine a few weeks ago.

Here’s a short reproducible example where I get an error:

using DataFrames
using JLD

ex_df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
save("ex-file.jld", "ex_df", ex_df)

d = load("ex-file.jld")

I get an error message, which seems very similar to the error that @katdeane got regarding a stored type not matching the currently loaded type:

ERROR: stored type DataFrames.DataFrame does not match currently loaded type

What is weird is that I can get this error to happen within one REPL session so I’m not sure how to understand this error. I am not familiar with the inner workings of JLD, but I’m guessing that it must be a problem in a package update? Does anyone with more knowledge of JLD have a sense of what’s going on here? Maybe @tim.holy would have a better sense of this?

I should point out that using JLD2 and FileIO as @katdeane as suggested above works fine and I do not receive the same error. So, I might just switch over to using JLD2. But, it would be good to know where this error is coming from and what to do.

katdeane · November 25, 2020, 8:02am

I’m coming from Matlab (trying to get away from it) and might be thinking in terms of the structures I used there. I am also not a vet programmer and use this for neuro data analysis so I’m sorry for lack of clarity.

In Matlab I could create a struct where 1 column would hold cells of matrices of data in relation to the other identifier tags. If I try to export that as a csv for example, the formatting takes the matrix in the cell and spreads it out over rows and columns. So if I have Data(1).Stimtype = “green” and Data(1).Dat1 = {20 x 800 double} then changing it to a table or csv format would make Dat1 take 20 rows and 800 columns instead of just 1x1. Let me know if I’m just completely missing something, anything to simplify storing and chunking data is appreciated.

Tamas_Papp · November 25, 2020, 9:10am

I am still not sure I understand the data structure (matrices in nested dictionaries?), but perhaps JSON could work.

Providing an MWE that generates example data would make it easier to suggest a solution.

katdeane · November 25, 2020, 10:26am

Of course, my bad. I tried and couldn’t recreate the error with an MWE before but I was able to finally do so when I went further to also recreate the data structure setup for you now.

GroupID = ["1" "2"]
Subject = ["001" "002" "003"]
StimList = ["green" "blue" "red"]

Group = Dict()
for iGr = 1:length(GroupID)
    SubjectNumber = Dict()
    for iSu = 1:length(Subject)
        Stimtype = Dict()
        for iSt = 1:length(StimList)
            dat1 = rand(2,4)
            dat2 = rand(2,4)
            Stimtype[StimList[iSt]] = dat1, dat2
        end
        SubjectNumber[Subject[iSu]] = Stimtype
    end
    Group[GroupID[iGr]] = SubjectNumber
end

using JLD, HDF5
save("Group.jld",Group)
loadGroup = load("Group.jld")

Tamas_Papp · November 25, 2020, 11:56am

Thanks for the MWE. I found that JLD2 writes it out, but you may want to test it.

For this kind of data, I would consider just using the filesystem + CSV, eg save a table in 1/001/green.csv etc in some subdirectory.

charshaw · November 25, 2020, 3:46pm

Thanks for the replies, @Tamas_Papp. Yes, it seems that the errors that @katdeane and I are having are fixed with JLD2.

I am still confused about the error message I’m getting from JLD when trying to save and load a DataFrame. See my reply above for a very short MWE. The weird thing is that I can get this error to occur in a single REPL session. Any thoughts on this?

(P.S. sorry if I’m hijacking this thread too much. I can open a new thread or go to the JLD github page if that’s easier!)

katdeane · November 26, 2020, 8:23am

No worries @charshaw, I was also saving and loading in the same REPL session and the error message totally confused me. How could it not match a loaded type when it was sitting in my workspace? My issue is solved but I am still curious about how to interpret this.

Topic		Replies	Views
Cannot convert JLD.AssociativeWrapper to Dict General Usage question , package	0	729	October 26, 2018
Serializing nested Dicts (or DataFrames) so that they can (easily) be loaded in Python as well? General Usage hdf5 , dictionary , python , jld2 , dataframes	3	1273	June 15, 2022
How to read range of JLD file? Data question , jld	19	3965	January 18, 2017
Problem saving SharedArrays with JLD type General Usage	3	924	November 24, 2020
Reading DataFrames from JLD2 files General Usage question , jld2 , dataframes	9	1752	January 16, 2023

JLD load error for Dict

Related topics