JSON3 to struct, how to improve deserialization performance?

Hi,

I’m trying to create an application to distribute, this application is loading a JSON configuration file. Since I wanted a one to one binding between the application data and the JSON I’m using the JSON3 package.

Here is my performance test for serialization and deserialization

using TestJson.Models
using Test

objects = [
    AppleData(Vector3D(10.0, 5.0, 3.0)),
    CarData(Vector3D(7.0, 0.0, 0.0))
]
s = SceneData(objects)

# Call the serialize
json = serialize(s, true)
println(json)

# Encapsulate time into a function
time_deserialize(x) = @time deserialize(x, SceneData)

# Call the deserialize
s1 = time_deserialize(json)
println(s1)

The output is here :

{
    "name": "Default Scene",
    "objects": [
        {
            "type": "apple",
            "position": {
                "x": 10,
                "y": 5,
                "z": 3
            },
            "weight": 0.1
        },
        {
            "type": "car",
            "position": {
                "x": 7,
                "y": 0,
                "z": 0
            },
            "weight": 1000,
            "model": "Julia model S",
            "electric": false
        }
    ]
}
 5.498082 seconds (2.20 M allocations: 127.961 MiB, 0.80% gc time, 99.99% compilation time)
SceneData("Default Scene", AbstractObject[AppleData("apple", [10.0, 5.0, 3.0], 0.1), CarData("car", [7.0, 0.0, 0.0], 1000.0, "Julia model S", false)])

So it’s taking 5 seconds to convert the string to the SceneData, I noticed that the compilation time is 99% of the runtime, so I guess something is not defined for some types.
However I fill like, I’ve not yet the expertise to understand which type are not resolve here, and how could I precompile every variant so the first execution is quicker.

Here are all the models I’m using into this example:

########## AbstractObject.jl ###############
using StructTypes

abstract type AbstractObject end
StructTypes.StructType(::Type{AbstractObject}) = StructTypes.AbstractType()
StructTypes.subtypekey(::Type{AbstractObject}) = :type

 ############### Apple.jl  ###############
using JSON3

struct AppleData <: AbstractObject
    type::String
    position::Vector3D
    weight::Float64
end

AppleData(position::Vector3D) = AppleData("apple", position, 0.1)
JSON3.StructType(::Type{AppleData}) = JSON3.Struct()

 ############### Car.jl  ###############
using JSON3

struct CarData <: AbstractObject
    type::String
    position::Vector3D
    weight::Float64
    model::String
    electric::Bool
end

CarData(position::Vector3D) = CarData("car", position, 1000, "Julia model S", false)
JSON3.StructType(::Type{CarData}) = JSON3.Struct()

 ############### Vector3D.jl  ###############
using StructTypes
using StaticArrays

struct Vector3D <: FieldVector{3, Float64}
    x::Float64
    y::Float64
    z::Float64
end

Vector3D() = Vector3D(0.0, 0.0, 0.0)
StructTypes.StructType(::Type{Vector3D}) = StructTypes.Struct()

 ############### SceneData.jl  ###############
using JSON3

struct SceneData
    name::String
    objects::Vector{AbstractObject}
end

SceneData(objects::Vector{AbstractObject}) = SceneData("Default Scene", objects)
JSON3.StructType(::Type{SceneData}) = JSON3.Struct()

Then I define all JSON3 AbstractObject supported types like this :

# Defining supported object type
StructTypes.subtypes(::Type{AbstractObject}) = (
    apple=AppleData,
    car=CarData
)

Here are the serialize and deserialize functions:

# serialize data to string (quick enough)
function serialize(data, pretty::Bool=false)::String
    if (pretty)
        io = IOBuffer()
        JSON3.pretty(io, data)
        String(take!(io))
    else
        JSON3.write(data)
    end
end
# deserialize a string to proper struct - Way to slow
function deserialize(json_data::String, ::Type{T})::T where {T}
    JSON3.read(json_data, T)
end

Finally I try to precompile the deserialize function for SceneData like so :

precompile(deserialize, (String, SceneData))

Everything is define into a package TestJson with include a module TestJson.Models which contain every struct definition.

If someone have some lead for me to optimize the first call of the deserialize it would help a lot :slight_smile: !

This is actually a very simple version of what kind of object structures I want to support in my application. Actually with a full configuration, it take up to 30 seconds the first execution, and since the configuration is loaded only once at the startup, it is really an issue because subsequent call of deserialize will not happen later.

Hope this concrete example can help me learn more about julia performance, and maybe some of you will be able to see obvious mistake in this implementation :sweat_smile:

This is probably the recommended approach: Home · PrecompileTools.jl

1 Like

Hi, thank you for the recommendation, I will take a loot at it!

Ok, It’s way better than before using this approach :grinning:

Here is the resulting code :

module TestJson

include("Models/Models.jl")

using PrecompileTools: @setup_workload, @compile_workload    # this is a small dependency
using .Models

@setup_workload begin
    # Putting some things in `@setup_workload` instead of `@compile_workload` can reduce the size of the
    # precompile file and potentially make loading faster.

    objects = [
        CarData(Vector3D{Float64}(7.0, 0.0, 0.0)),
        AppleData(Vector3D{Float64}(10.1, 5.0, 3.0))
    ]
    s = SceneData(objects)

    # Call the serialize
    json = serialize(s)

    @compile_workload begin
        # all calls in this block will be precompiled, regardless of whether
        # they belong to your package or not (on Julia 1.8 and higher)

        deserialize(json, SceneData)
    end
end

end # module TestJson

The first call is now way faster than before (it’s was 5 secondes):
image

But still the second call is way faster than the first :
image

On the first call there is still a lot of allocations, and the compilation time is still 96.23% of the execution time.

Is there a way to see the details on these 96 % of compilation time ?

That would help me to understand what exactly it is trying to solve on the first call, because right now the code inside the “@setup_workload / @compile_workload” is exactly the same code as the one on the test. And I would expected that with the exact same code, everything would be solved at this point.

I consider that from a performance point of view it’s enough for my application, so in that way it’s a solved issue. But from an understanding point of view, I feel like I miss something, is there some tools/practices that you could recommend to profile/analyze Julia execution in general?

Thanks @cjdoris!

1 Like

GitHub - timholy/SnoopCompile.jl: Making packages work faster with more extensive precompilation if you want to delve deeper

I’m no expert, but I believe precompilation doesn’t always compile everything you ever need because of method invalidation - as you load more code (modules, packages, etc) some of the methods you already compiled become invalidated because some function down in the call stack got a new method.

Though TBH I wouldn’t worry too much about 8ms of compilation time each time you load your module - that seems pretty negligible.

Thanks I will have a look at this package. I consider this issue solved given the actual timing!