How to use WriteVTK properly when saving continous simulation results?

Hello!

I would just like to check that I am doing this right. I use WriteVTK a lot and really love it.

I run a simulation in which every N time steps I output a .vtp file. The ‘pseudo’ code I use looks like this:

# This function uses WriteVTK to produce a simple ParaView file for visualization
    # Make sure to use "Point Gaussian" and select something other than "Solid Color" to see the particles!
    function create_vtp_file(SimulationMetaData,InputData,SimulationData)
        @unpack SaveLocation,SimulationName,Iteration = SimulationMetaData
        @unpack ρ₀,c₀,γ = InputData

        # Convert the particle positions and densities into the format required by the vtk_grid function:
        points = reduce(hcat,SimulationData.Position)  # Concatenate the particle positions into a single matrix
        polys = empty(MeshCell{WriteVTK.PolyData.Polys,UnitRange{Int64}}[])
        verts = empty(MeshCell{WriteVTK.PolyData.Verts,UnitRange{Int64}}[])

        # Note: the order of verts, lines, polys and strips is not important.
        # One doesn't even need to pass all of them.
        all_cells = (verts, polys)

        filename  = SaveLocation*"/"*SimulationName*"_"*lpad(Iteration,4,"0")
        # Create a .vtp file with the particle positions and densities:
        vtk_grid(filename, points, all_cells..., compress = false) do vtk

            # Add the particle densities as a point data array:
            vtk_point_data(vtk, SimulationData.Kernel, "Wi")
            vtk_point_data(vtk, SimulationData.KernelGradient, "Wg")
            vtk_point_data(vtk, SimulationData.Density, "Density")
            vtk_point_data(vtk, Pressure.(SimulationData.Density,c₀,γ,ρ₀), "Pressure")
            vtk_point_data(vtk, SimulationData.Acceleration, "Acceleration")
            vtk_point_data(vtk, SimulationData.Velocity, "Velocity")
        end
    end

    function OutputVTP(SimulationMetaData,SimulationConstants,FinalResults)
        #@printf "Iteration %i | dt = %.5e \n" SimulationMetaData.Iteration SimulationMetaData.CurrentTimeStep
        if SimulationMetaData.Iteration % SimulationMetaData.OutputIteration == 0
            create_vtp_file(SimulationMetaData,SimulationConstants,FinalResults)
        end
    end

When I profile my code I see that outputting the .vtp file takes up some time.

The arrays I save are either StaticArrays or vector{float}. Since I only output a file every N = 50 iteration in a 10000 iteration simulation, I feel that it taking 7% of total time is a bit much no?

Maybe I am just a bit ‘greedy’ for performance, which would also be good to know.

Kind regards

Writing data to disk can take a good amount of time in a simulation :slight_smile:

In your example, you might be able to directly pass SimulationData.Position (assuming it’s a vector of SVector) and avoid some allocations, but I don’t think that will give you a huge performance boost.

There may be some possible performance improvements in WriteVTK, but we would need to profile in more detail to know which WriteVTK functions can be optimised. Feel free to open an issue where we can look at this.

For now, the only suggestion I can give you is to replace your create_vtp_file function with a function that dumps the same data to binary files (or HDF5), and see how performance differs. If the difference is small, it means that performance is really limited by I/O, and there’s not much that can be done in WriteVTK.

1 Like

Hi!

Yes, SimulationData.Position is a vector of SVector - what do you mean by directly pass? Do you mean that I can generate a vtkgrid once and then afterwards update data arrays how I want? If yes, would you kindly show me how.

Good point about profiling WriteVTK it self.

Interesting point about HDF5/binary, I was thinking (mistakenly) that the data in WriteVTK was already prescribed to be in binary format. The vtk suite is just a bit too good with Paraview etc. so even if I save to HDF5 I am not sure it will have much use for me… thanks for the suggestion though and I have to think a bit.

Kind regards :slight_smile:

I just meant that you should be able to do:

vtk_grid(filename, SimulationData.Position, all_cells..., compress = false) do vtk

without needing to create an intermediate points array. But that shouldn’t make a big difference in performance.

Sorry if I wasn’t clear. By default WriteVTK does write binary data. I suggested using other binary formats just as a way of detecting performance issues. If dumping the same data to raw binary files is considerably faster than saving it to VTK, then it means that there may be improvements to be made in WriteVTK.

1 Like

I was indeed able to do: vtk_grid(filename, SimulationData.Position, all_cells..., compress = false) do vtk

Great catch! It didn’t do too much for performance:

But cleaner code is always much appreciated.

Got it, thanks for the further explanation of binary/fileformat.

Kind regards

Hello!

I am trying to see if I can make a version which allocates less. I’ve made a test-script here:

#https://www.analytech-solutions.com/analytech-solutions/blog/binary-io.html
using XML
using XML: Document, Declaration, Element, Text
using Base64: base64encode
using StaticArrays
using WriteVTK
const IS_LITTLE_ENDIAN = ENDIAN_BOM == 0x04030201

N              = 1
Points = [SVector{3,Float64}(1,2,3)]
#Points         = rand(SVector{3,Float64},N)
# Kernel         = rand(Float64,N)
# KernelGradient = rand(SVector{3,Float64},N)

xml_doc = Document(Declaration(version=1.0,encoding="utf-8"))
vtk_file = Element("VTKFile")
vtk_file.attributes["type"]        = "PolyData"
vtk_file.attributes["version"]     = "1.0"
vtk_file.attributes["byte_order"]  = "LittleEndian"
vtk_file.attributes["header_type"] = "UInt64"
push!(xml_doc,vtk_file)

polydata  = Element("PolyData")
piece     = Element("Piece")
piece.attributes["NumberOfPoints"] = string(N)
# piece.attributes["NumberOfVerts"] = "0"
# piece.attributes["NumberOfPolys"] = "0"
points    = Element("Points")
dataarray = Element("DataArray")
dataarray.attributes["type"]                = "Float64"
dataarray.attributes["Name"]                = "Points"
dataarray.attributes["NumberOfComponents"]  = "3"
dataarray.attributes["format"]              = "appended"
dataarray.attributes["offset"]              = "0"

push!(points,dataarray)
push!(piece,points)
push!(polydata,piece)
push!(vtk_file,polydata)


pointdata  = Element("PointData")
dataarray1 = Element("DataArray")
dataarray1.attributes["type"]                = "Float64"
dataarray1.attributes["Name"]                = "Kernel"
dataarray1.attributes["NumberOfComponents"]  = "1"
dataarray1.attributes["format"]              = "appended"
dataarray1.attributes["offset"]              = "0"

dataarray2 = Element("DataArray")
dataarray2.attributes["type"]                = "Float64"
dataarray2.attributes["Name"]                = "KernelGradient"
dataarray2.attributes["NumberOfComponents"]  = "3"
dataarray2.attributes["format"]              = "appended"
dataarray2.attributes["offset"]              = "0"

# push!(pointdata, dataarray1, dataarray2)
# push!(piece,pointdata)

appendeddata = Element("AppendedData")
appendeddata.attributes["encoding"] = "raw"

push!(vtk_file,appendeddata)

        # Initialize containers for VTK data structure
        polys = empty(MeshCell{WriteVTK.PolyData.Polys, UnitRange{Int64}}[])
        verts = empty(MeshCell{WriteVTK.PolyData.Verts, UnitRange{Int64}}[])
        all_cells = (verts, polys)

io = IOBuffer()
write(io,"\n")
write(io,"_")
write(io,Char(24))
write(io,Char(0)^7)
write(io,Points)
write(io,"\n")
v = join(Char.(take!(io)))
t = Text(v)
print("This is the result from custom writer:    ", v[2:end]) #Skip newline
push!(appendeddata,t)

XML.write(raw"E:\SPH\TestOfFile.vtp",xml_doc)

# Overwrite a bit of WriteVTK to produce simplified working file

function WriteVTK.vtk_grid(dtype::WriteVTK.VTKPolyData, filename::AbstractString,
        points::WriteVTK.UnstructuredCoords,
        cells::Vararg{AbstractArray{<:WriteVTK.PolyCell}}; kwargs...)
        Npts = WriteVTK.num_points(dtype, points)
        Ncls = sum(length, cells)

        xvtk = WriteVTK.XMLDocument()
        vtk = WriteVTK.DatasetFile(dtype, xvtk, filename, Npts, Ncls; kwargs...)

        xroot = WriteVTK.vtk_xml_write_header(vtk)
        xGrid = WriteVTK.new_child(xroot, vtk.grid_type)

        xPiece = WriteVTK.new_child(xGrid, "Piece")
        WriteVTK.set_attribute(xPiece, "NumberOfPoints", Npts)

        xPoints = WriteVTK.new_child(xPiece, "Points")
        WriteVTK.data_to_xml(vtk, xPoints, points, "Points", 3)

        vtk
end


function WriteVTK.data_to_xml_appended(vtk::WriteVTK.DatasetFile, xDA::WriteVTK.XMLElement, data)
        @assert vtk.appended
    
        buf = vtk.buf    # append buffer
        buf_check = IOBuffer()    # append buffer
        #compress = vtk.compression_level > 0
        compress = false
    
        # DataArray node
        WriteVTK.set_attribute(xDA, "format", "appended")
        WriteVTK.set_attribute(xDA, "offset", position(buf))
    
        # Size of data array (in bytes).
        nb = WriteVTK.sizeof_data(data)
    
        # if compress
        #     initpos = position(buf)
    
        #     # Write temporary data that will be replaced later with the real header.
        #     let header = ntuple(d -> zero(WriteVTK.HeaderType), Val(4))
        #         write(buf, header...)
        #     end
    
        #     # Write compressed data.
        #     zWriter = WriteVTK.ZlibCompressorStream(buf, level=vtk.compression_level)
        #     WriteVTK.write_array(zWriter, data)
        #     write(zWriter, WriteVTK.TranscodingStreams.TOKEN_END)
        #     flush(zWriter)
        #     WriteVTK.TranscodingStreams.finalize(zWriter.codec) # Release allocated resources (issue #43)
    
        #     # Go back to `initpos` and write real header.
        #     endpos = position(buf)
        #     compbytes = endpos - initpos - 4 * sizeof(WriteVTK.HeaderType)
        #     let header = WriteVTK.HeaderType.((1, nb, nb, compbytes))
        #         seek(buf, initpos)
        #         write(buf, header...)
        #         seek(buf, endpos)
        #     end
        # else
            write(buf, WriteVTK.HeaderType(nb))  # header (uncompressed version)
            nb_write = WriteVTK.write_array(buf, data)
            @assert nb_write == nb

            # buf_check to output
            write(buf_check, WriteVTK.HeaderType(nb))  # header (uncompressed version)
            nb_write = WriteVTK.write_array(buf_check, data)
            @assert nb_write == nb
            println("This is the result from WriteVTK:         _", join(Char.(take!(buf_check))))
        # end
    
        xDA
    end
    

# Initialize containers for VTK data structure
# Create a .vtp file with the specified positions
vtk_grid(raw"E:\SPH\TestOfFileWriteVTK.vtp", Points, all_cells...) do vtk
end;

I am focusing on PolyData and appended format, since that is my use case. I am having the issue that I am struggling on how to exactly write the raw data in AppendedTag. When I look at both your and mine implementation in Julia it looks good:

image

When I look at in NotePad++ remembering to use the same encoding (ANSI, just to visualize):

There is then a difference. As a test, I am trying to get Vector(SVector{3,Float64}(1,2,3)) to work.

Could you help me just getting this step done, then I should be able to test my less allocating idea.

In the file script one must specify save locations for the WriteVTK file and the custom save file.

Kind regards

Hi, I don’t know if you found a solution, but the difference may be explained by a workaround used in WriteVTK to write binary data to XML files (which wasn’t possible with the LightXML.jl package when the workaround was written, which is many years ago). I don’t know what XML.jl does in this case, I haven’t tried.

You can see the details here. Basically, instead of directly writing the XML file to disk, we convert everything but the binary data to a string. Then we write a new file where we write this string, but also dumping the binary data inside an <AppendedData> section.

1 Like

Hey!

Yes, I ended up getting it to work, I included an example script here, perhaps of interest:

Summary

Same script

#https://www.analytech-solutions.com/analytech-solutions/blog/binary-io.html
using XML
using XML: Document, Declaration, Element, Text
using StaticArrays

### Functions=================================================
# Function to create a DataArray element for VTK files
function create_data_array_element(name::String, data::AbstractVector{T}, offset::Int) where T
    # Create the DataArray elements
    dataarray = Element("DataArray")
    
    # Set attributes based on the input vector's type
    dataarray.attributes["type"]               = string(eltype(first(data)))
    dataarray.attributes["Name"]               = name  
    dataarray.attributes["NumberOfComponents"] = string(Int(sizeof(first(data))/sizeof(eltype(first(data)))))
    dataarray.attributes["format"]             = "appended"
    dataarray.attributes["offset"]             = string(offset)
    return dataarray
end

###===========================================================
# Points         = [SVector{3,Float64}(1,2,3), SVector{3,Float64}(4,5,6)]
# Kernel         = Float64.([100, 200]) 
# KernelGradient = [SVector{3,Float64}(-1,1,0), SVector{3,Float64}(1,-1,0)]
N              = 6195
Points         = rand(SVector{3,Float64},N) * 10
Kernel         = rand(Float64,N) * 1000
KernelGradient = rand(SVector{3,Float64},N) * 100


function PolyDataTemplate(filename::String, points, variable_names, args...)
        # Generate the XML document and then put in some fixed values
        xml_doc = Document(Declaration(version=1.0,encoding="utf-8"))
        vtk_file = Element("VTKFile")
        vtk_file.attributes["type"]        = "PolyData"
        vtk_file.attributes["version"]     = "1.0"
        vtk_file.attributes["byte_order"]  = "LittleEndian"
        vtk_file.attributes["header_type"] = "UInt64"

        # PolyData is the main section, filling it out
        polydata  = Element("PolyData")
        piece     = Element("Piece")
        N = length(points)
        piece.attributes["NumberOfPoints"] = string(N)

        # This Points element and its associated DataArray has to be constructed individually
        points_element    = Element("Points")
        point_dataarray = create_data_array_element("Points",points,0)
        point_dataarray["offset"] = 0

        
        # Generate appended data element
        appendeddata = Element("AppendedData")
        appendeddata.attributes["encoding"] = "raw"

        # Start writing the file and generating the correct dataarrays with the right offsets in the loop
        NB = 0
        io = IOBuffer()
        write(io,"\n_")
        UncompressedHeaderN  = N * length(first(points)) *  sizeof(typeof(first(points)))
        NB += write(io, UncompressedHeaderN)
        NB += write(io, points)

        # Generate XML tags for kwargs data
        pointdata  = Element("PointData")
        dataarrays = Vector{XML.Node}(undef,length(args))

        for i in eachindex(args)
            arg           = args[i]
            dataarrays[i] = create_data_array_element(variable_names[i],arg,NB)

            A             = typeof(first(arg))
            T             = eltype(A)
            Ni            = length(arg)
            Tsz           = sizeof(T)
            Nc            = Int( sizeof(A) / Tsz )
            HowManyBytes  = Tsz*Nc*Ni + Tsz

            NB           += HowManyBytes

            write(io, NB)
            write(io, arg)
        end

        # Take the result from the buffer, turn to string and write it
        v = take!(io)
        t = Text(String(v))
        write(io,"\n")
        push!(appendeddata,t)
        close(io)

        # Glue all xml pieces together
        push!(xml_doc,vtk_file)
        push!(points_element,point_dataarray)
        push!(piece,points_element)
        push!(polydata,piece)
        push!(vtk_file,polydata)
        map(x -> push!(pointdata,x), dataarrays)
        push!(piece,pointdata)
        push!(vtk_file,appendeddata)

        XML.write(filename,xml_doc)
end

save_location = raw"E:\SPH\TestOfFile.vtp"

d = @report_opt target_modules=(@__MODULE__,) PolyDataTemplate(save_location, Points, ["Kernel", "KernelGradient"], Kernel, KernelGradient)
println(d)

@profview PolyDataTemplate(save_location, Points, ["Kernel", "KernelGradient"], Kernel, KernelGradient)

b = @benchmark PolyDataTemplate($save_location, $Points, $(["Kernel", "KernelGradient"]), $Kernel, $KernelGradient)
display(b)

@code_warntype PolyDataTemplate(save_location, Points, ["Kernel", "KernelGradient"], Kernel, KernelGradient)

PolyDataTemplate(save_location, Points, ["Kernel", "KernelGradient"], Kernel, KernelGradient)

It is about 10% faster than WriteVTK, with same memory foot-print, but lot less flexibility etc. only this one format the way I want :slight_smile: I was not able to figure out how to nicely preallocate the whole file and buffer and then directly update the buffer to avoid allocs when using XML.jl so I took a break from it…

Let me know what you think, quite curious about your opinion

Kind regards