Reliably make a clean copy of an XML node

I am using XML.jl.

I have a basic XML node that I use as a base from which I wish to construct many slightly different instances. On each occasion, I want to copy the basic node and build from there. The Readme page for XML.jl says:

Node is an immutable type. However, you can easily create a copy with one or more field values changed by using the Node(::Node; kw...) constructor where kw are the fields you want to change.

I don’t want to change any fields, So I just tried new_node=XML.Node(old_node).

I then build up the rest of the new_node as I want it.

The consequence of this is that the changes I make to new_node also propagate back to old_node - specfically to its child nodes.

I have experimented with many different ways of specifying the kw... options in XML.Node but to no avail. For example, I tried variations on new_node=XML.Node(old_node, children=[x for x in XML.children(old_node)]) to try to copy the child nodes, too.

Two ways I have found to make this work are to try deepcopy(old_node) and XML.parse(Node, XML.write(old_node))[1].

The first of these seems not to be recommended (β€œA brief conversation of deepcopy vs copy in Julia and why deepcopy should be avoided unless you know you have a good reason to use it (serialization-like)” (here)).

The second is very inefficient.

MWE below the fold
using XML
copynode(o)=Node(o)
deepcopynode(o) = deepcopy(o)
parsenode(o) = XML.parse(Node, XML.write(o))[1]
function createRule()
    rule = XML.Element("x14:cfRule", type="iconSet", priority="1", id="XXXX-xxxx-XXXX")
    icon = XML.Element("x14:iconSet", iconSet="3Arrows", custom="1")
    cfvo = XML.Element("x14:cfvo", type="percent")
    push!(cfvo, XML.Element("xm:f", XML.Text("0")))
    push!(icon, cfvo)
    push!(rule, icon)
    return rule
end
function createCfx(rule, f)
    cfx=f(rule)
    cfvo = XML.Element("x14:cfvo", type="percent")
    push!(cfvo, XML.Element("xm:f", XML.Text("dummy")))
    push!(cfx[1], f(cfvo))
    push!(cfx[1], f(cfvo))
    push!(cfx[1], f(cfvo))
    cfx[1]["iconSet"] = "4Arrows"
    return cfx
end

println("\nusing XML.Node()")
rule = createRule()
println("\n `rule` before:\n",XML.write(rule))
cfx=createCfx(rule, copynode)
println("\n `cfx`:\n",XML.write(cfx))
println("\n `rule` after:\n",XML.write(rule))

println("\nusing deepcopy()")
rule = createRule()
println("\n `rule` before:\n",XML.write(rule))
cfx=createCfx(rule, deepcopynode)
println("\n `cfx`:\n",XML.write(cfx))
println("\n `rule` after:\n",XML.write(rule))

println("\nusing XML.parse()")
rule = createRule()
println("\n `rule` before:\n",XML.write(rule))
cfx=createCfx(rule, parsenode)
println("\n `cfx`:\n",XML.write(cfx))
println("\n `rule` after:\n",XML.write(rule))
using XML.Node()

 `rule` before:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `cfx`:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="4Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `rule` after:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="4Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

using deepcopy()

 `rule` before:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `cfx`:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="4Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `rule` after:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

using XML.parse()

 `rule` before:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `cfx`:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="4Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

 `rule` after:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

Does anyone have any suggestions?

Thank you for the suggestion @Pamela329Lac.

I did try copynode(o)=XML.Node(o, children=[XML.Node(x) for x in o.children]) but I this gives

 `rule` after:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="4Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo children="Node[Node Element <xm:f> (1 child)]" type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo children="Node[Node Element <xm:f> (1 child)]" type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
    <x14:cfvo children="Node[Node Element <xm:f> (1 child)]" type="percent">
      <xm:f>dummy</xm:f>
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

This suggests that Node thinks that children=... is a keyword argument specifying a children attribute - which it isn’t.

copynode(o)=XML.Node(o, [XML.Node(x) for x in o.children]) does no better:

 `rule` after:
<x14:cfRule type="iconSet" priority="1" id="XXXX-xxxx-XXXX">
  <x14:iconSet iconSet="3Arrows" custom="1">
    <x14:cfvo type="percent">
      <xm:f>0</xm:f>
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
      Node[Node Element <xm:f> (1 child)]
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
      Node[Node Element <xm:f> (1 child)]
    </x14:cfvo>
    <x14:cfvo type="percent">
      <xm:f>dummy</xm:f>
      Node[Node Element <xm:f> (1 child)]
    </x14:cfvo>
  </x14:iconSet>
</x14:cfRule>

Tim, I have no answer to your question but because this has happened to you for the second time in the last couple of days I thought I’d mention that these plausible-sounding replies from Firstname[Number]Lastname users are GenAI spam, so there’s little point engaging with them.

(If you do find them helpful of course you might want to try simply talking to one of the better models directly before posting here!)

3 Likes

Understood! The like was to try to be welcoming :slightly_smiling_face:

1 Like

On a random walk through options, I seem to have hit upon

copynode(o)=XML.Node(o.nodetype, o.tag, o.attributes, o.value, isnothing(o.children) ? nothing : [copynode(x) for x in o.children])

Which recursively copies a node and all its children.

As far as I can tell, this gives the same result as deepcopy but is quite a bit faster. Here, I’m just using an XML file representing an Excel worksheet (nested a few levels deep).

julia> using XML, BenchmarkTools, XLSX

julia> copynode(o)=XML.Node(o.nodetype, o.tag, o.attributes, o.value, isnothing(o.children) ? nothing : [copynode(x) for x in o.children])
copynode (generic function with 1 method)

julia> deepcopynode(o) = deepcopy(o)
deepcopynode (generic function with 1 method)

julia> function createCfx(rule, f)                                                                                                                                                                     
           cfx=f(rule)                                                                                                                                                                                 
           cfvo = XML.Element("x14:cfvo", type="percent")                                                                                                                                              
           push!(cfvo, XML.Element("xm:f", XML.Text("dummy")))                                                                                                                                     
           push!(cfx[end], f(cfvo))                                                                                                                                                            
           push!(cfx[end], f(cfvo))                                                                                                                                                        
           push!(cfx[end], f(cfvo))                                                                                                                                                                    
           cfx[1]["iconSet"] = "4Arrows"                                                                                                                                                               
           return cfx                                                                                                                                                                              
       end
createCfx (generic function with 1 method)

julia> f=XLSX.opentemplate(raw"C:\Users\tim\OneDrive\Documents\Julia\XLSX\iconKey.xlsx")
XLSXFile("C:\Users\tim\OneDrive\Documents\Julia\XLSX\iconKey.xlsx") containing 1 Worksheet
            sheetname size          range
-------------------------------------------------
               Sheet1 4x13          A1:M4

julia> rule = XLSX.get_worksheet_xml_document(f[1])
Node Document (2 children)

julia> @benchmark cfx=createCfx($rule, $copynode)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):   86.100 ΞΌs …   7.979 ms  β”Š GC (min … max): 0.00% … 97.99%
 Time  (median):      93.900 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   109.910 ΞΌs Β± 153.160 ΞΌs  β”Š GC (mean Β± Οƒ):  6.75% Β±  5.06%

     β–β–‡β–ˆβ–ƒ
  β–β–‚β–„β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–„β–‚β–‚β–‚β–β–β–β–β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β– β–‚
  86.1 ΞΌs          Histogram: frequency by time          159 ΞΌs <

 Memory estimate: 205.73 KiB, allocs estimate: 4520.

julia> rule2 = XLSX.get_worksheet_xml_document(f[1])
Node Document (2 children)

julia> cfx2=createCfx(rule2, deepcopynode)
Node Document (2 children)

julia> rule2 = XLSX.get_worksheet_xml_document(f[1])
Node Document (2 children)

julia> @benchmark cfx2=createCfx($rule2, $deepcopynode)
BenchmarkTools.Trial: 9872 samples with 1 evaluation per sample.
 Range (min … max):  438.400 ΞΌs …   4.798 ms  β”Š GC (min … max): 0.00% … 87.00%
 Time  (median):     456.900 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   505.863 ΞΌs Β± 259.013 ΞΌs  β”Š GC (mean Β± Οƒ):  4.52% Β±  7.66%

   β–…β–‡β–ˆβ–‡β–†β–†β–…β–„β–‚β–β–β–            ▁▂▂▂▂▁          ▁▁▁▁▁▁▁▂▁▁           β–‚
  β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–†β–…β–†β–‡β–‡β–†β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–…β–…β–„β–„β–„β–„ β–ˆ
  438 ΞΌs        Histogram: log(frequency) by time        698 ΞΌs <

 Memory estimate: 537.81 KiB, allocs estimate: 12190.

julia> cfx=createCfx(rule, copynode)
Node Document (2 children)

julia> rule = XLSX.get_worksheet_xml_document(f[1])
Node Document (2 children)

julia> cfx=createCfx(rule, copynode)
Node Document (2 children)

julia> rule2 = XLSX.get_worksheet_xml_document(f[1])
Node Document (2 children)

julia> cfx2=createCfx(rule2, copynode)
Node Document (2 children)

julia> cfx==cfx2
true