Counting the number of paragraphs ("p") between containers ("div")

Just to be perfectly clear. If the structure is like this

**subhead**

subrow
  p
  p
  p

**subhead**

subrow
  p
  p
subrow
  p
subrow
  p

**subhead**

subrow
  p
subrow
  p

the desired output is [3,4,2]? Indentations means “child of”.

Maybe

function countp(body::HTMLNode)
   i = 0
   counts = Int[]
   for e in body.children
     if e isa HTMLElement{:div} && get(e.attributes, "class", "")=="ttt-subhead"
       push!(counts, 0)
       i += 1
     end
     counts[i] += length(eachmatch(Selector("div.ttt-row > p"), e))
   end
   counts
end

I appreciate your help, but the code output is empty.

Could it be because there is a misunderstanding. The “ttt-subhead” are essentially “islands”, and so are the clusters of “div.ttt-row > p”.

Maybe because you passed in the root node, not the body? My bad. Poor naming.

HTML
myhtml = """
<!DOCTYPE html>
<html lang="en">

</head>

<body style="text-align: center;">

	 <div class="ttt-subhead"><span>8:00</span> Breakfast</div>


        <div class="ttt-row">
		<p>
			Eggs.
		</p>
		<p>
			Bacon.
		</p>
        </div>


	  <div class="ttt-subhead"><span>12:00</span> Lunch</div>

        <div class="ttt-row">
		<p>
			Burger.
		</p>
		<p>
			Fries.
		</p>
		<p>
			Coke.
		</p>
        </div>
        <div class="ttt-row">
		<p>
			Burger.
		</p>
		<p>
			Fries.
		</p>
		<p>
			Coke.
		</p>
        </div>

	  <div class="ttt-subhead"><span>18:00</span> Dinner</div>

        <div class="ttt-row">
		<p>
			Salad.
		</p>
        </div>

</body>

</html>
"""
dom = parsehtml(myhtml)
body = dom.root[2]

countp(body)
3-element Vector{Int64}:
 2
 6
 1