Hello Julians!
Welcome to the first Sunday Small challenge.
I will post 3 lil’ problems - whoever can solve them in Julia (in serial, all architectures allowed) on the input(s) repeated 1000x wins all the internet points. Each problem is it’s own category.
Challenge ends next Sunday.
Remove HTML Tags
# Input - assume the string is ASCII
str = Vector{UInt8}("<div>Hello <b>JuliaCon2022!</b></div>")
# Output satisfies
unhtml(str) == Vector{UInt8}("Hello JuliaCon2022!")
Hamming distance - count the number of corresponding unequal elements in 2 ordered collections of equal size.
using VectorizationBase
function vunhtml!(output::Vector{UInt8}, input::AbstractVector{UInt8})
N = length(input)
resize!(output, N)
n = 0
W = VectorizationBase.pick_vector_width(UInt8)
GC.@preserve output input begin
pinput = VectorizationBase.zstridedpointer(input)
pout = pointer(output)
i = 0
while i < N
m = VectorizationBase.mask(W, i, N)
md = VectorizationBase.data(m)
# @show i md
v = vload(pinput, (MM{Int(W)}(i),), m)
ml = v == UInt8('<')
mu = v == UInt8('>')
muu = VectorizationBase.data(mu)
mlu = VectorizationBase.data(ml)
i += 64
m2 = VectorizationBase.Mask(m)
if mlu > muu
lz = leading_zeros(mlu)
truncflag = (one(UInt) << (8sizeof(UInt) - 1 - lz))
mlu -= truncflag
m2 &= VectorizationBase.Mask{Int(W)}((truncflag-one(truncflag)))
i -= 1 + lz
end
cmu = ~((muu - mlu) + muu)
cm = VectorizationBase.Mask{Int(W)}(cmu) & m2
cmd = VectorizationBase.data(cm)
# @show muu mlu cmu
VectorizationBase.compressstore!(pout + n, v, cm)
n += count_ones(cm)
end
end
resize!(output, n)
output
end
vunhtml!(output, input::String) = vunhtml!(output, codeunits(input))
In case source string is a known const, you don’t need any regexes, the answer is already known. If it not exactly that and can be some other valid HTML:
Here’s one for parens depth that doesn’t allocate, for a 10x speedup, and handles strings that aren’t just ‘(’ and ‘)’:
function pdepth(str)
current = 0
maxdepth = -1
for c in str
current += (c == '(') - (c == ')')
maxdepth = max(maxdepth, current)
end
return maxdepth
end