# Sum or fold over fields in a struct

Given an array of RGB{UInt8} values, how can we collect the sums of their independent :r, :g, and :b 8-bit components, without overflowing?

We can write a `for` loop and do this manually:

``````for v in arr
r += v.r
g += v.g
b += v.b
end
``````

We can use the `sum` function:

``````r = sum(v -> v.r, arr)
g = sum(v -> v.g, arr)
b = sum(v -> v.b, arr)
``````

but this is likely to be much less efficient than the for loop, because it has to iterate three times, correct?

If we write:

``````rgbsum = sum(arr)
``````

we overflow if arr is an array of RGB structs with 8-bit fields.

Is there a way to ask the `sum` function to accumulate its sum into an RGB{Float64}, for example, so that `rgbsum = sum(arr)` would work without overflow and as efficiently as the explicit `for` loop?

Not sure which RGB type we are talking about (it doesn’t seem to be the one from Colors.jl) but you could try something like:

``````foldl(+, arr; init=RGB{Float64}(0.0,0.0,0.0))
``````

That is fold/reduce the array via addition with a `RGB{Float64}` initial value.

Example with a simple fake RGB type:

``````julia> struct RGB{T}
r::T
g::T
b::T
end

julia> Base.:+(x::RGB, y::RGB) = RGB(x.r + y.r, x.g + y.g, x.b + y.b)

julia> arr = RGB{UInt8}.(rand(UInt8,10), rand(UInt8,10), rand(UInt8,10));

julia> typeof(arr)
Array{RGB{UInt8},1}

julia> foldl(+, arr; init=RGB{Float64}(0,0,0))
RGB{Float64}(894.0, 1321.0, 1086.0)
``````
2 Likes

Another option, maybe a bit clearer, would be a typecast in `sum`:

``````sum(RGB{UInt16}, arr)
``````

But yes, it would help if the example code included the `using` statement that imported the `RGB` type.

Ah, thank you. The `init` argument was what I needed.

I’m sorry for the type confusion. This RGB type comes from the Images package, but I was intending to ask a more general question, which you’ve both answered.

I like the typecast approach of `sum(RGB{Float64}, arr)`, but it seems to perform much worse than the `foldl(+, arr, init=RGB{Float64}(0,0,0))` approach. 20ms vs 0.25ms for an array of size (460, 374).

It’ll be faster to retain the underlying `UInt` representation using a `N24f8` or similar instead of doing a floating-point conversion:

``````julia> using BenchmarkTools, Images

julia> arr = rand(RGB{N0f8}, 460, 374);

julia> @btime foldl(+, \$arr; init=RGB{Float64}(0, 0, 0))
158.500 μs (0 allocations: 0 bytes)
RGB{Float64}(86118.69411764605,85819.37254901891,86040.69019608037)

julia> @btime foldl(+, \$arr; init=RGB{N24f8}(0, 0, 0))
29.900 μs (0 allocations: 0 bytes)
RGB{N24f8}(86118.7,85819.4,86040.7)
``````

This is only an option if you know the sum for each channel will be less than `typemax(N24f8)`

``````julia> typemax(N24f8) |> Float64
1.6843009e7
``````

…but you can get around this with `N56f8` if need be:

``````julia> @btime foldl(+, \$arr; init=RGB{N56f8}(0, 0, 0))
54.800 μs (78 allocations: 2.73 KiB)
RGB{N56f8}(86118.7,85819.4,86040.7)
``````
1 Like

That’s worth raising as an issue, unless the difference is compilation time or something like that. Could you please post a complete example, either here or on Github, with the code you ran that printed those times?

At least on v1.6, it’s actually faster than the `foldl` approach:

``````julia> @btime sum(RGB{Float64}, \$arr)
76.800 μs (0 allocations: 0 bytes)
RGB{Float64}(86118.69411764707,85819.37254901961,86040.69019607845)
``````

**edit: can confirm that it’s much slower on v1.5.

``````julia> @btime sum(RGB{Float64}, \$arr)
17.857 ms (516119 allocations: 13.13 MiB)
``````
4 Likes