If instead of a abstract type of material you use a union type, using:
begin
struct NoMaterial end
const _no_material = NoMaterial()
const _y_up = @SVector[0f0,1f0,0f0]
end
struct Metal
albedo::SVector{3,Float32}
fuzz::Float32 # how big the sphere used to generate fuzzy reflection rays. 0=none
Metal(a,f=0.0) = new(a,f)
end
struct Lambertian
albedo::SVector{3,Float32}
end
struct Dielectric
ir::Float32 # index of refraction, i.e. Ξ·.
end
const Material = Union{NoMaterial,Metal,Lambertian,Dielectric}
and make all structs immutable, you get almost no allocations in the first render
function:
julia> c = default_camera()
Camera(Float32[0.0, 0.0, 0.0], Float32[-1.7777778, -1.0, -1.0], Float32[3.5555556, 0.0, 0.0], Float32[0.0, 2.0, 0.0], Float32[1.0, 0.0, 0.0], Float32[0.0, 1.0, 0.0], Float32[0.0, 0.0, 1.0], 0.0f0)
julia> s = scene_4_spheres()
HittableList(Hittable[Sphere(Float32[0.0, 0.0, -1.0], 0.5f0, Lambertian(Float32[0.7, 0.3, 0.3])), Sphere(Float32[0.0, -100.5, -1.0], 100.0f0, Lambertian(Float32[0.8, 0.8, 0.0])), Sphere(Float32[-1.0, 0.0, -1.0], 0.5f0, Metal(Float32[0.8, 0.8, 0.8], 0.3f0)), Sphere(Float32[1.0, 0.0, -1.0], 0.5f0, Metal(Float32[0.8, 0.6, 0.2], 0.8f0))])
julia> @benchmark render($s,$c, 96, 16)
BenchmarkTools.Trial: 187 samples with 1 evaluation.
Range (min β¦ max): 26.035 ms β¦ 36.614 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 26.559 ms β GC (median): 0.00%
Time (mean Β± Ο): 26.749 ms Β± 929.174 ΞΌs β GC (mean Β± Ο): 0.00% Β± 0.00%
βββββββ
β ββββ β
βββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββ β
26 ms Histogram: frequency by time 29.2 ms <
Memory estimate: 60.80 KiB, allocs estimate: 2.
This means almost certainly that we got rid of dynamic dispatch and type instabilities. The performance in this small example does not improve (actually it is slightly worse here), but for a larger problem that may be much better. To test.
The use of a union instead of an abstract type is another workaround to suggest to the compiler to do automatic union splitting, but of course it is again less elegant.
For the records, with the abstract material type, I had:
julia> @benchmark render($s,$c, 96, 16)
BenchmarkTools.Trial: 240 samples with 1 evaluation.
Range (min β¦ max): 19.864 ms β¦ 29.133 ms β GC (min β¦ max): 0.00% β¦ 0.00%
Time (median): 20.285 ms β GC (median): 0.00%
Time (mean Β± Ο): 20.910 ms Β± 1.521 ms β GC (mean Β± Ο): 1.83% Β± 4.79%
ββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
19.9 ms Histogram: log(frequency) by time 25.9 ms <
Memory estimate: 6.40 MiB, allocs estimate: 138409.
With the union type it is slower in this small test, but note the huge difference in the allocations.