replacing the interpolation. The conclusion: > 60% of the time is spent in the interpolation. So, no dramatic efficiency gain from rearranging stuff around that (though I did manage to get a 10% gain by fiddling a bit).
You’re still using the global variables xmin etc. in loaddata. After that’s corrected, there still seems to be type instability in indexing in e.g. x_all:
by precomputing / lazily computing / using Base.LogRange / … But I imagine that fundamentally at this point really only the for ii ... loop matters (the body of which will in total be executed 1.4 billion times). @JADekker has already given some ideas for that above. If you split the for ..., ..., ... into three nested loops, you might also be able to reuse some calculations. Likely the only way to get a major performance boost (if at all possible) is to try to make the interpolation more efficient somehow.
I would not expect such massive variations (excluding the first run), nor can I observe them:
julia> include("code.jl")
x ∈ [-1200.0, 1200.0], y ∈ [-1200.0, 1200.0], z ∈ [-1200.0, 1200.0]
55.999274 seconds (19.50 M allocations: 1.082 GiB, 0.84% gc time, 36.19% compilation time)
GLMakie.Screen(...)
julia> include("code.jl")
x ∈ [-1200.0, 1200.0], y ∈ [-1200.0, 1200.0], z ∈ [-1200.0, 1200.0]
62.853175 seconds (1.19 M allocations: 143.744 MiB, 0.08% gc time, 5.48% compilation time)
GLMakie.Screen(...)
julia> include("code.jl")
x ∈ [-1200.0, 1200.0], y ∈ [-1200.0, 1200.0], z ∈ [-1200.0, 1200.0]
53.756447 seconds (1.15 M allocations: 141.678 MiB, 0.46% gc time, 7.29% compilation time)
GLMakie.Screen(...)
julia> include("code.jl")
x ∈ [-1200.0, 1200.0], y ∈ [-1200.0, 1200.0], z ∈ [-1200.0, 1200.0]
57.703731 seconds (1.07 M allocations: 137.641 MiB, 0.33% gc time, 10.25% compilation time)
GLMakie.Screen(...)
(Note that for some reason you don’t have bound’s printlns.)
So no idea. If I were you, I would definitely test again without the volume! call, though as long as you don’t interact with the plot (within the @time, e.g. via a wait(display(fig))), it still shouldn’t cause the variability.
I think that GPU would improve performance of this code but how to use it? KernelAbstractions.jl documentation is not detailed enough. Please can you tell GPU version of this code because it takes about half hour to run code if i substitute nxs,nys,nzs = 2nx,2ny,2nz?
Could you first explain what the code is actually doing / trying to do?
On a high level you’re performing some quantised interpolation and conversion from spherical to cartesian coordinates. But what is e.g. Bp? In particular, what’s up with the fourth axis (also in cx_all, etc.)? Why doesn’t every b-iteration overwrite some (or all) of the field[ix, iy, iz] values from a ‘previous’ iteration? (The notation of ‘previous’ here is complicated by multithreading.) Why do we need log.(r) in interpolate? Why ‘smooth’ θ_eval, but not ϕ_eval? What is a? Do we need the extrapolate? …
You should use more descriptive variable names and add comments/documentation.
Hi @eldee I am trying to 3D plot Black Hole simulation data. I am converting Kerr-Schild coordinates to Cartesian coordinates. Bp is magnetic field ratio(poloidal/toroidal). I need log.(r) in interpolate because r(geometric series) is not uniform so i have to make it uniform.
I could not notice improvement in plot on making changes in ϕ_eval. a is spin of Black hole. Yes, We need extrapolate as we are interpolating on cx_all which are location where color is assigned and calculating cxe_all. Although extrapolation can be ignored if i tolerate little bit error/approximation in color position and interpolate directly on cx_all.
That i too don’t understand but it gives correct plot as i have matched it with scatter plot and every feature looks same.
I agree that the documentation is a bit too sparse for me (last time I checked, a couple of months ago). So below is a CUDA.jl implementation instead. It’s about 30x faster than the multithreaded CPU version (for an Intel i7-7700K, NVIDIA RTX 3070):
About 90% of the time is spent in the interpolation. I’m pretty sure it’s possible to speed that up significantly via better memory patterns (e.g. using shared memory), but this requires more effort than I’m willing to put in.
Code
using HDF5, GLMakie, Interpolations, Statistics, Profile, Base.Threads, BenchmarkTools
using CUDA, StaticArrays
using CUDA: i32
function readhdf(h5file::String)
h5 = h5open(h5file, "r")
B = h5["B"]
@views Br::Array{Float32, 4}, Bθ::Array{Float32, 4}, Bϕ::Array{Float32, 4} = B[:,:,:,:,1], B[:,:,:,:,2], B[:,:,:,:,3]
Bp = @. hypot(Br, Bθ) / Bϕ
x_all, y_all, z_all = h5["x1v"], h5["x2v"], h5["x3v"]
xe_all, ye_all, ze_all = h5["x1f"], h5["x2f"], h5["x3f"]
# Don't need MeshBlockSize or blocks, as this is embedded in the shape of Bp (and the x_all, ...)
cx_all::Matrix{Float32} = reshape(collect(x_all), size(x_all))
cy_all::Matrix{Float32} = reshape(collect(y_all), size(y_all))
cz_all::Matrix{Float32} = reshape(collect(z_all), size(z_all))
cxe_all::Matrix{Float32} = reshape(collect(xe_all), size(xe_all))
cye_all::Matrix{Float32} = reshape(collect(ye_all), size(ye_all))
cze_all::Matrix{Float32} = reshape(collect(ze_all), size(ze_all))
return Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all
end
function find_bounds(cxe_all, cye_all, cze_all)
# No need to collect the cartesian coordinates
mins = @SVector fill(Inf32, 3)
maxs = @SVector fill(-Inf32, 3)
for b = axes(cxe_all, 2)
@views r, θ, ϕ = cxe_all[:, b], cye_all[:, b], cze_all[:, b]
# Don't add a type declaration, as this will force a conversion from SubArray to Array
# i.e.
# @views r::Vector{Float32} = cxe_all[:, b]
# still copies, and will be slower than
# r = cxe_all[:, b]
# For reference, non-separated approach:
# for rr in r, th in θ, ph in ϕ
# # No a here?
# x = rr * sin(th) * cos(ph)
# y = rr * sin(th) * sin(ph)
# z = rr * cos(th)
# xyz = SA[x, y, z]
# mins = min.(mins, xyz)
# maxs = max.(maxs, xyz)
# end
# Cf. abraemer's compute_bounds
extr_r = extrema(r)
extr_sin_th = extrema(sin, θ)
extr_cos_th = extrema(cos, θ)
extr_sin_ph = extrema(sin, ϕ)
extr_cos_ph = extrema(cos, ϕ)
minmaxs = extrema.((
(rr * sth * cph for rr in extr_r, sth in extr_sin_th, cph in extr_cos_ph),
(rr * sth * sph for rr in extr_r, sth in extr_sin_th, sph in extr_sin_ph),
(rr * cth for rr in extr_r, cth in extr_cos_th)
))
mins = min.(mins, first.(minmaxs))
maxs = max.(maxs, last.(minmaxs))
end
println(" " * join(("$v ∈ [$m, $M]" for (v, m, M) in zip(("x", "y", "z"), mins, maxs)), ", "))
return mins, maxs
end
function loaddata_gpu(Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, mins, maxs)
field_dims = 10 .* (size(Bp)[1:3] .+ 1)
eval_dims = field_dims #(200, 200,200)
a = 0.0f0 #0.98f0
cu_field = CUDA.fill(NaN32, field_dims...)
voxel_sizes = (field_dims .- 1f0) ./ (maxs .- mins)
# We need explicit Vectors as on the GPU it seems we require the grid (Abstract)Vectors
# to be the same type (not a mix of CuArray and SubArray{..., CuArray})
# (Bp_block can be of a different type)
logr = Vector{Float32}(undef, size(cx_all, 1))
θ = Vector{Float32}(undef, size(cy_all, 1))
ϕ = Vector{Float32}(undef, size(cz_all, 1))
cu_θ_eval = CuVector{Float32}(undef, eval_dims[2])
for b = axes(Bp, 4)
logr .= log.(view(cx_all, :, b))
θ .= view(cy_all, :, b)
ϕ .= view(cz_all, :, b)
Bp_block = view(Bp, :, :, :, b)
itp = interpolate((logr, θ, ϕ), Bp_block, Gridded(Linear()))
# First on CPU, then adapt to GPU (cf. # https://juliamath.github.io/Interpolations.jl/stable/devdocs/#GPU-Support)
cu_itp = adapt(CuArray, itp)
cu_ext_itp = extrapolate(cu_itp, Interpolations.Line())
@views re, θe, ϕe = cxe_all[:, b], cye_all[:, b], cze_all[:, b]
r_eval = Base.LogRange(re[begin], re[end], eval_dims[1]) # isbits
cu_θ_eval .= acos.(LinRange(cos(θe[begin]), cos(θe[end]), eval_dims[2])) # used cos and acos for smooth sampling
# (Could create an isbits struct with lazy getindex, if we want)
ϕ_eval = LinRange(ϕe[begin], ϕe[end], eval_dims[3])
kernel = @cuda launch=false _fill_field_kernel!(cu_field, r_eval, cu_θ_eval, ϕ_eval, cu_ext_itp, mins, maxs, voxel_sizes, a)
config = launch_configuration(kernel.fun)
nb_eval_elems = prod(eval_dims)
nb_threads = min(nb_eval_elems, config.threads)
nb_blocks = clamp(cld(nb_eval_elems, nb_threads), config.blocks, 2^31 - 1)
kernel(
cu_field, r_eval, cu_θ_eval, ϕ_eval, cu_ext_itp, mins, maxs, voxel_sizes, a;
threads = nb_threads, blocks = nb_blocks
)
end
return Array(cu_field)
end
function _fill_field_kernel!(field, r_eval, θ_eval, ϕ_eval, ext_itp, mins, maxs, voxel_sizes, a)
idx = threadIdx().x + (blockIdx().x - 1i32) * blockDim().x
CI = CartesianIndices((length(r_eval), length(θ_eval), length(ϕ_eval)))
@inbounds while idx <= length(CI)
ridx, θidx, ϕidx = Tuple(CI[idx])
rr = r_eval[ridx]; th = θ_eval[θidx]; ph = ϕ_eval[ϕidx]
sθ, cθ = sincos(th)
sϕ, cϕ = sincos(ph)
x = (rr * cϕ + a * sϕ) * sθ
y = (rr * sϕ - a * cϕ) * sθ
z = rr * cθ
field_ids = trunc.(Int32, (SA[x, y, z] .- mins) .* voxel_sizes) .+ 1
field_ids = clamp.(field_ids, 1, size(field))
# (Is this clamping necessary, and if so does it make sense?)
field[field_ids...] = ext_itp(log(rr), th, ph)
idx += blockDim().x * gridDim().x
end
return
end
function tdplot(mins, maxs, field)
finite_vals = filter(!isnan, vec(field))
q_low, q_high = quantile(finite_vals, (0.05, 0.95))
fig = Figure(nan_color=(:white, 0.5))
ax = LScene(fig[1,1], show_axis=false)
volume!(ax, (m..M for (m, M) in zip(mins, maxs))..., field, transparency=true; colorrange=(q_low, q_high))
display(fig)
end
function main()
Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all = readhdf("sane00.prim.01800.athdf")
mins, maxs = find_bounds(cxe_all, cye_all, cze_all)
field = loaddata_gpu(Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, mins, maxs)
tdplot(mins, maxs, field)
end
Why do you want to extrapolate the data? I mean, in your data e.g. cxe_all[1, 1] < cx_all[1, 1], so you would indeed need extrapolate. But why do you want to evaluate outside of the data range, where you have less information to rely on to get a good estimate?
Like JM_Beckers in the topic you linked to, I don’t think this is the way to go. If you want to construct a dense field, instead of adding more interpolated samples in spherical space, just work backwards. I.e. for every voxel in field find the corresponding spherical (or Kerr-Schild) coordinates. Then interpolate in Bp to obtain a value. (The mysterious fourth Bp axis does make this a bit more confusing, though.)
@eldee Thank you very much althogh i have Intel GPU.
Because it will give full sphere otherwise i gets a crack in sphere. cx_all are grid cell body-center positions while cxe_all are cell interface positions along x direction.
Fourth Bp axis is the block number which we are iterating in @threads for loop.
I couldn’t understand what you are suggesting .
I have made it linear by using log.(r) because i am using linear interpolation Gridded(Linear()).
I believe this is the one thing that you must understand to make any real progress.
Like many other I have a hard time understanding your code but as far as I can tell you
Have data in a polar coordinate system.
Interpolate in the polar coordinate system to get more data points.
Project each point forward to a Cartesian grid and write each data value to the grid.
As a result some grid points may get written multiple times and whatever happened to be written last is your output, whereas other grid points may not be hit at all.
If my understanding is correct, there is a big problem because this is a fundamentally flawed way of resampling data.
The correct way is to
Loop over the Cartesian output grid.
Project each point backwards to the polar coordinate system.
Interpolate or extrapolate from your data to these points.
There may be more intricacies to how you do step 3 best, but this is the general approach you should use.
If this is still unclear I would strongly recommend you to find a textbook on the topic of resampling and interpolation. (Sorry, can’t give a recommendation because I studied this a long time ago.)
using GLMakie, Interpolations
function get_spherical_data()
rs = Float32[1.677044, 1.8385518, 2.0156136, 2.2097273, 2.4225352, 2.6558375, 2.911608, 3.1920104, 3.499417, 3.8364286, 4.205896, 4.6109447, 5.055002, 5.5418243, 6.0755296, 6.660634, 7.3020864, 8.005314];
θs = Float32[0.049087387, 0.14726216, 0.24543692, 0.3436117];
ϕs = Float32[0.19634955, 0.5890486, 0.9817477, 1.3744467, 1.7671459, 2.1598449, 2.552544, 2.9452431, 3.3379421, 3.7306414, 4.12334, 4.5160394, 4.9087386, 5.3014374, 5.6941366, 6.086836];
spherical_values = rand(18, 4, 16)
return rs, θs, ϕs, spherical_values
end
function cartesian_to_spherical(x, y, z)
r = hypot(x, y, z)
θ = r == 0 ? 0.0 : acos(z / r)
ϕ = atan(y, x) + pi # In [0, 2pi) (like ϕs)
return r, θ, ϕ
end
function spherical_to_cartesian(r, θ, ϕ)
x = r * sin(θ) * cos(ϕ)
y = r * sin(θ) * sin(ϕ)
z = r * cos(θ)
return x, y, z
end
function find_cartesian_bounds(rs, θs, ϕs)
extr_r = extrema(rs)
extr_sin_th = extrema(sin, θs)
extr_cos_th = extrema(cos, θs)
extr_sin_ph = extrema(sin, ϕs)
extr_cos_ph = extrema(cos, ϕs)
minmaxs = extrema.((
(r * sth * cph for r in extr_r, sth in extr_sin_th, cph in extr_cos_ph),
(r * sth * sph for r in extr_r, sth in extr_sin_th, sph in extr_sin_ph),
(r * cth for r in extr_r, cth in extr_cos_th)
))
return first.(minmaxs), last.(minmaxs)
end
function create_cartesian_field(mins, maxs, resolution, rs, θs, ϕs, spherical_values)
# minx, maxs determine the extent of the cartesian grid
# resolution is the number of grid cells per axis
cartesian_values = Array{Float32, 3}(undef, resolution)
steps = (maxs .- mins) ./ (resolution .- 1)
# There's bound to be a proper way do to periodic interpolation (not based on boundary points,
# as it is in https://juliamath.github.io/Interpolations.jl/stable/iterate/#Periodic).
# You could always create a wrapper class (same for the log, if you want).
# But here's an ugly workaround, where we first interpolate to find values for ϕ = 0
itp_ϕ = interpolate((rs, θs, [ϕs[end] - 2pi, ϕs[begin]]), spherical_values[:, :, [end, begin]], Gridded(Linear()))
ϕs = [0.f0; ϕs; Float32(2*pi)]
spherical_values_ϕ0 = reshape([itp_ϕ(r, θ, 0.f0) for r in rs, θ in θs], length(rs), length(θs), 1)
spherical_values = cat(spherical_values_ϕ0, spherical_values, spherical_values_ϕ0, dims=3)
itp = extrapolate(interpolate((rs, θs, ϕs), spherical_values, Gridded(Linear())), 0.f0)
# Don't extrapolate: 0.f0 is transparent
# Could use a single for I = CartesianIndices(cartesian_values) loop,
# but then we won't be able to reuse the index calculations
# (though I doubt this will make a measureable impact on performance)
z = mins[3]
for k = 1:resolution[3]
y = mins[2]
for j = 1:resolution[2]
x = mins[1]
for i = 1:resolution[1]
r, θ, ϕ = cartesian_to_spherical(x, y, z)
cartesian_values[i, j, k] = itp(r, θ, ϕ)
x += steps[1]
end
y += steps[2]
end
z += steps[3]
end
return Colon().(mins, steps, maxs)..., cartesian_values # xs, ys, zs, cartesian_values
end
function main()
rs, θs, ϕs, spherical_values = get_spherical_data()
mins, maxs = find_cartesian_bounds(rs, θs, ϕs)
resolution = (100, 100, 100)
xs, ys, zs, cartesian_values = create_cartesian_field(
mins, maxs, resolution, rs, θs, ϕs, spherical_values
)
fig = Figure()
ax = LScene(fig[1,1], show_axis=false)
volume!(ax, (m..M for (m, M) in zip(mins, maxs))..., cartesian_values; transparency=true)
# Apparently volume!(ax, xs, ys, zs, cartesian_values) is deprecated.
# Since the xs are spaced uniformly, the extrema do suffice
display(fig) # (or wait(...), when used in a script)
end
@eldee Thanks, I have difficulty in understanding this interpolation part.
Please improve the code and try with this different data. it fills a lot of RAM memory on iterating over blocks.
Volume code
using GLMakie, Interpolations, HDF5, Base.Threads
@time begin
function readhdf(h5file::String)
h5 = h5open(h5file, "r")
B = h5["B"]
@views Br, Bθ, Bϕ = B[:,:,:,:,1], B[:,:,:,:,2], B[:,:,:,:,3]
Bp = @. hypot(Br, Bθ) / Bϕ
x_all, y_all, z_all = h5["x1v"], h5["x2v"], h5["x3v"]
xe_all, ye_all, ze_all = h5["x1f"], h5["x2f"], h5["x3f"]
MeshBlockSize = read((HDF5.attributes(h5))["MeshBlockSize"])
blocks::Int32 = 1 #read(HDF5.attributes(h5),"NumMeshBlocks")
return Bp, x_all, y_all, z_all, xe_all, ye_all, ze_all, MeshBlockSize, blocks
end
Bp, x_all, y_all, z_all, xe_all, ye_all, ze_all, MeshBlockSize, blocks = readhdf("sane98.prim.01900.athdf")
function cartesian_to_spherical(x, y, z)
r = hypot(x, y, z)
θ = r == 0 ? 0.0 : acos(z/r)
ϕ = atan(y, x) + pi # In [0, 2pi) (like ϕs)
return r, θ, ϕ
end
function spherical_to_cartesian(r, θ, ϕ)
x = r * sin(θ) * cos(ϕ)
y = r * sin(θ) * sin(ϕ)
z = r * cos(θ)
return x, y, z
end
function find_cartesian_bounds(rs, θs, ϕs)
extr_r = extrema(rs)
extr_sin_th = extrema(sin, θs)
extr_cos_th = extrema(cos, θs)
extr_sin_ph = extrema(sin, ϕs)
extr_cos_ph = extrema(cos, ϕs)
minmaxs = extrema.(((r * sth * cph for r in extr_r, sth in extr_sin_th, cph in extr_cos_ph),
(r * sth * sph for r in extr_r, sth in extr_sin_th, sph in extr_sin_ph),
(r * cth for r in extr_r, cth in extr_cos_th)))
return first.(minmaxs), last.(minmaxs)
end
function create_cartesian_field(mins, maxs, resolution, rs, θs, ϕs, Bp_block)
# minx, maxs determine the extent of the cartesian grid
# resolution is the number of grid cells per axis
cartesian_values = fill(NaN, resolution)
steps = (maxs .- mins) ./ (resolution .- 1)
# There's bound to be a proper way do to periodic interpolation (not based on boundary points,
# as it is in https://juliamath.github.io/Interpolations.jl/stable/iterate/#Periodic).
# You could always create a wrapper class (same for the log, if you want).
# But here's an ugly workaround, where we first interpolate to find values for ϕ = 0
itp_ϕ = interpolate((rs, θs, [ϕs[end] - 2pi, ϕs[begin]]), Bp_block[:, :, [end, begin]], Gridded(Linear()))
ϕs = [0.f0; ϕs; Float32(2*pi)]
Bp_block_ϕ0 = reshape([itp_ϕ(r, θ, 0.f0) for r in rs, θ in θs], length(rs), length(θs), 1)
Bp_block = cat(Bp_block_ϕ0, Bp_block, Bp_block_ϕ0, dims=3)
itp = extrapolate(interpolate((rs, θs, ϕs), Bp_block, Gridded(Linear())), NaN)
# I replaced 0.f0 with NaN.
# Could use a single for I = CartesianIndices(cartesian_values) loop,
# but then we won't be able to reuse the index calculations
# (though I doubt this will make a measurable impact on performance)
z = mins[3]
for k = 1:resolution[3]
y = mins[2]
for j = 1:resolution[2]
x = mins[1]
for i = 1:resolution[1]
r, θ, ϕ = cartesian_to_spherical(x, y, z)
cartesian_values[i, j, k] = itp(r, θ, ϕ)
x += steps[1]
end
y += steps[2]
end
z += steps[3]
end
return Colon().(mins, steps, maxs)..., cartesian_values # xs, ys, zs, cartesian_values
end
fig = Figure()
ax = LScene(fig[1,1], show_axis=false)
for b in 2000 #1:blocks
@views r, θ, ϕ = x_all[:, b], y_all[:, b], z_all[:, b]
@views re, θe, ϕe = xe_all[:, b], ye_all[:, b], ze_all[:, b]
@views Bp_block = Bp[:, :, :, b]
mins, maxs = find_cartesian_bounds(re, θe, ϕe)
resolution = (100, 100, 100)
xs, ys, zs, cartesian_values = create_cartesian_field(mins, maxs, resolution, r, θ, ϕ, Bp_block)
volume!(ax, (m..M for (m, M) in zip(mins, maxs))..., cartesian_values; transparency=true)
end
display(fig)
end
We have to calculate extrapolation on re, θe, ϕe = xe_all[:, b], ye_all[:, b], ze_all[:, b]. Is there any error still hidden? I doubt about extrapolation parts.
I noticed that for this different data scatter plot doesn’t match its volume plot so there should be error somewhere.
Scatter code
using HDF5, GLMakie, LinearAlgebra, Statistics
GLMakie.activate!(ssao = true)
GLMakie.closeall()
function plothdf_scatter(h5file::String)
h5 = h5open(h5file, "r")
num_blocks::Int32 = read(HDF5.attributes(h5), "NumMeshBlocks")
x_all::Matrix{Float32} = read(h5["x1v"])
y_all::Matrix{Float32} = read(h5["x2v"])
z_all::Matrix{Float32} = read(h5["x3v"])
B::Array{Float32, 5} = read(h5["B"])
close(h5)
Br, Bθ, Bϕ = @views B[:,:,:,:,1], B[:,:,:,:,2], B[:,:,:,:,3]
Bp = @. hypot(Br, Bθ) / (Bϕ + eps(Float32)) # avoid divide-by-zero
fig = Figure(size = (1000, 800))
ax = LScene(fig[1, 1]; show_axis = false)
for b in 2000#:num_blocks
r = x_all[:, b]
θ = y_all[:, b]
ϕ = z_all[:, b]
Nx, Ny, Nz = length(r), length(θ), length(ϕ)
Ntot = Nx * Ny * Nz
# preallocate fixed-size Float32 arrays
X, Y, Z, V = ntuple(_ -> Vector{Float32}(undef, Ntot), 4)
idx = 1
for i in 1:Nx, j in 1:Ny, k in 1:Nz
ri, th, ph = r[i], θ[j], ϕ[k]
X[idx] = ri * sin(th) * cos(ph)
Y[idx] = ri * sin(th) * sin(ph)
Z[idx] = ri * cos(th)
V[idx] = abs(Bp[i, j, k, b]) + eps(Float32)
idx += 1
end
cr = extrema(V)
V ./= maximum(abs, V) # normalize per block
scatter!(ax, X, Y, Z; markersize = 50, color = V, colormap = :hawaii, colorrange = cr, colorscale = log10, transparency = true)
end
fig
end
@time plothdf_scatter("sane98.prim.01900.athdf")
While my earlier code which @JM_Beckers termed conceptually wrong gives volume plot similar to scatter plot.
My earlier conceptually wrong code
using HDF5, GLMakie, Interpolations, Statistics, Base.Threads, BenchmarkTools
@time begin
function readhdf(h5file::String)
h5 = h5open(h5file, "r")
B = h5["B"]
@views Br::Array{Float32, 4}, Bθ::Array{Float32, 4}, Bϕ::Array{Float32, 4} = B[:,:,:,:,1], B[:,:,:,:,2], B[:,:,:,:,3]
Bp = @. hypot(Br, Bθ) / Bϕ
x_all, y_all, z_all = h5["x1v"], h5["x2v"], h5["x3v"]
xe_all, ye_all, ze_all = h5["x1f"], h5["x2f"], h5["x3f"]
MeshBlockSize::Vector{Int32} = read((HDF5.attributes(h5))["MeshBlockSize"])
blocks::Int32 = read(HDF5.attributes(h5),"NumMeshBlocks")
cx_all::Matrix{Float32} = reshape(collect(x_all), size(x_all))
cy_all::Matrix{Float32} = reshape(collect(y_all), size(y_all))
cz_all::Matrix{Float32} = reshape(collect(z_all), size(z_all))
cxe_all::Matrix{Float32} = reshape(collect(xe_all), size(xe_all))
cye_all::Matrix{Float32} = reshape(collect(ye_all), size(ye_all))
cze_all::Matrix{Float32} = reshape(collect(ze_all), size(ze_all))
return Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, MeshBlockSize, blocks
end
Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, MeshBlockSize, blocks = readhdf("sane98.prim.01900.athdf")
function bound(cxe_all, cye_all, cze_all, MeshBlockSize, blocks)
XX,YY,ZZ = ntuple(_ -> zeros(Float32, 1* prod(MeshBlockSize.+1)), 3)
icx=1
for b in 2000 #1:blocks
@views r::Vector{Float32}, θ::Vector{Float32}, ϕ::Vector{Float32} = cxe_all[:, b], cye_all[:, b], cze_all[:, b]
for rr in r, th in θ, ph in ϕ,
XX[icx] = rr * sin(th) * cos(ph)
YY[icx] = rr * sin(th) * sin(ph)
ZZ[icx] = rr * cos(th)
icx += 1
end
end
xmin,xmax = extrema(XX); ymin,ymax = extrema(YY); zmin,zmax = extrema(ZZ)
println(" x ∈ [$xmin, $xmax], y ∈ [$ymin, $ymax], z ∈ [$zmin, $zmax]")
return xmin, xmax, ymin, ymax, zmin, zmax
end
xmin, xmax, ymin, ymax, zmin, zmax = bound(cxe_all, cye_all, cze_all, MeshBlockSize, blocks)
function clp(X,Y,Z,C,xmin,ymin,zmin,sx,sy,sz,nx,ny,nz,field)
xi = trunc(Int, (X-xmin) * sx) + 1
yi = trunc(Int, (Y-ymin) * sy) + 1
zi = trunc(Int, (Z-zmin) * sz) + 1
ix = clamp(xi, 1, nx)
iy = clamp(yi, 1, ny)
iz = clamp(zi, 1, nz)
field[ix, iy, iz] = C
end
function kerr2cart(iex,a,ii,jj,kk,r_eval,θ_eval,ϕ_eval,ext_itp,X,Y,Z,C)
rr, th, ph = r_eval[ii], θ_eval[jj], ϕ_eval[kk]
sθ, cθ = sincos(th)
sϕ, cϕ = sincos(ph)
X[iex] = (rr*cϕ + a*sϕ) * sθ
Y[iex] = (rr*sϕ - a*cϕ) * sθ
Z[iex] = rr * cθ
C[iex] = ext_itp(log(rr), th, ph)
end
function loaddata(Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, MeshBlockSize, blocks, xmin, xmax, ymin, ymax, zmin, zmax)
nx,ny,nz = 10(MeshBlockSize.+1)
nxs,nys,nzs = nx,ny,nz #300, 200,300
a::Float32 = 0.0 #0.98
field = fill(NaN32, nx,ny,nz)
sx::Float32 = (nx-1)/(xmax-xmin); sy::Float32 = (ny-1)/(ymax-ymin); sz::Float32 = (nz-1)/(zmax-zmin)
X, Y, Z, C = ntuple(_ -> zeros(Float32, nxs*nys*nzs), 4)
@threads for b in 2000 #1:blocks
@views r, θ, ϕ = cx_all[:, b], cy_all[:, b], cz_all[:, b]
@views re, θe, ϕe = cxe_all[:, b], cye_all[:, b], cze_all[:, b]
@views Bp_block = Bp[:, :, :, b]
@views size(Bp_block)
ext_itp = extrapolate(interpolate((log.(r), θ, ϕ), Bp_block, Gridded(Linear())), Interpolations.Line())
r_eval = exp.(LinRange(log(re[1]), log(re[end]), nxs))
θ_eval = acos.(LinRange(cos(θe[1]), cos(θe[end]), nys)) # used cos and acos for smooth sampling
ϕ_eval = LinRange(ϕe[1], ϕe[end], nzs)
iex::Int32 = 1
@inbounds for ii in 1:nxs, jj in 1:nys, kk in 1:nzs
kerr2cart(iex,a,ii,jj,kk,r_eval,θ_eval,ϕ_eval,ext_itp,X,Y,Z,C)
clp(X[iex],Y[iex],Z[iex],C[iex],xmin,ymin,zmin,sx,sy,sz,nx,ny,nz,field)
iex += 1
end
end
return field
end
field = loaddata(Bp, cx_all, cy_all, cz_all, cxe_all, cye_all, cze_all, MeshBlockSize, blocks, xmin, xmax, ymin, ymax, zmin, zmax)
function tdplot(xmin, xmax, ymin, ymax, zmin, zmax, field)
finite_vals = filter(!isnan, vec(field))
q_low, q_high = quantile(finite_vals, (0.05, 0.95))
fig = Figure(nan_color=(:white, 0.5))
ax::LScene = LScene(fig[1,1], show_axis=false)
volume!(ax, xmin..xmax, ymin..ymax, zmin..zmax, field, transparency=true; colorrange=(q_low, q_high))
display(fig)
end
tdplot(xmin, xmax, ymin, ymax, zmin, zmax, field)
end
Honestly, this sounds very entitled to me. You don’t understand the code, therefore it must be bad, and thus it must be fixed (by others).
In any case, the ‘problem’ is we have periodicity (\phi = \phi + 2k \pi for any angle \phi and k \in \mathbb{Z}), which is not known by Interpolations.jl. So if we want to interpolate for (e.g.) \phi = 0, which falls ‘in between’ ϕs[end] == 6.086836f0 and ϕs[begin] == 0.19634955f0, we would get a BoundsError. You could circumvent this using extrapolation, but this is silly, as we do have information on both sides. So instead, we explicitly use ϕs[end] - 2pi\approx -0.196 and ϕs[begin] to get the values at \phi = 0. Then we enlarge ϕs with 0 and 2pi, and spherical_values with these extra interpolated values. At this point we no longer have issues with later interpolation, since every phi we get in the loop lies between 0 and 2pi.
On that note, my line
ϕ = atan(y, x) + pi # In [0, 2pi) (like ϕs)
in cartesian_to_spherical was incorrect, and should instead read
ϕ = mod2pi(atan(y, x)) # In [0, 2pi) (like ϕs)
Now as I mentioned as a comment, the periodic interpolation approach is still a workaround (which needlessly allocates an extra potentially big Array), and a proper solution would be to either implement this type of periodicity directly into Interpolations.jl, or to create a wrapper which when called reduces \bmod 2 \pi and if necessary (when you fall below the first or above the last reference domain value) also considers the first and last reference function values.
The only significant amount of memory you really need (which cannot be freed) is the one from cartesian_values. Unless you want to start playing around with e.g. sparse octrees, you cannot get around this. Of course, the resolution parameter allows you to tune the size of cartesian_values.
Why? If you just want to visualise your data, you don’t need this extrapolation. So what do you even want to achieve? And, again, why? How do we know you’re not just asking us to solve your homework or whatever? Also note that we’ve strayed far from the topic, namely
You should really first know where this data comes from, and what it all means, in particular the final axis. Also, either this is publicly available data, in which case you should refer us to something like a project page instead of your personal university sharepoint. Or it is not freely available, and you should probably not share it.
There was an error in cartesian_to_spherical. But also, I’m not sure what I would expect to get when overlaying multiple volume plots in Makie.jl. Do you? Ideally, (for algorithm = :mip) I suppose it should take the (pixel/raywise) maximum of all individual plots, but from a quick test, I doubt it does. (Also, what would (you want to) happen if you mix different algorithms?)
Coming back to what the data means, does it consist of point samples, or of (averaged) volumetric samples? Should you visualise them as points, or as a volume? (How) are we supposed to reduce over the final axis? If there is a conceptually sound way to reduce, you only need to keep one cartesian_values field, which will presumably solve your memory issues (assuming volume and not scatter is the way to go).
I want to get volume plot. I would like to use :mip and :iso algorithms etc. This is Athena++ data and it is cell-centered dataset. Yes, I only need to keep cartesian_values field. I think it would be better to calculate the cartesian_bounds of all the blocks to create cartesian_values field that i am going to plot; instead of calculating cartesian_values field for each block separately .
Is this above statement correct? 0 ∉ [0.19634955f0, 6.086836f0]
In my data ϕ always lies between 0 and 2pi and it never goes beyond 2pi.
Because re, θe, ϕe(cell interface positions) contains points which outside the r, θ, ϕ(cell-center positions). Notice the difference between r, θ, ϕ and re, θe, ϕe values. If you plot more than one block then you will notice gap left between them.
There is another way to find bounds of all data as extreme values are stored in data. But still it will be better to calculate them for blocks as i may want to plot them separately if needed.
RootGridX1 = read((HDF5.attributes(h5))["RootGridX1"]) = [1.6, 1200.0, 1.0963050292915717] #last quantity is Geometric ratio in that direction.
RootGridX2 = read((HDF5.attributes(h5))["RootGridX2"]) = [0.0, 3.141592653589793, 1.0]
RootGridX3 = read((HDF5.attributes(h5))["RootGridX3"]) = [0.0, 6.283185307179586, 1.0]
Assuming that the blocks correspond to disjoint subsets of the (spherical) space, indeed, just create a big Cartesian grid. Every voxel here will correspond to at most one point within the interpolated spherical data. In particular if the blocks form a (disjoint) tiling you immediately know in which block (or blocks, if you’re on the boundary) to look for the interpolation. But if this were the case, why have blocks at all?
However, if the blocks overlap you can (and probably should?) perform your interpolation in multiple blocks, and need to combine these somehow, in a way that makes sense (which again depends on what the blocks mean).
Yes, that’s the point. Draw these angles on a unit circle.
ϕ has been used for both a Vector and a scalar, so I’m not sure what you mean here. But the scalar version, corresponding to some point in Cartesian coordinates, can lie outside of the range of the reference Vector version (which only spans a strict subset of [0, 2\pi)). This means the interpolation is not straightforward any more, cf. supra.
I don’t know what the concepts of cell-interface or cell-center positions mean. But the question remains, if all you want to do is visualise the data, why do you apparently need to get values at specific predetermined positions (the _e versions)?
Presumably because you are not interpolating across block boundaries.