Radon and iradon transforms

For things you don’t want to share. Unpublished work (i.e. research that should be kept secret until publication, so I don’t get scooped) is one thing I use it for.

Gorgeous.

I followed your advice on imresize and found that, if one adds in the end of iradon function the final rescale

  N_ = Int64(round(N/sqrt(2.)));
  I = imresize(I,N_,N_);

then, - at least for this example, the size of reconstruction matches the size of the original Shepp-Logan phantom.

iradon is still 4x times slower than radon and the reason is triple loop.
by commenting it out, I found that fft part takes only 1.3% of processing time.

Hi again Tim,

I just now tried, naively, to “improve” brackets in iradon following your advice.
with these compiler directives it is now

  @simd for m=1:N
    @simd for n=1:N
      @inbounds for t=1:nAngles
        r = XCOS[m,t] + XSIN[n,t];
        index = Int64(round((r/sqrt(2.0) + 0.5)*N)); # sqrt(2) magnified
          if index>0 && index<=N
            I[m,n] += A[index,t];
          end
      end
    end
  end

amazingly, that gave 19% speed-up, so now Julia/Matlab ratio is around 17. but still quite high.

I don’t touch multi-threading and other hardware-related things for now, just structuring the code.

BTW options like GPU or/and running on a cluster are of separate interest.
in Matlab we use GPU speed-up, practically not making any meaningful coding, very nice system they have there.

You’re traversing that loop in a manner that’s not cache efficient (see this nice blog post). Try something like this:

    @inbounds for t=1:nAngles
        for n=1:N
            xs = xsin[n,t]
            for m=1:N
                r = xcos[m,t] + xs
                index = Int64(round((r/sqrt(2.0) + 0.5)*N)); # sqrt(2) magnified
                # index = Int64(round((r + 0.5)*N)); # to keep pixel resolution
                if index>0 && index<=N
                    dest[m,n] += src[index,t];
                end
            end
        end
    end

For me that gives something like a 3x speedup.

Which version of matlab are you running? I have 2016a installed, and if I just snip out that inner loop it’s glacially slow, so much that I killed it after one minute. For reference,

function I = stuff2(src, xcos, xsin)
    [N, nAngles] = size(src);
    dest = zeros(N, N);
    assert(all(size(dest) == [N,N]))
    assert(all(size(xcos) == size(xsin)))
    assert(all(size(xcos) == [N,nAngles]))
    for t=1:nAngles
        for n=1:N
            xs = xsin(n,t);
            for m=1:N
                r = xcos(m,t) + xs;
                index = int64(round((r/sqrt(2.0) + 0.5)*N));
                if index>0 && index<=N
                    dest(m,n) = dest(m,n) + src(index,t);
                end
            end
        end
    end
end

Prep script:

N = 1415
angles = (0:1:359)/360*2*pi;
XCOS = zeros(N,length(angles));
XSIN = zeros(N,length(angles));
x = linspace(-0.5,0.5,N);
for k=1:N
    for a=1:length(angles)
        XCOS(k,a)=x(k)*cos(angles(a));
        XSIN(k,a)=x(k)*sin(angles(a));
    end
end
src = rand(N, length(angles));

I = stuff2(src, XCOS, XSIN);
2 Likes

certainly a translation will be much slower, - Julia rules, there is no doubt.
however Matlab’s “iradon” function is ridiculously fast.
it is a wrapper for something really brutal, written or in C or assembler, one can only guess.

I will try your suggestion and report tomorrow.
many thanks.

Ah. When you said things like “slow brackets”, I had the impression you were testing the two programming languages and had come to the conclusion that Julia was slower on the same exact task. Given that’s not the case, unless you know that the algorithm in Julia was the same as what Matlab uses, these are not at all comparable—I can write a really slow algorithm in a fast language and it will still be slow, while a fast algorithm in a slow language might be really fast. It also seems possible (I won’t look at their code to keep from being “contaminated” by license issues) that the core part of Matlab’s algorithm is written in C.

I understand that from a pragmatic standpoint, what you need is a fast radon/iradon pair, but that’s not as much a language issue as a question of writing the right code. I polished up radon but don’t have time to tackle iradon for a bit; if you’re in a hurry, you might find some papers on “fast radon transformations” and see if there are any tricks they recommend.

confirmed, - I’ve got 3.18.

so for now, the improved ratios are
radon 3.67
iradon 6.31

miracle!

sure but even as it is now, it is approaching being useful.

pragmatically, we need fast radon/iradon and Bioformats, and also faster calculus to do iterative restorations like TwIST where both radon and iradon are used, - here Julia offers some hope.
next issues are GPU and how to run that on HPC cluster.

I will also look in Python where the thing exists as package: https://bitbucket.org/alexlib/tomographylab.
It implements radon/iradon and also some other (SART/SIRT) algorithms, however my suspicion is that it isn’t as fast as needed.

The test program in the post is using:

angles = (0:1:359)/3602pi; # [rad]

But as the transform in an angle and the angle+180 are identical, this is redundant work. This is also why in Matlab the default angles go from 0:179. So this is another simple 2x improvement with literally no change in code :slight_smile: .

that is true formally, but actual data are often acquired at 360 degrees as well.
the matter is, in OPT an object is far from being simply symmetry-transformed when imaged from opposite sides, there are things like excitation/emission scattering, depth of field and PSF, so that the angle “redundancy” actually allows improving image quality.

The comment was regarding radon and there it should apply in any case.
iradon is a different story. With iradon the output resolution and filter can vary and I haven’t looked at it.

you are right, but in practice radon isn’t used very often, only in iterative reconstruction.
frankly, we never thought about possibility to exploit this redundancy - angles, they are always angles.

with iradon, it is typical situation when at theta angle nearest half of the sample is in focus and the rest is out, at theta+180 they change places, and in order to get all volume more or less in focus, one needs to process both projections.