Compute intensive function rewrite

BioinfoBuggaBoo · August 15, 2023, 1:42am

I have been using a package in R that really only has one main function which does a lot behind the scenes of which a portion is prohibitively slow to the point where the timescales involved are far too long to ever achieve an output on the size of data I need to run it with. So, I want to use this as an opportunity to learn and I’m hopeful the community is willing to guide me in my inexperience. My plan is as follows:

Profile the existing code to ascertain exactly what is taking the most time (currently narrowed down to two functions which I am now going through)
Write the identified process above in plain language ignoring objects/structures that were used to implement it in R but focusing on what is actually being executed
Find existing code written in Julia that is closest to what I need to carry out based on outcome of steps 1 and 2
Modify the existing code in small steps using tests to ensure I am getting the expected results until the process described in 2 is achieved. Use the same input data in R and the rewrite to ensure same output.
Investigate whether the types initially used in Julia are able to be optimised for serial execution: adjust accordingly
Implement in parallel on CPU
Any hope of executing on GPU?
From those of you who have done this and more before, is this an advisable way to proceed?

mrufsvold · August 15, 2023, 3:09am

If you’re willing to specify what the R package is doing or even just its domain, there are lots of folks here who can help you track down related and/or similar Julia code!

BioinfoBuggaBoo · August 15, 2023, 5:08am

Yes certainly, I was just hoping to pinpoint the exact slow code out of respect for everyone’s time. It’s using convex analysis, crudely as follows:
data → perspective projection → outliers removed → clustering (kmeans) → quickhull → identify points around vertices as cluster centres.
And the slow culprit/s is/are somewhere in the final three steps.

Topic		Replies	Views
Why is my code very slow General Usage performance	16	679	June 23, 2023
My Julia workflow Community	22	3213	May 8, 2019
Making a Function FASTER! New to Julia question	26	2371	November 14, 2020
General questions from Python user Performance	59	4319	March 8, 2021
How to effectively develop in Julia? New to Julia	17	1640	May 23, 2021

Compute intensive function rewrite

Related topics