I wrote a function which inspects a data frame and, if its content meets certain requirements, sends it to a server. To make sure (new) users can easily understand which aspect of the content needs to be changed in case the upload fails, I added an explicit error if the function input is no data frame.
function my_fun(df_in)
if typeof(df_in) != DataFrame error("Input must be a data frame") end
...
end
However, when running the Juno profiler on the function (I am new to measuring performance - please tell me if this is a terrible approach), I noticed this line is very slow - it accounts for 50% of elapsed time.
I assume using my_fun(df_in::DataFrame) would improve performance but then the function would throw a method error as opposed to my specific error message.
Is there a performance friendlier way to check if an object is of type DataFrame?
This seems like an issue with the profiler. When I tried the following:
function my_fun(df_in)
if typeof(df_in) != DataFrame a = 1 end
end
@btime my_fun(test_data)
4.583 ns (0 allocations: 0 bytes)
@btime my_fun(5)
0.036 ns (0 allocations: 0 bytes)
There’s a few nanoseconds of overhead from the if statement, but that’s it. I couldn’t actually time the error function in the profiler, but from manual testing it was practically instant, so this check should be fast in all cases.
You’re right. I just commented the type check and now the profiler shows the next line to account for 50% of elapsed time. (I might be misinterpreting the flame graph.) I then used @btime to measure performance before and after commenting all input checks - the difference is negligible.