I’m trying to profile the following piece of Julia code, taken from ProfileView.jl, using the external Intel VTune Profiler. Concretely, the code is
# vtune.jl
function profile_test(n)
for i = 1:n
A = randn(100,100,20)
m = maximum(A)
Am = mapslices(sum, A; dims=2)
B = A[:,:,5]
Bsort = mapslices(sort, B; dims=1)
b = rand(100)
C = B.*b
end
end
profile_test(1)
profile_test(10)
As described in the relevant section of the Julia documentation, I compiled Julia (release-1.3) with USE_INTEL_JITEVENTS=1 in the Make.user file and set the environment variable ENABLE_JITPROFILING=1. Specifically, since I’m on Windows I cross compiled Julia using the recommended Cygwin-to-MinGW path.
To profile the code I fired up VTune and created a simple analysis (inspired by the Python tutorial here):
Inspecting the “Bottom-up” analysis results I see all kinds of low level jl_* stuff but cannot find anything about my vtune.jl file or the included profile_test function:
Is there something else I need to do? Is VTune expected to work with Julia? If yes, to what extend? In the python tutorial (linked above) both the script file runtime.py as well as the function black_scholes could be clearly identified as a bottleneck.
I’d appreciate any comments/hints on how to get better profiling results with VTune + Julia!
I am using the “instrumentation and tracing technology (ITT) API” to profile just the f01() function in
include(joinpath(@__DIR__,"IntelITT.jl"))
using Main.IntelITT
N = 256
A = rand(UInt32,1024*1024*N) .% UInt32(17)
function f01()
B = cumsum(A)
end
precompile(f01,())
__itt_resume()
f01()
__itt_detach()
There seems to be a library for something called Instrumentation and Tracing Technology APIs (ITT) which gives the ability to pass debug information from an application to the Vtune profiler.
you can start an application with vtune in “pause” mode
and let the application call __itt_resume() which makes the profiler start to collect data
you might pass string-annotations for what-is-going-on to the API which can be used in the profiler to group events in an analysis later-on
The application can call __itt_detach() which tells vtune to finish the data collection
I just found out about this myself, tried to hack a Julia binding for the libittnotify library and uploaded it for you on github. Haven’t tested anything more specific, but it seems to work at least for this simple case.
… Although I am not quite sure where this bug report would belong to. Is it the LLVMBuilder repository or the BinaryBuilder.jl repository or the julia repository?
Do you have any advice which one to choose?
EDIT: Further, as far as I understood, this feature of LLVM is based on the JIT Profiling API and unfortunately the disassembly seems broken. I can open the disassembly section in vtune but it displays the same single assembly instruction in each line. This might be a bug in LLVM not reporting the assembly correctly; or it is not even intended to work; or it is a misconfiguration on my side. So this needs some further observation.
Unfortunately building with USE_BINARYBUILDER=0 in cygwin on Windows fails for me, so I can’t get it to work. I’m wondering whether compiling LLVM with LLVM_USE_INTEL_JITEVENTS would be safe as a default (does it come with a performance penalty or something similar?)
I opened an issue to discuss whether it makes sense to set the flag for the prebuilt binaries.
I think this already was the intention, but fails due to a typo of using USE_INTEL_JITEVENTS instead of LLVM_USE_INTEL_JITEVENTS (which is why the pre-build-cmake log contains this warning)
Not adding anything to the discussion. Using Intel VTune with Julia is impressive! Its something I though about recently. I hope to get hands on with some bigger systems soon so could help with this.
Also Yggdrasil was the original Linux distribution on floppy disks. I thought it was long dead. Is the Julia builder related in some way?
With the latest VTune, I also need to set the following environment variable to see the JITed codes showing up in the profile: INTEL_JIT_BACKWARD_COMPATIBILITY=1
I tried to work on Intel VTune profiler with, but I did not know how to connect it to Julia (run via VS code).
Any step-by-step documentation for it? or could anyone help me in this please ?
Thank you very much!