Although the originaldescription of the algorithm advises using a Fibonacci Heap asits internal queue, it has been noted that in practice, a binary(ord-ary) heap implementation is significantly faster
Have you tried just using a binary heap from datastructures?
You probably care about runtime, not provable complexity class. Constant factors are likely to swamp log factors for the heap, and your linked algorithm never merges two large heaps.