Optimizing the use of Blocks, Threads vs. Array Indexing

Sorry to necro-bump this thread but did anyone get anywhere with a GPU friendly sorting algorithm?