As a library maintainer what are some good idioms for managing implementations of single threaded and multi threaded algorithm implementations?
Yesterday I implemented an algorithm three ways, with channels, locks, and thread-local arrays. For my algorithm (isosurface extraction), locks give the best overall performance on multiple threads. However, in the single thread case I notice 50% -15% slower performance. My first thought was to implement two loops for the single thread and multi threaded cases. If I do this, I am left with two implementations.
What is the expectation for package maintainers? Is it reasonable to expect users will run julia with >1 thread in the future, and keep the code terse? Is it worth maintaining two implementations to keep good single thread performance?