I’ve already written a (somewhat) satisfying version of algorithm here.
It’s a normal single-threaded program (I don’t have a multicore server or GPU).
Typically other people may claim they do “parallel computing”. (But there is actually a lot of small issues to be noticed in this context.)
Actually if parallel computing is equipped, this is a lesser issue. Since all subproblem blocks can be executed in parallel, and I can pick one according to the violation level fast and return to the master problem. So it would be fine.
I think my single-threaded code is also fine.
I’m going to try some other decomposition algorithms and make comparisons later.
Although that problem might also admit of a Benders decomposition solution method, a block-decomposition spirit will not be embodied. Therefore my investigation is finished.