tackling a problem that appears to have a fixed handful of cutoffs and a performance metric,
What is the specific problem? I assume the “fixed handful of cutoffs” are constraints? Performance metric is probably the objective function. If you’re solving optimizations with constraints, then probably you need to use JuMP-like workflow (you need an underlying solver).
These if-else cutoffs are not differentiable
Maybe you can present the specific problem. To me it sounds like integer programming, e.g. in a unit commitment problem if the generator is ON, then it’s output power is e.g. [2MW, 10MW], if it’s OFF, then it’s output power is 0MW.
rather decisions in a loop
I didn’t get this either. What does loop here mean? Do you mean reinforcement learning (aka approximate dynamic programming)?
Does this mean smoothing approximations are not possible and I need to focus on derivative-free optimization?
No. There’re two layers, the outer layer is branch-and-bound, the inner layer is continuous optimization where you apply effective optimization methods. It’s not quite accurate to partition the methods into derivative-free methods or others. For example, the well-studied simplex algorithm for linear programming—it is not a typical derivative-driven algorithm, but it’s very effective.
downsides to derivative-free optimization like poor scaling, slow convergence, and more likely non-global optima.
You need to distinguish optimization problems and algorithms. The primary question is: is my optimization convex? If yes, then it is easy, otherwise pursuing global optimum is hard. If it’s a nonconvex differentiable nonlinear programming, then derivative-based methods are usable, which is based on descent direction and stepsize selection (But they may converge to local optimum, so they essentially don’t have any guarantee about the quality of the solution. Or, say, different local optimization algorithms are only relatively comparable). So the core issue is not about “using derivatives or not”.
What do I do when out-of-sample testing suggests the model is bad?
Are you doing stochastic optimization? In stochastic optimization, there are two models: the model of the uncertainty, and the model of physic system. For example, if you are solving a stochastic unit commitment, then you need a model for wind generation, maybe an ARIMA model for instance. This part is irrelevant to optimization techniques.