If you do not have an explicit Hessian, it is usually not worth coding one. Quasi-Newton methods don’t need one anyway and are rather efficient in real-life problems. Theory characterizes local convergence (quadratic vs superlinear), but that is not the pimary concern in practice unless your problem is super well-behaved.
Generally, the curvature variation in your problem is at least as important as the dimension, if not more. A simple probit model in 10³–10⁴ dimensions should be manageable with either BFGS or even CG.