Is it possible to decrease the learning rate for those algorithms? That would be my first attempt. Also, since you are only using two features now, you can visualize the data in a scatter plot. Does it appear linearly separable? If not, then adding another layer or two the NN will help the most.