A simple logit or probit model is a possibility, maximizing the sum or average of the log likelihood:
function logit(theta, s, x)
p = 1.0./(1.0 .+ exp.(-x*theta))
obj = s.*log.(p) .+ (log.(1.0 .- p)).*(1.0 .- s)
end
Another possibility is to use nonparametric kernel regression, rather than kernel density. The probability that s=1|X=x is also the conditional mean of s, given that X=x. Running a kernel regression might be a little easier.