After successfully implementing a small function to generate correlated data, I wish now to simulate data for logistic regression (with a 0/1 outcome) with specific coefficients.
# R version # https://stats.stackexchange.com/questions/46523/how-to-simulate-artificial-data-for-logistic-regression # > x1 = rnorm(1000) # some continuous variables # > x2 = rnorm(1000) # > z = 1 + 2*x1 + 3*x2 # linear combination with a bias # > pr = 1/(1+exp(-z)) # pass through an inv-logit function # > y = rbinom(1000,1,pr) # bernoulli response variable # Julia attempt # add https://github.com/neuropsychology/Psycho.jl.git using Psycho, Statistics, Random, GLM coefs = [0.2] data = simulate_data_correlation(coefs; n=100) cor(data[:y], data[:Var1]) # 0.2 data[:y] = 1 ./ (1 .+ exp.(-1 * data[:y])) # pass through an inv-logit function response = rand.([Distributions.Binomial.(1, x) for x in data[:y]], 1) data[:y] = [x for x in response] glm(@formula(y ~ Var1), data, Binomial(), LogitLink()) # Doesn't return the correct coef
I am not sure where the errors are… Any advice?
PS: I am not trying to copy the R version per se, but rather just to simulate logistic data with pre-specified coefficients. There could be a better approach for this problem, but I only found these answers for R.