I am trying to write a function that will calculate deviance residuals for a general linear model (specified using GLM.jl
). The function extracts predictions from the fitted model (pred
), a vector containing the response variable (response_y
) and the error distribution specified for the fitted model (distType
). I then use a conditional if-elseif-else statement to calculate the deviance residuals depending on the output of distType
.
# Write function
using Test
using Distributions
using Random
using DataFrames
using StatsBase
using GLM
using Gadfly
using Statistics
using Compose
# Define function to calculate deviance residuals
function calcDevResids(;model, df)
###################################################
# Section #1: Extract response vector and model residuals
###################################################
# Extract fitted/predicted values from model object
pred = GLM.predict(model)
# Extract vector containing response variable
response_y = df.y
###################################################
# Section #2: Calculate deviance residuals
# - requires conditional statements to
# loop through different data distributions
###################################################
# Extract statistical distribution specified in model
distType = typeof(model).parameters[1].parameters[1].parameters[2]
# Use if-elseif-else statements to calculate deviance residuals depending
# on which errors were specified during model specification
# If model was specified with Poisson errors
if distType == "Poisson"
sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Poisson(), response_y, pred))
# If model was specified with Gamma errors
elseif distType == "Gamma"
return sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Gamma(), response_y, pred))
# If model was specified with Bernoulli errors
elseif distType == "Bernoulli"
sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Bernoulli(), response_y, pred))
# If model was specified with Binomial errors
else distType == "Binomial"
sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Binomial(), response_y, pred))
end
end
And here is a basic model to test the function:
# Set reproducible seed
Random.seed!(123)
# Simulate n = 100 data points from a Poisson distribution (lambda = 7)
y = rand(Poisson(7), 100)
# Assign grouping variables (5 groups of 20)
x = repeat([1, 2, 3, 4, 5],
inner = 20,
outer = 1)
df = hcat(y, x)
df = DataFrame(df)
colnames = ["y","x"]
rename!(df, Symbol.(colnames))
# Run model
modPoisson = fit(GeneralizedLinearModel,
@formula(y ~ 1 + x),
df,
Poisson(),
GLM.LogLink())
However, I keep getting the follow error when I run the function:
ERROR: DomainError with -10.0: log1p will only return a complex result if called with a complex argument. Try log1p(Complex(x)).
But, when I run the code outside of the function call (see below), I get the expected output and no error.
pred = GLM.predict(modPoisson)
response_y = df.y
dv = sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Poisson(), response_y, pred))
julia> dv = sign.(response_y .- pred) .* sqrt.(GLM.devresid.(Distributions.Poisson(), response_y, pred)) 100-element Array{Float64,1}: 0.9061633419878498 1.5493622873332258 0.9061633419878498 0.217165948855438 -0.9383362226732821 -0.9383362226732821 â‹® 0.07437940612764843 0.07437940612764843 0.8013566375722013 0.07437940612764843 1.4748443746174682 -0.7265187364238832
I have tried adding complex()
and Complex()
to the terms in the code to create dv
, with no success.
Any pointers or advice would be much appreciated. I am a Julia-newbie (< 3 months experience), so I am sure I am missing something quite simple.