Indicator matrix for categorical data in GLM.jl with DataFrames.jl


#1

I am working with a large data set and want to run a logit regression on monthly data. For this I create a DataFrame and use the GLM package in Julia. My code looke something like that:

f=glm((Y ~  Age + Duration + Gender + Nationality + MonthIn), Data2000, Binomial(), LogitLink())

My question is, as I have monthly data I want to create dummy variables for the 12 months, or eleven when I want to use a constant. The MonthIn is just a column which have numbers for the month (eg 3 for march).

Now when I tried to find how this is done I just learned that in R this possibility is build in s.t. it can automatically create monthly dummies. Now one guess of mine would be to use the pooling data function build in the dataframe.jl to create an indicator matrix, but I am not sure how this or something similar would be done.

I highly appreciate any help and please feel free to ask if my question is not clear.

Cheers

Link: http://stackoverflow.com/questions/43708635/indicator-matrix-for-categorical-data-in-glm-jl-with-dataframes-jl


Link to Julia questions on StackOverflow
#2

This topic is a test of a new service discussed here. Please, do not reply to it, but instead, use the link to StackOverflow posted at the end.


#3

Did you ever get an answer to this? I am also having some trouble using categorical variables


#4

It looks like the poster found their own solution and wrote it as an answer in the stackoverflow link in the original post.