How to sample from a logistic regression model

Yes I guessed that memory layout is why. It’s interesting that nearly all the ML textbooks stick to the convention that the design matrix X is N x D. The only exception I know of is the Gaussian Process book by Rasmussen and Williams (IIRC). It actually simplifies some of the equations (as well as the code) if X is D x N, but it will look unfamiliar (to some) to write OLS estimator as (X X’)^{-1} X y instead of the more familiar (X’X)^{-1} X’ y. I am tempted to switch to the D * N convention for v2 of my own book (“Machine learning: a probabilistic perspective”), as I make the switch from Matlab to Julia :slight_smile:

3 Likes