I would like to fit a model to predict a future outcome from a given time (t=1 in the attached plot) using covariates X.

But I have many missings at that time, and I’ve thought about also using past information of X.

I have designed this model which uses X no matter if it was measured at t=0 or t=1, and using a trick to avoid multicollinearity:

*Y ~ a + b·(1-T)·X + c·T·X + d·T*

Or, if X_{T=0} and X_{T=1} are not simultaneously not null, just:

*Y ~ a + b·X _{T=0} + c·T·X_{T=1} + d·T*

For each person I measure *Y* once but I could have a measurement of X at two different times. This model would let me use the value of X at t=0 or t=1, but not both simultaneously.

(We also need to decide if we want a logistic regression, survival… but this is not important now).

Then I have a doubt.

Can it be considered “repeated measures” and then I need to use random effects such as…?

*Y ~ (1-t)·X + t·X + t + (1|ID)*

Or this is not applicable here because we have just one Y measurement? Is “repeated measures” used for repeated Y or also for repeated X?

How would you write the model to also accept people with measures both at t=0 and t=1 simultaneously and avoid multi-collinearity?

Or would you rewrite the model in a completely different way, maybe with latent variables allowing to use different periods and missings.

Another alternative it would be to assign different weights depending on the antiquity of the data, but different covariates (for the same patient) may have been measures at different periods, they can be mixed.