Constructing Markov Transition Frequency Matrix with DataFrames

samerb · February 9, 2021, 3:01am

I am cross-posting this question from StackOverflow since that seems a bit slow.

I want to use Julia DataFrames to construct a 3x3 Markov transition matrix i.e. a frequency matrix that tells me the likelihood of transitioning from each of 3 states to the others. I am trying to learn data frames and I would like to learn the best way to do this. This is more for general learning than about this particular example.

Here’s some code I tried so far with some example data but I am not really familiar enough with how to think about dataframes to know how to proceed.

Any suggestions? Thank you.


state=[2,2,3,1,1,3,3,2,1,1,3,1,2,3,2,3,1,2,3,3,1]
statelag=[1,2,2,3,1,1,3,3,2,1,1,3,1,2,3,2,3,1,2,3,3]
df = DataFrame(state=state, statelag=statelag)

markov = combine(groupby(df, [:statelag, :state]), nrow => :cat_countmar)
sort!(markov, :statelag, :state) # this gives the number of occurences of each tranistion

total = combine(groupby(df, :statelag), nrow => :cat_count) 
# this gives the number of occurences of each state


trans = Array{Float64}(undef, (3,3))
# trans should give probability of transitioning between different states

I need to basically “divide” catcountmar of by cat_count so that I’m dividing the number of occurrences of a transition from state i to state j by the number of occurences of state i. This will give the desired transition frequency. But I don’t see how to put markov and total together in one data frame and easily carry out this computation.

PharmCat · November 7, 2023, 1:36am

Hi all!

Is any package to make Markov transition matrix from the data?

gdalle · November 7, 2023, 6:16pm

What does your data look like?

PharmCat · November 7, 2023, 6:31pm

Something like this:

Subj,Period,Group
1, V1, G1
1, V2, G3
2, V1, G2
2, V2, G3
2, V3, G4
3, V1, G3
3, V2, G1
3, V3, G1
4, V1, G1
4, V2, G1
4, V3, G3
4, V4, G2

gdalle · November 7, 2023, 6:33pm

So each subject independently follows a trajectory over the groups, and you want to learn the Markov chain governing it?

PharmCat · November 7, 2023, 6:38pm

Yes, at first I want to know what is overall probabilities to change group, then it good to know is that probabilities is period-dependent or is previous state inference to next transition.

gdalle · November 7, 2023, 6:48pm

Well for the simplest problem, if you forget about the dataframes and just encode your trajectories as vectors of integers, here’s what it could look like:

function estimate_transitions(trajectories::Vector{Vector{Int}}, N)
    A = zeros(N, N)
    # count transitions
    for traj in trajectories
        for t in 1:length(traj)-1
            A[traj[t], traj[t+1]] += 1.0
        end
    end
    # normalize rows
    @views for i in 1:N
        A[i, :] ./= sum(A[i, :])
    end
    return A
end

Demo with your data:

julia> trajectories = [
           [1, 3],
           [2, 3, 4],
           [3, 1, 1],
           [1, 1, 3, 2]
       ];

julia> estimate_transitions(trajectories, 4)
4×4 Matrix{Float64}:
   0.5         0.0         0.5    0.0
   0.0         0.0         1.0    0.0
   0.333333    0.333333    0.0    0.333333
 NaN         NaN         NaN    NaN

PharmCat · November 7, 2023, 9:47pm

Thank you very much, I will try this code.

Topic		Replies	Views
Creating a 3d frequency array for categorical variables from a dataframe New to Julia dataframes	3	676	February 12, 2021
Efficiently finding the frequency of patterns in DataFrame columns New to Julia dataframes , dictionaries , splitapplycombine	12	1546	January 1, 2022
Transition matrix of multiple-dimensional state and sparse array New to Julia question	2	155	June 16, 2023
Array variable - matrix multplication in JuMP Optimization (Mathematical) question , jump	2	926	November 9, 2020
Excel to Julia New to Julia	6	1207	November 9, 2020

Constructing Markov Transition Frequency Matrix with DataFrames

Related topics