I’m attempting to implement group convolutions which applies group transformations to incoming filters and then apply convolution.
To do so, I create a set of transformation indices and then create an array with each of these indices mapped to the original filters to get a group set of filters. The below mapping when calculating the gradient is where the issue comes from →
function apply_transformation_group_to_group(m_transformation_set, channels_in, filters, in_group_size, filter_size)
no_filters, in_group_size, transform_group_size = size(m_transformation_set)
# This will hold the set of filters after group transformations are applied
filter_result = Array{Any}(undef, ( filter_size..., in_group_size, channels_in, size(filters)[1], transform_group_size))
for filter in 1:size(filters)[1]
for group=1:in_group_size # for each of the group transformations applied to the group filters
for channel=1:channels_in
# permute which transformations are applied to which group filter
for g_filter in 1:transform_group_size
perm = circshift(1:in_group_size, g_filter)
m_transformation_set_for_filter_set = m_transformation_set[filter,group, g_filter][2] # get the set of transformation indicies for this filter, group, group_action
filter_result[: ,:, perm[group] , channel, filter, g_filter] = filters[filter, group, :, :][m_transformation_set_for_filter_set] # apply the transformation through array referencing
end
end
end
end
I’m running into this issue that Mutating Arrays is not supported in taking the gradient. I’m a Julia noob so am not sure how to reframe the approach to overcome this, all the values in the multi-dim array are just reindexed from the filters stored in the P4GroupConv.w shown below.
struct P4GroupConv{N,M,F,A,V,S, R}
w :: A
bias::V
σ :: F
stride::NTuple{N, Int}
pad::NTuple{M, Int}
dilation::NTuple{N,Int}
channels_in :: Int
transformation_set::Array{Array{Any,S}, R}
end
Is this kind of re-indexing and summing the gradient contributions from the original indexed filters possible in Juila?