# Counting Exchanges To Order Array with Repeats

I am asking this question to avoid reinventing the wheel (also because I am lazy). I have to arrays of integers in normal order with no repeats within each array. Then I generate an array by concatenation the two arrays. This array can have out of order elements and can have repeated elements. For example

[1,3,4][2,3,5] => [1,3,4,2,3,5]

How many exchanges to get [1,2,3,3,4,5]?

Do you have any suggestions or links before I start using brute force.

Here is a related problem. Consider the following rules for exchanging elements of an array - [a_1,â€¦,a_i,a_(i+1),â€¦,a_n]

If a_i == a_(i+1) then

[a_1,â€¦,a_i,a_(i+1),â€¦,a_n] => g(a_i,a_i)[a_1,â€¦,a_(i-1),a_(i+2),â€¦,a_n]

where the g(a_i,a_i) is a coefficient associated with the reduced array.
If a_i != a_(i+1) then

[a_1,â€¦,a_i,a_(i+1),â€¦,a_n] => 2g(a_i,a_(i+1)) [a_1,â€¦,a_(i-1),a_(i+2),â€¦,a_n]
-[a_1,â€¦,a_(i+1),a_i,â€¦,a_n]

Again the 2g(i,i=1) and -1 are coefficients associated with the arrays (think of each array as a basis vector). The objective is to take the array [a_1,â€¦,a_i,a_(i+1),â€¦,a_n] and reduce it to a linear combination of coefficients and arrays where each array is in normal order with no repeated elements.

If you have two sorted arrays and you want to merge them into a single sorted array, you shouldnâ€™t concatenate and then sort. You should do a merge operation, which takes only linear time.

There is a package MergeSorted.jl that gives you a function `mergesorted(a,b)` that does this.

(Or are you trying to pose some theoretical combinatorics question? That would be a bit offtopic for this forum.)

2 Likes

Thank you. MergeSorted does exactly what I want except I also need to know how many exchanges were required to sort the merge. I will ask the author of MergeSorted sorted if there is a way of extracting that information.

Merging two sorted lists does not require any â€śexchangesâ€ť and it doesnâ€™t require any separate â€śsortâ€ť operation distinct from the merge.

You can just copy directly into the destination array from the source arrays, without any permutation, by simply copying the smaller of the two elements from the current position in each array. See the explanation under Merging two lists on Wikipedia, for example.

What are you trying to accomplish? Why do you need the number of â€śexchangesâ€ť?

I am calculating the product of two base of a geometric (clifford algebra). The bases are the geometric product of basis vectors say e_{i}. For an orthogonal basis e_{i}e_{j} = - e_{j}e_{i} and if i=j then e_{i}e_{i} = g_{i,i} a scalar (g_{i,j}is the metric tensor). A base is of the form e_{i_1}â€¦e_{i_r} where the i_jâ€™s are all distinct and in normal order. The product of two bases is e_{i_1}â€¦e_{i_r}e_{j_1}â€¦e_{j_s}. This needs to be put in normal order and each time basis vectors are exchanged the sign changes. If consecutive basis vectors are identical say e_{i_r}e_{j_1} then i_r=j1 and e_{i_1}â€¦e_{i_r}e_{j_1}â€¦e_{j_s} = g_{i_r,i_r}e_{i_1}â€¦e_{i_r-1}e_{j_1+1}â€¦e_{j_s} reducing the number of basis vectors by two. If the basis vectors are not orthogonal you need to use the relationship e_{i}e_{j} = 2g_{i,j}-e_{j}e_{i} to normal order terms an things get more complicated. In geometric algebra you can add scalars and bases just like real and imaginary parts of complex numbers. In fact if the e_{i}e_{i} = 1 (Euclidean metric) then (e_{i}e_{j})^2 = -1 for all i != j.

The number lf exchanges can be tracked easily while doing the merge since you keep track of your indices anyways. Thus when you copy an element to its final destination you can very easily compute the number of swaps necessary.

Do you only care about euclidian / ON- bases?

Are you in the situation where `N` is quite fixed and smallish (e.g. <=64, <= 256 or <= 512)?

Do you care about performance or is clean and flexible code more important?

You should think about using bitpatterns if that is the case, i.e. represent each basis as e.g. `NTuple{VecElement{UInt64}, NN} `, and the ith bit is set if and only if the element contains `e_{i}`. This kind of code tends to be very fast, allocation-free and simd-y; e.g. up to the parity sign, a product can be done with a simple `xor` which is 1 cycle on avx2 for NN=4 / N = 256. The requisite type-instability on NN (i.e. the size of your universe) can be put behind a function barrier.

If you want to try that, we can together figure out a fast way of getting the parity of a product.

Which N are you talking about. If there are N basis vectors there are 2^N bases. For example if the basis vectors are ex, ey, and ez the bases are 1, ex, ey, ez, exey, eyez, ezex, and exeyez. The number of basis vectors will usually be 8 or less. Usually the basis vectors will be orthogonal so you want that to be fast. The other case is special and I can do it with brute force in separate functions. Also the orthogonal case can be calculated on the fly and the results saved in a table for future use. In the nonorthogonal case you need to calculate the entire multiplication table at once because the two different type of products ( geometric product and ^ outer product) require it.

N is the number of basis vectors, such that 1 \le i \le N for each e_i that appears in your normalized bases e_{i_1}\ldots e_{i_k} with 1 \le i_1 < i_2 < \ldots i_k \le N.

So, if your N=8, then you only need 8 bits, i.e. an UInt8 to refer to a basis (there are 2^8 = 256 bases). Instead of computing anything at all, you could just precompute a multiplication table with 2^16 entries â€“ each is only one bit wide, so 8 kb, which fits well into L1.

What Iâ€™m saying is that you should consider using this dense representation as `UInt8` instead of e.g. `Vector{Int}`, and then use bit-fiddling and/or multiplication tables.

If you go up to larger N, e.g. N=32 to N=64, then multiplication tables become very unwieldy, and you want bit-fiddling. For N=128 to N=2048, you probably want simdified bit-fiddling, and above that you probably should consider a vector/sparse representation (Iâ€™m not entirely sure where the cutoff lies).

It is possible use bit-fiddling plus lookup table, such that it has size N*2^N instead of 2^2N, which makes N=16 reasonable for lookup tables (â€¦fits in L2). Itâ€™s probably possible to shrink that to fit into L1, using two lookups.

The general goal here would be to obtain a mostly branch/loop-free algorithm.

For now I am going to do it the stupid way and see how long it takes for the following reason. The coefficients of the basis vectors and bases are symbolic and for the values if N I have the scalar symbolic algebra on the coefficients probably takes much longer than the multiplication of the bases (I will check that out when the code gets that far along). For my use what would probably make the biggest difference in speed would be to use multiple cpuâ€™s to distribute the scalar algebraic calculations especially the simplification of the coefficients. Your idea would be perfect for purely numerical calculations.

1 Like