A Tree? Multi-dimensional Array? hmm

theboiwholived · November 9, 2022, 7:50am

Hey all, could really use your expertise in determining if what i need here is a tree… and how i would go about doing that

At the moment my code basically determines out the difference of all trades between currencies.
c.Ai represents the name of a currency, example USD or EUR, represented numerically.

Lets say i have a 10 currencies
[1,2,3,4,5,6,7,8,9,10]

The code below takes all of my exchanges and computes the grand total difference.

    for (Δ, Λ, c) in zip(r.Δs, r.Λs, r.exchanges)
        difference[c.Ai] += Λ - Δ
    end

an example result:
[ 100, 0, 4, 50, 70000, 450, 20, 1, 42, .01 ]

coinciding to the final output of currency 1 - 10

What i need to do is instead of grab just total difference, id like to be create a path of what was exchanged. Tracking more or less what my loop is doing
ex

basically a tree i think? Essentially say from currency 10, user converted to 5 then 8 then 9. Id also like to like highlight in the event a user took a currency and split it into more than one currency, ex:

[1,2......10]
 3 7       5
 6 1     8  4 
 2 4     9  3

which is makes me think a tree is valid here. Is there a smart clear cut way to accomplish this in Julia without introducing a ton of code debt weighing me down? Similar to how i have the for loop that just rips through the zip for totals?

Can trees store multi values? Like in addition to the path, also the difference at each step?

extra question out of curiosity:
was doing some research on for loops with multi threads for enhanced speed, or GPU based Arrays with CUDA - should i look into multi thread here?

SteffenPL · November 9, 2022, 8:37am

How large is your dataset, is it like millions of entries or just a few. Depending on that different approaches might be good.

Naively, I would create for every user a 10 by N matrix which represents after every step hoe much in each currency the user owns. E.g. in percentage or total amount.

Doing a tree seems more complicated for following computations.

theboiwholived · November 10, 2022, 1:27am

Hey Steffen, thanks for the reply !
Less about users and more on the currencies exchanged. So lets just say for one user.

Its really about < 5000 for currencies in the array c.Ai, ex [1,2,3,4,5…4560] - I was hoping i could somehow wrap it in to the :

difference[c.Ai] += Λ - Δ

for one spread or DF or tree or something. To then iterate upon if need be -

my concern would be in instances where they take a number of currency X and split it into two different currencies , i would lose out on that vector of accuracy for what really happened to find the total difference.

“Naively” How would you go about it? I must admit any all help would be huge at this point. Quite stuck.

Thanks!

SteffenPL · November 10, 2022, 7:17am

First of all, there might be packages for your task, for example:

For conversion between currencies: GitHub - JuliaPackageMirrors/Currencies.jl: Julia package mirror.
It could also be nice to use DataFrames!

That having said, this is how I would do it with plain Julia. I use sparse matrices to represent the transitions in each step. That allows you to have full control, you can even convert into multiple currencies at the same time… To get the final amount one has to multiply all transition matrices with the initial vector. (Of course, I assume there is no fee between transitions, but you could also extend it to that case)

using LinearAlgebra, SparseArrays

# initial currencies 
n_currencies = 500
n_steps = 2

transactions = [ sparse(1.0I(n_currencies)) for i in 1:n_steps ]

rates = sparse(1.0I(n_currencies))

# I don't know, some random data, say [usd, jpy and yuan]
rates[1:3,1:3] .= [1.0 1/146.0 1/7.25; 146.0 1.0 1/0.05; 7.25 0.05 1.0]

# sent in first step money from currency #1 to currency #2  
transactions[1][1:2, 1] .= [0.0, 1.0] .* rates[1:2,1]

# sent in second step money from currency #2: 30% to currency #1 and 50$ to #3, 20% we keep
transactions[2][1:3, 2] .= [0.3, 0.2, 0.5] .* rates[1:3,2]

init_x = zeros(n_currencies)
init_x[1:3] .= [100.0, 0.0, 0.0]

x = init_x[:]

for transaction in transactions
    x .= transaction * x
end

x[1:3]  # -> [29.999999999999996, 2920.0, 365.0]

theboiwholived · November 10, 2022, 10:45pm

@SteffenPL , legend! This is awesome! massively helpful.

I think I’m in line with you across the board outside of the rates portion.

And particular to your example but just want to make sure i entirely understand the low down of the knowledge your providing.

What are we emulating here from the rates changing in rates[1:2,1] to rates[1:3,2] ? Just want to understand the example if there was a reason you switched from just doing rates[1:2,2] for transaction 2. Is this just to demonstrate a possible change in rate data from like day 1 or 2 ?

Separate to that How then would you suggest i track the path of the swapped currencies held with the transactions sparsearrays?

Is there a way we could combine transaction 1 and 2 for a total path of what was exchanged?

julia> transactions[1][1:2, 1]
2-element SparseVector{Float64, Int64} with 2 stored entries:
  [1]  =  0.0
  [2]  =  146.0

julia> transactions[2][1:3, 2]
3-element SparseVector{Float64, Int64} with 3 stored entries:
  [1]  =  0.00205479
  [2]  =  0.2
  [3]  =  0.025

I imagine there’s a cleaner way than a for loop to push or append the numbers at position 1 in transaction[x ] to a new array for a net profit/loss path? Ideally it would be this for all 500. To somehow from all transactions print just the column of change for a position, ex 1 for usd?

[1]  =  0.0                   (tx 1)
[2]  =  0.00205479      (tx 2)

Lastly, i need to track the change of what currencies were converted per tx in the path rather than just overall net change.

So when you wrote “sent in first step money from currency #1 to currency #2” - how could we take this example to track that exact path, so at the end we could print the path from say USD or currency 1 just like how we track net change per step in transactions[ x ]?
ex:

[1] = 1 
[2] = 2
[3] = 10 
etc..

Thanks again Steffen, learning worlds from this already.

SteffenPL · November 11, 2022, 6:21am

Oh, I think there is a unlucky coincidence in the example that the i th transaction happens in the i th column of the matrices…

With the approach of the example, in each step the transaction matrix transactions[1] can represent any kind of transfer from any currency to any other currency (all at once). If I write Ai = transactions[i] then the entry Ai[n,m] represents the amount of transfer (in percentage) from currency m to currency n.
(But there could be another transfer in the other direction at the same time, which might balance out.)

These Ai matrices could be seen as adjacency matrices of the graph representing the flow of money in the i th step.

You can get the transactions between the initial and the final state via

total_transaction = prod( transactions )   # EDIT: is should be prod(reverse(transactions))

here, the indices total_transaction[i,j] again indicate the transfer from currency j to currency i.

In the given example

prod(transactions)[1:3,1:3]                 # EDIT: it should be reverse(transactions)...
3×3 SparseMatrixCSC{Float64, Int64} with 6 stored entries:
   0.0  0.0     ⋅ 
 146.0  0.5     ⋅ 
    ⋅   0.025  1.0

and if you want to know how an initial amount of money propagates, you can use total_transactions * init_x and similar, the j th column gives you where the j th currency went too.

(I know, it’s a bit the opposite way of indices compared to what one would expect. Otherwise you have to transpose the matrix or use row-vectors from the left side, instead of the matrix-vector produce.)

EDIT: by the way, you can also use normal matrices instead of sparse ones. I think sparse onces are only needed for very large problems.
Or, if you know that in each step only one currency is converted, you can just just (idx = 2, targets = [0.3,0.2,0.5]). Many possibilities.

theboiwholived · November 12, 2022, 12:47am

visibly makes all the sense, amazing!!! Is there a good way without reiterating on the list the final path it traversed? Or maybe i need to iterate again on total_transaction which is okay, but curious how to capture in code the final path list of steps in a more human readable format
ex: total_transaction[1:3,1:3] = or prod(transactions)[1:3,1:3] =

prod(transactions)[1:3,1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 6 stored entries:
   0.0  0.0     ⋅ 
 146.0  0.5     ⋅ 
    ⋅   0.025  1.0

prod(transactions) translates to a transaction path of USD to step 1 to step 2, being

totalpath =

[1] = JPY  
[2] = USD, JPY,  YUAN

or as we defined earlier as numbers, currency 1 2 and 3

[1] = 2 
[2] = 1, 2 ,3

or perhaps

[[ 1] , [1,2,3] ]

I’d like to be able to capture this list for a more human readable output to keep as record, and then also be able to iterate on everything else like you so graciously taught me.

Lastly just a general question on the result here, where is the 0.5 coming from in column 2 row 2 for prod(transactions)[1:3,1:3] ? Should that number be 0.2 ? Or am I misunderstanding that value and where its coming from ? And the # above that in column 2 row 1 be 0.00205479 ?

julia> transactions[2][1:3, 1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 5 stored entries:
 1.0  0.00205479   ⋅
  ⋅   0.2          ⋅
  ⋅   0.025       1.0

julia> transactions[1][1:3, 1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
   0.0   ⋅    ⋅
 146.0  1.0   ⋅
    ⋅    ⋅   1.0

julia> prod(transactions)[1:3,1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 6 stored entries:
   0.0  0.0     ⋅
 146.0  0.5     ⋅
    ⋅   0.025  1.0

You have no idea how grateful i am of the help here . Thanks so much again @SteffenPL !!! Once i understand that human readable output part i think we can consider this absolutely more than solved <3

SteffenPL · November 12, 2022, 1:52pm

Oh! First about the 0.2. That was a very stupid mistake in my code… You see that we compute T x to apply the transactions T to a vector x. To combine transactions T_1,\dots,T_n we need to compute T_n \cdots T_1 x, but wrongly I computed prod(transactions) which is T_1 \cdots T_n, i.e. in the wrong order.

The corrected code is

total_transaction = prod(reverse(transactions))

which then yields

julia> total_transaction[1:3, 1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 7 stored entries:
  0.3   0.00205479   ⋅ 
 29.2   0.2          ⋅ 
  3.65  0.025       1.0

Sorry again for the mistake!

About the output, here is a quick code which gives your output. For sure, it could be done better but I guess it doesn’t matter too much:

using Printf

function print_transaction(T, rates, names = string.(1:size(T,1)))
    A = Matrix(T) 

    for col in 1:size(A, 2)
        targets = A[:, col]

        if any(targets .!= 0)
            printstyled("Transactions from [$(names[col])]:\n", color = :green)
            for row in 1:size(A,1)
                v = targets[row]
                p = targets[row] / rates[row, col] * 100
                if p != 0.0
                    @printf "  %.2f → %s       " p names[row] 
                    @printf "\t(1 %s → %.4f %s)\n" names[col] v names[row]
                end
            end
        end
    end
end

print_transaction(total_transaction, rates, ["USD", "JPY", "YUAN"])

gives

Transactions from [USD]:
  30.00 → USD           (1 USD → 0.3000 USD)
  20.00 → JPY           (1 USD → 29.2000 JPY)
  50.34 → YUAN          (1 USD → 3.6500 YUAN)
Transactions from [JPY]:
  30.00 → USD           (1 JPY → 0.0021 USD)
  20.00 → JPY           (1 JPY → 0.2000 JPY)
  50.00 → YUAN          (1 JPY → 0.0250 YUAN)
Transactions from [YUAN]:
  100.00 → YUAN         (1 YUAN → 1.0000 YUAN)

Note that the 50.34 percent are because the rates I got from google do not match exactly, i.e. you gain money when you convert a few times back and forth. With correct numbers that problem shouldn’t be there.

theboiwholived · November 13, 2022, 9:22am

Absolute legend! Cheers @SteffenPL

beyond grateful for the help and time you’ve spent here teaching me. Really your guidance couldn’t be any more appreciated.

Absolutely solved. Did just want to validate column one with the new answer

julia> total_transaction[1:3, 1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 7 stored entries:
  0.3   0.00205479   ⋅ 
 29.2   0.2          ⋅ 
  3.65  0.025       1.0

Should column one reflect similarly transaction[1] as column two the answers from the initial transaction[2] ?

julia> transactions[1][1:3, 1:3]
3×3 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
   0.0   ⋅    ⋅
 146.0  1.0   ⋅
    ⋅    ⋅   1.0

Curious how total_transactions column 2 is the same as transaction[2] but not transaction[1] with column 1

for ex total_transaction column 2 is the same as transaction 2

transactions[2][1:3, 2]
3-element SparseVector{Float64, Int64} with 3 stored entries:
  [1]  =  0.00205479
  [2]  =  0.2
  [3]  =  0.025

I could be groggy overthinking the initial math, but regardless. Thanks again, wishing you and yours the merriest of holidays.

SteffenPL · November 13, 2022, 11:02am

Thanks, I’m very happy if replies are useful.

If some columns of transaction matrices occur in the final one depends on the transactions before and afterwards.
For example, the second transaction is the first time JPY is converted into something else and there is no transaction afterwards which further converts the initial JPY into something else. That’s why the column in the total transaction matrix is the same. But in general the total transaction matrix can be completely different than the individual ones.

In the example, the first column of total transactions is actually a multiple of the second column. Since we convert all USD into JPY and then we spread JPY in the second step. Since 1USD is 146JPY, you see that the first column of total transactions if 146 times the second [anyway, hope it does lead to more confusion.]

Topic		Replies	Views
How to best parallelize custom decision tree / forest? Machine Learning question	0	260	May 4, 2023
Why I love Julia Community announcement	34	4820	October 11, 2019
Help with binary trees benchmark games example General Usage benchmarks	53	3672	May 6, 2021
Adaptation of Julia for production and execution in trading Finance and Economics	7	3186	August 19, 2021
New blog post on arrays Community images , indexing , discretization	9	2076	April 24, 2017

A Tree? Multi-dimensional Array? hmm

Related topics