Checking if two values are identical

I am attempting to check if two values are identical and then do stuff if they are not. My check involves:

stock[id,3] != flow[v,3]

Unfortunately, I can get the following

Any[0]

and

0

and Julia interprets that as being different.

What beginner’s trap have I stumbled into?

julia> typeof(Any[0])
Array{Any,1}

julia> typeof(0)
Int64

can’t be equal. Only the single element of your array can be equal to an Int64.
There is no good solution, if you insist to compare the two different types.

Clearly

julia> Any[0][1]==0
true

julia> Any[0]==[0]
true

but this is trivial and seems not suitable for anything which call itself a program.

What do you try to do in general? (I expect this to be related to the other two questions about stock and flow) :grinning:

2 Likes

The script that is causing problems is shown below. I start by getting information from an API and create a “flow” array and a “stock” array that is equal to flow. Then I gather more information from the API (I use a loop, from i to number, and use sleep(seconds) to call the API at set intervals). Armed with the new information, I then create a new flow, and look for differences between this new flow and stock to update stock where needed (i.e., where there are differences between the new flow and stock). This is done by looking in the first column of the arrays, where there is an identifier for a particular sports match. If the new flow contains an entirely new line not contained in stock (i.e., a new match is included in the new flow but not in stock), I add that line to stock.

That is the basic idea. But in checking to see differences, I get the trouble stated above. So, I wonder why the type of numbers in stock and flow can be different at all…

for i = 1:updates
    # ...
    # The API is called. Information is gathered and placed in arrays.
    # ...

    # An aggregated array (flow) is assembled after calling the API: 
    flow = [marketId price1]
    # If this is the first iteration of the update loop, flow is duplicated and named "stock":
    if i == 1
       global stock = flow
    end
    # After i > 1, stock is compared with the new flow to look for differences and update stock:
    for v = 1:length(flow[:,1])
        if isempty(findall(x->x==flow[v,1], stock[:,1])) == false
           id = findall(x->x==flow[v,1], stock[:,1])
           if stock[id,2] != flow[v,2]
              stock[id,2] .= flow[v,2]
           end
        end
        if isempty(findall(x->x==flow[v,1], stock[:,1])) == true
           stock = [permutedims(flow[v,:]);stock]
        end
    end

sleep(60)

end

I mean, you are checking if two different things (an array and an integer) are equal and Julia tells you no. There isn’t really much more to say… You probably have som error in your code so you compare the wrong things.

2 Likes

Can you see if my code (in principle) should work if the types are the same?

It’s hard to understand what the code does. It would be best to reduce your code to the smallest possible example and then ask the question about that code.

My guess is that you mixup the dimensions of stock and flow.
First you do

stock = flow

Later you do

stock = [permutedims(flow[v,:]);stock]

This looks fishy to me.

You should create some example data, which is in dimensions equal and for the entries similar to the data you receive from the API, i.e. an example for flow with more than 1 rows (or cols, its unclear what you receive).

You can post this example here or you may try again implementing your algorithm with the example.

To me it looks like you should be using a dictionary, not an array. It actually looks like you made a very inefficient implementation of Dict.

Something like this maybe

stock = Dict{String, Float64}() # or some appropriate types
for i in axes(flow, 1)
    key, value = flow[i, 1], flow[i, 2] 
    stock[key] = value
end

?

This is a very peculiar way of writing the loop, which makes it hard to see what’s going on, but it seems like you are basically trying to replace values in stock with values in flow if they are different now but were the same before?
Some comments:

for v = 1:length(flow[:,1])

Here it would be idiomatic to use size(flow, 2)

if isempty(findall(x->x==flow[v,1], stock[:,1])) == false
    ...
end
if isempty(findall(x->x==flow[v,1], stock[:,1])) == true
    ...
end

Here you are unnecessarily performing the findall check twice, when it can only be either true or false. You could just use if ... else ... end to write:

if !isempty(findall(x->x==flow[v,1], stock[:,1]))
    ...
else
    ...
end
id = findall(x->x==flow[v,1], stock[:,1])

Here you are performing the check for a third time, which tells me you should have just saved the result the first time around.

I would probably write something like this (note this hasn’t been tested of course!)

for v = 1:size(flow, 2)

    matches = flow[v, 1] .== stock[:, 1]
    
    if sum(matches) > 0
        stock[matches, 2] .= flow[v, 2]
    else
       stock = [permutedims(flow[v,:]);stock]
    end
end
2 Likes

Again, don’t use arrays for this.

I think I have found the main problem (there were others, as the advice in the thread revealed)

id = findall(x->x==flow[v,1], stock[:,1])

does not produce and integer, but an array.

Hence, when I compare

if stock[id,3] != flow[v,3]

I get an array in the first case, but an integer in the second, since v is an integer.

You could use findfirst if you wanted to just grab the first match. That’ll return an integer.

findall returns an array because there might be more than one match.

You can use findfirst instead of findall.

But can you say why you have to use an array instead of a dictionary? I’m starting to feel invisible here.

Sorry, DNF. It is my background in MATLAB that makes it easier for me to keep things looking like MATLAB. Not a great excuse, I know…

I am primarily a Matlab user too. Dictionaries are like Matlab’s containers.Map, if that helps.

I must admit, I never used containers.Map before. But I will certainly examine Julia’s Dict to become familiar with it. Thank you for your advice.

Using a global variable inside a for loop reduces performance due to type instability.
You should put your code into a function and define stock inside the function, not global.

2 Likes