On the behavior of reversed iterators

That’s Iterators.reverse, I was talking about Base.reverse.

1 Like

But Base.reverse works by default as it collects the iterator? Thats also what Seq.rev does in F#

I’m not sure if it’s because I haven’t updated as much as possible, but this is what happens when I attempt it:

julia> Base.reverse(zip(1:10, 1:2))
ERROR: MethodError: no method matching reverse(::Base.Iterators.Zip{Tuple{UnitRange{Int64}, UnitRange{Int64}}})

Ahh youre right its not you have to call collect - but it doesn’t really change my point on F#, its just doing that automatically for you.

It is lazy in the sense that the statement Seq.rev mySeq will not be evaluated until you actually start consuming elements from the reversed sequence.

Imagine the following:

let rand = System.Random()
let someSeq = 
    seq { 
            for i in 1 .. rand.Next(5, 10) do 
                if i % 2 = 0 
                then yield i + 1
                else
                    yield i + 1; 
                    yield i + 2 }

someSeq |> Seq.rev |> Seq.take 1

At least to me, it seems that the nature of sequences demands this kind of consumption.

Also, although an F# antipattern, you could generate a sequence using a mutable object (which does not support lazy evaluation) - so the only way to ensure correctness would be to actually evaluate all seq elements before even being able to know the last one (this is kinda similar with the actual snippet I posted).

However, I am really curious about Haskell implementation (because a similar approach can be used in F# to ensure the same result).

Are those snippets posted somewhere?

Iterators.reverse in julia is actually for iterators that can be reversed lazily, like a range or vector - they are not collected, they are just iterated in reverse.

Also, although an F# antipattern, you could generate a sequence using a mutable object (which does not support lazy evaluation) - so the only way to ensure correctness would be to actually evaluate all seq elements before even being able to know the last one.

This sounds similar to whats happening in the juila bug in question.

The disagreement is precisely on what is lazily reversible according to Iterators.reverse. Some types like ranges just do eager reversals because it’s cheap, so we don’t have to worry about those. The fallback Reverse iteration looks sensible, it uses getindex over a reverseible eachindex, which could be a problem if getindex is impure, e.g. depends on how many times it was called. The following methods iterate backwards from the ends, which can be different from forwards iteration:

reverse(G::Generator) = Generator(G.f, reverse(G.iter))
reverse(z::Zip) = Zip(Base.map(reverse, z.is)) # n.b. we assume all iterators are the same length
reverse(f::Filter) = Filter(f.flt, reverse(f.itr))
reverse(it::Cycle) = Cycle(reverse(it.xs))
reverse(p::ProductIterator) = ProductIterator(Base.map(reverse, p.iterators))
reverse(f::Flatten) = Flatten(reverse(itr) for itr in reverse(f.it))

Besides the comment assuming zip’s iterators are all the same length, it seems like a deliberate deviation from the eager versions. Taking the RollingMax(0) example from the issue, the eager filter and reverse do not commute, but the lazy versions were defined so. All the possible combinations here:

julia> Iterators.filter(RollingMax(0), Iterators.reverse([1,3,2,5,4])) |> collect
2-element Vector{Int64}:
 4
 5

julia> Iterators.filter(RollingMax(0), reverse([1,3,2,5,4])) |> collect
2-element Vector{Int64}:
 4
 5

julia> filter(RollingMax(0), reverse([1,3,2,5,4]))
2-element Vector{Int64}:
 4
 5

julia> Iterators.reverse(Iterators.filter(RollingMax(0), [1,3,2,5,4])) |> collect
2-element Vector{Int64}:
 4
 5

julia> Iterators.reverse(filter(RollingMax(0), [1,3,2,5,4])) |> collect
3-element Vector{Int64}:
 5
 3
 1

julia> reverse(filter(RollingMax(0), [1,3,2,5,4]))
3-element Vector{Int64}:
 5
 3
 1
1 Like

Yes, but it is worth mentioning that F# does not allow this kind of bug: Seq.rev is lazy in the sense that it only consumes the complete input sequence when you attempt to consume elements from the reversed sequence (otherwise digesting is not taking place).

My understanding is this is motivated by the versatile ways in which you can create Seqs in F#.

Also, I think we are kinda comparing apples to oranges here: F# Seq associated functions are not merely iterators over collections (Seq.rev included). This opens a whole new chapter that is not really relevant to the discussed subject.

A reversed lazy iterator will produce the same issue for F# as the one we observe in Julia: however, if one is aware that the impure filter function is dependent on the evaluation order (which is by default an F# antipattern), the implementation would be done using continuation passing style (if we are to be loyal to functional programming approach) - and the bug would be averted.

It seems to me that the bug obtained in Julia is only possible in F# (and likely Haskell) if one is actually trying to produce the bug in the first place. Otherwise, being aware of what you want to achieve (and especially the impure function), the correctness issue would never occur.

1 Like

For the record, here are the code snippets I had used for my table:

C++23

#include <ranges>
#include <iostream>
#include <vector>

int main() {
  auto even = [](int i) { return 0 == i % 2; };
  int m = 0;
  auto rollmax = [&m](int i) {
    if (i > m) {
      m = i;
      return true;
    } else {
      return false;
    }
  };
  std::vector<int> v = {1,3,2,5,4};
  auto u = std::views::iota(1, 10);

  for (auto e : std::views::zip(u, v)
	 | std::views::reverse)
    std::cout << '(' << std::get<0>(e) << ',' << std::get<1>(e) << ')' << ' ';
  std::cout << '\n';

  for (float x : std::views::filter(v, even)
	 | std::views::reverse)
    std::cout << x << ' ';
  std::cout << '\n';

  // The following segfaults on g++ -std=c++2b
  for (float x : std::views::filter(v, rollmax)
	 | std::views::reverse)
    std::cout << x << ' ';
  std::cout << '\n';  
}

Clojure

(def v [1 3 2 5 4])
(def u (range 1 10))

(def zip (partial map vector))
  
(println (reverse (zip u v)))
(println (reverse (filter even? v)))

(defn rollmax [s]
  (let [m (atom s)]
    (fn [i]
      (if (> i @m)
        (do (reset! m i)
            true)
        false))))

(println (reverse (filter (rollmax 0) v)))

Haskell

import Control.Monad as CM
import Control.Monad.State as CMS

v = [1,3,2,5,4]
u = [1..9]

rollmax i =
  CMS.get >>= \m ->
  if i > m
  then CMS.put i >>
       return True
  else return False

main = do
  putStrLn $ show $ reverse $ zip u v
  putStrLn $ show $ reverse $ filter even v
  putStrLn $ show $ runState (fmap reverse $ CM.filterM rollmax v) 0

Yes, Haskell essentially forces you to write impure functions in continuation-passing style – with the do-notation providing a bit of syntactic sugar. The type system also tracks where this is needed and usually you have to adapt the program when including impure parts, e.g., use filterM (which is designed to sequentially chain side-effects) instead of filter (which requires pure predicates).
Due to this, I don’t think it would actually be possible to create that bug in Haskell – even if trying to.

3 Likes

I’m not sure what consume means here but if that means the reversed sequence collects the input sequence and iterates backwards, then the element allocation makes it incomparable to the reversal of iterators here, where only explicit collects can allocate all the elements.

Just checked that also Clojure’s reverse is not lazy, i.e., collect all elements.
On the other hand, a lazy reverse would only be of interest if you don’t want to realize any elements. As soon as you request a single element from the reverse iterator, all elements of the original need to be traversed and potentially stored somewhere – using iterator state boils down to building something like a linked list and will probably be less efficient than simply collecting the elements right away.
Am I missing anything here?

1 Like

From the source code of Julia 1.9.1 is appears that certain algebraic laws are expected to hold for reversed iterators:

reverse(f::Filter) = Filter(f.flt, reverse(f.itr))

I.e., this assumes the law (in Haskell notation): reverse . filter f == filter f . reverse which does hold for pure functions, but breaks otherwise.

reverse(z::Zip) = Zip(Base.map(reverse, z.is)) # n.b. we assume all iterators are the same length

I.e. assumes that reverse . uncurry zip == uncurry zip . fmap reverse which only holds if all argument iterators have the same length. This should not be implicitly assumed – especially if not documented except in this comment – but rather checked, i.e., using broadcast instead of map which requires same length arguments (formally interpreting scalars as infinite sequences which is the correct behavior according to the applicative instance of ZipList in Haskell).

1 Like

There is another hidden assumption: that the iterator itself is not stateful & actually reversible. In the general case, you can’t get around collecting the entire iterator, no matter whether the filtering function itself is free of sideeffects or not.

Unfortunately, this kind of thing happens often (though I don’t think other languages are better at this either):

1 Like

It is correct to say that in F# collect takes place, and the reverse function is actually returning elements after the filter function was called in the order dictated by the original order of elements.

let sq = 
    seq { for i in 1 .. 10 do 
                    printfn "i: %A" i
                    yield i }

let revsq = sq |> Seq.rev

// no evaluation/allocation above this point

// but this will trigger the complete allocation 
// of `sq` before applying `Seq.rev` and returning the first
// element of reversed sequence
revsq |> Seq.take 1
1 Like

So it looks like most other languages dont let you lazily reverse filtered iterators.

  • Haskel does it properly because monads intrinsically separate mutating and pure functions
  • C++ lets you and segfaults for mutating functions
  • julia works for the pure function case and gives the wrong answer for mutating functions
  • Edit: and Rust does what Julia does, allowing the same bug

@Sukera its actually a thing to rely on zip not being in order… which would break this use case anyway Reported length of zip of stateful iterators wrong · Issue #47790 · JuliaLang/julia · GitHub

2 Likes

All developers writing impure filter functions deserve to be punished: Julia is ensuring the punishment fits the crime :slight_smile:

5 Likes

wait, what?

it should iterate through its constituent iterators in order until one of them halts, right?

But you can iterate the last iterator before the first one… it doesn’t say anywhere that they happen in argument order.

1 Like

ahh, ok. that makes sense given that argument order evaluation isn’t guaranteed in any context afaik