A confusing problem of pipe

This is my code

split("hello-world julia", [' ', '-']) .|> 
    x -> filter(isletter, x) .|>
    first .|> 
    uppercase |> 
    join 

I want to get the result "HWJ", but I get a

3-element Vector{String}:
 "H"
 "W"
 "J"

When I use the Pipe package, the code works fine. The result is "HWJ"

using Pipe
@pipe split("hello-world julia", [' ', '-']) .|> 
    filter(isletter, _) .|>
    first .|> 
    uppercase |> 
    join 

and following code always has an error no matter how many parentheses are added

@pipe split(phrase, [' ', '-']) .|> 
    filter(isletter, _) |> 
    filter(x -> length(x) > 0, _) .|>
    first .|> 
    uppercase |> 
    join 

I’m very confused.

julia> split("hello-world julia", [' ', '-']) .|>
           (x -> filter(isletter, x)) .|>
           first .|>
           uppercase |>
           join
"HWJ"

anonymous function (->) has lower precedence than pipe

4 Likes

I think the Chain.jl syntax is cleaner

using Chain: @chain

@chain "hello-world julia" begin
    split([' ', '-'])
    filter.(isletter, _)
    first.()
    uppercase.()
    join
end
3 Likes

Thanks for your help!

There was recently an announcement of new piping package DataPipes.jl by @aplavin.

Wondering if code below is a proper way of writing requested task:

using DataPipes 
@p begin
    "hello-world julia" |>
    split(_, [' ', '-'])
    filter.(isletter,↑)
    first.(↑)
    uppercase.(↑)
    join
end

result seems correct at least: "HWJ"

3 Likes

Nice. How do you type the \uparrow?

I am completely confused by this example

image

No idea what’s going on.

I think it’s the same with map(x->(a=x, b=x^2, c=1:x), [1,2,3,4]), and return a vector of namedtuple

so u can skip the x->. Don’t feel this is very intuitive. After skimming thru the documentation, it doesn’t seem to explain that.

The following snippet would be slightly more along the spirit of DataPipes.jl:

@p begin
    split("hello-world julia", [' ', '-'])
    filter.(isletter,↑)
    map(first)
    map(uppercase)
    join
end

I haven’t really considered broadcasting as pipeline steps yet, so unlike regular function calls they don’t enjoy automatic substitution of the previous step results.

1 Like

Just type \uparrow<tab>, or \upar<tab><tab>.

This is the whole point of DataPipes.jl: to get rid of code overhead when using common data processing functions. Such functions typically take a function as their first argument (or arguments, if multiple), and the dataset as the last argument: think map/filter/sort/... in Base or group/join/... in SplitApplyCombine. So, map(_ + 1) translates to map(x -> x + 1, <previous step result>) and so on.

Agree that the documentation is far from perfect, I just found it easiest for me to put examples showcasing main features instead of extended description.

3 Likes