See possible explanation here.
Thanks, that makes sense!
@aplavin FWIW I often use the date of the last change to assess if a repo is still being maintained, and skipped this one in favor of others because I thought it wasn’t being changed. Maybe I’m an outlier? If not, it might be worth “rounding” by month / quarter rather than year, or adding a note to the Readme of the date of the latest release, or including the release on GitLab.
Maximilian, indeed the linked explanation is correct. I was also thinking about finer rounding, or just assigning all commit dates to “now” - release times are effectively public anyway. Will probably go with this approach…
Anyway, JuliaHub shows correct release dates at the package page JuliaHub, so maybe I should just link to JuliaHub instead of the repo itself.
Announcing another update, that should help writing and debugging long pipelines:
The @pDEBUG
macro
Its intended usage is when your pipeline doesn’t work as expected, possibly throwing an error somewhere:
julia> @p begin
1:5
map(_ ^ 2)
filter(_ > 3)
only
end
ERROR: ArgumentError: Collection has multiple elements, must contain exactly 1 element
Replace (temporarily) @p
with @pDEBUG
, and it’ll export all intermediate results till the first error into the _pipe
variable:
julia> @pDEBUG begin
1:5
map(_ ^ 2)
filter(_ > 3)
only
end
ERROR: ArgumentError: Collection has multiple elements, must contain exactly 1 element
julia> _pipe
3-element Vector{Any}:
1:5
[1, 4, 9, 16, 25]
[4, 9, 16, 25]
_pipe
is a vector of all intermediate pipe results in their order.
Now, you can clearly see why the error in only
happened.
Further, all intermediate pipe variables are also exported:
julia> @pDEBUG begin
1:5
x = map(_ ^ 2)
...
end
julia> x
[1, 4, 9, 16, 25]
Note: @pDEBUG
exports to the global scope, so the variables are accessible even if the pipe is inside a function.
Sorry for being off-topic, but consider updating your clock
That’s my hello from near future (:
What’s a little floor
vs ceil
between friends…
A new version of DataPipes
is now released: 0.3.0
.
README and docs are significantly improved. Unfortunately, the first post in this thread isn’t editable anymore.
Nothing breaks in the core piping functionality. The only breaking change in the package is the removal of convenience functions that were previously defined in DataPipes
: filtermap
, mutate
, and a few more.
Now, DataPipes
just implements its piping syntax and does nothing else. There are no dependencies anymore, and the loading time is less than 1 ms. These changes make DataPipes
itself a no-brainer to include as a dependency, even in very lightweight projects.
Of course, those removed data processing functions are useful by themselves as well. Over time, more and more of such functions gathered in DataPipes, which didn’t really make sense conceptually. I’m making them more general and performant, and plan to release as another package soonish. If you also use them, feel free to stay on DataPipes@0.2
for now. For now, basically the only difference between versions is removal of those functions in 0.3
.
There are also a couple of minor fixes/improvements in the core piping functionality since the previous announcement here. I haven’t encountered any serious issues for a long time in my pretty heavy usage of DataPipes
. So, recent changes addressed some remaining corner cases:
- implicit inner pipes (that start with
__
) now work everywhere they make sense, including kwargs -
@p let ... end
and@p begin ... end
forms generate corresponding blocks,let
orbegin
, making variable scoping consistent with plain Julia -
function (x) ... end
is treated exactly the same asx -> ...
- qualified function calls also work without brackets, as in regular pipes:
@p data |> Iterators.flatten
Let me announce another release of DataPipes
, v0.3.5
- already registered in General
.
The main highlight since the last update is
Pipe broadcast: .|>
Sometimes operations are more natural to write as a map
call, sometimes as a broadcast. Making this even more convenient, DataPipes
now supports the broadcasted pipe, .|>
.
For the regular pipe |>
, the __
placeholder gets replaced with the result of the previous step. Likewise, for the .|>
broadcasted pipe, __
means a single element of the previous step result.
When this placeholder is not used, DataPipes
implicitly appends it to the function arguments, same for |>
and .|>
.
Some examples:
julia> @p "1, 2, 3, 4" |> eachmatch(r"(\d)") .|> __.captures[1]
4-element Vector{SubString{String}}:
"1"
"2"
"3"
"4"
julia> @p [[1, 2], [3]] .|> __ .+ 1
2-element Vector{Vector{Int64}}:
[2, 3]
[4]
julia> @p 1:10 |> group(_ % 3) .|> map(_ ^ 2)
3-element Dictionaries.Dictionary{Int64, Vector{Int64}}
1 │ [1, 16, 49, 100]
2 │ [4, 25, 64]
0 │ [9, 36, 81]
Especially the last case demonstrates that .|>
is convenient for processing nested datasets in a single line. Alternatives, such as nested maps, are more noisy for simple one-liner pipes.