Looks very useful. Is there anything similar in Julia?
I am puzzled why this was packaged up. It basically amounts to
where the real content is in
is_valid_data, which one would define anyway on a per-dataset basis.
The example given was when developing a data science pipeline. Things might change and this acts as a test every time a new pipeline is run. It just looked a useful feature to me.
I am not questioning the usefulness of validating data, just pointing out that
- the actual validation criteria is usually dependent on the dataset, and thus hard to generalize in the package,
- but it can be wrapped in a routine for a particular project, and called in a single line.
I don’t see what a package would and what it could look like. Most of the building blocks are already defined in Julia, eg
issorted, or can be trivially implemented, eg