Library of string validators and sanitizers

Hi everyone,

I was inspired by validator.js that I’ve used a lot in javascript. I didn’t find anything similar in Julia, so I started developing validator.jl. I think that this could be useful considering also the expansion of Julia.
So I’m here to ask you what do you think and if this could be useful.
I’ve just begun the development and it will be a long journey. If anyone is interested in contributing please do!

2 Likes

I think you could one one file with all the functions properly documented instead of file per function.

Your code is overly restricted on the types it accepts

julia> a = "12345"
"12345"

julia> z = @view a[2:3]
"23"

julia> typeof(z)
SubString{String}

julia> function isEmpty(str::String, ignore_whitespace::Bool=false)::Bool 
           return (ignore_whitespace == true ? length(strip(str)) : length(str)) === 0
       end
isEmpty (generic function with 2 methods)

julia> isEmpty(z)
ERROR: MethodError: no method matching isEmpty(::SubString{String})
Closest candidates are:
  isEmpty(::String) at REPL[6]:1
  isEmpty(::String, ::Bool) at REPL[6]:1
Stacktrace:
 [1] top-level scope
   @ REPL[7]:1

julia> function isEmpty2(str, ignore_whitespace=false)
           return (ignore_whitespace == true ? length(strip(str)) : length(str)) === 0
       end
isEmpty2 (generic function with 2 methods)

julia> isEmpty2(z)
false

although perhaps that is too permissive

julia> isEmpty2(2)
false
julia> function isEmpty3(str::AbstractString, ignore_whitespace=false)
           return (ignore_whitespace == true ? length(strip(str)) : length(str)) === 0
       end
isEmpty3 (generic function with 2 methods)

julia> isEmpty3(2)
ERROR: MethodError: no method matching isEmpty3(::Int64)
Closest candidates are:
  isEmpty3(::AbstractString) at REPL[12]:1
  isEmpty3(::AbstractString, ::Any) at REPL[12]:1
Stacktrace:
 [1] top-level scope
   @ REPL[15]:1
2 Likes

https://github.com/iskyd/validator.jl/blob/7fcbcd8a0838a756f3edd163f45159fab93ded35/src/isAscii.jl#L3

don’t use non-const globals, also there’s already:


help?> isascii("123")
  isascii(c::Union{AbstractChar,AbstractString}) -> Bool


  Test whether a character belongs to the ASCII character set, or whether this is true for all elements of a string.

  Examples
  ≡≡≡≡≡≡≡≡≡≡

  julia> isascii('a')
  true
  
  julia> isascii('α')
  false
  
  julia> isascii("abc")
  true
  
  julia> isascii("αβγ")
  false

You also need to make your pkg named Validator.jl.

https://github.com/iskyd/validator.jl/blob/7fcbcd8a0838a756f3edd163f45159fab93ded35/src/isAfter.jl#L5

is this also string validation?


In general, for performance, don’t use regex matching to check if a string contains something

2 Likes

What’s the advantage of doing that?

1 Like

Thanks for the suggestion. Using AbstractString is for sure a better idea!

Thanks for pointing out ascii function and for pointing out non-const global regex declared.
This package is inspired from validator.js so I’m trying to port all functionality, isAfter included. Not sure why it was added in validator.js but I don’t think it is a bad idea to include it. What do you think?

Julia is very much not Javascript and the community has a strong opposition against stuff close to “single function package” on the spectrum. If all the functions are just utilities that can be done in Base Julia, idiomatically, I personally don’t think it’s worth making a package.

But this is not to say I want to stop you, it’s totally fine, but at the same time, I don’t think people want Julia packages to be like Javascript in terms of organization and functionality fragmentation.

2 Likes

https://github.com/iskyd/validator.jl/blob/7fcbcd8a0838a756f3edd163f45159fab93ded35/src/isEmpty.jl#L3

there’s already isempty() function, and to ignore white space, you just need isempty(strip(x))

for country code, there’s:
https://github.com/JuliaFinance/Countries.jl but I guess they don’t have BIC check?

https://github.com/iskyd/validator.jl/blob/7fcbcd8a0838a756f3edd163f45159fab93ded35/src/isBIC.jl#L10

you can use contains() for this type of pattern:

julia> contains("apple", r"app.e")
true
1 Like

I didn’t want to be all negative and “we don’t do that here” but I don’t think there is much of a use case as a package.

It’s the sort of thing I would copy & paste code from if I needed a specific function and knew it was in there.

I have some packages like that myself, e.g. just examples of ccall to crib from

What’s the benefit of copy-paste if there is a library to do that? I agree there are some use-cases when you just need a simple function. Not all Julia fields (thinking of data science) needs something like this. But thinking for example about web dev things change. As a web dev I don’t think there is a single project where I didnd’t use any validators (not talking just about email validation).

I don’t mean to say there is no value in it.

although tbh validator.js looks more like a way for people to boost their GitHub score

“how do I validate a German car registration number” is not something that comes up much in my life :slight_smile:

It could be that it has no value, that’s why I’m trying to understand all your ideas.
validator.js has more than 6 million weekly downloads, so it seems to be used in js world (that of course could be totally different from Julia, that’s why I’m asking).

It’s funny to read about license plate cause I’ve recently worked on a computer vision project where plate validation was needed.

BTW thanks for all the suggestions!

Colors has 26m downloads per week

And we know what happened there

but again, go for it ! if nothing else you will learn some stuff :slight_smile:

A few comnents to your package:

In general more code is always good. But I’m not sure the scope of the package makes a lot of sense. What exactly is it validating? Every possible string format? Social security numbers? License plates? That is way too broad a scope for a package.

Some minor comments

  • Stylistically, modules are usually CamelCase, and functions are snake_case in Julia (or lower case in one word)
  • There is already an isascii and isempty in Julia Base for strings. Having another, different implementation is confusing.
  • Be careful of string indexing like you do in isBIC.jl - it can throw errors if the input is not ASCII strings. Instead, use functions like nextind, or validate the string is ASCII before indexing. You can read more in the Julia documentation on strings.
2 Likes

Not Julia, but applicable beyond it’s domain: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=568s

all(isspace, x) is much faster than isempty(strip(x)).

Yes indeed, but that’s besides the point.