Let’s say I have these two files:
is there an easy way to assess that these two files are “the same” in the sense that they both define identical objects bar comments, whitespaces and docstrings?
I tried with a filtered
Meta.parseall but it seemed ugly so I thought I’d ask here first.
I would write tests that check behavior is the same and then run the suite of tests on both codes.
hmm thanks that won’t work; the files are not generated by me and I can’t make assumptions about what they may define
What should work is a big regex removing docstrings, comments and whitespaces and checking if the strings match. But I was wondering if there’s something nicer that could be done
This is the rough solution I have, it seems to do what I want, feedback welcome to make it better
is_code_equal(s1::AbstractString, s2::AbstractString) =
is_code_equal(c1, c2) = (c1 == c2)
is_code_equal(e1::Expr, e2::Expr) = is_code_equal(e1.args, e2.args)
function is_code_equal(a::Vector, b::Vector)
a, b = trim_args.((a, b))
length(a) == length(b) || return false
for (ai, bi) in zip(a, b)
is_code_equal(ai, bi) || return false
# expand (and remove) docstring blocks, remove linenumbernode
r = 
for e in a
if e isa Expr && e.head == :macrocall
x -> !(typeof(x) in (LineNumberNode, GlobalRef, String)),
s1 = """
s2 = """
s3 = """
s4 = """
is_code_equal(s1, s2) # true
is_code_equal(s1, s3) # false (type signature difference)
is_code_equal(s1, s4) # true (docstrings are ignored)
If I tried to do something like this, I would start by checking whether the CSTParser package could be of help.
Could you evaluate the files into different modules and then check
setdiff of a list of names in the respective module namespaces, plus maybe checking
fieldnames(typeof()) of those defined objects (or something similar depending on your needs)?
I’d like something very lightweight which can be conservative (it’s fine if it returns
false on code that actually would have the same effect but not the other way around).
The context as to why I’m bothering with this is that in Franklin when the server is running, if a specific file is modified (
utils.jl) it will trigger a full build which may be slow. If the change in
utils.jl is frivolous (eg a docstring change, whitespace or comment), it’s better if that full re-build is avoided.
So I’d like something that can fairly quickly (i.e. not much more than the time it takes to read the file) assess whether the change might be significant or not.