The huge difference of two functions for the same goal inside and outside of function

That because the vectorization is introducing a function barrier, which restricts the type instability to the “surface” of the code, not to the loop. When you use a broadcasted operation like:

x .= 1.0

is like you where doing

function f(x)
    for val in x
        x += one(eltype(x))
    end
    return x
end
f(x)

Thus the function is implicit there. This is sort of the same mechanism that makes vectorized codes in other languages be fast while loops are slow (they call specialized versions of the operations), except that in Julia you can write the functions that do those operations in Julia itself, just guaranteeing that the code can be compiled for the types of variables involved, which essentially depends on the code being within a type-stable block (a function, for instance).