I also tried some other design methods, but I always saw a shift in phase. My question is, how can MATLAB guarantee no phase shift with lowpass, and how can I reproduce that using DSP.jl?

I found that there is another function filtfilt. If I use filtfilt instead of filt, it seems to have zero phase distortion. Then when should I use filt if it always creates a shift? In the options of FIRWindows, there is often a zerophase optional argument that confuses me.

Hello
The phase shift is absolutely normal, you may juste have a look on the frequency response of you Butterworth filter at the frequency of your signal.
When you use filtfiltt you apply the filtering two times, the first in the classic way and a second time in a reverse way, which cancel the phase. Note that in practical systems (like real time filtering on a temporal series), you cannot do a filtfilt.

The filt function applies a causal filter, i.e. one that cannot look into the future (a prerequisite for real-time processing). Therefore it can only react to its input with some delay. However there is a simple trick: applying such a filter twice on a signal that is already completely available in memory, once in forward direction, and once in backward direction, causes these delays to cancel each other out. The filtfilt function does exactly that for you. But keep in mind that it does apply the filter twice, i.e. it will double the filter order (make its transition steeper). The result is a non-causal filter (output samples will be affected by “future” input samples).

Other methods to obtain non-causal filters exist, for example filtering in the frequency domain, i.e. pad the signal, apply the FFT, attenuate some frequencies as desired, and then apply the inverse FFT and remove the padding (and boundary effects). (If you use that method, never forget that the discrete Fourier transform operates always on a single period of a periodic signal, i.e. its last and first sample are neighbours. Hence the need for padding to separate them. The padding width should be at least the length of the impulse response of your filter.)

Depends entirely on the application. Most common is zero padding, in particular if what you are filtering with the FFT is just one block out of a sequence of blocks, that you are then afterwards going stitch back together, with whatever overflowed into the padding region being added to the corresponding samples in the next block, so nothing gets lost, as shown on slide 86 in DSP lecture recording 21: FFT based convolution.