In my previous job I had to work with Foster/Cauer models for circuits. I am going to explain what I did, although it may not apply to your case.
My models were purely reactive (only LC elements in the Foster/Cauer model) and thus it had resonances (singularities), which made it very difficult/impossible to extract the parameters from a curve fitting tool. I got the results from a numerical simulation in terms of an S-Matrix in the frequency domain. I transformed it to a Z-matrix and worked from there.
First, I extracted an initial guess for the pole locations using Foster reactance theorem and transformations back and forth between admittance and impedance form. When I had the initial values, I created a function that calculated the immittance (admittance/impedance) with them. Then, I created another function that transformed the immittance of the two-port network back to an S-matrix, and created a cost function based on the Frobenius norm of the difference between my data and my model:
S # original S-matrix
Ŝ # reconstruction from model
f(x) = norm(S(x) - Ŝ(x))
This function is then fed to a global optimizer with the appropriate parameters and, some time later, voilà.
In your case
First you need to know the exact number of terms needed in your Foster series. These may come from an analytical expression or you may have to perform several guesses.
If you want to work on the time domain, create initial guesses for your coefficients (r_i, tau_i) and create a function that accepts a vector of these and evaluates the expression on your time points.
Then create a cost function using the norm of the difference between your data vectors (original and estimation).
Feed these to the global optimizer.
Maybe you could work directly with curve fitting packages, but since it did not work for me I cannot say how they work.
My package is here: GitHub - nandoconde/BranchLineCouplers.jl: Synthesize and simulate branch line couplers
Explore it at will