# How to understand filtering in topology optimization?

Hello
I’m trying to understand the role of filtering in the tutorial about topology optimization of `gridap`. It claims that filtering process will help to binarize the result. Well…

I first imagine an optimization problem without filtering a density value `ρ`. Then the optimization problem is to minimize an objective function `O(ρ)` only constrained by physics’ laws (here, Maxwell’s equations) and feasible sets. Let’s suppose the `O(ρ)` is minized when `ρ=ρ0`.

Then, I go back to the case with filtering function `f:ρ↦ρ'`. Then the objective function will be `O(ρ')` and it is minimized when `ρ'=ρ0`. Regardless of the filtering, the final result is the same.

Folloing the logic, I don’t think filtering contributes when binarizing the result. What am I missing?

The filtering (i.e. convolving with a smoothing kernel or “diffusing” the degrees of freedom) is not to binarize the result, it is to regularize the problem by imposing a minimum lengthscale`[*]` on the solution. Otherwise, the solution may not converge with resolution. (This is not necessary in all problems, in principle, but is generally a good idea.)

By filtering, however, even if you start with a binary density, the smoothed density will have lots of intermediate values — you are prohibiting a nontrivial binarized result. So to allow the optimization to get back to a binarized result you need to pass the smoothed density through a step function projection step. Unfortunately, a step function is not differentiable`[**]`, so instead the projection is done with a smoothed step function that is gradually made “steeper” as the optimization progresses (at the cost of making the optimization stiffer`[**]`).

There are many review articles on this filter/project style of density-based topology optimization; arguably it is the most widely used approach at the moment. See, for example, Jensen (2011).

`[*]` Although the filtering sets a minimum lengthscale in a certain sense, it does not guarantee that the result satisfies manufacturing constraints. That requires additional constraints; see e.g. Hammond et al (2021) and references therein.

`[**]` We should be posting a manuscript later this year on how to do (essentially) step function projection while being everywhere differentiable and remaining non-stiff (i.e. avoiding an ill-conditioned Hessian for the objective function).

2 Likes

One way to make sense of filters and how they impact optimization is to consider their effect on the gradient vector of the objective wrt the pseudo-densities/decision variables. Say `O(ρ)` is your unfiltered objective where `ρ` is the unfiltered design, and `dO(ρ)/dρ` is the gradient of `O(ρ)` wrt `ρ`. The common filters used can essentially be written as a linear operator `ρ_f = A * ρ` for some filter matrix `A`. If we define the filtered objective as `O(ρ_f) = O(A * ρ)` instead, its gradient will be `A' * dO(ρ_f)/d(ρ_f)`. `A'` also has a local averaging effect. So even if `ρ_f` in the new formulation was equal to `ρ` in the old formulation (giving the same objective value), the gradients of the 2 objectives will not be equal!

As Steven mentioned, filters have a regularizing effect because they essentially average out the local partial derivatives. Due to numerical errors in the FEA, sometimes a chequerboard pattern naturally emerges as the “optimal” design, where 1 element can be pushed to have a pseudo-density of 1 and its immediate neighbors get pushed to have a pseudo-density of 0, which happens due to “high frequency” changes in the gradient’s sign. Averaging out the partial derivatives makes it less likely that 2 neighboring elements will move in 2 opposite directions when following the gradient in gradient-based optimization, essentially reducing the average norm of the gradient of the objective function (or acting like a low-pass filter) because positive and negative partial derivatives cancel out. This increases the regularity of the shapes that come out of the optimization. I also generally take the filtered design `ρ_f = A * ρ` (or its Heaviside projection) as the “final design” instead of `ρ` even though the optimization happens in terms of `ρ` because `A * ρ` is the design for which the true objective function `O` and constraints were evaluated.

1 Like

Thank you for the explanation. So the key is to treat the filter as projection. Then, when implementing gradient method with filter, the iteration statement is `ρ_f -= α(dO(ρ_f)/d(ρ_f))*dO(ρ_f)/d(ρ)`, not `ρ -= α(dO(ρ_f)/d(ρ_f))*dO(ρ_f)/d(ρ)`.
I thought it was the later case.

Optimization always happens in terms of `ρ` not `ρ_f`. `p_f` is just an intermediate variable I defined above for the filtered design.

This is not a question of numerical errors. If you don’t have something limiting the minimum lengthscale, in many physical problems the optimum design can theoretically be an unphysical structure with a lattice of arbitrarily fine features. In a discretized simulation, this will lead to features that get finer and finer (e.g. ρ that oscillates faster and faster) as you increase resolution. It doesn’t converge … unless you regularize the problem with filtering or similar.

See the discussion here https://www.researchgate.net/publication/225129218_A_Q4Q4_continuum_structural_optimization_implementation. The discretization scheme appears to be a factor.