I found the derivative operators are not efficient, there seems to have a more efficient way to use. If I am wrong please tell me.

For example when I have :

```
dx = 0.01
x = dx:dx:0.3
Δ1 = UpwindDifference(1, 1, dx, length(x), -0.1)
Δ2 = CenteredDifference(2 ,2, dx, length(x))
bc = RobinBC((0.1, 1E-4, 0.1), (0., 1., 0.), dx, 2)
u0 = @. exp(1-x)
f1(u) = Δ1*bc*u + Δ2*bc*u
```

the time needed is:

```
@benchmark f1(u0)
BenchmarkTools.Trial:
memory estimate: 1.36 KiB
allocs estimate: 7
--------------
minimum time: 658.642 ns (0.00% GC)
median time: 960.494 ns (0.00% GC)
mean time: 3.880 μs (73.80% GC)
maximum time: 1.536 ms (99.91% GC)
--------------
samples: 8005
evals/sample: 162
```

then if I define:

```
d1 = Δ1*bc
d2 = Δ2*bc
f2(u) = d1*u+d2*u
```

the time needed is:

```
@benchmark f2(u0)
BenchmarkTools.Trial:
memory estimate: 1.36 KiB
allocs estimate: 7
--------------
minimum time: 698.387 ns (0.00% GC)
median time: 1.034 μs (0.00% GC)
mean time: 4.212 μs (74.19% GC)
maximum time: 2.255 ms (99.94% GC)
--------------
samples: 9799
evals/sample: 124
```

furthermore, from what i understand, d1 has two elements, d1*u0 is actually d1[1]*u0+d1[2], then define:

```
A1, b1 = Array(Δ1*bc)
A2, b2 = Array(Δ2*bc)
f3(u) = A1*u+b1 + A2*u+b2
```

It became slightly faster

```
@benchmark f3(u0)
BenchmarkTools.Trial:
memory estimate: 1008 bytes
allocs estimate: 3
--------------
minimum time: 534.043 ns (0.00% GC)
median time: 959.574 ns (0.00% GC)
mean time: 5.779 μs (82.36% GC)
maximum time: 3.336 ms (99.97% GC)
--------------
samples: 4767
evals/sample: 188
```

actually A1 and A2 are Diagonal matrix, we can further modify:

```
A12 = Tridiagonal(A1+A2)
b12 = b1+b2
f4(u) = A12*u + b12
```

```
@benchmark f4(u0)
BenchmarkTools.Trial:
memory estimate: 672 bytes
allocs estimate: 2
--------------
minimum time: 175.385 ns (0.00% GC)
median time: 371.923 ns (0.00% GC)
mean time: 3.443 μs (89.20% GC)
maximum time: 820.765 μs (99.96% GC)
--------------
samples: 1944
evals/sample: 780
```

the last is much faster than the first.

considering that A1 and A2 may not be tridiagonal matrix (for example when the upwind scheme is 2nd order accurate), define:

```
A11 = A1+A2
f5(u) = A11*u + b12
```

```
@benchmark f5(u0)
BenchmarkTools.Trial:
memory estimate: 672 bytes
allocs estimate: 2
--------------
minimum time: 281.955 ns (0.00% GC)
median time: 497.744 ns (0.00% GC)
mean time: 3.153 μs (84.15% GC)
maximum time: 1.925 ms (99.96% GC)
--------------
samples: 6334
evals/sample: 266
```

still faster than the first one