Red-Black Gauss-seidel with mir

Christoph alt.c4 at web.de
Sun Sep 13 14:48:30 UTC 2020


Hi all,

I am trying to implement a sweep method for a 2D Red-black 
Gauss-Seidel Solver with the help of mir and its slices.
The fastest Version I discovered so far looks like this:
```
void sweep(T, size_t Dim : 2, Color color)(in Slice!(T*, 2) F, 
Slice!(T*, 2) U, T h2)
{
     const auto m = F.shape[0];
     const auto n = F.shape[1];
     auto UF = U.field;
     auto FF = F.field;

     foreach (i; 1 .. m - 1)
     {
         for (size_t j = 1 + (i + 1 + color) % 2; j < n - 1; j += 
2)
         {
             auto flatindex = i * m + j;
             UF[flatindex] = (
                     UF[flatindex - m] +
                     UF[flatindex + m] +
                     UF[flatindex - 1] +
                     UF[flatindex + 1] - h2 * FF[flatindex]) / 4.0;
         }
     }
}
```
Accessing the field of the mirslice directly seems to be 
incredibly fast and I assume this might be the fastest version 
for this purpose?

I have already implemented this in Python with numpy and there it 
looks like this:

```
def sweep_2D(color, F, U, h2):

     m, n = F.shape

     if color == 1:
         # red
         U[1:m - 1:2, 2:n - 1:2] = (U[0:m - 2:2, 2:n - 1:2] +
                                    U[2::2, 2:n - 1:2] +
                                    U[1:m - 1:2, 1:n - 2:2] +
                                    U[1:m - 1:2, 3::2] -
                                    F[1:m - 1:2, 2:n - 1:2] * h2) 
/ (4.0)
         U[2:m - 1:2, 1:n - 1:2] = (U[1:m - 2:2, 1:n - 1:2] +
                                    U[3::2, 1:n - 1:2] +
                                    U[2:m - 1:2, 0:n - 2:2] +
                                    U[2:m - 1:2, 2::2] -
                                    F[2:m - 1:2, 1:n - 1:2] * h2) 
/ (4.0)
...

```
My Question now is: Is there a possibility too use the mir slices 
in a way that "looks" similar to the numpy version and is similar 
fast?
Below there is fastest version I found up to now, but it is still 
much slower then the version with the two for loops. It seems to 
me, that accessing and modifying only parts of a mirslice with 
this "indexed" syntax in combination with strided is really slow. 
Is this true or am I using the slices wrong?

```
static if (color == Color.red)
{
         U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] = U[0 .. 
m - 2, 1 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
         U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[2 
.. m, 1 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
         U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[1 
.. m - 1, 0 .. n - 2].strided!(0, 1)(2, 2) / 4.0;
         U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[1 
.. m - 1, 2 .. n].strided!(0, 1)(2, 2) / 4.0;
         U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] -= F[1 
.. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2) * h2 / 4.0;

         U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] = U[1 .. 
m - 2, 2 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
         U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[3 
.. m, 2 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
         U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[2 
.. m - 1, 1 .. n - 2].strided!(0, 1)(2, 2) / 4.0;
         U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[2 
.. m - 1, 3 .. n].strided!(0, 1)(2, 2) / 4.0;
         U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] -= F[2 
.. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2) * h2 / 4.0;

}
...

```

Thanks for your help in Advance!

Christoph



More information about the Digitalmars-d-learn mailing list