Red-Black Gauss-seidel with mir
Christoph
alt.c4 at web.de
Sun Sep 13 14:48:30 UTC 2020
Hi all,
I am trying to implement a sweep method for a 2D Red-black
Gauss-Seidel Solver with the help of mir and its slices.
The fastest Version I discovered so far looks like this:
```
void sweep(T, size_t Dim : 2, Color color)(in Slice!(T*, 2) F,
Slice!(T*, 2) U, T h2)
{
const auto m = F.shape[0];
const auto n = F.shape[1];
auto UF = U.field;
auto FF = F.field;
foreach (i; 1 .. m - 1)
{
for (size_t j = 1 + (i + 1 + color) % 2; j < n - 1; j +=
2)
{
auto flatindex = i * m + j;
UF[flatindex] = (
UF[flatindex - m] +
UF[flatindex + m] +
UF[flatindex - 1] +
UF[flatindex + 1] - h2 * FF[flatindex]) / 4.0;
}
}
}
```
Accessing the field of the mirslice directly seems to be
incredibly fast and I assume this might be the fastest version
for this purpose?
I have already implemented this in Python with numpy and there it
looks like this:
```
def sweep_2D(color, F, U, h2):
m, n = F.shape
if color == 1:
# red
U[1:m - 1:2, 2:n - 1:2] = (U[0:m - 2:2, 2:n - 1:2] +
U[2::2, 2:n - 1:2] +
U[1:m - 1:2, 1:n - 2:2] +
U[1:m - 1:2, 3::2] -
F[1:m - 1:2, 2:n - 1:2] * h2)
/ (4.0)
U[2:m - 1:2, 1:n - 1:2] = (U[1:m - 2:2, 1:n - 1:2] +
U[3::2, 1:n - 1:2] +
U[2:m - 1:2, 0:n - 2:2] +
U[2:m - 1:2, 2::2] -
F[2:m - 1:2, 1:n - 1:2] * h2)
/ (4.0)
...
```
My Question now is: Is there a possibility too use the mir slices
in a way that "looks" similar to the numpy version and is similar
fast?
Below there is fastest version I found up to now, but it is still
much slower then the version with the two for loops. It seems to
me, that accessing and modifying only parts of a mirslice with
this "indexed" syntax in combination with strided is really slow.
Is this true or am I using the slices wrong?
```
static if (color == Color.red)
{
U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] = U[0 ..
m - 2, 1 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[2
.. m, 1 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[1
.. m - 1, 0 .. n - 2].strided!(0, 1)(2, 2) / 4.0;
U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] += U[1
.. m - 1, 2 .. n].strided!(0, 1)(2, 2) / 4.0;
U[1 .. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2)[] -= F[1
.. m - 1, 1 .. n - 1].strided!(0, 1)(2, 2) * h2 / 4.0;
U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] = U[1 ..
m - 2, 2 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[3
.. m, 2 .. n - 1].strided!(0, 1)(2, 2) / 4.0;
U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[2
.. m - 1, 1 .. n - 2].strided!(0, 1)(2, 2) / 4.0;
U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] += U[2
.. m - 1, 3 .. n].strided!(0, 1)(2, 2) / 4.0;
U[2 .. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2)[] -= F[2
.. m - 1, 2 .. n - 1].strided!(0, 1)(2, 2) * h2 / 4.0;
}
...
```
Thanks for your help in Advance!
Christoph
More information about the Digitalmars-d-learn
mailing list