foreach (i; taskPool.parallel(0..2_000_000)

Steven Schveighoffer schveiguy at gmail.com
Mon Apr 3 23:50:48 UTC 2023


On 4/3/23 7:22 PM, Paul wrote:
> ```d
> // Timed main() 
> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> void main(string[] args) {
> auto progStartTime = MonoTime.currTime;
> //-----------------------------------------------------------------------------
>      string filename = args.length > 1 ? args[1] : "aoc2215a.data";
>      CommPair[] commPair;
>      ulong x,y;
> 
>      // read file that has data sets in the form of x,y coordinate pairs
>      // for each sensor-beacon pair.  Create on array of structs to hold
>      // this information.
>      loadFileDataIntoArrayOfStructs(commPair, filename);
> 
>      foreach(int lineOfInterest;parallel(iota(0,4_000_001))) {
>          Span[] span; // A section of line-of-interest coordinates where 
> no other beacons are present.
>          const spanReserve = span.reserve(50);
>          createSpansOfNoBeacons(lineOfInterest,commPair,span);
> 
>          // if spans overlap, combine them into a single span and mark
>          // the other spans !inUse.
>          combineOverlappingSpans(span);
> 
>          // look for a line that doesn't have 4,000,001 locations 
> accounted for
>          if(beaconFreeLocations(span) < 4_000_001) {
> 
>              // find the location that is not accounted for
>              foreach(ulong i;0..4_000_000) {
>                  bool found = false;
>                  foreach(sp;span) {
>                      if(i >= sp.xLow && i <= sp.xHigh) {
>                          found = true;
>                          break;
>                      }
>                  }
>                  if(!found) {
>                      x = i; y = lineOfInterest;
>                      break;
>                  }
>              }
>          }
>      }
> writeln(x," ",y);
> 
> ```

So I just quoted your main loop.

I am assuming that this O(n^2) algorithm doesn't actually run for all 
iterations, because that wouldn't be feasible (16 trillion iterations is 
a lot).

This means that I'm assuming a lot of cases do not run the second loop. 
Everything you do besides prune the second loop is mostly allocating an 
array of `Span` types. This means most of the parallel loops are 
allocating, and doing nothing else. As I said earlier, allocations need 
a global lock of the GC.

What you need to do probably, is to avoid these allocations per loop.

The easiest thing I can think of is to store the Span array as a static 
array of the largest array you need (i.e. the length of `commPair`), and 
then slice it instead of appending.

So what you need is inside `createSpansOfNoBeacons`, take as a reference 
a `ref Span[MAX_SPANS]`, and have it return a `Span[]` that is a slice 
of that which was "alocated".

See if this helps.

FWIW, I did the AoC 2022 as well, and I never needed parallel execution. 
Looking at my solution comment in reddit, this one I sort of punted by 
knowing I could exit as soon as the answer is found (my solution runs in 
2.5s on my input). But I recommend (once you are done), reading this 
post, it is a really cool way to look at it:

https://www.reddit.com/r/adventofcode/comments/zmcn64/2022_day_15_solutions/j0hl19a/?context=8&depth=9

-Steve


More information about the Digitalmars-d-learn mailing list