[OT] Is this more readable, or just way too verbose?

Lutger lutger.blijdestijn at gmail.com
Mon Aug 9 18:32:42 PDT 2010


simendsjo wrote:

> I took splitlines from std.string, which is a simple, short method.
> 
> S[] splitlines(S)(S s)
> {
>      size_t istart;
>      auto result = Appender!(S[])();
> 
>      foreach (i; 0 .. s.length)
>      {
>          immutable c = s[i];
>          if (c == '\r' || c == '\n')
>          {
>              result.put(s[istart .. i]);
>              istart = i + 1;
>              if (c == '\r' && i + 1 < s.length && s[i + 1] == '\n')
>              {
>                  i++;
>                  istart++;
>              }
>          }
>      }
>      if (istart != s.length)
>      {
>          result.put(s[istart .. $]);
>      }
> 
>      return result.data;
> }
> 
> 
> I guess it takes less than 30 seconds to fully understand this method.
> Then I rap.. I mean refactored it to this:
> 
> S[] mysplitlines(S)(S s)
> {
>      const CR = '\r';
>      const LF = '\n';
> 
>      size_t istart;
> 
>      auto result = Appender!(S[])();
> 
>      foreach (i; 0 .. s.length)
>      {
>          immutable c = s[i];
> 
>          immutable isCR = (c == CR);
>          immutable isLF = (c == LF);
> 
>          if (isCR || isLF)
>          {
>              auto line = s[istart .. i];
>              result.put(line);
> 
>              istart = i + 1;
> 
>              // Might be CRLF. In that case we need to consume LF too
>              if (isCR)
>              {
>                  immutable hasMoreCharacters = (i + 1 < s.length);
>                  immutable nextIsLF = hasMoreCharacters && (s[i + 1] == LF);
>                  immutable isCRLF = isCR && nextIsLF;
> 
>                  if (isCRLF)
>                  {
>                      i++;
>                      istart++;
>                  }
>              }
>          }
>      }
> 
>      immutable lineNotEmpty = (istart != s.length);
>      if (lineNotEmpty)
>      {
>          auto lastLine = s[istart .. $];
>          result.put(lastLine);
>      }
> 
>      return result.data;
> }
> 
> 
> Yes, I'm reading some books now and don't have much experience :)
> 
> It went from a small 26 line, very readable function to 48 lines (85%
> increase!) with many more temporary variables..
> 
> So... Do you think this kind of code is (more) readable, or just way too
> verbose and doing more harm than good?

The CR and LF constants are a bit too much, probably because they don't really 
abstract over the literals which I can actually parse faster. The isCR and isLF 
are nice however. Taking it a step further:

bool canSplit = inPattern(c,"\r\n");
if (canSplit)
{
  ...

You have increased the nesting of ifs to 3 inside a for-loop. Personally I don't 
read deep nesting very well. To go for readability I would use a small function 
for the entire expression: 

if( s[i..$].startsWithCRLF() ) // same as startsWithCRLF(s[i..$])
{
  i++;
  istart++;
}

or use std.algorithm: if ( s[i..$].startsWith("\r\n") )
 
> And will the compiler generate slower code, or should the optimizer be
> able to inline the temporaries?

I don't think it matters much, but you can only tell by testing. There is a 
benchmark function (called benchmark iirc) somewhere in phobos to start with and 
dmd has a builtin profiler too.


More information about the Digitalmars-d-learn mailing list