LLVM Coding Standards

Mon Apr 11 12:58:57 PDT 2011

[slightly OT]

Hello,

I'm reading (just for interest) the LLVM Coding Standards at 
http://llvm.org/docs/CodingStandards.html. Find them very interesting because 
their purposes are clearly explained. Below sample.

Denis

=== sample ===================================
Use Early Exits and continue to Simplify Code

When reading code, keep in mind how much state and how many previous decisions 
have to be remembered by the reader to understand a block of code. Aim to 
reduce indentation where possible when it doesn't make it more difficult to 
understand the code. One great way to do this is by making use of early exits 
and the continue keyword in long loops. As an example of using an early exit 
from a function, consider this "bad" code:

Value *DoSomething(Instruction *I) {
   if (!isa<TerminatorInst>(I) &&
       I->hasOneUse() && SomeOtherThing(I)) {
     ... some long code ....
   }

   return 0;
}

This code has several problems if the body of the 'if' is large. When you're 
looking at the top of the function, it isn't immediately clear that this only 
does interesting things with non-terminator instructions, and only applies to 
things with the other predicates. Second, it is relatively difficult to 
describe (in comments) why these predicates are important because the if 
statement makes it difficult to lay out the comments. Third, when you're deep 
within the body of the code, it is indented an extra level. Finally, when 
reading the top of the function, it isn't clear what the result is if the 
predicate isn't true; you have to read to the end of the function to know that 
it returns null.

It is much preferred to format the code like this:

Value *DoSomething(Instruction *I) {
   // Terminators never need 'something' done to them because ...
   if (isa<TerminatorInst>(I))
     return 0;

   // We conservatively avoid transforming instructions with multiple uses
   // because goats like cheese.
   if (!I->hasOneUse())
     return 0;

   // This is really just here for example.
   if (!SomeOtherThing(I))
     return 0;

   ... some long code ....
}

This fixes these problems. A similar problem frequently happens in for loops. A 
silly example is something like this:

   for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
     if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {
       Value *LHS = BO->getOperand(0);
       Value *RHS = BO->getOperand(1);
       if (LHS != RHS) {
         ...
       }
     }
   }

When you have very, very small loops, this sort of structure is fine. But if it 
exceeds more than 10-15 lines, it becomes difficult for people to read and 
understand at a glance. The problem with this sort of code is that it gets very 
nested very quickly. Meaning that the reader of the code has to keep a lot of 
context in their brain to remember what is going immediately on in the loop, 
because they don't know if/when the if conditions will have elses etc. It is 
strongly preferred to structure the loop like this:

   for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
     BinaryOperator *BO = dyn_cast<BinaryOperator>(II);
     if (!BO) continue;

     Value *LHS = BO->getOperand(0);
     Value *RHS = BO->getOperand(1);
     if (LHS == RHS) continue;

     ...
   }

This has all the benefits of using early exits for functions: it reduces 
nesting of the loop, it makes it easier to describe why the conditions are 
true, and it makes it obvious to the reader that there is no else coming up 
that they have to push context into their brain for. If a loop is large, this 
can be a big understandability win.
========================================
-- 
_________________
vita es estrany
spir.wikidot.com