Proposal new type "flags"

Rueschi rueschi at GIquadrat.de
Mon Feb 5 11:28:09 PST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

At first: I found 2 errors in my proposal - where I wrote
>     .sizeof  8, 16, 32 or 64, depending on the number of flags
>     .all     flags value with all flags set (same as ~.init)
I actually meant
    .sizeof  1, 2, 4 or 8, depending on the number of flags
    .all     flags value with all flags set
             (with ~x.all==x.init and ~x.init==x.all)

Steve Horne schrieb:
> On Wed, 10 Jan 2007 00:39:03 -0500, "Jarrett Billingsley"
> <kb3ctd2 at yahoo.com> wrote:
> 
>> "Rueschi" <rueschi at GIquadrat.de> wrote in message 
>> news:eo1nft$1831$1 at digitaldaemon.com...
> 
>>> many functions may be controlled by some flags that are combined
>>> in an uint value. However, this is not type safe. You can unintentionally
>>> provide/combine flags that are not intended for that function.
>>> The solution would be a "flags" data type, similar to enums:
>> I have wanted something like this so, so many times.  Enums make you 
>> expliitly number the flags and then you have to use a plain uint param to 
>> accept flags.. it's just a pain.  Something like this would be fantastic. 
> 
> I'm pretty sceptical about this.
> 
> The thing is that I tend to want some things that aren't single-bit
> flags - some things that use several bits. Or at least the ability to
> test several flags in one go. To do that, you need masks to isolate
> each group of bits as well as the various values for each group.
> Having a flags feature that could cope with generalisations like that
> would probably be more complex than its worth, and if it can't handle
> these generalisations, to me it's worse than worthless. Yes, worse,
> since you'd end up wasting time converting other peoples flag sets to
> enumerates with explicit values when the requirements change and you
> need those generalisations.

Doesn't D allow you to write
  alias FEATURE1|FEARURE2 FEATURE_SET;
or similar? When testing, use
  if ( FEATURE_SET==given_flags ) { /* all and only FEATURE_SET given */ }
or
  if ( FEATURE_SET<>given_flags ) { /* at least all FEATURE_SET given */ }


> Another common requirement is to pack a couple of flags into the most
> significant bits of an integer that doesn't need all its bits. This
> means having flag values that start high and count downward, and
> having a mask to extract the integer itself as well. For example, in a
> multiway tree data structure (such as a B+ tree database index) I
> might pack leaf-node and root-node flags into the high two bits of a
> node-size field.

This is contrary to the statement of yours, on modern
multi-megabyte-machines the additional space of one byte would
not be worth the effort. Anyway you can write something like:

  alias uint NodeData;

  flags NodeFlags
  {
    ROOT        = 1 << (NodeData.sizeof*8 - 1),
    LEAF        = 1 << (NodeData.sizeof*8 - 2)
  };
  static_assert( NodeFlags.sizeof == NodeData.sizeof, "size conflict" );
  static_assert( !(cast(NodeData)NodeFlags.all<<NodeFlags.count),
                 "unused MSBs" );

  struct Node
  {
  private:
    union
    {
      NodeData     nodedata;   /* I really miss int(0..0x3FFFFFFF) */
      NodeFlags    nodeflags;
    } = { nodeintval : 0 };

  public:
    NodeFlags flags()
      { return cast(NodeFlags)(nodata & cast(NodeData) NodeFlags.all); }
    NodeData data()
      { return nodedata & ~(cast(NodeData) nodeflags.all); }

    NodeFlags flags( NodeFlags f )
    {
      nodedata = data | cast(NodeData) f;
      return f;
    }
    NodeData data( NodeData i )
    {
      assert( (i<<NodeFlags.count)>>NodeFlags.count == i, "overflow" );
      nodedata = i | cast(NodeData) flags;
      return i;
    }
  };

/* The casts of flags to integral types are quite ugly. The language
 * should provide such casts via property, e.g. ".integral". This leaves
 * the implementation free to gamble with unused bits and mask them out
 * only when this property is used.
 */

Alternatively, if you are the master of the data layout, you could keep the
flags in the LSBs and shift the integral value.

> A simple one-bit-per-flag counting upwards from bit zero is, for me, a
> very rare exceptional case. The only cases where I remember it
> happening are in artificial teaching examples back in college. The
> nearest real-world programming match I have is a C++
> set-of-unsigned-integers class that I use a lot which packs bitsets
> into an underlying vector of integers, but the flags are associated
> with integer values (normally some kind of UID), not identifier names.

My approach makes no assumption to which bits are used. A compiler writer
may even choose to use one byte per flag, as long as I get my type safety
and operations (although he should seriously think about changing his
field of activity *g*). In my business environment, flags are used very
heavily. So much that I even dare to suggest multiple inheritance for flags:

  flags Permissions
  {
     READ,
     WRITE,
     EXECUTE,
     ALL = READ|WRITE|EXECUTE,
     NONE = Permissions.init
  };
  flags FilePermissions : (Permission:Owner),
                          (Permission:Group),
                          (Permission:Others)
  { STICKY };


  with ( FilePermissions )
    file.setPermissions( Owner.ALL | Group.READ | Others.NONE | STICKY );


Even more valuable would be the construction from enums:

  enum Unknown { UNKNOWN };
  enum Genre : Unknown { CLASSIC, HEAVYMETAL, LULLABY };
  enum Format : Unknown { VORBIS, MP3, WAV };

  flags FGenre : Genre {};
  flags FFormat : Format {};
  flags FSearch : FGenre, FFormat {};

  void search( string name, FSearch f, ... );

  with( FSearch )
    search( "happy birthday", HEAVYMETAL | FFormat.UNKNOWN, ... );

(virtual inheritance is not needed, at least I can't imagine such a scenario)

> Also, packing bits into an integer tends to be slower than just using
> a boolean type, and since the saving in memory is usually small
> (saving a couple of bytes here and there is pointless on machines with
> megabytes, let alone gigabytes, of memory) there's no gain to pay for
> the extra complexity. You should only do it if you have a specific
> good reason.

That is simply not true. When you have to pass flags as parameters, it
is faster to combine them in one integral value instead of pushing and
popping multiple bytes on and from the stack. It is waste of memory and
cache pollution to represent a logical bit by a physical byte.

But the main speedup comes when you want to compare more than one
(logical related) conditions in one operation. It is easier and faster
to compare against multiple flags if they are combined in one integral
value than doing byte-wise comparisions. Even in some situations where
integer operations on separated flags produce optimized code, the
single-word variant doesn't drag behind:

  flags AnsiColor { RED, GREEN, BLUE, BLINK };

The comparing part of the function

  void draw( string text, bool red, bool green, bool blue, bool blink )
  {
    if ( !blink && (red+green+blue)==1 ) { /* use only one plane */ }
  }

may compile to:

    mov    eax, [esp+blink]   ; arguments are passed as 32bit
    sub    eax, 1
    jnc    label1
    mov    edx, [esp+blue]
    mov    eax, [esp+green]
    lea    edx, [eax+edx-1]
    add    edx, [esp+red]
    jnz    label1
    jmp    use_only_one_plane
  label1:

while the alternative using flags

  void draw( string text, AnsiColor color )
  if (  !blink && (color==AnsiColor.red
                   || color==AnsiColor.green
                   || color==AnsiColor.blue) ) { /* draw only one plane */ }

may, if bit flags are used, compile to:

    mov    eax, [esp+color]
    test   eax, 0x08
    jnz    label1
    and    eax, 0x07
    mov    edx, eax
    shr    edx, 2
    and    eax, 0x03
    dec    edx
    shr    eax, 1
    adc    eax, edx
    jnz    label1
    jmp    use_only_one_plane
  label1:

These instructions use the same number of registers, take approximately
2 non-parallel instruction more but save 3 dwords on the stack.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32)

iD8DBQFFx4VJxpVjSwvEWI4RApFQAJ0cu06Fxs9mlgtAGtA4nsnbPF2Z5wCePV3P
cPmbceN2TZSLohtKcn1HJ8M=
=Hzm5
-----END PGP SIGNATURE-----



More information about the Digitalmars-d mailing list