repack ubyte[] to use only 7 bits
Manolo via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sat Dec 13 12:41:25 PST 2014
On Saturday, 13 December 2014 at 19:52:33 UTC, Charles Hixson via
Digitalmars-d-learn wrote:
>
> On 12/13/2014 03:20 AM, Manolo via Digitalmars-d-learn wrote:
>> On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson
>> via Digitalmars-d-learn wrote:
>>> Is there a standard way to do this? The code below is
>>> untested, as I haven't yet written the x7to8 routine, and
>>> came up with a better way to do what this was to accomplish,
>>> but it feels as if this should be somewhere in the standard
>>> library, if I could only find it.
>>>
>>> /** Repack the data from an array of ubytes into an array of
>>> ubytes of
>>> * which only the last 7 are significant. The high bit will
>>> be set only
>>> * if the byte would otherwise be zero. */
>>> byte[] x8to7 (ubyte[] bin)
>>> {
>>> ubyte[] bout;
>>> // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000
>>> // 1 => 0x7f = 01111111, 0x00 = 00000000
>>> // 2 => 0x3f = 00111111, 0x80 = 10000000
>>> // 3 => 0x1f = 00011111, 0xc0 = 11000000
>>> // 4 => 0x0f = 00001111, 0xe0 = 11100000
>>> // 5 => 0x07 = 00000111, 0xf0 = 11110000
>>> // 6 => 0x03 = 00000011, 0xf8 = 11111000
>>> // 7 => 0x01 = 00000001, 0xfc = 11111100
>>> if (bin.length < 1) return bout;
>>> int fByte, fBit;
>>> while (fByte < bin.length)
>>> { if (fByte + 1 == bin.length && fBit > 1) break;
>>> ubyte b;
>>> switch (fBit)
>>> { case 0:
>>> b = bin[fByte] / 2;
>>> break;
>>> case 1:
>>> b = bin[fByte] & 0x7f;
>>> break;
>>> case 2:
>>> ubyte b1 = (bin[fByte] & 0x3f) << 1;
>>> ubyte b2 = (bin[fByte + 1] & 0x80)
>>> >>> 7;
>>> b ~= (b1 | b2);
>>> break;
>>> case 3:
>>> ubyte b1 = (bin[fByte] & 0x1f) << 2;
>>> ubyte b2 = (bin[fByte + 1] & 0xc0)
>>> >>> 6;
>>> b ~= (b1 | b2);
>>> break;
>>> case 4:
>>> ubyte b1 = (bin[fByte] & 0x0f) << 3;
>>> ubyte b2 = (bin[fByte + 1] & 0xe0)
>>> >>> 5;
>>> b ~= (b1 | b2);
>>> break;
>>> case 5:
>>> ubyte b1 = (bin[fByte] & 0x07) << 4;
>>> ubyte b2 = (bin[fByte + 1] & 0xf0)
>>> >>> 4;
>>> b ~= (b1 | b2);
>>> break;
>>> case 6:
>>> ubyte b1 = (bin[fByte] & 0x03) << 5;
>>> ubyte b2 = (bin[fByte + 1] & 0xf8)
>>> >>> 3;
>>> b ~= (b1 | b2);
>>> break;
>>> case 7:
>>> ubyte b1 = (bin[fByte] & 0x01) << 6;
>>> ubyte b2 = (bin[fByte + 1] & 0xfc)
>>> >>> 2;
>>> b ~= (b1 | b2);
>>> break;
>>> default:
>>> assert (false, "This path should never be
>>> taken");
>>> } // switch (fBit)
>>> if (b == 0) bout ~= 0x80;
>>> else bout ~= b;
>>> fBit = fBit + 7;
>>> if (fBit > 7)
>>> { fByte++;
>>> fBit -= 7;
>>> }
>>> }
>>> }
>>
>> Are you trying to make a "kind-of" Variable-Length quantity
>> encoder ?
>>
>> eg:
>> 0b10101110: last bit not set, integrate 0b10101110 and stop
>> reading.
>> 0b10011001: last bit set, integrate 0b10011000 and continue to
>> next byte.
>>
>> http://en.wikipedia.org/wiki/Variable-length_quantity
>>
>> except that this algo limits the length to 24 bits. It was
>> used a lot with
>> MIDI, at a time when hardware memory was costly (eg inside
>> hardware synthesizer or workstations).
>>
> What I was trying to do was pack things into 7 bits so I could
> recode 0's as 128. I finally thought clearly about it and
> realized that I only needed to use one particular byte value (I
> chose 127) to duplicate so I could repack things with a string
> of 0's replaced by 127 followed by the length (up to 126) of
> zeros, and for 127 itself I'd just emit 127 twice. This was to
> pack binary data into a string that C routines wouldn't think
> had ended partway through. (If I get more than 127 zeros in a
> row, I just have more than one packing code.)
Sorry, I misunderstood the thing.
More information about the Digitalmars-d-learn
mailing list