Escaping control in formatting

kenji hara k.hara.pg at gmail.com
Mon Apr 23 21:55:17 PDT 2012


2012年4月24日2:49 Denis Shelomovskij <verylonglogin.reg at gmail.com>:
> 23.04.2012 21:15, kenji hara написал:
>>
>> 2012年4月24日1:14 Denis Shelomovskij<verylonglogin.reg at gmail.com>:
>>>
>>> 23.04.2012 18:54, kenji hara написал:
>>>
>>>
>>>> Please give us use cases. I cannot imagine why you want to
>>>> change/remove quotations but keep escaped contents.
>>>
>>>
>>>
>>> Sorry, I should mention that !' and !" are optional and aren't commonly
>>> used, and all !?* are very optional and are here just for completeness
>>> (IMHO).
>>>
>>> An example is generating a complicated string for C/C++:
>>> ---
>>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
>>>                   str1, str2, str3)
>>> ---
>>>
>>>
>>> --
>>> Денис В. Шеломовский
>>> Denis V. Shelomovskij
>>
>>
>> During my improvements of std.format module, I have decided a design.
>> If you format some values with a format specifier, you should unformat
>> the output with same format specifier.
>>
>> Example:
>>     import std.format, std.array;
>>
>>     auto aa = [1:"hello", 2:"world"];
>>     auto writer = appender!string();
>>     formattedWrite(writer, "%s", aa);
>>
>>     aa = null;
>>
>>     auto output = writer.data;
>>     formattedRead(output, "%s",&aa);  // same format specifier
>>
>>     assert(aa == [1:"hello", 2:"world"]);
>>
>> More details:
>>
>> https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264
>>
>> I call this "reflective formatting", and it supports simple text based
>> serialization and de-serialization.
>> Automatic quotation/escaping for nested elements is necessary for the
>> feature.
>>
>> But your proposal will break this design very easy, and it is
>> impossible to unformat the outputs reflectively.
>>
>> For these reasons, your suggestion is hard to accept.
>>
>> Kenji Hara
>
>
> Is there sum misunderstanding?
>
> Reflective formatting is good! But it isn't what you always want. It is
> needed mostly for debug purposes. But debugging is one of two usings of
> formatting, the second one is just writing something somewhere.
>
> There are already some non-reflective constructs (like "%(%(%c%), %)" for a
> range and "X%sY%sZ" for strings) and I just propose adding more comfortable
> ones because every second time I use formatting I use it for writing (I mean
> not for debugging).
>
>
> --
> Денис В. Шеломовский
> Denis V. Shelomovskij

My concern is that the proposal is much complicated and less useful
for general use cases.
You can emulate such formatting like follows:

import std.array, std.format, std.stdio;
import std.range, std.uni;
void main()
{
    auto strs = ["It's", "\"world\""];
    {
        // emulation of !?"
        auto w = appender!string();
        foreach (s; strs)
            formatStrWithEscape(w, s, '"');
        writeln(w.data);
    }
    {
        // emulation of !?'
        auto w = appender!string();
        foreach (s; strs)
            formatStrWithEscape(w, s, '\'');
        writeln(w.data);
    }
}
void formatStrWithEscape(W)(W writer, string str, char quote)
{
    writer.put(quote);
    foreach (dchar c; str)
        formatChar(writer, c, quote);
    writer.put(quote);
}
// copy from std.format
void formatChar(Writer)(Writer w, in dchar c, in char quote)
{
    if (std.uni.isGraphical(c))
    {
        if (c == quote || c == '\\')
            put(w, '\\'), put(w, c);
        else
            put(w, c);
    }
    else if (c <= 0xFF)
    {
        put(w, '\\');
        switch (c)
        {
        case '\a':  put(w, 'a');  break;
        case '\b':  put(w, 'b');  break;
        case '\f':  put(w, 'f');  break;
        case '\n':  put(w, 'n');  break;
        case '\r':  put(w, 'r');  break;
        case '\t':  put(w, 't');  break;
        case '\v':  put(w, 'v');  break;
        default:
            formattedWrite(w, "x%02X", cast(uint)c);
        }
    }
    else if (c <= 0xFFFF)
        formattedWrite(w, "\\u%04X", cast(uint)c);
    else
        formattedWrite(w, "\\U%08X", cast(uint)c);
}

I can agree changing private functions in std.format, e.g. formatChar,
to public undocumented, but cannot agree adding such complicated rule
into supported format specifier.

Kenji Hara


More information about the Digitalmars-d mailing list