Crash my webserver!

Vladimir Panteleev thecybershadow.lists at gmail.com
Sun May 14 11:28:43 UTC 2023


On Sunday, 14 May 2023 at 10:56:29 UTC, Andrea Fontana wrote:
>> It returns mojibake. However, only for URL and form parameters.
>>
>> Normally these get percent-encoded by user-agents though.
>
> Hmm I don't think you can use utf-8 encoding in your request. I 
> think everything must be encoded as old US-ASCII.
>
> How can I understand in advance what encoding you're using, 
> otherwise? You could use utf-8 or big5 but I couldn't tell, or 
> am I missing something?

Well, bytes are bytes until you decide to look at them in a 
certain way. Yea, the input may be invalid as per the spec; 
however, if mojibake indicates that you're decoding them twice, 
you're probably doing something that's at least unnecessarily 
inefficient.

Maybe you're passing the bytes as char arrays to std.algorithm, 
which produces dchars, which are then being cast into char before 
decoding again? I think that would produce this sort of mojibake.



More information about the Digitalmars-d mailing list