Commmandline arguments and UTF8 error

Daniel Keep daniel.keep.lists at gmail.com
Sun Feb 21 19:12:06 PST 2010



Nils Hensel wrote:
> Hello, group!
> 
> I have a problem writing a small console tool that needs to be given
> file names as commandline arguments. Not a difficult task one might
> assume. But everytime a filename contains an Umlaut (ä, ö, ü etc.) I
> receive "Error: 4invalid UTF-8 sequence".
> 
> Here's the sample code:
> 
> import std.stdio;
> 
> int main(string[] argv)
> {
>    foreach (arg; argv)
>    {
>       writef(arg);
>    }
>    return 0;
> }
> 
> I use dmd v1.046 by the way.
> 
> How do I make the argument valid? I need to be able to use std.path and
>  std.file methods on the file names.
> 
> Any help would be greatly appreciated.
> 
> Regards,
> Nils Hensel

If you look at the real main function in src\phobos\internal\dmain2.d,
you'll see this somewhere around line 109 (I'm using 1.051, but it's
unlikely to be much different in an earlier version):

> for (size_t i = 0; i < argc; i++)
> {
>     auto len = strlen(argv[i]);
>     am[i] = argv[i][0 .. len];
> }
>
> args = am[0 .. argc];
>
> result = main(args);

In other words, Phobos never bothers to actually convert the arguments
to UTF-8.

Tango does (tango\core\rt\compiler\dmd\rt\dmain2.d:238 for a recent-ish
trunk).


More information about the Digitalmars-d-learn mailing list