DirEntry on Windows - wstring variant?

dcrepid via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Fri Oct 24 18:11:26 PDT 2014


On Friday, 24 October 2014 at 22:53:15 UTC, Jonathan M Davis via 
Digitalmars-d-learn wrote:
>
> Also, given how DirEntry works internally, I'd definitely be 
> inclined to argue
> that it would be too much of a mess to support wstring unless 
> it's by simply
> converting the name to a wstring when requested (which is kind 
> of pointless,
> since you can just do to!wstring on the name if that's what you 
> want). Making
> it support wstring directly would involve a lot of code 
> duplication, and it
> would increase the memory footprint, because the structs 
> involved would then
> have to hold the path and whatnot as both a string and wstring. 
> So, I question
> that it's at all worth it to try and make dirEntries support 
> wstring.

I would suggest that the string be kept as wstring inside the 
DirEntry structure, rather than converting twice as you suggest. 
Then a decision can be made as to whether .name() returns a 
string or wstring. If backwards compatibility is a concern, then 
it could be converted to a string on that call. It would break 
the nothrow promise that way, though. Adding something like 
.wname() would work here for getting the native wstring, I 
suppose.

Another alternative is to have a union of string and wstring, and 
a bool indicating how strings are handled internally. Of course, 
the .name and .wname properties would need to check it and 
convert depending on how it is stored.  Its not pretty, but its 
just another possibility.

The whole point is that there is a lot of wasted time doing the 
UTF16-UTF8 conversions when using these library functions.

> And we
> definitely don't want to encourage the use of wstring. It's 
> there for when you
> need it (which is great), but programs really should be using 
> string if they
> don't actually need to use wstring or dstring.

I get that wstring on a whole is ugly, but its the native unicode 
string type in Windows.  If someone is doing serious work on 
Windows, wstring will eventually need to be used.  It'd be nice 
to keep the abstraction of string at every level of a program, 
but in Windows its impossible. The standard library, even if it 
was comprehensive enough, will never cover every corner case 
where strings are needed.  Whether using the Windows API, COM, or 
interfacing with other Windows libraries, wstring will still rear 
its ugly head.

But, idealism aside, there are good reasons for keeping the 
pathname in its native format on Windows:
- If a program is processing lots of files, there's going to be a 
lot of wasted cycles doing those wstring->string conversions.
- Doing anything more with the files, besides listing them, will 
probably result in a string->wstring conversion during a call to 
Windows for opening or querying information about the file = more 
cycles wasted
- Additionally, Windows has a peculiar way of handling long 
pathnames that requires a "\\?\" prefix, and only works with the 
unicode versions of its functions. This also makes the pathname 
uniquely OS-specific..

Anyway, some things to think about.


More information about the Digitalmars-d-learn mailing list