[Issue 2964] New: Reading string into associative array key garbles string

d-bugmail at puremagic.com d-bugmail at puremagic.com
Mon May 11 16:30:34 PDT 2009


http://d.puremagic.com/issues/show_bug.cgi?id=2964

           Summary: Reading string into associative array key garbles
                    string
           Product: D
           Version: 1.043
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla at digitalmars.com
        ReportedBy: djd at mailinator.com


Created an attachment (id=363)
 --> (http://d.puremagic.com/issues/attachment.cgi?id=363)
.tar.gz file with D1 code illustrating bug and one-line sample input text file

Either I'm doing something dumb, or I've found a bug where a string gets
trashed between storing it as key in an associative array and then getting it
back out.

The weird thing is it only happens when the string is read in from a file. 
Adding the same string as a literal doesn't trigger it.  

The attached D1 code simply reads in each line from a BufferedFile, storing it
as key in an uint[string] AA that counts how many times each line occurred.  It
verifies the the line is valid UTF-8 going in.  It then loops over the keys in
the AA, verifying that they're valid UTF-8 and printing them out.  Only the
string fails validation and gives an error if you try to print it out.  I don't
think there's anything special about the particular string that I'm using.

I verified this with three compilers on two operating systems:
DMD 1.043 on Ubuntu 8.10 x86_64
gcc version 4.1.3 20070831 (prerelease gdc 0.25, using dmd 1.021) (Ubuntu
0.25-4.1.2-16ubuntu1)
gdcmac trunk r229 (based on gcc 4.0.1) on Mac OS X 10.5.5 x86_64 

Here is some sample output:

Reading data...
Matched bad input.
Read 1 lines, 1 unique (0 non-UTF).
Checking...
2nd validate: string
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\80\245\34\158\255\127\0\0\144\180\123\1\0\0\0\0\112\243\34\158\255\127
didn't validate as UTF
Error: 4invalid UTF-8 sequence

The Unicode string printed out (as decimal chars) varies each time under Linux,
perhaps suggesting its reading some memory it oughtn't?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list