Random string samples & unicode - Reprise

Mon Sep 13 02:29:22 PDT 2010

Pelle:

>  >>> from random import sample
>  >>> "äö"
> '\xc3\xa4\xc3\xb6'
>  >>> "".join(sample("äö", 2))
> '\xb6\xc3'
> 
> Doesn't work with utf8. The D version is clearly superior. :-)

On the other hand D/Phobos/DMD have several thousand problems, small, big and HUGE, that Python lacks :-)

You are using Python 2.6.5, where you need to use unicode strings ("u" prefix). This works correctly on both Windows and Linux with Python 2.6.6, if your source code is UTF-8:

# coding: utf-8
from random import sample
print u"äö".encode("utf-8")
print "".join(sample(u"äö", 2)).encode("utf-8")

The strings have being changed in Python3.x, where they are the default. So there is no need to use the "u" prefix.

Mine was not a comparison, and it didn't have the purpose to show that Python is better, it was a way to put in the limelight a possible problem with Phobos.

Bye,
bearophile