[PyKDE] QString to Python string conversion trouble

Andreas Gerstlauer gerstl at ics.uci.edu
Wed Oct 24 02:37:04 BST 2001


> It's not acceptable to get an exception when applying str() to a QString

Why not? Even in Python itself it is handled like that:

>>> s = u"Test\u0400Test"
>>> print str(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)

> you should get a Python string object with any characters > 128 properly escaped.
> 
If you are dealing with unicode strings you shouldn't use str() on them.
That's how I understand it is in Python (see above, a Python unicode ist *not*
escaped if passed through str()). A string object can't handle chars > 128. 
If you want those encoded in some other way you'll have to do that manually 
(i.e. via encode()).
Should that be different in PyQt?

> When I replied originally I was away from my system. When I actually repeat 
> your test I don't get the same result. My transcript is...
> 
> >>> from qt import *
> >>> l = "Test\nTest"
> >>> print l
> Test
> Test
> >>> s = QString(l)
> >>> print str(s)
> Test\nTest
>
Ok, still, the newline is escape encoded as string "\\n" but not in
there as real newline.
 
> >Not in the case of the default ASCII encoding used by unicode() at least...
> 
> unicode() does not use the default encoding.
>
Not sure I understand. According to what I read in the Python docs,
unicode() uses ASCII encoding by default unless given an encoding as second
parameter. And the ASCII encoding chokes on anything > 128.
 
> You are basically asking me to restore the previous behaviour - which 
> Boudewijn and Marc-Andre (eventually) convinced me was incorrect.
>
No, not really. As far as I understood, the previous behavior was that
__str__ was returning a Python *string* object with no special handling of
unicode characters. In that case, 'unicode(QString(u"\u0411\u0412"))'
won't work.
What I was saying is that __str__ should return a Python *unicode* object,
which will then go through the unicode() function unmodified (while the 
str() function converts it automatically). See my "Test" class example 
in the previous mail. Returning a unicode in __str__ works fine.

> As I said - that is not Ok. Also, __str__ has to return a string
> object. In  Python 2.2 __unicode__ returns a Unicode object. 

The solution via separate __str__ and __unicode__ for returning a string
vs. a unicode in 2.2 is the perfect one, I agree.

However, in versions previous to 2.2, both unicode() and str() call
the __str__ method. And from what I read, __str__ is allowed to return
a unicode object. See the following threads:
  http://mail.python.org/pipermail/python-dev/2000-November/010469.html
  http://mail.python.org/pipermail/python-dev/2001-January/011799.html

On the other hand, in lieu of 2.2. where __str__ should probably only
return a string, and if you really want str() not to raise an exception,
how about the following: __str__ returns a string object in which only 
chars > 128 are escaped (other <= 128 are passed without escaping)?

> The first thing to try is upgrading your version of Python and we can take it 
> from there.
> 
Once I find the time for that.... ;-)

Andreas







More information about the PyQt mailing list