[PyQt] PyQt cannot trasform QString into str when reading emoji symbol from QClipboard

Phil Thompson phil at riverbankcomputing.com
Fri Jan 23 08:45:53 GMT 2015


On 22/01/2015 12:13 pm, Ilya Kulakov wrote:
> I'm testing the following symbol: 😷
> 
> I wrote simple Objective-C application to check how native frameworks
> would encode this into UTF-8. Here is the code:
> 
>     NSString *str = [[NSPasteboard generalPasteboard]
> stringForType:@"public.utf8-plain-text"];
>     const char *cstr = str.UTF8String;
>     size_t i = 0;
>     while (cstr[i] != 0)
>     {
>         NSLog(@"0x%x", cstr[i]);
>         ++i;
>     }
> 
> Then I wrote a simple Qt app to ensure that returned QString has the 
> same bytes:
> 
>     QClipboard *clipboard = QApplication::clipboard();
>     QString originalText = clipboard->text();
>     QByteArray bytes = originalText.toUtf8();
>     for (size_t i = 0; i < bytes.count(); ++i)
>         qDebug("0x%x", bytes.at(i));
> 
> In both apps output is:
> 
>     0xfffffff0
>     0xffffff9f
>     0xffffff98
>     0xffffffb7
> 
> However when I extract text by using PyQt (python 3):
> 
>     QApplication.clipboard().text()
> 
> returned str consists of 1 string and cannot be encoded to UTF-8 due
> to surrogate '\ud83d' at position 0.
> However, as you can see above, there is no such symbol.
> 
> That raises 2 questions:
> 1. How this symbols was introduced
> 2. How to handle this in an application
> 
> The original bug report we received was from our Windows user, but we
> were not able to reproduce it there. However it's pretty easy to
> reproduce on Mac OS X.
> 
> Best Regards,
> Ilya Kulakov

Should be fixed in tonight's PyQt4 and PyQt5 snapshots.

Thanks,
Phil


More information about the PyQt mailing list