[PyQt] PyQt cannot trasform QString into str when reading emoji symbol from QClipboard

Ilya Kulakov kulakov.ilya at gmail.com
Fri Jan 23 07:40:52 GMT 2015


This workaround does not work on Python 3.4.2, PyQt 5.4:

UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 0-1: unexpected end of data

> On 23 янв. 2015 г., at 2:57, Pavel Roskin <proski at gnu.org> wrote:
> 
> This would decode surrogates!
> 
> import array
> string = QApplication.clipboard().text()
> # string = '\U0001f637'
> # string = '\ufeff\ud83d\ude87'
> try:
>    # sane case - valid unicode
>    string.encode('utf-8')
> except UnicodeEncodeError:
>    # insane case - need to decode surrogates
>    string = array.array('H', map(ord, list(string))).tobytes().decode('utf-16')
> print(string)
> 
> The string is split into characters, converted to integers, packed as
> 16-bit unsigned int, converted to bytes and decoded as UTF-16. Real
> characters over 0xffff would raise OverflowError in that expression.
> That's why it's a fallback if UTF-8 encoding doesn't work.
> 
> Of course it's a workaround. QApplication.clipboard().text() should
> not return surrogates.
> 
> -- 
> Regards,
> Pavel Roskin



More information about the PyQt mailing list