[QScintilla] Using python strings with SendScintilla

Phil Thompson phil at riverbankcomputing.com
Sat Nov 8 15:47:17 GMT 2008


On Sat, 08 Nov 2008 15:07:09 +0000, Baz Walter <bazwal at ftml.net> wrote:
> i am using the following python method to get undecoded strings from 
> scintilla:
> 
>      def range(self, start, end, decode=True):
>          bytes = '\1' * (end - start)
>          self.SendScintilla(QSB.SCI_GETTEXTRANGE, start, end, bytes)
>          if decode:
>              return bytes.decode('utf8')
>          return bytes
> 
> it works fine, but i noticed some very odd behaviour when i used space 
> characters to create the string buffer. what would happen is that string 
> literals used elsewhere in my program would become overwritten by 
> different characters. so, for instance, my comment method would start 
> inserting '#b' at the beginning of a line instead of '# '. my guess 
> about the cause of this is that it has something to do with python's 
> interning of certain string literals (which includes all the single byte 
> strings from 0-255). there seems to be a risk that interned python 
> strings could become 'corrupted' when used to create a buffer that is 
> passed to SendScintilla.
> 
> unfortunately i can't provide example code that demonstrates this 
> problem, because it is very unpredictable. all i can say with certainty 
> is that if i use a control character to create the buffer, the problem 
> goes away. (unless it's '\0', which creates even less predictable 
> truncation problems).
> 
> obviously, i'm hoping that this is a bug that can be fixed as it took me 
> a *long* time to work out what was causing some very weird problems in 
> my program. but if not: what would be the safest way to create python 
> string buffers for use with SendScintilla - is the above method the best 
> i can do? (i think the use of '\1' for the buffer only works because 
> it's not being used as a string literal elsewhere in my program).
> 
> ps. the reason for bothering with all this is: speed. working with 
> undecoded strings is much faster for certain operations (like stripping 
> trailing spaces or replacing tabs).

So you are using SendScintilla() to bypass the immutability of Python
strings? In that case you deserve all you get.

If speed is a real problem (and not premature optimisation) then a better
solution would be to implement a new method (stripTrailingSpaces()) at the
C++ level.

Phil


More information about the QScintilla mailing list