[PyKDE] How do you get HTML source from konqueror/KHTMLPart?

yichun wei yichun.wei at gmail.com
Wed Dec 20 18:59:06 GMT 2006


Hi pyKDE list,

I am trying to grab some html pages via KHTMLPart.openURL and scrape
the content I get. However I am not able to read out the HTML document
sources I have in KHTMLPart.

kdelibs has KHTML::documentSource in khtml that can return the source of the
pages since 2005, however I only found .document() in pyKDE. I find
.toHTML() and .toString() method, but they did not solve my problem:

toHTML() seemed to return nothing (None or ""), while toString() gave
me an exception and my script crashed:

terminate called after throwing an instance of 'DOM::DOMException'
KCrash: Application 'pywebscraper.py' crashing...
Could not find 'drkonqi' executable.
KCrash cannot reach kdeinit, launching directly.

How can I get the HTML source from the KHTMLPart object? Or is there
any way to get HTML source of a web page via dcop to konqueror? I find
some discussion which point me to use KIO.get, but it returns a
TransferJob and I have no idea how to get a QString from a
TransferJob...


any help will be highly appreciated.

- yichun




More information about the PyQt mailing list