[PyKDE] How do you get HTML source from konqueror/KHTMLPart?

yichun wei yichun.wei at gmail.com
Thu Dec 21 01:40:03 GMT 2006


On 12/20/06, Marcos Dione <mdione at grulic.org.ar> wrote:
> On Wed, Dec 20, 2006 at 10:59:06AM -0800, yichun wei wrote:
> > I am trying to grab some html pages via KHTMLPart.openURL and scrape
> > the content I get. However I am not able to read out the HTML document
> > sources I have in KHTMLPart.
>
>     just call:
>
> domDocu= part.document ()
> html= domDocu.toString ().string ()
>
>     that's a QString.
>
> > toHTML() seemed to return nothing (None or ""), while toString() gave
> > me an exception and my script crashed:
>
>     yes, under certain circumstances that happens. I think it's because
> the KHTMLPart has no parentWidet or no parent or both. if you setup the
> whole apparatus for showing the part, everythings works just fine.

Thanks a lot Marcos. I was using Jim Bublitz's
doc/examples/pyKHTMLPart.py and modified it from there. It appears to
me that parentWidget for the KHTMLPart is not 0:

class pyPartsMW(KParts.MainWindow):
    def __init__(self, *args):
        ...
        self.w = KHTMLPart(self, "HTMLart", self)
        self.w.openURL (KURL("http://www.kde.org"))
        domDocu = self.w.document ()
        html = domDocu.toString().string()
        ...

Then I got the error message:

terminate called after throwing an instance of 'DOM::DOMException'

where .toString() is called. Is there any other reason leading to this
other than the 0 values when initiate the KHTMLPart object?

- yichun




More information about the PyQt mailing list