Logo Search packages:      
Sourcecode: webcheck version File versions  Download package

def crawler::Link::_pagechildren (   self  )  [private]

Determin the page children of this link, combining the children of
embedded items and following redirects.

Definition at line 559 of file crawler.py.

00559                            :
        """Determin the page children of this link, combining the children of
        embedded items and following redirects."""
        # if we already have pagechildren defined we're done
        if self.pagechildren is not None:
            return self.pagechildren
        self.pagechildren = set()
        # add my own children, following redirects
        for child in self.children:
            # follow redirects
            child = child.follow_link()
            # skip children we already have
            if child is None:
                continue
            # set depth of child if it is not already set
            if child.depth is None:
                child.depth = self.depth+1
            # add child pages to out pagechildren
            if child.ispage:
                self.pagechildren.add(child)
        # add my embedded element's children
        for embed in self.embedded:
            # set depth of embed if it is not already set
            if embed.depth is None:
                embed.depth = self.depth
            # merge in children of embeds
            self.pagechildren.update(embed._pagechildren())
        # return the results
        return self.pagechildren

    def set_encoding(self, encoding):


Generated by  Doxygen 1.6.0   Back to index