Logo Search packages:      
Sourcecode: webcheck version File versions  Download package

def crawler::Link::__checkurl (   self,
  url 
) [private]

Check to see if the url is formatted properly, correct formatting
if possible and log an error in the formatting to the current page.

Definition at line 365 of file crawler.py.

00365                              :
        """Check to see if the url is formatted properly, correct formatting
        if possible and log an error in the formatting to the current page."""
        # search for spaces in the url
        if _spacepattern.search(url):
            self.add_pageproblem('link contains unescaped spaces: %s' % url)
            # replace spaces by %20
            url = _spacepattern.sub('%20', url)
        # find anchor part
        try:
            # get the anchor
            anchor = _anchorpattern.search(url).group(1)
            # get link for url we link to
            child = self.site.get_link(url)
            # store anchor
            child.add_reqanchor(self, anchor)
        except AttributeError:
            # ignore problems lookup up anchor
            pass
        return url

    def __tolink(self, link):


Generated by  Doxygen 1.6.0   Back to index