Logo Search packages:      
Sourcecode: webcheck version File versions

def crawler::Site::__init__ (   self  ) 

Creates an instance of the Site class and initializes the
state of the site.

Definition at line 101 of file crawler.py.

00101                       :
        """Creates an instance of the Site class and initializes the
        state of the site."""
        # list of internal urls
        self._internal_urls = []
        # list of regexps considered internal
        self._internal_res = {}
        # list of regexps considered external
        self._external_res = {}
        # list of regexps matching links that should not be checked
        self._yanked_res = {}
        # map of scheme+netloc to robot handleds
        self._robotparsers = {}
        # a map of urls to Link objects
        self.linkMap = {}
        # list of base urls (these are the internal urls to start from)
        self.bases = []

    def add_internal(self, url):


Generated by  Doxygen 1.6.0   Back to index