Grab Framework Documentation

pytest status coveralls documentation


    $ pip install -U grab

See details about installing Grab on different platforms here



Russian telegram chat:

English telegram chat:

To report bug please use GitHub issue tracker:

What is Grab?

Grab is a python web scraping framework. Grab provides a number of helpful methods to perform network requests, scrape web sites and process the scraped content:

  • Automatic cookies (session) support
  • HTTP and SOCKS proxy with/without authorization
  • Keep-Alive support
  • IDN support
  • Tools to work with web forms
  • Easy multipart file uploading
  • Flexible customization of HTTP requests
  • Automatic charset detection
  • Powerful API to extract data from DOM tree of HTML documents with XPATH queries
  • Asynchronous API to make thousands of simultaneous queries. This part of library called Spider. See list of spider fetures below.
  • Python 3 ready

Spider is a framework for writing web-site scrapers. Features:

  • Rules and conventions to organize the request/parse logic in separate blocks of codes
  • Multiple parallel network requests
  • Automatic processing of network errors (failed tasks go back to task queue)
  • You can create network requests and parse responses with Grab API (see above)
  • HTTP proxy support
  • Caching network results in permanent storage
  • Different backends for task queue (in-memory, redis, mongodb)
  • Tools to debug and collect statistics

Grab Example

    import logging

    from grab import Grab


    g = Grab()

    g.doc.set_input('login', '****')
    g.doc.set_input('password', '****')

    g.doc('//ul[@id="user-links"]//button[contains(@class, "signout")]').assert_exists()

    home_url = g.doc('//a[contains(@class, "header-nav-link name")]/@href').text()
    repo_url = home_url + '?tab=repositories'


    for elem in'//h3[@class="repo-list-name"]/a'):
        print('%s: %s' % (elem.text(),

Grab::Spider Example

    import logging

    from grab.spider import Spider, Task


    class ExampleSpider(Spider):
        def task_generator(self):
            for lang in 'python', 'ruby', 'perl':
                url = '' % lang
                yield Task('search', url=url, lang=lang)

        def task_search(self, grab, task):
            print('%s: %s' % (task.lang,

    bot = ExampleSpider(thread_number=2)