/ Web Crawling & Web Scraping

A Python library for automating interaction with websites

A Python library for automating interaction with websites

MechanicalSoup

A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript.

MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize is incompatible with Python 3 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and BeautifulSoup (for document navigation). Since 2017 it is a project actively maintained by a small team including @hemberger and @moy.

Installation

|Latest Version| |Supported Versions|

PyPy and PyPy3 are also supported (and tested against).

Download and install the latest released version from PyPI <https://pypi.python.org/pypi/MechanicalSoup/>__::

pip install MechanicalSoup

Download and install the development version from GitHub <https://github.com/MechanicalSoup/MechanicalSoup>__::

pip install git+https://github.com/MechanicalSoup/MechanicalSoup

Installing from source (installs the version in the current working directory)::

python setup.py install

(In all cases, add --user to the install command to
install in the current user's home directory.)

Example

From <examples/expl_duck_duck_go.py>__, code to get the results from
a DuckDuckGo search:

.. code:: python

"""Example usage of MechanicalSoup to get the results from
DuckDuckGo."""

import mechanicalsoup

# Connect to duckduckgo
browser = mechanicalsoup.StatefulBrowser()
browser.open("https://duckduckgo.com/")

# Fill-in the search form
browser.select_form('#search_form_homepage')
browser["q"] = "MechanicalSoup"
browser.submit_selected()

# Display the results
for link in browser.get_current_page().select('a.result__a'):
    print(link.text, '->', link.attrs['href'])

More examples are available in <examples/>__.

For an example with a more complex form (checkboxes, radio buttons and
textareas), read <tests/test_browser.py>__
and <tests/test_form.py>__.

Development

|Build Status| |Coverage Status|
|Requirements Status| |Documentation Status|
|CII Best Practices|

Instructions for building, testing and contributing to MechanicalSoup:
see <CONTRIBUTING.rst>__.

GitHub

Comments