codecov Language grade: Python Run tests view examples PyPI version

Overview

Robox is a simple library with a clean interface for exploring/scraping the web or testing a website you’re developing. Robox can fetch a page, click on links and buttons, and fill out and submit forms. Robox is built on top of two excelent libraries: httpx and beautifulsoup4.


Robox has all the standard features of httpx, including async, plus:

  • clean api
  • caching
  • downloading files
  • history
  • understands robots.txt

Examples

from robox import Robox


with Robox() as robox:
    page = robox.open("https://httpbin.org/forms/post")
    form = page.get_form()
    form.fill_in("custname", value="foo")
    form.check("topping", values=["Onion"])
    form.choose("size", option="Medium")
    form.fill_in("comments", value="all good in the hood")
    form.fill_in("delivery", value="13:37")
    page = page.submit_form(form)
    assert page.url == "https://httpbin.org/post"

or use async version:

import asyncio

from pprint import pprint
from robox import AsyncRobox


async def main():
    async with AsyncRobox(follow_redirects=True) as robox:
        page = await robox.open("https://www.google.com")
        form = page.get_form()
        form.fill_in("q", value="python")
        consent_page = await page.submit_form(form)
        form = consent_page.get_form()
        page = await consent_page.submit_form(form)
        pprint(list(page.get_links_by_text("python")))


asyncio.run(main())

Caching can be easily configured via httpx-cache

from robox import Robox, DictCache, FileCache


with Robox(cache=DictCache()) as robox:
    p1 = robox.open("https://httpbin.org/get")
    assert not p1.from_cache
    p2 = robox.open("https://httpbin.org/get")
    assert p2.from_cache

See examples folder for more detailed examples.

Installation

Using pip:

pip install robox

Robox requires Python 3.8+. See Changelog for changes.

GitHub

View Github