Haystack

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want to perform Question Answering or semantic document search, you can use the State-of-the-Art NLP models in Haystack to provide unique search experiences and allow your users to query in natural language. Haystack is built in a modular fashion so that you can combine the best technology from other open-source projects like Huggingface's Transformers, Elasticsearch, or Milvus.

What to build with Haystack

  • Ask questions in natural language and find granular answers in your documents.
  • Perform semantic search and retrieve documents according to meaning, not keywords
  • Use off-the-shelf models or fine-tune them to your domain.
  • Use user feedback to evaluate, benchmark, and continuously improve your live models.
  • Leverage existing knowledge bases and better handle the long tail of queries that chatbots receive.
  • Automate processes by automatically applying a list of questions to new documents and using the extracted answers.

Core Features

  • Latest models: Utilize all latest transformer-based models (e.g., BERT, RoBERTa, MiniLM) for extractive QA, generative QA, and document retrieval.
  • Modular: Multiple choices to fit your tech stack and use case. Pick your favorite database, file converter, or modeling framework.
  • Pipelines: The Node and Pipeline design of Haystack allows for custom routing of queries to only the relevant components.
  • Open: 100% compatible with HuggingFace's model hub. Tight interfaces to other frameworks (e.g., Transformers, FARM, sentence-transformers)
  • Scalable: Scale to millions of docs via retrievers, production-ready backends like Elasticsearch / FAISS, and a fastAPI REST API
  • End-to-End: All tooling in one place: file conversion, cleaning, splitting, training, eval, inference, labeling, etc.
  • Developer friendly: Easy to debug, extend and modify.
  • Customizable: Fine-tune models to your domain or implement your custom DocumentStore.
  • Continuous Learning: Collect new training data via user feedback in production & improve your models continuously
ledger DocsOverview, Components, Guides, API documentation
floppy_disk InstallationHow to install Haystack
mortar_board TutorialsSee what Haystack can do with our Notebooks & Scripts
beginner Quick DemoDeploy a Haystack application with Docker Compose and a REST API
vulcan_salute CommunitySlack, Twitter, Stack Overflow, GitHub Discussions
heart ContributingWe welcome all contributions!
bar_chart BenchmarksSpeed & Accuracy of Retriever, Readers and DocumentStores
telescope RoadmapPublic roadmap of Haystack
newspaper BlogRead our articles on Medium
phone JobsWe're hiring! Have a look at our open positions

Installation

If you're interested in learning more about Haystack and using it as part of your application, we offer several options.

1. Installing from a package

You can install Haystack by using pip.

    pip3 install farm-haystack

Please check our page on PyPi for more information.

2. Installing from GitHub

You can also clone it from GitHub — in case you'd like to work with the master branch and check the latest features:

    git clone https://github.com/deepset-ai/haystack.git
    cd haystack
    pip install --editable .

To update your installation, do a git pull. The --editable flag will update changes immediately.

3. Installing on Windows

On Windows, you might need:

    pip install farm-haystack -f https://download.pytorch.org/whl/torch_stable.html

Tutorials

image

Follow our introductory tutorial to setup a question answering system using Python and start performing queries! Explore the rest of our tutorials to learn how to tweak pipelines, train models and perform evaluation.

Quick Demo

By following these steps, you will start up our demo which creates a Haystack service via Docker Compose. With this you can begin calling it directly via the REST API or even interact with it using the included Streamlit UI.

1. Update/install Docker and Docker Compose, then launch Docker

    apt-get update && apt-get install docker && apt-get install docker-compose
    service docker start

2. Clone Haystack repository

    git clone https://github.com/deepset-ai/haystack.git

3. Pull images & launch demo app

    cd haystack
    docker-compose pull
    docker-compose up
    
    # Or on a GPU machine: docker-compose -f docker-compose-gpu.yml up

You should be able to see the following in your terminal window as part of the log output:

..
ui_1             |   You can now view your Streamlit app in your browser.
..
ui_1             |   External URL: http://192.168.108.218:8501
..
haystack-api_1   | [2021-01-01 10:21:58 +0000] [17] [INFO] Application startup complete.

4. Open the Streamlit UI for Haystack by pointing your browser to the "External URL" from above.

You should see the following:

image

You can then try different queries against a pre-defined set of indexed articles related to Game of Thrones.

Note: The following containers are started as a part of this demo:

  • Haystack API: listens on port 8000
  • DocumentStore (Elasticsearch): listens on port 9200
  • Streamlit UI: listens on port 8501

Please note that the demo will publish the container ports to the outside world. We suggest that you review the firewall settings depending on your system setup and the security guidelines.

GitHub - deepset-ai/haystack: End-to-end Python framework for building natural language search interfaces to data. Leverages Transformers and the State-of-the-Art of NLP. Supports DPR, Elasticsearch, Hugging Face’s Hub, and much more!
:mag: End-to-end Python framework for building natural language search interfaces to data. Leverages Transformers and the State-of-the-Art of NLP. Supports DPR, Elasticsearch, Hugging Face’s Hub, a...