RFDesign: Protein hallucination and inpainting with RoseTTAFold

Jue Wang ([email protected])
Doug Tischer ([email protected])
Sidney Lisanza ([email protected])
David Juergens ([email protected])
Joe Watson ([email protected])

This repository contains code for protein hallucination or inpainting, as
described in our
preprint
. Code
for postprocessing and analysis scripts included in scripts/.

License

All code is released under the MIT license.

All weights for neural networks are released for non-commercial use only under the Rosetta-DL license.

Installation

  1. Clone the repository:

    git clone https://git.ipd.uw.edu/jue/rfdesign.git
    cd rfdesign
  1. Create environment and install dependencies:

    cd envs
    conda env create -f SE3.yml
  1. Download model weights (see license info above).

    wget https://files.ipd.uw.edu/pub/rfdesign/weights.tar.gz
    tar xzf weights.tar.gz
  1. Configure path to weights. Put a file called config.json in hallucination/ and
    inpainting/ with the path to the weights directory. An example file is in each
    folder to copy from.

Dependencies

If you want/need to configure your environment manually, here are the packages in our environment:

Notes

  • If you are running this on digs at the IPD, you don’t need to do steps 3-4.
  • If you are getting output pdbs that are a ball of disconnected segments (as viewed in pymol), this may be due to a problem with the spherical harmonics cached by SE3-transformer. A workaround is to copy the hallucination/cache/ folder (a correct, clean copy of the cache) to your working directory before running hallucinate.py or inpaint.py.

Usage

See READMEs in hallucination/ and inpainting/ subfolders.

References

J. Wang, S. Lisanza, D. Juergens, D. Tischer, et al. Deep learning methods for designing proteins scaffolding functional sites. bioRxiv (2021). link

M. Baek, et al., Accurate prediction of protein structures and interactions using a three-track neural network, Science (2021). link

An earlier version of our hallucination method can be found at the trdesign-motif repo and published at:

D. Tischer, S. Lisanza, J. Wang, R. Dong, I. Anishchenko, L. F. Milles, S. Ovchinnikov, D. Baker. Design of proteins presenting discontinuous functional sites using deep learning. (2020) bioRxiv link

Our work is based on previous hallucination methods for unconstrained protein generation and fixed-backbone sequence design (trDesign repo):

I Anishchenko, SJ Pellock, TM Chidyausiku, …, S Ovchinnikov, D Baker. De novo protein design by deep network hallucination. (2021) Nature link

C Norn, B Wicky, D Juergens, S Liu, D Kim, B Koepnick, I Anishchenko, Foldit Players, D Baker, S Ovchinnikov. Protein sequence design by conformational landscape optimization. (2021) PNAS link

GitHub

View Github