Scenic: A Jax Library for Computer Vision and Beyond

Scenic

Scenic is a codebase with a focus on research around attention-based models for computer vision. Scenic has been successfully used to develop classification, segmentation, and detection models for multiple modalities including images, video, audio, and multimodal combinations of them.

More precisely, Scenic is a (i) set of shared light-weight libraries solving tasks commonly encountered tasks when training large-scale (i.e. multi-device, multi-host) vision models; and (ii) a number of projects containing fully fleshed out problem-specific training and evaluation loops using these libraries.

Scenic is developed in JAX and uses Flax.

Philosophy

Scenic aims to facilitate rapid prototyping of large-scale vision models. To
keep the code simple to understand and extend we prefer forking and
copy-pasting over adding complexity or increasing abstraction. Only when
functionality proves to be widely useful across many models and tasks it may be
upstreamed to Scenic's shared libraries.

Code structure

Shared libraries provided by Scenic are split into:

dataset_lib: Implements IO pipelines for loading and pre-processing data
for common Computer Vision tasks and benchmarks. All pipelines are designed to
be scalable and support multi-host and multi-device setups, taking care of
dividing data among multiple hosts, incomplete batches, caching, pre-fetching,
etc.
model_lib: Provides (i) several abstract model interfaces (e.g.
ClassificationModel or SegmentationModel in model_lib.base_models) with
task-specific losses and metrics; (ii) neural network layers in
model_lib.layers, focusing on efficient implementation of attention and
transfomer layers; and (iii) accelerator-friedly implementations of bipartite
matching algorithms in model_lib.matchers.
train_lib: Provides tools for constructing training loops and implements
several example trainers (classification trainer and segmentation trainer).
common_lib: Utilities that do not belong anywhere else.

Projects

Models built on top of Scenic exist as separate projects. Model-specific code
such as configs, layers, losses, network architectures, or training and
evaluation loops exist as separate projects.

Common baselines such as a ResNet or a Visual Transformer (ViT) are implemented
in the projects/baselines project. Forking this directory is a good starting
point for new projects.

There is no one-fits-all recipe for how much code should be re-used by projects.
Projects can fall anywhere on the wide spectrum of code re-use: from defining
new configs for an existing model to redefining models, training loop, logging,
etc.

Getting started

See projects/baselines/README.md for a walk-through baseline models and
instructions on how to run the code.
If you would like to to contribute to Scenic, please check out the
Philisophy, Code structure and
Contributing sections.
Should your contribution be a part of the shared libraries, please send us a
pull request!

Quick start

Download the code from GitHub

git clone https://github.com/google-research/scenic.git
cd scenic
pip install .

and run training for ViT on ImageNet:

python main.py -- \
  --config=projects/baselines/configs/imagenet/imagenet_vit_config.py \
  --workdir=./

Disclaimer: This is not an official Google product.

GitHub

https://github.com/google-research/scenic

Scenic: A Jax Library for Computer Vision and Beyond

Scenic

Philosophy

Code structure

Projects

Getting started

Quick start

GitHub

John

An open-source library for graph contrastive learning in PyTorch

Multi-level Statistics Transfer for Self-driven Person Image Generation

Scenic

Philosophy

Code structure

Projects

Getting started

Quick start

GitHub

An open-source library for graph contrastive learning in PyTorch

Multi-level Statistics Transfer for Self-driven Person Image Generation

You might also like...