Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Dec 01, 2021 1 min read

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

[Paper] [Colab is coming soon]

Approach

Example

Usage

To run captioning on a single image:

$ python run.py 
--reset_context_delta
--caption_img_path "example_images/captions/COCO_val2014_000000097017.jpg"

To run model on visual arithmetic:

$ python run.py 
--reset_context_delta
--end_factor 1.06
--fusion_factor 0.95
--grad_norm_factor 0.95
--run_type arithmetics
--arithmetics_imgs "example_images/arithmetics/woman2.jpg" "example_images/arithmetics/king2.jpg" "example_images/arithmetics/man2.jpg"
--arithmetics_weights 1 1 -1

To run model on real world knowledge:

$ python run.py
--reset_context_delta --cond_text "Image of" 
--end_factor 1.04 
--caption_img_path "example_images/real_world/simpsons.jpg"

To run model on OCR:

$ python run.py
--reset_context_delta --cond_text "Image of text that says" 
--end_factor 1.04 
--caption_img_path "example_images/OCR/welcome_sign.jpg"

GitHub

View Github

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Approach

Example

Usage

To run captioning on a single image:

To run model on visual arithmetic:

To run model on real world knowledge:

To run model on OCR:

GitHub

John

A form of Internet abuse which is perpetrated through the sending of massive volumes of email to a specific email address with the goal

Handles PDF to make it compatible with PDF/X and grayscale printers

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Approach

Example

Usage

To run captioning on a single image:

To run model on visual arithmetic:

To run model on real world knowledge:

To run model on OCR:

GitHub

A form of Internet abuse which is perpetrated through the sending of massive volumes of email to a specific email address with the goal

Handles PDF to make it compatible with PDF/X and grayscale printers

You might also like...