/ Machine Learning

Two-stage GANs that generate fingerstyle guitarist images from audio

Two-stage GANs that generate fingerstyle guitarist images from audio

Audio2Guitarist-GAN

A two-stage generative adversarial network that generates images of guitarists playing guitar from audio.

Architecture

Stage 1: Audio to binary mask

stage1_arch

Stage 2: Binary mask to color image

stage2_arch
More information in this blog post.

Result

1. Video output

Here are the official website of 南澤大介 and 伍々慧.

2. Conditional output

The following gifs are result images generated from an audio that the model had never seen.

blue olive wine

3. Pose-guided generation

The following gifs show outputs of 2nd-stage model given conditional poses.

JP_guided_pose1 JP_guided_pose2

GitHub