NeuralHash

An Adversarial Steganographic Method For Robust, Imperceptible Watermarking.

Building the next-gen watermark with deep learning: imperceptibly encoding images with un-erasable patterns to verify content ownership.

What it does:

Given an image (like Scream), Neuralhash makes small perturbations to visually encode a unique signature of the author:

original_to_watermarked

Which is able to be decoded even after extreme transformations (like a cellphone photo of the encoded image):

Our secure watermarking scheme represents significant advances in protecting content ownership and piracy prevention on the Internet.

Harnessing Adversarial Examples

Our key insight is that we can use adversarial example techniques on a Decoder Network (that maps input images to 32-bit signatures) to generate perturbations that decode to the desired signature. We perform projected gradient descent under the Expectation over Transformation framework to do this as follows:

We simulate an attack distrubtion using a set of differentiable transformations over which we train over. Here are some sample transforms:

Training the Network

We also propose a method to train our decoder network under the Expectation-Maximization (EM) framework to learn feature transformations that are more resilient to the threat space of attacks. As shown below, we alternate between encoding images using the network and then updating the network's weights to be more robust to attacks.

The below plots show robustness of our encoded images during the training process. As you can see, over many iterations, the line becomes flatter, indicating robustness over rotation and scaling. Shown later, our approach generalizes to more extreme transformations.

Sample Encodings

Here are some sample original images (top row) and the corresponding watermarked image (bottom row):

Example Attacks

Some examples where our approach succeessfully decodes the correct signature and examples where it fails:

Final Thoughts:

The development of a secure watermarking scheme is an important problem that has applications in content ownership and piracy prevention. Current state-of-the-art techniques are unable to document robustness across a variety of affine transformations. We propose a method that harnesses the expressiveness of deep neural networks to covertly embed imperceptible, transformation-resilient binary signatures into images. Given a decoder network, our key insight is that adversarial example generation techniques can be used to encode images by performing projected gradient descent on the image to embed a chosen signature.

By performing projective gradient descent on the decoder model with respect to a given image, we can use it to “sign” images robustly (think of a more advanced watermark). We start with the original image, then repeatedly tweak the pixel values such that the image (and all transformations, including scaling, rotation, adding noise, blurring, random cropping, and more) decodes to a specified 32-bit code. The resultant image will be almost imperceptible from the original image, yet contain an easily-decodable signature that cannot be removed even by the most dedicated of adversaries.

We also propose a method to train our decoder network under the Expectation-Maximization (EM) framework to learn feature transformations that are more resilient to the threat space of attacks. Experimental results indicate that our model achieves robustness across different transformations such as scaling and rotating, with improved results over the length of EM training. Furthermore, we show an inherent trade-off between robustness and imperceptibility, which allows the user of the model flexibility in adjusting parameters to fit a particular task.

GitHub