WAKE: Watermarking Audio with Key Enrichment

TL;DR

WAKE is the first key-controllable audio watermark framework, which aims to embed watermarks into audio and decode the embedded watermarks using specific keys, as shown in the following Figure. If an incorrect key is used, it will be impossible to decode the correct watermark, substantially enhancing the watermarking system's security and scalability while also fulfilling personalized watermarks. Notably, WAKE can achieve multiple watermark embeddings and corresponding watermark decoding based on the key used during embedding. WAKE outperforms the current state-of-the-art audio watermarking models in watermarked audio quality and decoding performance.

WAKE's overall process

Interpolate start reference image.

Audio Event

Sampled from AudioSet

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8
Origin Audio
AudioSeal (single watermark)
WavMark (single watermark)
WAKE (single watermark)
AudioSeal (double watermark)
WavMark (double watermark)
WAKE (double watermark)

English Speech

Sampled from LibriSpeech.

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8
Origin Audio
AudioSeal (single watermark)
WavMark (single watermark)
WAKE (single watermark)
AudioSeal (double watermark)
WavMark (double watermark)
WAKE (double watermark)

Other Language Speech

Sampled from CommonVoice.

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8
Origin Audio
AudioSeal (single watermark)
WavMark (single watermark)
WAKE (single watermark)
AudioSeal (double watermark)
WavMark (double watermark)
WAKE (double watermark)

Music Data

Sampled from FMA.

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8
Origin Audio
AudioSeal (single watermark)
WavMark (single watermark)
WAKE (single watermark)
AudioSeal (double watermark)
WavMark (double watermark)
WAKE (double watermark)

Other Data

Sampled from outside the train/test dataset.

Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8
Origin Audio
AudioSeal (single watermark)
WavMark (single watermark)
WAKE (single watermark)
AudioSeal (double watermark)
WavMark (double watermark)
WAKE (double watermark)

Specgram

Sampled Randomly

Sample 1 Sample 2 Sample 3
Origin Audio Figure Figure Figure
AudioSeal (single watermark) Figure Figure Figure
WavMark (single watermark) Figure Figure Figure
WAKE (single watermark) Figure Figure Figure
AudioSeal (double watermark) Figure Figure Figure
WavMark (double watermark) Figure Figure Figure
WAKE (double watermark) Figure Figure Figure

Watermarked Audio with different Watermark Times(Audio Event)

Sampled from AudioSet.

Origin Audio 1 2 3 4 5 6 7 8 9 10
AudioSeal
WavMark
WAKE

Watermarked Audio with different Watermark Times(English Speech)

Sampled from LibriSpeech.

Origin Audio 1 2 3 4 5 6 7 8 9 10
AudioSeal
WavMark
WAKE

Watermarked Audio with different Watermark Times (music)

Sampled from FMA.

Origin Audio 1 2 3 4 5 6 7 8 9 10
AudioSeal
WavMark
WAKE

Watermarked Audio with different Watermark Times(other language speech)

Sampled from CommonVoice.

Origin Audio 1 2 3 4 5 6 7 8 9 10
AudioSeal
WavMark
WAKE

Specgram with watermark times

Sampled Randomly

AudioSeal WavMark WAKE
Origin Audio Figure Figure Figure
Watermark 1 time Figure Figure Figure
Watermark 2 times Figure Figure Figure
Watermark 3 times Figure Figure Figure
Watermark 4 times Figure Figure Figure
Watermark 5 times Figure Figure Figure
Watermark 6 times Figure Figure Figure
Watermark 7 times Figure Figure Figure
Watermark 8 times Figure Figure Figure
Watermark 9 times Figure Figure Figure
Watermark 10 times Figure Figure Figure