Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- audioseal
|
4 |
+
inference: false
|
5 |
+
---
|
6 |
+
# AudioSeal
|
7 |
+
|
8 |
+
We introduce AudioSeal, a method for speech localized watermarking, with state-of-the-art robustness and detector speed. It jointly trains a generator that embeds a watermark in the audio, and a detector that detects the watermarked fragments in longer audios, even in the presence of editing.
|
9 |
+
Audioseal achieves state-of-the-art detection performance of both natural and synthetic speech at the sample level (1/16k second resolution), it generates limited alteration of signal quality and is robust to many types of audio editing.
|
10 |
+
Audioseal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed — achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications.
|
11 |
+
|
12 |
+
# :mate: Installation
|
13 |
+
|
14 |
+
AudioSeal requires Python >=3.8, Pytorch >= 1.13.0, [omegaconf](https://omegaconf.readthedocs.io/), [julius](https://pypi.org/project/julius/), and numpy. To install from PyPI:
|
15 |
+
|
16 |
+
```
|
17 |
+
pip install audioseal
|
18 |
+
```
|
19 |
+
|
20 |
+
To install from source: Clone this repo and install in editable mode:
|
21 |
+
|
22 |
+
```
|
23 |
+
git clone https://github.com/facebookresearch/audioseal
|
24 |
+
cd audioseal
|
25 |
+
pip install -e .
|
26 |
+
```
|
27 |
+
|
28 |
+
# :gear: Models
|
29 |
+
|
30 |
+
We provide the checkpoints for the following models:
|
31 |
+
|
32 |
+
- AudioSeal Generator.
|
33 |
+
It takes as input an audio signal (as a waveform), and outputs a watermark of the same size as the input, that can be added to the input to watermark it.
|
34 |
+
Optionally, it can also take as input a secret message of 16-bits that will be encoded in the watermark.
|
35 |
+
- AudioSeal Detector.
|
36 |
+
It takes as input an audio signal (as a waveform), and outputs a probability that the input contains a watermark at each sample of the audio (every 1/16k s).
|
37 |
+
Optionally, it may also output the secret message encoded in the watermark.
|
38 |
+
|
39 |
+
Note that the message is optional and has no influence on the detection output. It may be used to identify a model version for instance (up to $2**16=65536$ possible choices).
|
40 |
+
|
41 |
+
**Note**: We are working to release the training code for anyone wants to build their own watermarker. Stay tuned !
|
42 |
+
|
43 |
+
# :abacus: Usage
|
44 |
+
|
45 |
+
Audioseal provides a simple API to watermark and detect the watermarks from an audio sample. Example usage:
|
46 |
+
|
47 |
+
```python
|
48 |
+
|
49 |
+
from audioseal import AudioSeal
|
50 |
+
|
51 |
+
# model name corresponds to the YAML card file name found in audioseal/cards
|
52 |
+
model = AudioSeal.load_generator("audioseal_wm_16bits")
|
53 |
+
|
54 |
+
# Other way is to load directly from the checkpoint
|
55 |
+
# model = Watermarker.from_pretrained(checkpoint_path, device = wav.device)
|
56 |
+
|
57 |
+
# a torch tensor of shape (batch, channels, samples) and a sample rate
|
58 |
+
# It is important to process the audio to the same sample rate as the model
|
59 |
+
# expectes. In our case, we support 16khz audio
|
60 |
+
wav, sr = ..., 16000
|
61 |
+
|
62 |
+
watermark = model.get_watermark(wav, sr)
|
63 |
+
|
64 |
+
# Optional: you can add a 16-bit message to embed in the watermark
|
65 |
+
# msg = torch.randint(0, 2, (wav.shape(0), model.msg_processor.nbits), device=wav.device)
|
66 |
+
# watermark = model.get_watermark(wav, message = msg)
|
67 |
+
|
68 |
+
watermarked_audio = wav + watermark
|
69 |
+
|
70 |
+
detector = AudioSeal.load_detector("audioseal_detector_16bits")
|
71 |
+
|
72 |
+
# To detect the messages in the high-level.
|
73 |
+
result, message = detector.detect_watermark(watermarked_audio, sr)
|
74 |
+
|
75 |
+
print(result) # result is a float number indicating the probability of the audio being watermarked,
|
76 |
+
print(message) # message is a binary vector of 16 bits
|
77 |
+
|
78 |
+
|
79 |
+
# To detect the messages in the low-level.
|
80 |
+
result, message = detector(watermarked_audio, sr)
|
81 |
+
|
82 |
+
# result is a tensor of size batch x 2 x frames, indicating the probability (positive and negative) of watermarking for each frame
|
83 |
+
# A watermarked audio should have result[:, 1, :] > 0.5
|
84 |
+
print(result[:, 1 , :])
|
85 |
+
|
86 |
+
# Message is a tensor of size batch x 16, indicating of the probability of each bit to be 1.
|
87 |
+
# message will be a random tensor if the detector detects no watermarking from the audio
|
88 |
+
print(message)
|
89 |
+
```
|