Unofficial port of the VQ model in the Latent Diffusion Model (LDM)
This is an unofficial port of the VQ model in the Latent Diffusion Model (LDM).
Model Details
See models/first_stage_models/vq-f16
in the original repository.
How to use
python -m venv .venv && source .venv/bin/activate
pip install huggingface-hub
huggingface-cli download ktrk115/ldm-vq-f16 requirements.txt | xargs cat > requirements.txt
pip install -r requirements.txt
import torch
from PIL import Image
from transformers import AutoImageProcessor, AutoModel
image_processor = AutoImageProcessor.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
model = AutoModel.from_pretrained("ktrk115/ldm-vq-f16", trust_remote_code=True)
# Image reconstruction
img = Image.open("path/to/image.png")
example = image_processor(img)
with torch.inference_mode():
recon, _ = model.model(example["image"].unsqueeze(0))
recon_img = image_processor.postprocess(recon[0])
recon_img.save("recon.png")
- Downloads last month
- 39
Inference API (serverless) does not yet support model repos that contain custom code.