FoivosPar
/

Arc2Face

Model card Files Files and versions Community

Arc2Face / README.md

FoivosPar's picture

Update README.md

909a834 verified 8 months ago

|

2.46 kB

	---
	license: mit
	language:
	- en
	library_name: diffusers
	---

	# Arc2Face Model Card

	<div align="center">

	[Project Page](https://arc2face.github.io/) \| [Paper (ArXiv)](https://arxiv.org/abs/2403.11641) \| [Code](https://github.com/foivospar/Arc2Face) \| [🤗 Gradio demo](https://huggingface.co/spaces/FoivosPar/Arc2Face)



	</div>

	## Introduction

	Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding.
	It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.

	<div align="center">
	<img src='assets/samples_short.jpg'>
	</div>

	## Model Details

	It consists of 2 components:
	- encoder, a finetuned CLIP ViT-L/14 model
	- arc2face, a finetuned UNet model

	both of which are fine-tuned from [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5).
	The encoder is tailored for projecting ID-embeddings to the CLIP latent space.
	Arc2Face adapts the pre-trained backbone to the task of ID-to-face generation, conditioned solely on ID vectors.

	## Usage

	The models can be downloaded directly from this repository or using python:
	```python
	from huggingface_hub import hf_hub_download

	hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/config.json", local_dir="./models")
	hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/diffusion_pytorch_model.safetensors", local_dir="./models")
	hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/config.json", local_dir="./models")
	hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/pytorch_model.bin", local_dir="./models")
	```

	Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for complete inference instructions.

	## Limitations and Bias

	- Only one person per image can be generated.
	- Poses are constrained to the frontal hemisphere, similar to FFHQ images.
	- The model may reflect the biases of the training data or the ID encoder.

	## Citation


	BibTeX:

	```bibtex
	@misc{paraperas2024arc2face,
	title={Arc2Face: A Foundation Model of Human Faces},
	author={Foivos Paraperas Papantoniou and Alexandros Lattas and Stylianos Moschoglou and Jiankang Deng and Bernhard Kainz and Stefanos Zafeiriou},
	year={2024},
	eprint={2403.11641},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```