CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations

Yuanhan Zhang, ZhenFei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu ;


As facial interaction systems are prevalently deployed, security and reliability of these systems become a critical issue, with substantial research efforts devoted. Among them, face anti-spoofing emerges as an important area, whose objective is to identify whether a presented face is live or spoof. Though promising progress has been achieved, existing works still have difficulty in handling complex spoof attacks and generalizing to real-world scenarios. The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity. To overcome these obstacles, we contribute a large-scale face anti-spoofing dataset, extbf{CelebA-Spoof}, with the following appealing properties: extit{1) Quantity:} CelebA-Spoof comprises of 625,537 pictures of 10,177 subjects, significantly larger than the existing datasets. extit{2) Diversity:} The spoof images are captured from 8 scenes (2 environments all_papers.txt decode_tex_noligatures.sh decode_tex_noligatures.sh~ decode_tex.sh decode_tex.sh~ ECCV_abstracts.csv ECCV_abstracts_good.csv ECCV_abstracts_good.csv.old ECCV_abstracts_ori.csv ECCV.csv ECCV.csv~ ECCV_new.csv generate_list.sh generate_list.sh~ generate_overview.sh generate_overview.sh~ gen.sh pdflist pdflist.copied RCS snippet.html 4 illumination conditions) with more than 10 sensors. extit{3) Annotation Richness:} CelebA-Spoof contains 10 spoof type annotations, as well as the 40 attribute annotations inherited from the original CelebA dataset. Equipped with CelebA-Spoof, we carefully benchmark existing methods in a unified multi-task framework, extbf{Auxiliary Information Embedding Network (AENet)}, and reveal several valuable observations. Our key insight is that, compared with the commonly-used binary supervision or mid-level geometric representations, rich semantic annotations as auxiliary tasks can greatly boost the performance and generalizability of face anti-spoofing across a wide range of spoof attacks. Through comprehensive studies, we show that CelebA-Spoof serves as an effective training data source. Models trained on CelebA-Spoof (without fine-tuning) exhibit state-of-the-art performance on standard benchmarks such as CASIA-MFSD. The datasets are available at \href{https://github.com/Davidzhangyuanhan/CelebA-Spoof}{https://github.com/Davidzhangyuanhan/CelebA-Spoof} . "

Related Material