CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong, Matthew R. Scott, Dinglong Huang; The European Conference on Computer Vision (ECCV), 2018, pp. 135-150


We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled rawly from the Internet by using text queries, without any human annotation. We develop a principled learning strategy by leveraging curriculum learning, with the goal of handling massive amount of noisy labels and data imbalance effectively. We design a new learning curriculum by measuring the complexity of data using its distribution density in a feature space, and rank the complexity in an unsupervised manner. This allows for an efficient implementation of curriculum learning on large-scale web images, resulting in a high-performance CNN model, where the negative impact of noisy labels is reduced substantially. Importantly, we show by experiments that those images with highly noisy labels can surprisingly improve the generalization capability of model, by serving as a manner of regularization. Our approaches obtain the state-of-the-art performance on four benchmarks, including Webvision, ImageNet, Clothing-1M and Food-101. With an ensemble of multiple models, we achieve a top-5 error rate of 5.2% on the Webvision challenge cite{li2017webvision} for 1000-category classification, which is the top performance that surpasses other results by a large margin of about 50% relative error rate. Codes and models are available at:

Related Material

author = {Guo, Sheng and Huang, Weilin and Zhang, Haozhi and Zhuang, Chenfan and Dong, Dengke and Scott, Matthew R. and Huang, Dinglong},
title = {CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}