Feature Representation Matters: End-to-End Learning for Reference-based Image Super-resolution
In this paper, we are aiming for a general reference-based super-resolution setting: it does not require the low-resolution image and the high-resolution reference image to be well aligned or with a similar texture. Instead, we only intend to transfer the relevant textures from reference images to the output super-resolution image. To this end, we engaged neural texture transfer to swap texture features between the low-resolution image and the high-resolution reference image. We identify the importance of designing a super-resolution task-specific features rather than classification oriented features for neural texture transfer, making the feature extractor more compatible with the image synthesis task. We develop an end-to-end training framework for the reference-based super-resolution task, where the feature encoding network prior to matching and swapping is jointly trained with the image synthesis network. We also discover that learning the high-frequency residual is an effective way for the reference-based super-resolution task. Without bells and whistles, the proposed method E2ENT2 achieved better performance than state-of-the method (i.e., SRNTT with five loss functions) with only two basic loss functions. Extensive experimental results on several datasets demonstrate that the proposed method E2ENT2 can achieve superior performance to existing best models both quantitatively and qualitatively.