Conditional Image Repainting via Semantic Bridge and Piecewise Value Function
We study conditional image repainting where a model is trained to generate visual content conditioned on user inputs, and composite the generated content seamlessly onto a user provided image while preserving the semantics of users' inputs. The content generation community have been pursuing to lower the skill barriers. The usage of human language is the rose among horns for this purpose, because the language is friendly to users but poses great difficulties for the model in associating relevant words with the semantically ambiguous regions. To resolve this issue, we propose a delicate mechanism which bridges the semantic chasm between the language input and the generated visual content. The state-of-the-art image compositing techniques pose a latent ceiling of fidelity for the composited content during the adversarial training process. In this work, we improve the compositing by breaking through the latent ceiling using a novel piecewise value function. We demonstrate on two datasets that the proposed techniques can better assist tackling conditional image repainting compared to the existing ones.