A Rotation-invariant Texture ViT for Fine-Grained Recognition of Esophageal Cancer Endoscopic Ultrasound Images
Tianyi Liu, Shuaishuai S Zhuang, Jiacheng Nie, Geng Chen , Yusheng Guo, Guangquan Zhou*, Jean-Louis Coatrieux, Yang Chen*
;
Abstract
"Endoscopic Ultrasound (EUS) is advantageous in perceiving hierarchical changes in the esophageal tract wall for diagnosing submucosal tumors. However, the lesions often disrupt the structural integrity and fine-grained texture information of the esophageal layer, impeding the accurate diagnosis. Moreover, the lesions can appear in any radial position due to the characteristics of EUS imaging, further increasing the difficulty of diagnosis. In this study, we advance an automatic classification model by equipping the Vision Transformer (ViT), a state-of-the-art(SOTA) model, with a novel statistical rotation-invariant reinforcement mechanism dubbed SRRM. Mainly, we adaptively select crucial regions to avoid interference from irrelevant information in the image. Also, this model integrates histogram statistical features with rotation invariance into the self-attention mechanism, achieving bias-free capture of fine-grained information of lesions at arbitrary radial positions. Validated by in-house clinical and public data, SRRM-ViT has demonstrated remarkable performance improvements, suggesting our approach’s efficacy and potential in EUS image classification.The source code is publicly available at: https://github.com/tianyiliu-lab/SRRM-ViT/."
Related Material
[pdf]
[supplementary material]
[DOI]