Few-Shot Semantic Segmentation with Frequency Prototype Learning
Keywords:
Few-shot segmentation, few-shot learning, prototype learning, frequency domain learningAbstract
Few-shot semantic segmentation is a challenging task aimed at segmenting new objects in the query image with only a few annotated support images. Most advanced methods for this task mainly focus on either global or local prototype learning through global average pooling (GAP) or clustering. However, due to the limitation of average and cluster operation, these methods still fail to exploit the object information from support images entirely. To address these limitations, we propose a generalization of prototype learning in the frequency domain through multi-frequency pooling (MFP) to mine both local and global object information. Based on the MFP, we further build a Frequency Prototype Network (FPNet) consisting of three novel designs. Firstly, the Frequency Prototype Generation Module (FPGM) extracts frequency prototypes by MFP in the DCT domain to provide complete object guidance information. Then, the Prior Attention Mask Module (PAMM) produces a prior attention mask to identify a query target more precisely and retain high generalization. Finally, the Frequency Prototype Selection Module (FPSM) selects the most effective support prototypes to reduce redundancy. Extensive experiments on PASCAL-5i and COCO-20i demonstrate that our model achieves state-of-the-art performances in both 1-shot and 5-shot settings.