Few-Shot Pattern Detection via Template Matching and Regression

TL; DR

We propose TMR, a simple template-matching detector for few-shot pattern detection, achieving strong results on diverse datasets including our new dataset RPINE.

Abstract

We address the problem of few-shot pattern detection, which aims to detect all instances of a given pattern, typically represented by a few exemplars, from an input image. Although similar problems have been studied in few-shot object counting and detection (FSCD), previous methods and their benchmarks have narrowed patterns of interest to object categories and often fail to localize non-object patterns. In this work, we propose a simple yet effective detector based on template matching and regression, dubbed TMR. While previous FSCD methods typically represent target exemplars as spatially collapsed prototypes and lose structural information, we revisit classic template matching and regression. It effectively preserves and leverages the spatial layout of exemplars through a minimalistic structure with a small number of learnable convolutional or projection layers on top of a frozen backbone. We also introduce a new dataset, dubbed RPINE, which covers a wider range of patterns than existing object-centric datasets. Our method outperforms the state-of-the-art methods on the three benchmarks, RPINE, FSCD-147, and FSCD-LVIS, and demonstrates strong generalization in cross-dataset evaluation.

Method Overview

The proposed method, dubbed template matching and regression (TMR), is designed to be aware of the structure and shape of given exemplars. Given an input image, TMR first extracts a feature map using a backbone network. It then crops a template feature from the support exemplar's bounding box using a template extraction technique based on RoIAlign. This template is correlated with the image feature map to produce a template matching feature map. Using this correlation map, the model learns bounding box regression parameters to rectify the template box size adaptively. This process, termed template-conditioned regression, enables the model to handle support exemplars of varying sizes more effectively. Notably, TMR consists only of a few 3x3 and linear projections without any complicated modules such as cross-attention.

Proposed Dataset: RPINE

Existing benchmarks (e.g., FSCD-147, FSCD-LVIS) mainly target object-level patterns, limiting the evaluation of general pattern detection. To address this, we introduce a new dataset, Repeated Patterns IN Everywhere (RPINE), which covers diverse repeated patterns in the real world. RPINE contains images with varying degrees of objectness, from well-defined object-level patterns to non-object patterns, all annotated with bounding boxes via crowd-sourcing. Compared to FSCD datasets, RPINE provides broader coverage, including both non-object patterns and nameless parts of objects.

Results

Quantitative comparison

Method	SD	MAE (↓)	RMSE (↓)	AP (↑)	AP50 (↑)	AP75 (↑)
C-DETR		9.58	21.24	13.88	32.20	10.22
SAM-C	✓	18.77	37.14	18.80	34.04	18.74
PseCo	✓	48.20	88.16	23.18	44.54	21.24
GeCo	✓	9.57	17.07	23.33	45.93	21.19
TMR_(ours)		8.45	19.87	33.59	64.05	30.52
TMR_(ours)	✓	8.30	19.40	29.66	58.94	25.41

One-shot pattern counting and detection results on the RPINE dataset.
SD denotes box refinement with the SAM decoder. All the models are trained by the official code.

Method	Seen		Unseen
Method	AP (↑)	AP50 (↑)	AP (↑)	AP50 (↑)
FSDetView-PB	2.72	7.57	1.03	2.89
AttRPN-PB	4.08	11.15	3.15	7.87
C-DETR	4.92	14.49	3.85	11.28
DAVE	6.75	22.51	4.12	14.16
PseCo	22.37	42.56	-	-
GeCo	-	-	11.47	24.49
TMR_(ours)	27.49	48.48	22.71	39.68

Three-shot counting detection-based methods on the FSCD-LVIS seen and unseen split.

Cross-dataset comparison

Train	Test	cross-eval	AP		AP50
Train	Test	cross-eval	GeCo	TMR	GeCo	TMR
FSCD-147	FSCD-147		43.42	44.43	75.06	73.83
	FSCD-LVIS_seen	✓	13.96	21.25	25.87	37.18
	RPINE	✓	19.47	26.21	38.69	52.01
RPINE	FSCD-147	✓	36.99	41.39	60.38	69.19
	FSCD-LVIS_seen	✓	10.01	20.92	17.44	37.87
	RPINE		23.33	29.66	45.93	58.94

Cross-dataset comparison of GeCo and TMR.
TMR presents overwhelming performances, showing its strong generalization ability.

Qualitative Results

Qualitative comparison with the state-of-the-art models on RPINE (the first two images) and FSCD-147 (the last two images).

Additional qualitative results on the FSCD-147 dataset.

References

PseCo: Huang, Zhizhong, et al. "Point segment and count: A generalized framework for object counting." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

GeCo: Pelhan, Jer, et al. "GeCo: A novel unified architecture for low-shot counting by detection and segmentation." Advances in Neural Information Processing Systems 37 (2024): 66260-66282.

BibTeX

@inproceedings{jo2025tmr,
  title     = {Few-Shot Pattern Detection via Template Matching and Regression},
  author    = {Eunchan Jo, Dahyun Kang, Sanghyun Kim, Yunseon Choi, and Minsu Cho},
  booktitle = {International Conference on Computer Vision (ICCV)},
  year      = {2025},
}