Research

My research lies at the intersection of machine learning and computer vision, with a focus on solving practical problems involving imbalanced data, scene understanding, and the digitization of cultural heritage.

1. Imbalance & Long-tailed Learning

Real-world data is rarely balanced. We develop algorithms for class-imbalance learning, multi-class imbalance, and long-tailed recognition — combining resampling, ensemble learning, fuzzy transitions, and prototypical learning. Our open-source toolkit multi-imbalance and recent surveys on long-tailed learning have become widely-cited references in the field.

Selected works: A systematic review on long-tailed learning (TNNLS 2025); Multi-imbalance: an open-source software (KBS 2019); Open-set long-tailed recognition (Neural Networks 2025).

2. Scene Text Detection & Recognition (OCR)

Reading text "in the wild" — on shop signs, road signs, and street views — is essential for autonomous driving and intelligent transportation. We have built large-scale Chinese street-view datasets (ShopSign), proposed character-level multi-segmentation networks, and explored fusion strategies for robust detection over low-quality images.

Selected works: Street view text recognition for ITS (IEEE TITS 2020); Character-level street view text spotting (IEEE TAI 2021); UCR: unified character-radical dual-supervision (Pattern Recognition 2025).

3. Ancient Language Understanding

AI is transforming how we study ancient scripts. We develop self-supervised and generative methods for oracle bone inscriptions (rejoining fragments, recognizing characters, font classification) and ancient bamboo manuscripts (binarization, ink enhancement, scribe verification). This work bridges machine learning with paleography and digital humanities.

Selected works: Data-driven oracle bone rejoining (KDD 2022); AI-powered oracle bone inscriptions recognition (IJCAI 2021); Binarizing severely degraded ancient bamboo slips (BMVC 2025).

4. Deep Contrastive & Self-supervised Learning

Contrastive and self-supervised learning provide powerful representations without requiring expensive labels. We have published a comprehensive survey of deep contrastive learning, and apply these ideas to medical imaging, video action recognition, and recommendation systems.

Selected works: Deep contrastive learning: a survey (Acta Automatica Sinica 2023); Quality-aware self-training (AAAI 2023); Unified deep semi-supervised graph learning (Neural Networks 2023).