
Prof. Antoni B. Chan
Professor, Dept. of Computer Science
Associate Dean (Research & Postgraduate), College of Computing
Deputy Director, Multimedia Engineering Research Centre (MERC)
BSc MEng Cornell, PhD UC San Diego
SrMIEEE
Video, Image, and Sound Analysis Lab (VISAL)
Department of Computer Science
College of Computing
City University of Hong Kong
Office: Room AC1-G7311, Yeung Kin Man Academic Building (lift 7)
Phone: +852 3442 6509
Fax: +852 3442 0503
Email: abchan at cityu dot edu dot hk
News
- Various PhD opportunities are available:
- The Hong Kong PhD Fellowship Scheme aims to attract the best and brightest students in the world to pursue their research degree programmes in Hong Kong’s institutions. This scheme was established by the Research Grants Council (RGC) of the HKSAR government.
- CityUHK also offers several schemes for outstanding students from world-class universities.
- PhD positions funded by my own projects.
Bio
Dr. Antoni Chan is a Professor at the City University of Hong Kong in the Department of Computer Science. Before joining CityU, he was a postdoctoral researcher in the Department of Electrical and Computer Engineering at the University of California, San Diego (UC San Diego). He received the Ph.D. degree from UC San Diego in 2008 studying in the Statistical and Visual Computing Lab (SVCL). He received the B.Sc. and M.Eng. in Electrical Engineering from Cornell University in 2000 and 2001. From 2001 to 2003, he was a Visiting Scientist in the Computer Vision and Image Analysis lab at Cornell. In 2005, he was a summer intern at Google in New York City. In 2012, he was the recipient of an Early Career Award from the Research Grants Council of the Hong Kong SAR, China.
Research Interests [more]
Computer Vision, Surveillance, Machine Learning, Explainable AI (XAI), Pattern Recognition, Computer Audition, Music Information Retrieval, Eye Gaze Analysis
image captioning, object tracking, dynamic textures, motion segmentation, motion analysis, semantic image annotation, image retrieval, crowd counting, probabilistic graphical models, support vector machines, Bayesian regression, Gaussian processes, semantic music annotation and retrieval, music segmentation, feature extraction.
- For more information about my current research projects, please visit my lab website.
- Opportunities for graduate students and research assistants! If you are interested in joining the lab, please check this information. Outstanding non-HK students may also consider applying for the HK PhD fellowship.
Recent Publications [more]
Semantic bias in image-text matching in humans versus vision-language pretraining AI models.Jinhan Zhang, Qichun Duan, Chenyang Zhao, Antoni B. Chan, and Janet H. Hsiao,
In:
Annual Conference of the Cognitive Science Society (CogSci),
Rio de Janeiro,
Jul 2026.
Identifying Mind Wandering Episodes during Virtual Cognitive Stimulation Therapy through Gaze Estimation from Videos.
Xiaoru Teng, Yong Li, Gloria H.Y. Wong, Antoni B. Chan, and Janet H. Hsiao,
In: Annual Conference of the Cognitive Science Society (CogSci), Rio de Janeiro, Jul 2026.
Understanding Aging-Related Changes in Face Scanning Behavior through Integrating Deep Neural Networks and Hidden Markov Models.
Dongcheng He, Antoni B. Chan, and Janet H. Hsiao,
In: Annual Conference of the Cognitive Science Society (CogSci), Rio de Janeiro, Jul 2026 (oral).
AI counsellors exaggerate linguistic qualities of human counsellors, and human clients align more with AI counsellors.
David A. Haslett, June Shi, Antoni B. Chan, and Janet H. Hsiao,
In: Annual Conference of the Cognitive Science Society (CogSci), Rio de Janeiro, Jul 2026.
Foresee-to-Ground: From Predictive Temporal Perception to Evidence-Driven Reasoning for Video Temporal Grounding.
Zelin Zheng, Xinyan Liu, Ruixin Li, Antoni B. Chan, Guorong Li, Qingming Huang, and Laiyun Qing,
In: International Conference on Machine Learning (ICML), Jul 2026.
Adaptive momentum weight averaging reduces initialization noise.Jia Wan, Ziquan Liu, Junyu Gao, Xia Wu, and Antoni B. Chan,
Pattern Recognition,
171:112297,
Part B, to appear Mar 2026.
GeneQuery: Generalized Gene Expression Prediction from Histology Images via Image-Gene QA.
Ying Xiong, Linjing Liu, Yufei Cui, Shangyu Wu, Xue Liu, Antoni B. Chan, and Chun Jason Xue,
In: Workshop on Machine Learning for Biological and Medical Image Big Data at IEEE International Conference on Bioinformatics and Biomedicine (BIBM):5738-5745, Dec 2025.
Embodied Crowd Counting.Runling Long, Yunlong Wang, Jia Wan, Xiang Deng, Xinting Zhu, Weili Guan, Antoni B. Chan, and Liqiang Nie,
In:
Neural Information Processing Systems (NeurIPS),
San Diego,
Dec 2025. [
github]
PUO-Bench: A Panel Understanding and Operation Benchmark with A Privacy-Preserving Framework.Wei Lin, Yiwei Zhou, Junkai Zhang, Rui Shao, Zhiyuan Zhao, Junyu Gao, Antoni B. Chan, and Xuelong Li,
In:
Neural Information Processing Systems (NeurIPS) Track on Datasets and Benchmarks,
San Diego,
Dec 2025. [
benchmark]
Regional crowd flow estimation from aerial view.Huibin Wei, Qi Li, Xindai Lin, Yuhao Lin, Shu Wang, Shengfeng He, Antoni B. Chan, and Wenxi Liu,
Neural Networks,
192:107907,
to appear Dec 2025.
Selected Publications [more]
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention.Jiuniu Wang, Wenjia Xu, Qingzhong Wang, and Antoni B. Chan,
International Journal of Computer Vision (IJCV),
133:1435-1455,
April 2025 (online Oct 2024).
Gradient-based Instance-Specific Visual Explanations for Object Specification and Object Discrimination.Chenyang Zhao, Janet H. Hsiao, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
46(9):5967-5985,
Sep 2024 (online Mar 2024). [
github]
Generalized Characteristic Function Loss for Crowd Analysis in the Frequency Domain.Weibo Shu, Jia Wan, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
46(5):2882-2899,
May 2024 (online Nov 2023). [
supplemental |
github]
Modeling Noisy Annotations for Point-Wise Supervision.Jia Wan, Qiangqiang Wu, and Antoni B. Chan,
IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI),
45(12):15065-15080,
Dec 2023 (online Jul 2023). [
github]
On Distinctive Image Captioning via Comparing and Reweighting.Jiuniu Wang, Wenjia Xu, Qingzhong Wang, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
45(2):2088-2103,
Feb 2023 (online 2022). [
github]
PRIMAL-GMM: PaRametrIc MAnifold Learning of Gaussian Mixture Models.Ziquan Liu, Lei Yu, Janet H. Hsiao, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
44(6):3197-3211,
June 2022 (online 2021). [
github]
Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in Large Scenes.Qi Zhang and Antoni B. Chan,
International Journal of Computer Vision (IJCV),
130(8):1938-1960,
Aug 2022. [
github]
Kernel-based Density Map Generation for Dense Object Counting.Jia Wan, Qingzhong Wang, and Antoni B. Chan,
IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI),
44(3):1357-1370,
Mar 2022. [
github]
On Diversity in Image Captioning: Metrics and Methods.Qingzhong Wang, Jia Wan, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
44(2):1035-1049,
Feb 2022. [
github]
Eye Movement analysis with Hidden Markov Models (EMHMM) with co-clustering.Janet H. Hsiao, Hui Lan, Yueyuan Zheng, and Antoni B. Chan,
Behavior Research Methods,
53:2473-2486,
April 2021.
Visual Tracking via Dynamic Memory Networks.Tianyu Yang and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
43(1):360-374,
Jan 2021. [
github]
Incorporating Side Information by Adaptive Convolution.Di Kang, Debarun Dhar, and Antoni B. Chan,
International Journal of Computer Vision (IJCV),
128:2897-2918,
July 2020.
Density-Preserving Hierarchical EM Algorithm: Simplifying Gaussian Mixture Models for Approximate Inference.Lei Yu, Tianyu Yang, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
41(6):1323-1337,
June 2019. [
code]
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking.Di Kang, Zheng Ma, and Antoni B. Chan,
IEEE Trans. on Circuits and Systems for Video Technology (TCSVT),
29(5):1408-1422,
May 2019.
Eye Movement Patterns in Face Recognition are Associated with Cognitive Decline in Older Adults.Cynthia Y.H. Chan, Antoni B. Chan, Tatia M.C. Lee, and Janet H. Hsiao,
Psychonomic Bulletin & Review,
25(6):2200-2207,
Dec 2018.
Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation.Sijin Li, Weichen Zhang, and Antoni B. Chan,
International Journal of Computer Vision (IJCV),
122(1):149-168,
March 2017.
Counting People Crossing a Line using Integer Programming and Local Features.Zheng Ma and Antoni B. Chan,
IEEE Trans. on Circuits and Systems for Video Technology (TCSVT),
26(10):1955-1969,
Oct 2016. [
appendix]
Understanding eye movements in face recognition using hidden Markov models.Tim Chuk, Antoni B. Chan, and Janet H. Hsiao,
Journal of Vision,
14(11):8,
Sep 2014.
Clustering hidden Markov models with variational HEM.Emanuele Coviello, Antoni B. Chan, and Gert R.G. Lanckriet,
Journal of Machine Learning Research (JMLR),
15(2):697-747,
Feb 2014. [
code]
Clustering Dynamic Textures with the Hierarchical EM Algorithm for Modeling Video.Adeel Mumtaz, Emanuele Coviello, Gert R.G. Lanckriet, and Antoni B. Chan,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
35(7):1606-1621,
Jul 2013. [
appendix]
Counting People with Low-Level Features and Bayesian Regression.Antoni B. Chan and Nuno Vasconcelos,
IEEE Trans. on Image Processing (TIP),
21(4):2170-2177,
May 2012.
Layered dynamic textures.Antoni B. Chan and Nuno Vasconcelos,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
31(10):1862-1879,
Oct 2009.
Modeling, clustering, and segmenting video with mixtures of dynamic textures.Antoni B. Chan and Nuno Vasconcelos,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
30(5):909-926,
May 2008.
Modeling music as a dynamic texture.Luke Barrington, Antoni B. Chan, and Gert R.G. Lanckriet,
IEEE Trans. on Audio, Speech and Language Processing (TASLP),
18(3):602-612,
Mar 2010.
Supervised learning of semantic classes for image annotation and retrieval.Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno, and Nuno Vasconcelos,
IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI),
29(3):394-410,
Mar 2007.
Google Scholar
Microsoft Academic
orcid.org/0000-0002-2886-2513
Scopus ID: 14015159100
Recent Project Pages [more]
Continual Learning MIL We pinpoint catastrophic forgetting to the attention layers of attention-MIL models for whole-slide images and introduce two remedies: Attention Knowledge Distillation (AKD) to retain attention weights across tasks and a Pseudo-Bag Memory Pool (PMP) that keeps only the most informative patches. Combined, AKD and PMP achieve state-of-the-art continual-learning accuracy while sharply cutting memory usage on diverse WSI datasets.

P2R Loss for Semi-Supervised Counting We introduce a Point-to-Region (P2R) loss to address the over-activation and pseudo-label propagation issues inherent in semi-supervised crowd counting. By replacing pixel-level matching with region-level supervision, P2R suppresses background noise and achieves state-of-the-art results with significantly higher training stability.
Proximal Mapping Loss for Crowd Counting We propose the Proximal Mapping Loss (PML), a theoretically grounded framework that discards the unrealistic “non-overlap” assumption common in crowd counting. By leveraging proximal operators from convex optimization, PML accurately recovers density in highly congested scenes where severe occlusions and overlapping objects are prevalent.
Recent Datasets and Code [more]
This is the toolbox for modeling eye movements and feature learning with deep neural networks and hidden Markov models (DNN+HMM).

Dolphin-14k: Chinese White Dolphin detection dataset
A dataset consisting of Chinese White Dolphin (CWD) and distractors for detection tasks.
Crowd counting: Zero-shot cross-domain counting
Generalized loss function for crowd counting.

CVCS: Cross-View Cross-Scene Multi-View Crowd Counting Dataset
Synthetic dataset for cross-view cross-scene multi-view counting. The dataset contains 31 scenes, each with about ~100 camera views. For each scene, we capture 100 multi-view images of crowds.
Crowd counting: Generalized loss function
Generalized loss function for crowd counting.

Teaching
- CS 5495 – Explainable AI — 2025A.
- CS 5487 – Machine Learning: Principles & Practice (postgraduate) — 2012A-2026B.
- CS 5489 – Machine Learning: Algorithms & Applications (postgraduate) — 2020B-2025B.
- GE1361 – Digital Literacy: New Technologies, Society, and You — 2025B-2026B.
- CS 6487 – Topics in Machine Learning (postgraduate) — 2019B.
- CS 4487 – Machine Learning (undergraduate) — 2015A-2018A.
- GE 2326 – Probability in Action: From the Unfinished Game to the Modern World — 2015B-2017B.
- GE 1319 – Interdisciplinary Research for Smart Professionals — 2013B-2017B.
- CS 5301 – Computer Programming — 2012A-2014A.
- CS 2363 – Computer Programming — 2009A-2011A.
- CS 3306 (B) – Contemporary Programming Methods in Java — 2010B.
- CS 4380 (B) – Web 2.0 Technologies — 2011B, 2012B.
- Multimedia Subject Group leader
- Research Mentoring Scheme Coordinator
- Final Year Project Coordinator (2016-2022)
- MSCS Project and Guided Study Coordinator (2016-2022)
- BScCM Deputy Programme Leader (2020-2022)
Service
- Associate Editor, IEEE Transactions on Pattern Analysis and Machine Intelligence (2023-now)
- Action Editor, Transactions on Machine Learning Research (2022-now)
- Guest Editor, Special Issue on “Applications of artificial intelligence, computer vision, physics and econometrics modelling methods in pedestrian traffic modelling and crowd safety”, Transportation Research Part C: Emerging Technologies (2022-23)
- Senior Area Editor, IEEE Signal Processing Letters (2016-2020)
- Associate Editor, IEEE Signal Processing Letters (2014-2016)
- Conference Area Chair
- CVPR – 2020, 2023, 2026
- ICCV – 2015, 2017, 2019, 2021, 2025 (Lead AC)
- ECCV – 2022, 2024, 2026 (Lead AC)
- NeurIPS – 2020, 2021, 2022, 2023, 2024, 2025 (top 10%), 2026
- ICML – 2021, 2022, 2023, 2024, 2025, 2026
- ICLR – 2021, 2023, 2024, 2025, 2026
- ICPR – 2020
- Pacific Graphics – 2018
- Conference Senior PC
- AAAI – 2021, 2022
- IJCAI – 2019-20
- Conference Program Committees
- CVPR – 2012-2019, 2021, 2022, 2024, 2025
- ICCV – 2011, 2013, 2023
- ECCV – 2012, 2014, 2016, 2018
- ACCV – 2011, 2014, 2016
- ICML – 2012, 2013, 2014, 2015, 2018, 2019, 2020
- NIPS – 2015, 2017, 2018, 2019
- ICLR – 2022
- Siggraph (tertiary)- 2018
- Journal Reviewing
- IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Trans. on Image Processing (TIP)
- Intl. Journal Computer Vision (IJCV)
- IEEE Trans. on Circuits and Systems for Video Technology (TCSVT)
- IEEE Trans. on Neural Networks (TNN)
- IEEE Trans. on Multimedia
- IEEE Trans. Intelligent Transportation Systems
- Grant Reviewing
- Organized Events
- CogSci 2022 Hong Kong Meetup & Symposium: Computational Approaches to Psychological Research, Aug 2022.
Awards and Honors
- Top 2% Most Highly Cited Researchers (Ioannidis et al. 2019. Plos Biology)
- The President’s Award, City University of Hong Kong, 2016.
- Early Career Award, Research Grants Council of Hong Kong, 2012.
- NSF IGERT Fellowship: Vision and Learning in Humans and Machines, UCSD, 2006-07.
- Outstanding Teaching Assistant Award, ECE Department, UCSD, 2005-06.
- Office of the President Award, UCSD, 2003.
- Henry G. White Scholorship, Cornell University, 2001.
- Knauss M. Engineering Scholorship, Cornell University, 2001.
- GTE Fellowship, Cornell University, 2001.
Mailing Address:
Prof. Antoni Chan,
Department of Computer Science,
City University of Hong Kong,
Tat Chee Avenue,
Kowloon Tong, Hong Kong.
IEEE Copyright Notice
©IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.