Spring 2026, Friday 1:10pm to 4:00pm, Classroom: The second Medical Science Building Room 104
Instructor: Chih-Yuan Yang
This course is titled Advanced Computer Vision. However, the field is vast—in 2025 alone, CVPR saw over 2,800 papers published, not to mention specialized CV conferences like ICCV and WACV, or major AI venues like NeurIPS, ICLR, ICML, and AAAI. At an advanced level, we must shift our focus toward specific areas of expertise rather than attempting to cover fundamental knowledge. In this class, I want to guide students through the latest research papers to explore their new ideas, the problems they aim to solve, their current limitations, and their relevance to students’ own research. I expect students to answer these core questions: What is the paper proposing? Is the source code available? Are the results reproducible? How does their approach benefit your research? And finally, are there ways to improve upon their solutions? As this is a literature-heavy course, I assume students already possess foundational knowledge in Computer Vision. While I do not plan to lecture on basic concepts like pixels, color spaces, filters, or neural networks, I will step in to clarify concepts or provide necessary background if discussions become confusing or technical gaps arise.
In this course, I want students to read the latest papers from top computer vision conferences and journals, which are the state-of-the-art research reports. Students need to present their findings, understanding, reproduced experimental results, and ideas for improvements in the classroom. By understanding those cutting-edge methods, students should gain knowledge and get some ideas for their own research. This course requires programming experience and fundamental knowledge of computer vision.
https://teams.microsoft.com/meet/48242579078635?p=UaJzY7gYU1iQivpZ10 It will be activated only when asked.
| Week | Date | Topic | Slides | Recording | Action |
|---|---|---|---|---|---|
| 1 | 2/27 | Holiday: Peace Memorial Day Compensation Day | |||
| 2 | 3/6 | Introduction to this course and the top computer vision conferences. Presented paper: 2025 ICCV Towards Proactive Social Robots: Distilling Visual Knowledge from Large Vision-Language Models | pptx | YouTube | |
| 3 | 3/13 | 2025 ICCV GeoFormer: Geometry Point Encoder for 3D Object Detection with Graph-based Transformer 2026 arXiv LoopViT: Scaling Visual ARC with Looped Transformers | |||
| 4 | 3/20 | 2025 ICCV From Gaze to Movement: Predicting Visual Attention for Autonomous Driving Human-Machine Interaction based on Programmatic Imitation Learning 2022 WACV SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water | |||
| 5 | 3/27 | Paper presentation and discussion 3 | |||
| 6 | 4/3 | Holiday: Children’s Day | |||
| 7 | 4/10 | Term project proposal / Paper presentation and discussion 4 | |||
| 8 | 4/17 | Paper presentation and discussion 5 | |||
| 9 | 4/24 | Paper presentation and discussion 6 | |||
| 10 | 5/1 | Holiday: Labor Day | |||
| 11 | 5/8 | Midterm presentation / Paper presentation and discussion 7 | |||
| 12 | 5/15 | Paper presentation and discussion 8 | |||
| 13 | 5/22 | Paper presentation and discussion 9 | |||
| 14 | 5/29 | Paper presentation and discussion 10 | |||
| 15 | 6/5 | Paper presentation and discussion 11 | |||
| 16 | 6/12 | Term project presentation | |||
| 17 | 6/19 | Final report due |
| Topic | Slides | Report | Code |
|---|---|---|---|
We do not have a textbook because the knowledge reported by latest research papers is too new to be covered by a textbook. An evoling large language model is more useful than a textbook for you to retrive new knowledge.
Your final grade will be made up from
Chih-Yuan Yang: cyyang@cgu.edu.tw
Office hours: Tue 10:30~11:30 Management Building Room 1416