Fall 2023, Thur. 1:10pm to 4:00pm, Location: Engineering Building Room 302
Instructor: Chih-Yuan Yang
In this course, I will introduce fundamental concepts of computer vision. I expect students can understand a few important topics in computer vision such as image processing, color spaces, recognition, convolutional neural networks, generative models, vision-language models. Through discussion in the class and implementation for projects, students should learn concepts and first-hand experience of applying computer vision knowledge for real-world applications.
The course consists of four programming projects and one final group project (max 5 members each team). Please find information about final project in the syllabus.
This course requires programming experience as well as linear algebra, basic calculus, and basic probability. Previous knowledge of visual computing will be helpful.
Week | Date | Topic | Slides | Recording | Action |
---|---|---|---|---|---|
1 | 9/7 | Introduction to computer vision, camera, human vision | pptx | video | |
2 | 9/14 | Human vision, color, digital camera, image filtering, | pptx | video | |
3 | 9/21 | Fourier transform, image pyramid, aliasing, JPEG compression | pptx | video | Team member list |
4 | 9/28 | Applications of Fourier transform, computing resources of CGU AI Center | pptx | video | Homework1 presentation |
5 | 10/5 | Homework 1 feedback, CGU AI center Kubeflow demo, deep learning for computer vision | pptx | video | Project proposal due |
6 | 10/12 | Project pitch feedback, k-nearest neighbor, k-fold cross validation | pptx | video | Project pitch |
7 | 10/19 | CGU AI center computing resource tutorial for DNN training and image generation | pptx | video | Homework2 presentation |
8 | 10/26 | HOG-based pedestrian detection, CNN-part 1: linear classifier, regulation and optimization | pptx | video | |
9 | 11/2 | CNN-part 2: Neural networks, backpropagation, convolutional neural networks | pptx | video | |
10 | 11/9 | CNN-part 3: 1x1 convolution | pptx | video | Homework3 presentation |
11 | 11/16 | Homework 3 feedback, 3D convolution, generative model part1: GANs | pptx | video | Project midterm report |
12 | 11/23 | Generative model part2: GANs, Diffusion model, CLIP | pptx | video | |
13 | 11/30 | Students’ Homework4 Presentation | Homework4 presentation | ||
14 | 12/7 | Transformer, ViT, CLIP, LAION, DDPM, DDIM, AE, VAE, VQ-VAE | pptx | video | |
15 | 12/14 | SAGAN, BigGAN, BERT, Network-to-Network, VQGAN, DALL-E, Latent Diffusion, OpenCLIP, PyramidCLIP | pptx | video | |
16 | 12/21 | Final presentation, BLIP | pptx | video | Term project presentation |
Topic | Slides | Report |
---|---|---|
Deepfake Detection: An In-depth Comparative Analysis of the Generalizability of Various Deepfake Detection Techniques | pptx | docx |
Visual Inspection on Mango Maturity | pptx | |
Impact of Lighting Conditions on Face Recognition Accuracy | pptx | |
Sign Language Translation System | pptx | docx |
Social Distance Detection | pptx |
Instructor’s comments on the final reports pdf
Your final grade will be made up from
Chih-Yuan Yang: cyyang@cgu.edu.tw
Office hours: Tue 10:30~11:30 Management Building Room 1416