City University of Hong Kong
Department of Computer Science
Artificial Intelligence
Semester B, 2024/25
This is a 3-credit course.
CS4486 is an undergraduate-level course for the field of Artificial
Intelligence (AI). This course is designed to equip students with the knowledge
and skills of problem solving using AI techniques. It is not about computer
vision and natural language processing; instead, it is an entry-level course
covering the problem-solving methods such as search and optimization, the
logical systems with reasoning, and machine learning techniques.
Prerequisites
- CS2310 Computer Programming or
- CS2315 Computer Programming or
- CS2334 Data Structures for Data Science or
- CS2360 Java Programming
Textbook
There is no textbook for the course. All teaching materials will be from
online sources.
The optional readings, unless explicitly specified, come from the book
Artificial Intelligence: A Modern Approach, 3rd ed by Stuart
Russell and Peter Norvig.
Instructor:
Dr. Dapeng Wu
Office: Y6321, AC-1 Building
Email: dapengwu@cityu.edu.hk
TA:
1) Hong Huang
Email: hohuang-c@my.cityu.edu.hk
2) Yongcan Luo
Email: yongcaluo2-c@my.cityu.edu.hk
3) Hongming Piao
Email: hpiao6-c@my.cityu.edu.hk
4) Tianli Shi
Email: tianlishi2-c@my.cityu.edu.hk
5) Hao Wang
Email: hwang728-c@my.cityu.edu.hk
6) Shuguang Wang
Email: sgwang6-c@my.cityu.edu.hk
7) Yun Wang
Email: ywang3875-c@my.cityu.edu.hk
8) Renwei Yang
Email: renweyang2-c@my.cityu.edu.hk
9) Guanyi Zhao
Email: guanyzhao3-c@my.cityu.edu.hk
10) Jiahao Zheng
Email: jhzheng4-c@my.cityu.edu.hk
Course website: https://www.cs.cityu.edu.hk/~dapengwu/courses/CS4486s25
Meeting Time for Lectures
Friday, 9 am - 11:50 am
Meeting Room for Lectures
LT 18 (on Floor 4), AC-1 Building
Meeting Weeks for Tutorials
Tutorials will be given in Room B4702, AC-1 Building, in the first week through
the 10th week (i.e., from Jan. 17 to March 28) for a total of 10 tutorials; note
that there is no class/tutorial on Jan. 31. There
are two sessions for the tutorials. The meeting times for tutorials are
- 13:00-13:50, Friday, instructors: Hong Huang, Hongming Piao
- 14:00-14:50, Friday, instructors: Tianli Shi, Renwei Yang
You only need to attend one session since the two sessions cover the same
teaching materials.

Course Policies
- During lecture, cell phones should be in a silent mode.
- No late submissions of your homework solution, and project report, are allowed
unless advance permission is granted by
the instructor.

Grading:
| Grades |
Percentage |
Due Dates |
| Weekly quiz |
10% |
In-class quiz |
| Homework |
20% |
To be announced |
| Project |
20% |
To be announced |
| Final exam |
50% |
April 28--May 13 |
Class Project:
The class project will be done individually. Each student is expected
to implement some AI technique to solve real-world problems such as sales
prediction, birds classification, spam detection, music genre classification,
skin cancer classification, and game. A report is expected to be written by
each student to document his/her research.

The course calendar can be found here.


Useful links
- Anaconda: Anaconda is the
leading open data science platform powered by Python.
- Theano:
Theano is a Python library that lets you to define, optimize, and evaluate
mathematical expressions, especially ones with multi-dimensional arrays (numpy.ndarray).
- TensorFlow:
TensorFlow is an open source software library for numerical computation using
data flow graphs. Nodes in the graph represent mathematical operations, while
the graph edges represent the multidimensional data arrays (tensors)
communicated between them. The flexible architecture allows you to deploy
computation to one or more CPUs or GPUs in a desktop, server, or mobile device
with a single API.
- Keras: Keras is a minimalist,
highly modular neural networks library, written in Python and capable of
running on top of either TensorFlow or Theano. It was developed with a focus
on enabling fast experimentation. Being able to go from idea to result with
the least possible delay is key to doing good research.
-
PyTorch: PyTorch is a deep learning
framework for fast, flexible experimentation.
-
A curated list of resources dedicated to
recurrent neural networks
-
Use the Keras platform to implement handwritten digit recognition, with a
multi-layer perceptron:
[link]
-
Source code in
PyTorch for handwritten digit recognition, using 2D convolutional neural networks
-
Source code in Python for TF-mRNN: a TensorFlow library for image captioning
-
Source code in Python for the following work on image captioning:
-
Image captioning:
-
Microsoft COCO datasets
- Visual Question Answering:
- Semantic Propositional Image Caption Evaluation (SPICE)
- Region-based Convolutional Neural Networks (R-CNN)
- References:
- Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN:
Towards real-time object detection with region proposal networks." In Advances
in neural information processing systems, pp. 91-99. 2015. [pdf]
- Dai, Jifeng, Yi Li, Kaiming He, and Jian Sun. "R-FCN: Object detection via
region-based fully convolutional networks." In Advances in neural information
processing systems, pp. 379-387. 2016. [pdf]
[source code]
- Huang, Jonathan, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara,
Alireza Fathi, Ian Fischer et al. "Speed/accuracy trade-offs for modern
convolutional object detectors." arXiv preprint arXiv:1611.10012 (2016). [pdf]
(E.g., for Inception V3, extract features from the “Mixed 6e” layer whose
stride size is 16 pixels. Feature maps are cropped and resized to 17x17.)
- Source codes:
- Source code in Python for end-to-end training of LSTM
- Bidirectional Encoder Representations from Transformers (BERT)
- Source code in Python for sequence-to-sequence learning (language translation,
chatbot)
-
Visual Storytelling Dataset (VIST)
- Visual storytelling algorithms:
- No Metrics Are Perfect: Adversarial REward Learning for Visual
Storytelling: source codes (TensorFlow)
-
Visual Genome is a dataset, a knowledge
base, an ongoing effort to connect structured image concepts to language.
-
MPII Movie & Description dataset for automatic video description, video
summary, video storytelling
-
Bidirectional recurrent neural networks (B-RNN):
- Graves, Alan, Navdeep Jaitly, and Abdel-rahman Mohamed. "Hybrid speech
recognition with deep bidirectional LSTM." IEEE Workshop on Automatic Speech
Recognition and Understanding (ASRU), 2013. [pdf]
- Deep reinforcement learning
- UCL Course on reinforcement learning: [ppt]
[video]
- References:
- Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis
Antonoglou, Daan Wierstra, and Martin Riedmiller. "Playing
atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602
(2013).
- Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel
Veness, Marc G. Bellemare, Alex Graves et al. "Human-level
control through deep reinforcement learning." Nature 518, no. 7540
(2015): 529-533. [source
code]
-
How to Study Reinforcement Learning
- Source codes:
- Implementation
of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow.
Exercises and Solutions to accompany Sutton's Book and David Silver's course.
[link]
- Generative Adversarial Network (GAN)
- References:
- Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative
adversarial nets." In Advances in neural information processing systems,
pp. 2672-2680. 2014.
- Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised
representation learning with deep convolutional generative adversarial
networks." arXiv preprint arXiv:1511.06434 (2015).
- Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein
GAN." arXiv preprint arXiv:1701.07875 (2017).
- Types of GAN
- Vanilla GAN
- Conditional GAN
- InfoGAN
- Wasserstein GAN
- Mode Regularized GAN
- Coupled GAN
- Auxiliary Classifier GAN
- Least Squares GAN
- Boundary Seeking GAN
- Energy Based GAN
- f-GAN
- Generative Adversarial
Parallelization
- DiscoGAN
- Adversarial Feature Learning
& Adversarially Learned Inference
- Boundary Equilibrium GAN
- Improved Training for
Wasserstein GAN
- DualGAN
- MAGAN: Margin Adaptation for
GAN
- Softmax GAN
- Source codes:
- A Tensorflow
Implementation of "Deep Convolutional Generative Adversarial Networks":
python code
- Collection of
generative models, e.g. GAN, VAE in Pytorch and Tensorflow:
python code
- Sequential Generative Adversarial Network (GAN)
- References:
- Yu, Lantao, Weinan Zhang, Jun Wang, and Yong Yu. "SeqGAN:
Sequence Generative Adversarial Nets with Policy Gradient." In AAAI,
pp. 2852-2858. 2017.
- Mogren, Olof. "C-RNN-GAN:
Continuous recurrent neural networks with adversarial training." arXiv
preprint arXiv:1611.09904 (2016).
- Im, Daniel Jiwoong, Chris Dongjoo Kim, Hui Jiang, and Roland Memisevic. "Generating
images with recurrent adversarial networks." arXiv preprint
arXiv:1602.05110 (2016).
- Press, Ofir, Amir Bar, Ben Bogin, Jonathan Berant, and Lior Wolf. "Language
Generation with Recurrent Generative Adversarial Networks without Pre-training."
arXiv preprint arXiv:1706.01399 (2017).
- Source codes:
- Stanford NLP
Parser: A natural language parser is a program that works out the
grammatical structure of sentences.
- Performance metrics
for a natural language parser
- Precision and
recall
- mAP (mean
Average Precision) for Object Detection
- Question answering
- References:
- Source codes:
- Question answering datasets:
- The General Language Understanding Evaluation (GLUE)
benchmark is a collection of resources for training, evaluating, and analyzing
natural language understanding systems.
- Semantic
Textual Similarity (STS) benchmark evaluation dataset
- Automatic text understanding and reasoning:
-
NLTK sentiment analysis
tool
-
Opinion Lexicon (dictionary of sentiment words):
Positive and
Negative
-
Human activity recognition
-
HMDB: a large human motion database
- UCF101: Action
Recognition Data Set
-
Coronavirus
dataset
-
AI City Challenge
-
Batch Normalization and Weight Decay Notes
-
A powerful and flexible machine learning
platform for drug discovery
-
MATLAB Tutorial
-
MATLAB Central
-
Matlab Primer,
Matlab Manuals,
Image
Processing Toolbox
-
Matlab implementation of image/video compression algorithms
- Matrix Reference
Manual
- HIPR2: a WWW-based Image
Processing Teaching Materials with J
- Learning by simulations
- OpenCV
- OpenGL
- A Recipe for
Training Neural Networks (by Andrej Karpathy)
- Download the following
free (open source)
program to record video with screen capture:
http://www.nchsoftware.com/capture/index.html?gclid=CNadwsW6-6wCFSVjTAodbjzTSg
Free books
Software:
- Virtual Dub: VirtualDub
is a video capture/processing utility for 32-bit Windows platforms
(95/98/ME/NT4/2000/XP), licensed under the GNU General Public License (GPL).
- XnView:
is an efficient multimedia viewer, browser and converter.
- ImageJ: Read and write GIF,
JPEG, and ASCII. Read BMP, DICOM, and FITS. [Open Source, Public Domain]
- Open source for image processing tasks:
http://octave.sourceforge.net/doc/image.html
Related courses in other institutions:
JOURNALS
Elsevier
- Computer Vision and
Image Understanding
- Journal of Visual
Communication and Image Representation
- Data & Knowledge Engineering
- Image and Vision Computing
- Pattern Recognition
- Pattern Recognition Letters
IEEE
- IEEE Transactions on
Circuits and Systems for Video Technology
- IEEE Transactions on Multimedia
- IEEE Transactions on
Image Processing
- IEEE Transactions on
Medical Imaging
- IEEE Transactions on PAMI
Computer Vision
Public Domain Image Databases
CMU Database
