Generative AI
Module aims
This comprehensive module provides students with a deep understanding of generative AI by exploring its theoretical foundations and practical applications. Covering key concepts in generative models, the module introduces techniques such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), Diffusion Models, Transformers, and Autoregressive Model (AR), equipping learners with the skills to design and implement these models effectively.
With a strong focus in the module on real-world applications, students will critically assess the strengths and limitations of different generative AI approaches. The module also emphasizes the ethical considerations surrounding AI-generated content, ensuring learners are prepared to address potential risks and societal impacts. Throughout the module, students engage in hands-on coding exercises, and case studies, progressively building a strategic understanding of generative AI deployment.
Learning outcomes
1. Demonstrate a comprehensive understanding of the theoretical foundations of generative AI, including core principles, mathematical frameworks, and distinctions from discriminative models.
2. Design, implement, and rigorously evaluate leading generative AI techniques, such as VAEs, GANs, diffusion models, Transformers, and autoregressive models, across a variety of data modalities (text, audio, image, 3D, and video).
3. Apply deep learning concepts and hands-on coding skills to real-world problems, leveraging generative models for innovative applications and critically comparing the strengths and limitations of each approach.
4. Identify, analyze, and address ethical and societal challenges associated with generative AI, including responsible deployment, risk mitigation, and the impact of synthetic media on society.
Module syllabus
1: Preliminary: Basics of Deep Learning
• Neural networks
• Loss functions
• Optimization algorithms
• Training techniques
2: Introduction to Generative AI
• Overview of Discriminative vs. Generative models
• Introduction to unsupervised learning and probability in generative models
• Types of generative models and key applications
3: Text Generation
• Generative models for text: text tokenizer, attention mechanisms, pretraining on unlabelled data, finetuning to follow instructions
• Applications: text completion, chatbots, translation, summarization
4: Audio Generation
• Generative models for audio: audio tokenization, semantic and acoustic modelling, audio LLM, audio detokenizer
• Applications: speech synthesis, music generation, audio restoration
• Ethics: voice cloning
5: Image Generation
• Generative models for image: VAEs (Variational inference, KL-divergence), GANs (adversarial learning), and Diffusion Models (forward and reverse diffusion process, denoising score matching)
• Applications: image synthesis, style transfer, inpainting
• Ethics: deepfakes
6: 3D Generation
• 3D representations: voxels, meshes, point clouds, Neural Radiance Fields (NeRF), and Gaussian Splatting
• Generative models for 3D: GAN-based (3D-GAN), Neural fields (NeRF), and Diffusion-based (Point-E, Magic3D)
• Applications: 3D object synthesis, virtual reality, augmented reality
7: Video Generation
• Video basics: representations (frame-based, optical flow, motion fields, spatial-temporal volumes) and challenges (temporal consistency, long-range dependency, high-dimensionality)
• Generative models for video: Autoregressive models (VideoGPT), Diffusion-based models (Google Imagen Video, Runway Gen-4), temporal consistency, motion regularization, conditioning techniques.
• Applications: entertainment, simulation, robotics
Teaching methods
• Lectures: Weekly lectures to cover foundational concepts and theoretical principles.
• Tutorials: Problem-solving sessions to reinforce concepts and discuss practical challenges.
• Lab Sessions: Hands-on coding exercises in Jupyter notebooks, focusing on implementing generative models.
An online service will be used as a discussion forum for the module.
Assessments
The assessment will consist of two components: a hands-on coursework and a final written examination.
1. Coursework (30%): Hands-on project involving the implementation of a generative model to solve a chosen problem using the Jupyter Notebook.
2. Final Exam (70%): Written exam assessing theoretical understanding of key generative AI models, mathematical foundations, and practical challenges.
Detailed written feedback will be given for each assessed assignment, along with class-wide feedback that highlights common pitfalls and provides suggestions for improvement. This approach ensures that students receive both individualized and general guidance to enhance their learning and performance.