Yandex logo: black and red sans serif lettering on a white background

Course description

Teaching and group sessions will be run remotely using remote meeting software. Details about the timetable and how to connect to the sessions will be sent to registrants closer to the course start date.

The course provides an intensive 2-week study of contemporary machine learning and deep learning methods, via a series of lectures and seminars. Lectures provide formal definitions of methods and derivation of their key properties. Seminars give hands-on experience of data analysis, training, evaluation and application of various models. Most attention is given to building predictive models (classification, regression) with classical machine learning methods and deep neural networks. Additional attention is given to data preparation – (feature selection, dimensionality reduction, feature scaling) and grouping data into logical categories (clustering). Special deep learning topics such as convolutional neural networks, long short-term memory, generative adversarial learning and style transfer are also covered.

Students are asked to apply studied algorithms to solve practical tasks on real data. Application projects cover machine learning use-cases from various fields, with special emphasis on applications in high energy physics. The main tool throughout the course is python programming language with its scientific libraries (numpy, scipy, matplotlib, pandas, scikit-learn, tensorflow).

The course is organized by Yandex School of Data Analysis.




The course is aimed at students with no, or an introductory level of, experience in machine learning. However, firm mathematical background is expected, because each method is stated formally with main properties derived analytically and students are expected to solve theoretical problems highlighting important concepts of the course.

It is expected that participants are familiar with mathematical analysis, linear algebra, theory of probability, statistics and have general programming skills.

Prior acquaintance with Python programming language and its major data analysis libraries (numpy, scipy, matplotlib, pandas) is desirable. You may obtain it using, for example, A Crash Course in Python for Scientists and 10 minutes introduction to pandas. More detailed information can be found in Python for Data Analysis.


25 January – 8 of February 2021 (inclusive)


The course fee is £245 per person.


Registration is now closed. If you have any queries please contact the orgnaiser: Nick Wardle.