Efficient AI Computing,
Transforming the Future.

TinyML and Efficient Deep Learning Computing

6.S965

Fall

2022

https://efficientml.ai

Have you found it difficult to deploy neural networks on mobile devices and IoT devices? Have you ever found it too slow to train neural networks? This course is a deep dive into efficient machine learning techniques that enable powerful deep learning applications on resource-constrained devices. Topics cover efficient inference techniques, including model compression, pruning, quantization, neural architecture search, and distillation; and efficient training techniques, including distributed training, gradient compression and on-device transfer learning; followed by application-specific model optimization techniques for video, point cloud, GAN and NLP; it will cover futuristic research on quantum machine learning. Students will get hands-on experience implementing deep learning applications on microcontrollers and mobile phones with an open-ended design project related to mobile AI.

Instructor

Associate Professor

Teaching Assistants

Announcements

Currently no active announcements.

Schedule

Date

Lecture

Logistics

Introduction

Sep 8

Lecture
1
:

Introduction

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Basics of Deep Learning

Sep 13

Lecture
2
:

Basics of Deep Learning

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Pruning and Sparsity (Part I)

Sep 15

Lecture
3
:

Pruning and Sparsity (Part I)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Pruning and Sparsity (Part II)

Sep 20

Lecture
4
:

Pruning and Sparsity (Part II)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Quantization (Part I)

Sep 22

Lecture
5
:

Quantization (Part I)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Quantization (Part II)

Sep 27

Lecture
6
:

Quantization (Part II)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Neural Architecture Search (Part I)

Sep 29

Lecture
7
:

Neural Architecture Search (Part I)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Neural Architecture Search (Part II)

Oct 4

Lecture
8
:

Neural Architecture Search (Part II)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Neural Architecture Search (Part III)

Oct 6

Lecture
9
:

Neural Architecture Search (Part III)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Student Holiday - No Class

Oct 11

Lecture
9
:

Student Holiday - No Class

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Knowledge Distillation

Oct 13

Lecture
10
:

Knowledge Distillation

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

MCUNet - Tiny Neural Network Design for Microcontrollers

Oct 18

Lecture
11
:

MCUNet - Tiny Neural Network Design for Microcontrollers

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Paper Reading Presentation

Oct 20

Lecture
12
:

Paper Reading Presentation

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Distributed Training and Gradient Compression (Part I)

Oct 25

Lecture
13
:

Distributed Training and Gradient Compression (Part I)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Distributed Training and Gradient Compression (Part II)

Oct 27

Lecture
14
:

Distributed Training and Gradient Compression (Part II)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

On-Device Training and Transfer Learning (Part I)

Nov 1

Lecture
15
:

On-Device Training and Transfer Learning (Part I)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

On-Device Training and Transfer Learning (Part II)

Nov 3

Lecture
16
:

On-Device Training and Transfer Learning (Part II)

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

TinyEngine - Efficient Training and Inference on Microcontrollers

Nov 8

Lecture
17
:

TinyEngine - Efficient Training and Inference on Microcontrollers

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Efficient Point Cloud Recognition

Nov 10

Lecture
18
:

Efficient Point Cloud Recognition

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Efficient Video Understanding and GANs

Nov 15

Lecture
19
:

Efficient Video Understanding and GANs

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Efficient Transformers

Nov 17

Lecture
20
:

Efficient Transformers

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Basics of Quantum Computing

Nov 22

Lecture
21
:

Basics of Quantum Computing

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Thanksgiving — No Class

Nov 24

Lecture
21
:

Thanksgiving — No Class

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Quantum Machine Learning

Nov 29

Lecture
22
:

Quantum Machine Learning

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Noise Robust Quantum ML

Dec 1

Lecture
23
:

Noise Robust Quantum ML

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Final Project Presentation

Dec 6

Lecture
24
:

Final Project Presentation

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Final Project Presentation

Dec 8

Lecture
25
:

Final Project Presentation

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Course Summary & Guest Lecture

Dec 13

Lecture
26
:

Course Summary & Guest Lecture

[Slides]
[Slides]
[Video]
[Video]
[Video (Live)]
[Video (Live)]

Logistics

Grading

The class requirements include brief reading summaries, scribe notes for 1 lecture, 4 labs, and a project. This is a PhD level course, and by the end of this class you should have a good understanding of efficient deep learning techniques, and be able to deploy AI applications on resource-constrained devices.

The grading breakdown is as follows:

  • Scribe Duties (5%)
  • 4 Labs (60%)
  • Paper Review Presentation (10%)
  • Final Project (25%)
  • Participation Bonus (4%)

Note that this class does not have any tests or exams.

Scribe Duties

Each student is required to scribe for a few lectures. During your assigned lectures, you are to take detailed notes independently. After the lecture, the notes have to be converted into a written markdown (see the guidelines). The notes must be detailed and thorough, and must be submitted through a pull request on GitHub within 1 week after the lecture. TAs will audit and review the submitted notes, request changes if necessary, and will eventually approve the notes.

As long as your scribe notes are complete and accurate, you will be awarded full credit for scribe duties. If your notes have errors or are otherwise not up to standard, we will inform you and give you a chance to correct them.

Labs

There will be 4 labs over the course of the semester. These assignments may contain material that has been covered by published papers and webpages. It is a graduate class, and we expect students to solve the problems themselves rather than search for answers.

Collaboration Policy

Labs must be done individually: each student must hand in their own answers. However, it is acceptable to collaborate when figuring out answers and to help each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution arising from such collaboration. You also must indicate on each homework with whom you have collaborated.

Late Policy

You will be allowed 6 total homework late days without penalty for the entire semester. You may be late by up to 6 days on any homework assignment. Once those days are used, you will be penalized according to the following policy:

  • Homework is worth full credit at the due time on the due date.
  • The allowed late days are counted by day (i.e., each new late day starts at 12:00 am ET).
  • Once the allowed late days are exceeded, the penalty is 50% per late day counted by day.
  • The homework is worth zero credit 2 days after exceeding the late day limit.

You must turn in at least 3 of the 4 assignments, even if for zero credit, in order to pass the course.

Regrade Policy

If you feel that we have made a mistake in grading your work, please submit a regrading request to TAs during the office hour and we will consider your request. Please note that regrading of a homework may cause your grade to go either up or down.

Paper Review Presentation

The goal of the paper review presentation is to learn how to read papers (critique and extract information).

Every paper will have 2-3 students acting as reviewers, and they should work as a team. Each team is required to present the paper in the class:

  • The team will give overview of paper, including background, contributions, methods, and key evaluation results.
  • Each student will give strength/weakness of paper.
  • The team will answer questions from other students in the class.

Final Project

The class project will be carried out in groups of 2 or 3 people, and has three main parts: a proposal, a final report, and an oral presentation. The project is an integral part of this class, and is designed to be as similar as possible to researching and writing a conference-style paper.

Participation Bonus

We appreciate everyone being actively involved in the class! There are several ways of earning participation bonus credit, which will be capped at 4%:

  • Piazza participation: The top ~10 contributors to Piazza will get 3%. (To prevent abuse of the system, not all contributions are counted and instructors hold the right to determine to count contributions as positive or negative.)
  • Completing mid-semester evaluation: Around the middle of the semester, we will send out a survey to help us understand how the course is going, and how we can improve. Completing it is worth 1%.
  • Karma point: Any other act that improves the class, which a TA or instructor notices and deems worthy: 1%.