TinyML and Efficient Deep Learning Computing

6.S965

• Fall

• 2022

• https://efficientml.ai

Have you found it difficult to deploy neural networks on mobile devices and IoT devices? Have you ever found it too slow to train neural networks? This course is a deep dive into efficient machine learning techniques that enable powerful deep learning applications on resource-constrained devices. Topics cover efficient inference techniques, including model compression, pruning, quantization, neural architecture search, and distillation; and efficient training techniques, including distributed training, gradient compression and on-device transfer learning; followed by application-specific model optimization techniques for video, point cloud, GAN and NLP; it will cover futuristic research on quantum machine learning. Students will get hands-on experience implementing deep learning applications on microcontrollers and mobile phones with an open-ended design project related to mobile AI.

Live Streaming:
https://live.efficientml.ai/
Lecture Videos:
https://www.youtube.com/playlist?list=PL80kAHvQbh-ocildRaxjjBy6MR1ZsNCU7
Time:
Tuesday/Thursday 3:30-5:00 pm Eastern Time
Location:
36-156
Office Hour:
Thursday 5:00-6:00 pm Eastern Time, 38-344 Meeting Room
Discussion:
Piazza
Homework Submission:
Canvas
Contact:
- Students should ask all course-related questions on Piazza.
- For external inquiries, personal matters, or emergencies, you can email us at 6s965-fall2022-staff@mit.edu.

Instructor

Song Han

Associate Professor

Teaching Assistants

Yujun Lin

Ph.D

Zhijian Liu

Ph.D

Announcements

Currently no active announcements.

Schedule

Date

Lecture

Logistics

Introduction

Sep 8

Lecture

Introduction

[Slides]

[Video]

[Video (Live)]

Basics of Deep Learning

Sep 13

Lecture

Basics of Deep Learning

[Slides]

[Video]

[Video (Live)]

Pruning and Sparsity (Part I)

Sep 15

Lecture

Pruning and Sparsity (Part I)

[Slides]

[Video]

[Video (Live)]

Pruning and Sparsity (Part II)

Sep 20

Lecture

Pruning and Sparsity (Part II)

[Slides]

[Video]

[Video (Live)]

Quantization (Part I)

Sep 22

Lecture

Quantization (Part I)

[Slides]

[Video]

[Video (Live)]

Quantization (Part II)

Sep 27

Lecture

Quantization (Part II)

[Slides]

[Video]

[Video (Live)]

Neural Architecture Search (Part I)

Sep 29

Lecture

Neural Architecture Search (Part I)

[Slides]

[Video]

[Video (Live)]

Neural Architecture Search (Part II)

Oct 4

Lecture

Neural Architecture Search (Part II)

[Slides]

[Video]

[Video (Live)]

Neural Architecture Search (Part III)

Oct 6

Lecture

Neural Architecture Search (Part III)

[Slides]

[Video]

[Video (Live)]

Student Holiday - No Class

Oct 11

Lecture

Student Holiday - No Class

[Slides]

[Video]

[Video (Live)]

Knowledge Distillation

Oct 13

Lecture

Knowledge Distillation

[Slides]

[Video]

[Video (Live)]

MCUNet - Tiny Neural Network Design for Microcontrollers

Oct 18

Lecture

MCUNet - Tiny Neural Network Design for Microcontrollers

[Slides]

[Video]

[Video (Live)]

Paper Reading Presentation

Oct 20

Lecture

Paper Reading Presentation

[Slides]

[Video]

[Video (Live)]

Distributed Training and Gradient Compression (Part I)

Oct 25

Lecture

Distributed Training and Gradient Compression (Part I)

[Slides]

[Video]

[Video (Live)]

Distributed Training and Gradient Compression (Part II)

Oct 27

Lecture

Distributed Training and Gradient Compression (Part II)

[Slides]

[Video]

[Video (Live)]

On-Device Training and Transfer Learning (Part I)

Nov 1

Lecture

On-Device Training and Transfer Learning (Part I)

[Slides]

[Video]

[Video (Live)]

On-Device Training and Transfer Learning (Part II)

Nov 3

Lecture

On-Device Training and Transfer Learning (Part II)

[Slides]

[Video]

[Video (Live)]

TinyEngine - Efficient Training and Inference on Microcontrollers

Nov 8

Lecture

TinyEngine - Efficient Training and Inference on Microcontrollers

[Slides]

[Video]

[Video (Live)]

Efficient Point Cloud Recognition

Nov 10

Lecture

Efficient Point Cloud Recognition

[Slides]

[Video]

[Video (Live)]

Efficient Video Understanding and GANs

Nov 15

Lecture

Efficient Video Understanding and GANs

[Slides]

[Video]

[Video (Live)]

Efficient Transformers

Nov 17

Lecture

Efficient Transformers

[Slides]

[Video]

[Video (Live)]

Basics of Quantum Computing

Nov 22

Lecture

Basics of Quantum Computing

[Slides]

[Video]

[Video (Live)]

Thanksgiving — No Class

Nov 24

Lecture

Thanksgiving — No Class

[Slides]

[Video]

[Video (Live)]

Quantum Machine Learning

Nov 29

Lecture

Quantum Machine Learning

[Slides]

[Video]

[Video (Live)]

Noise Robust Quantum ML

Dec 1

Lecture

Noise Robust Quantum ML

[Slides]

[Video]

[Video (Live)]

Final Project Presentation

Dec 6

Lecture

Final Project Presentation

[Slides]

[Video]

[Video (Live)]

Final Project Presentation

Dec 8

Lecture

Final Project Presentation

[Slides]

[Video]

[Video (Live)]

Course Summary & Guest Lecture

Dec 13

Lecture

Course Summary & Guest Lecture

[Slides]

[Video]

[Video (Live)]

Logistics

Grading

The class requirements include brief reading summaries, scribe notes for 1 lecture, 4 labs, and a project. This is a PhD level course, and by the end of this class you should have a good understanding of efficient deep learning techniques, and be able to deploy AI applications on resource-constrained devices.

The grading breakdown is as follows:

Scribe Duties (5%)
4 Labs (60%)
Paper Review Presentation (10%)
Final Project (25%)
Participation Bonus (4%)

Note that this class does not have any tests or exams.

Scribe Duties

Each student is required to scribe for a few lectures. During your assigned lectures, you are to take detailed notes independently. After the lecture, the notes have to be converted into a written markdown (see the guidelines). The notes must be detailed and thorough, and must be submitted through a pull request on GitHub within 1 week after the lecture. TAs will audit and review the submitted notes, request changes if necessary, and will eventually approve the notes.

As long as your scribe notes are complete and accurate, you will be awarded full credit for scribe duties. If your notes have errors or are otherwise not up to standard, we will inform you and give you a chance to correct them.

Labs

There will be 4 labs over the course of the semester. These assignments may contain material that has been covered by published papers and webpages. It is a graduate class, and we expect students to solve the problems themselves rather than search for answers.

Collaboration Policy

Labs must be done individually: each student must hand in their own answers. However, it is acceptable to collaborate when figuring out answers and to help each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution arising from such collaboration. You also must indicate on each homework with whom you have collaborated.

Late Policy

You will be allowed 6 total homework late days without penalty for the entire semester. You may be late by up to 6 days on any homework assignment. Once those days are used, you will be penalized according to the following policy:

Homework is worth full credit at the due time on the due date.
The allowed late days are counted by day (i.e., each new late day starts at 12:00 am ET).
Once the allowed late days are exceeded, the penalty is 50% per late day counted by day.
The homework is worth zero credit 2 days after exceeding the late day limit.

You must turn in at least 3 of the 4 assignments, even if for zero credit, in order to pass the course.

Regrade Policy

If you feel that we have made a mistake in grading your work, please submit a regrading request to TAs during the office hour and we will consider your request. Please note that regrading of a homework may cause your grade to go either up or down.

Paper Review Presentation

The goal of the paper review presentation is to learn how to read papers (critique and extract information).

Every paper will have 2-3 students acting as reviewers, and they should work as a team. Each team is required to present the paper in the class:

The team will give overview of paper, including background, contributions, methods, and key evaluation results.
Each student will give strength/weakness of paper.
The team will answer questions from other students in the class.

Final Project

The class project will be carried out in groups of 2 or 3 people, and has three main parts: a proposal, a final report, and an oral presentation. The project is an integral part of this class, and is designed to be as similar as possible to researching and writing a conference-style paper.

Participation Bonus

We appreciate everyone being actively involved in the class! There are several ways of earning participation bonus credit, which will be capped at 4%:

Piazza participation: The top ~10 contributors to Piazza will get 3%. (To prevent abuse of the system, not all contributions are counted and instructors hold the right to determine to count contributions as positive or negative.)
Completing mid-semester evaluation: Around the middle of the semester, we will send out a survey to help us understand how the course is going, and how we can improve. Completing it is worth 1%.
Karma point: Any other act that improves the class, which a TA or instructor notices and deems worthy: 1%.

Efficient AI Computing,Transforming the Future.

TinyML and Efficient Deep Learning Computing

6.S965

•

Fall

•

2022

•

https://efficientml.ai

Instructor

Song Han

Teaching Assistants

Yujun Lin

Zhijian Liu

Announcements

Schedule

Date

Lecture

Logistics

Introduction

Sep 8

Introduction

Basics of Deep Learning

Sep 13

Basics of Deep Learning

Pruning and Sparsity (Part I)

Sep 15

Pruning and Sparsity (Part I)

Pruning and Sparsity (Part II)

Sep 20

Pruning and Sparsity (Part II)

Quantization (Part I)

Sep 22

Quantization (Part I)

Quantization (Part II)

Sep 27

Quantization (Part II)

Neural Architecture Search (Part I)

Sep 29

Neural Architecture Search (Part I)

Neural Architecture Search (Part II)

Oct 4

Neural Architecture Search (Part II)

Neural Architecture Search (Part III)

Oct 6

Neural Architecture Search (Part III)

Student Holiday - No Class

Oct 11

Student Holiday - No Class

Knowledge Distillation

Oct 13

Knowledge Distillation

MCUNet - Tiny Neural Network Design for Microcontrollers

Oct 18

MCUNet - Tiny Neural Network Design for Microcontrollers

Paper Reading Presentation

Oct 20

Paper Reading Presentation

Distributed Training and Gradient Compression (Part I)

Oct 25

Distributed Training and Gradient Compression (Part I)

Distributed Training and Gradient Compression (Part II)

Oct 27

Distributed Training and Gradient Compression (Part II)

On-Device Training and Transfer Learning (Part I)

Nov 1

On-Device Training and Transfer Learning (Part I)

On-Device Training and Transfer Learning (Part II)

Nov 3

On-Device Training and Transfer Learning (Part II)

TinyEngine - Efficient Training and Inference on Microcontrollers

Nov 8

TinyEngine - Efficient Training and Inference on Microcontrollers

Efficient Point Cloud Recognition

Nov 10

Efficient Point Cloud Recognition

Efficient Video Understanding and GANs

Nov 15

Efficient Video Understanding and GANs

Efficient Transformers

Efficient AI Computing,
Transforming the Future.