CS6886 - Systems Engineering for Deep Learning
Course Data :
Description: The course will provide a comprehensive and current view of systems challenges to execute workloads from Deep Learning (DL). This will include three major areas: hardware design, security, and safety-critical execution. All these three areas will be studied covering both training and inference methods and spanning from edge to cloud devices. At the end of the course, the students should appreciate the significance of systems thinking in enabling efficient DL, and be equipped to implement and optimize large systems for real-world application of DL.
CourseContent:
- Introduction to the course; overview of computer architecture and deep learning (2.5 weeks)
- Evolution of platforms for Deep Learning: CPUs, GPUs, FPGAs, DSPs, accelerators; Hardware considerations in inference and training, Case study of an accelerator: Google TPU v1-3 (2.5 weeks)
- Accelerating the convolution operation: Algorithms, Data flow patterns, Memory reuse (1.5 week)
- Case-study on writing a custom GPU kernel for accelerating convolution (2 weeks)
- Optimizing networks: Weight quantization, network compression, sparse operations, zero forwarding, learning with hardware in the loop, learning and inference on low-memory devices (2 weeks)
- Case-study on hardware design of small modules with Bluespec BSV. Alternatively, implementing a state-of-the-art DL model on an embedded device such as Raspberry Pi / Arduino (1.5 weeks)
- Security for DL: side channel attacks on devices, secure enclaves, privacy during learning and inference (1.5 weeks)
- Safety-critical deployment of DL: Inference under timing deadlines, Minimizing latency (0.5 week)
The following topics will be covered based on available time and student interest - Neuromorphic computing, Spiking neurons, Signaling in time, In-memory compute, Memristors, Current industry pursuits - Security for DL: adversarial learning, shared learning on confidential data, DL on encrypted data, homomorphic models - Case-study on designing and optimizing a DL network for a real-world problem and deploying to an Android phone
TextBooks: There is no prescribed textbook for this course
ReferenceBooks: Recent papers from top venues especially ML systems workshops at NIPS, ICML, SOSP - Stoica, Ion, et al. "A Berkeley view of systems challenges for AI." arXiv preprint arXiv:1712.05855 (2017). - Sze, Vivienne, et al. "Efficient processing of deep neural networks: A tutorial and survey." Proceedings of the IEEE105.12 (2017): 2295-2329.
Prerequisite: CS5691 / EE5177 and CS 2600.
Pre-Requisites |
Parameters
Credits |
Type |
Date of Introduction |
3-0-0-0-0-9-12 |
Elective |
Sep 2018 |
|
Previous Instances of the Course