Changelog : Syllabus updated from Jul 2018 offering. 
 Syllabus
- Introduction: Defining Computer Architecture, Flynn’s Classification of Computers, Metrics for
Performance Measurement.
- Memory Hierarchy Introduction, Advanced Optimizations of Cache Performance, Memory
Technology and Optimizations, Virtual Memory and Virtual Machines, The Design of Memory
Hierarchy, Introduction to Pin Instrumentation and Cachegrind, Case Study: Memory
Hierarchies in Intel Core i7 and ARM Cortex-A8.
- Instruction Level Parallelism Instruction-level Parallelism: Concepts and Challenges, Basic
Compiler Techniques for Exposing ILP, Reducing Branch Costs with Advanced Branch
Prediction, Dynamic Scheduling, Advanced Techniques for Instruction Delivery and Speculation,
Limitations of ILP, Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor
Throughput, Modeling Branch Predictors using Pin Tool, Case Study: Dynamic Scheduling in
Intel Core i7 and ARM Cortex-A8.
- Thread Level Parallelism - Introduction, Shared-Memory Multicore Systems, Performance
Metrics for Shared-Memory Multicore Systems, Cache Coherence Protocols, Synchronization,
Memory Consistency, Multithreaded Programming using OpenMP, Case Study: Intel Skylake
and IBM Power8.
- Data Level Parallelism  Introduction, Vector Architecture, SIMD Instruction Set Extensions for
Multimedia, Graphics Processing Units, GPU Memory Hierarchy, Detecting and Enhancing Loop-
Level Parallelism, CUDA Programming, Case Study: Nvidia Maxwell. 
Text Book
- J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach. 5th Edition, Morgan Kauffmann Publishers, 2012.
References
- J.P. Shen and M.H. Lipasti. Modern Processor Design: Fundamentals of Superscalar Processors. McGraw-Hill Publishers, 2005.
- D.B. Kirk and W.W. Hwu. Programming Massively Parallel Processors. 2nd Edition, Morgan Kauffmann Publishers, 2012.
- Pin – A Dynamic Binary Instrumentation Tool. 
- Cachegrind: A Cache and Branch-Prediction Profiler. 
- OpenMP. 
- CUDA.