Objective: The course provides an understanding of the architectural fundamentals and design of shared memory multicore systems and explores state of the art research issues related to such systems.
Syllabus:
Part - I: Fundamentals (11 Weeks)
Introduction (1 Week) -- Motivation for multicore systems, fundamental design issues.
Parallel Programs (1 Week) -- Parallel application case studies, the parallelization process.
Programming for Performance (1 Week) -- Partitioning for performance, data access and communication in a multi-memory system, orchestration for performance, performance factors from the processor's perspective.
Workload-Driven Evaluation (1 Week) -- Scaling workloads and machines, evaluating a real machine, evaluating an architectural idea or trade-off, illustrating workload characterization.
Shared Memory Multicore Systems (2 Weeks) -- Multicore systems, shared cache management in multicore systems, snoop-based multicore systems, scalable multicore systems.
Cache Coherence Protocols (2 Weeks) -- Snooping coherence protocols, assessing snoopy protocol design trade-offs, directory coherence protocols, assessing directory protocol design trade-offs, case studies.
Synchronization and Memory Consistency (2 Weeks) -- Mutual exclusion, point-to-point event synchronization, global synchronization, sequential consistency, total store order memory model, relaxed memory consistency.
Network on Chip (1 Week) -- Network topologies, routing techniques, flow control mechanisms, router architecture.
- Part -- II: Research Topics (3 Weeks)
Discussion on latest research papers related to the above topics.
Reference books:
- D.E. Culler, J.P. Singh, and A. Gupta. Parallel Computer Architecture - A Hardware/Software Approach. Morgan Kaufmann Publishers, 2010.
- N.E. Jerger and Li-Shiuan Peh. On-Chip Networks. Morgan and Claypool, 2009.
- D.J. Sorin, M.D. Hill, and D.A. Wood. A Premier on Memory Consistency and Cache Coherence. Morgan and Claypool, 2011.