Fall Semester, 2004
Iowa State University
MW 2:10-3:25PM,
Instructor: Zhao Zhang
Office Hours: Monday 1:00-2:00PM; Wednesday 3:30-4:30PM; or by appointment
Office Location: 368 Durham Center
Contact: phone 294-7940, e-mail:
zzhang@iastate.edu
Number of Credits: 3
Prerequisite: CprE 305 Computer Systems Organization and Architecture or
equivalent course.
Course OBJECTIVEs
This course introduces high-performance computer architecture in a systemic manner. First, it discusses two fundamental issues in designing a computer, instruction set architecture (ISA) design and performance evaluation methodology. Then, it introduces dynamically scheduled superscalar techniques, including multi-issue, dynamic instruction scheduling, speculative execution, branch prediction, and memory dependence speculation. Altogether, those techniques allow a processor to exploit instruction level parallelism (ILP) in a sequential program. Next, advanced cache designs, including multi-level cache, high-performance instruction cache, multi-port data cache, cache prefetching, and other techniques will be studied. Those techniques address the performance loss from the increasing CPU-memory speed gap. High performance storage systems will then be studied, which is important to applications with massive I/O activities. After that, the course will introduce multiprocessor systems, which are ideal platforms for parallel and multiprogramming workloads. Finally, it will try to cover a set of relatively new techniques, including SMT (Simultaneous Multithreading), CMP (chip-level multiprocessing), recent VLIW (Very-Long Instruction Word) developments, network processors, and low power processor designs.
Textbook
J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, 3rd edition, Morgan Kaufmann Publishers, Inc., 2002.
REFERENCES
John Paul Shen and Mikko H. Lipasti. Modern Processor Design: Fundamentals of SuperScalar Processors, McGraw Hill, 2002.
D. A. Patterson and J. L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 2nd Edition, Morgan Kaufmann Publishers, 1997.
Additional Readings
Selected papers from journals and major conferences in computer architecture.
Course SCHEDULE
(Subject to changes)
Week 1 8/23
8/25Lecture 1: Introduction
Lecture 2: Performance evaluation methodologyWeek 2 8/30
9/1Lecture 3: ISA principles
Lecture 4: Instruction dependence analysis and superscalar techniques OverviewWeek 3 9/6
9/8(Labor Day)
Lecture 5: Scoreboarding: Enforcing register data dependenceWeek 4 9/13
9/15Lecture 6: Tomasulo algorithm: register renaming and tag-based dependence check
Lecture 7: Speculative execution and recovery using Reorder BufferWeek 5 9/20
9/22Lecture 8: Branch prediction and fetch prediction
Lecture 9: Memory dependence speculationWeek 6 9/27
9/29Lecture 10: Modern superscalar processor models
Lecture 11: Alpha 21264 and Intel Pentium 4Week 7 10/4
10/6Lecture 12: VLIW and EPIC
Term Exam 1Week 8 10/11
10/13Lecture 13: Cache and virtual memory review
Lecture 14: Hardware approaches for cache optimizationsWeek 9 10/18
10/20Lecture 15: Application approaches for improving cache locality
Lecture 16: Prefetching techniquesWeek 10 10/25
10/27Lecture 17: Shared-memory SMP -- Overview and Cache Coherence
Lecture 18: Shared-memory SMP -- Cache CoherenceWeek 11 11/1
11/3Lecture 19: Shared-memory SMP -- Memory consistency
Lecture 20: Shared-memory SMP -- Examples and PerformanceWeek 12 (TBD) Lecture 21: RAID -- High Performance Storage Systems
Lecture 22: RAID (continue)Week 13 11/15
11/17Lecture 23: Simultaneous Multithreading and Chip-level Multiprocessing
Lecture 24: Power Efficient SystemsThanks-giving Week 14 11/29
12/1Lecture 25: Network Processor
Term Exam 2Week 15 12/6
12/8Student Presentations
Student PresentationsFinal Week 12/14 Student Presentations (12:00-2:00)
HOMEWORK ASSIGNMENTS
There will be seven to nine homework assignments. In general, the problems will fall into one of the following categories:
Conceptual: Ask about your understanding on certain topics
Analytical & Evaluative: Analyze a given issue, e.g. pipeline performance bottleneck, or evaluate the performance/strength/weakness of a given design.
Problem Solving: Give a solution to a given problem or optimize a hardware design.
Reading-based: Answer questions on paper reading assignment.
Simulation-based: Develop small simulation modules for a given design and evaluate the performance.
Verilog-based: Use Verilog to describe and verify an architectural design idea.
SURVEY PROJECTS
You will do a survey with a partner on a topic that you select from a suggestion list. You can also select your own topic with the permission of Dr. Zhang. At the end of the course, you and your partner will submit the survey and do a short presentation.
EXAM
There will be two term exams (75-minute long).
Grading
Exams: 30%
Homework Assignments: 50%
Survey Project: 15%
Class participation: 5%
Students with Disabilities: If you have a disability requiring accommodation in this class, please notify the instructor at the beginning of the semester.