CMSC 32201 Topics in Computer Architecture

Winter 2019

 

Course Hours and Location:

Tuesdays and Thursdays, 11:00am-12:20pm, JCL 350

  

Instructor

Yanjing Li (yanjingl@uchicago.edu)

 

Course Description

Computing systems have advanced rapidly and transformed every aspect of our lives for the last few decades, and innovations in computer architecture is a key enabler. Residing in the middle of the system design layers, computer architecture interacts with both the software stack (e.g., operating systems and applications) and hardware technologies (e.g., logic gates, interconnects, and memories) to enable efficient computing with unprecedented capabilities. In this advanced course, we will discuss the most up-to-date research, trends, and future in computer architecture. Students are required to present and participate in class, and complete a research project.

  

Grading

Presentations: 50%

Attendance and Participation: 25%

Technical Report: 25%

  

Course Topics and Schedule

 

Date

Topics

Assignments

Tuesday, 01/08

Course intro and logistics

 

Thursday, 01/10

Architectures for Machine Learning / Deep Learning

Day1:

1.     V. Sze, T.-J. Yang, Y.-H. Chen, J. Emer, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey," Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, December 2017.

 

Day2&3:

1.    Y.-H. Chen, J. Emer, V. Sze, "Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks," International Symposium on Computer Architecture (ISCA), pp. 367-379, June 2016.

2.    Y.-H. Chen, J. Emer, V. Sze, "Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks.”

3.    MAESTRO: A performance and cost model for DNN dataflows.

https://arxiv.org/abs/1805.02566.

4.    Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. 2018. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects. SIGPLAN Not. 53, 2 (March 2018), 461-475. 

5.    DNN Dataflow Choice Is Overrated.

https://arxiv.org/abs/1809.04070.

6.    Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, and Christos Kozyrakis. 2017. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17).

7.    Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, and William J. Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In Proceedings of the 43rd International Symposium on Computer Architecture (ISCA '16). IEEE Press, Piscataway, NJ, USA, 243-254.

8.    A. Parashar et al., "SCNN: An accelerator for compressed-sparse convolutional neural networks," 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON, 2017, pp. 27-40.

9.    A. Shafiee et al., "ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars," 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, 2016, pp. 14-26.

10. RAPIDNN: In-Memory Deep Neural Network Acceleration Framework.

https://arxiv.org/abs/1806.05794.

11. On-Chip Optical Convolutional Neural Networks. https://arxiv.org/abs/1808.03303.

 

Tuesday, 01/15

Thursday, 01/17

Tuesday, 01/22

Hardware Security

Day1:

1.    Meltdown: https://arxiv.org/abs/1801.01207

2.    Spectre: https://arxiv.org/abs/1801.01203

3.    https://googleprojectzero.blogspot.com/

4.     RowHammer: Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 2014. Flipping bits in memory without accessing them: an experimental study of DRAM disturbance errors. In Proceeding of the 41st annual international symposium on Computer architecture (ISCA '14).

5.     Security Basics for Computer Architects, Synthesis Lectures on Computer Architecture (not required)

 

Day 2&3:

1.     Acıiçmez, B. B. Brumley, and P. Grabher, “New results on instruction cache attacks,” in CHES, Santa Barbara, CA, US, Apr 2010, pp. 110–124.

2.     N. Benger, J. van de Pol, N. P. Smart, and Y. Yarom, “‘Ooh aah..., just a little bit’: A small amount of side channel can go a long way,” in CHES, Busan, KR, Sep 2014, pp. 75–92.

3.     R. Hund, C. Willems, and T. Holz, “Practical timing side channel attacks against kernel space ASLR,” in Symp. Security & Privacy, San Francisco, CA, US, May 2013, pp. 191–205.

4.     Y. Yarom and K. Falkner, “FLUSH+RELOAD: a high resolution, low noise, L3 cache side-channel attack,” in USENIX Security, San Diego, CA, US, Aug 2014, pp. 719–732.

5.     Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. 2015. Last-Level Cache Side-Channel Attacks are Practical. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP '15). IEEE Computer Society, Washington, DC, USA, 605-622.

6.     W.-M. Hu, “Reducing timing channels with fuzzy time,” in Symp. Security & Privacy, Oakland, CA, US, May 1991, pp. 8–20.

7.     T. Kim, M. Peindo, and G. Mainer-Ruiz, “STEALTHMEM: System-level protection against cache-based side channel attacks in the Cloud,” in USENIX Security, Bellevue, WA, US, Aug 2012.

8.     Z. Wang and R. B. Lee, “New Cache Designs for Thwarting Software Cache-based Side Channel Attacks,” in ISCA, San Diego, CA, US, Jun 2007, pp. 494–505.

9.     Z. Wang and R. B. Lee, “A Novel Cache Architecture with Enhanced Performance and Security,” in MICRO, Como, IT, Nov 2008, pp. 83–93.

10.  Wang, Z., and Lee, R. B. A Novel Cache Architecture with Enhanced Performance and Security. In IEEE/ACM International Symposium on Microarchitecture — MICRO (2008), pp. 83 — 93.

11.  Liu, F., and Lee, R. B. Random Fill Cache Architecture. In International Symposium on Microarchitecture — MICRO (2014), IEEE, pp. 203 — 215.

12.  Robert Martin, John Demme, and Simha Sethumadhavan. 2012. TimeWarp: rethinking timekeeping and performance monitoring mechanisms to mitigate side-channel attacks. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA '12). IEEE Computer Society, Washington, DC, USA, 118-129.

13.  J. Demme, R. Martin, A. Waksman and S. Sethumadhavan, "Side-channel vulnerability factor: A metric for measuring information leakage," 2012 39th Annual International Symposium on Computer Architecture (ISCA), Portland, OR, 2012, pp. 106-117.

14.  Sarani Bhattacharya and Debdeep Mukhopadhyay. 2016. Curious Case of Rowhammer: Flipping Secret Exponent Bits Using Timing Analysis. In CHES (Lecture Notes in Computer Science), Vol. 9813. Springer, 602–624.

15.  Daniel Gruss, Moritz Lipp, Michael Schwarz, Daniel Genkin, Jonas JuffingerSioli  O’Connell,  Wolfgang  Schoechl,  and  Yuval  Yarom. 2017. Another  Flip  in  the  Wall  of  Rowhammer  Defenses. CoRRabs1710.00551 (2017).

16.  Peter Pessl, Daniel Gruss, Clémentine Maurice, Michael Schwarz, and Stefan Mangard. 2016.  DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks. In USENIX Security Symposium. USENIX Association, 565-581.

17.  Rui Qiao and Mark Seaborn. 2016. A new approach for rowhammer attacks. In HOST. IEEE Computer Society, 161–166.

18.  Kaveh Razavi, Ben Gras, Erik Bosman, Bart Preneel, Cristiano Giuffrida, and Herbert Bos. 2016.  Flip Feng Shui: Hammering a Needle in the Software Stack. In USENIX Security Symposium. USENIX Association, 1–18.

19.  Mark  Seaborn  and  Thomas  Dullien.  2015.   Exploiting  the  DRAM rowhammer bug to gain kernel privileges. Black Hat (2015), 7–9.

20.  Victor van der Veen, Yanick Fratantonio, Martina Lindorfer, Daniel Gruss, Clémentine  Maurice,  Giovanni  Vigna,  Herbert  BosKaveh Razavi,  and Cristiano  Giuffrida.  2016. Drammer:  Deterministic Rowhammer Attacks on Mobile Platforms. In ACM Conference on Computer and Communications Security. ACM, 1675–1689.

21.  Yuan Xiao, Xiaokuan Zhang, Yinqian Zhang, and Radu Teodorescu. 2016. One Bit Flips, One Cloud Flops: Cross-VM Row Hammer Attacks and Privilege Escalation. In USENIX Security Symposium. USENIX Association, 19–35.

22.  Zelalem Birhanu Aweke, Salessawi Ferede Yitbarek, Rui Qiao, Reetuparna Das, Matthew Hicks, Yossi Oren, and Todd Austin. 2016. ANVIL: Software-Based Protection Against Next-Generation Rowhammer Attacks. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’16) ACM, New York, NY, USA, 743–755.

23.  Ferdinand Brasser, Lucas Davi, David Gens, Christopher Liebchen, and Ahmad-Reza Sadeghi. 2017. Can’t touch this: Software-only mitigation against rowhammer attacks targeting kernel memory. In Proceedings of the 26th USENIX Security Symposium (Security). Vancouver, BC, Canada.

24.  Radhesh Krishnan Konoth, Marco Oliverio, Andrei Tatar, Dennis Andriesse, Herbert Bos, Cristiano Giuffrida, and Kaveh Razavi. 2018. ZebRAM: Comprehensive and Compatible Software Protection Against Rowhammer Attacks. In13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18).  USENIX  Association, Carlsbad, CA, 697–710.

 

Thursday, 01/24

(no class)

Tuesday, 01/29

Thursday, 01/31

Tuesday, 02/05

Project presentations

 

Thursday, 02/07

Tuesday, 02/12

(no class)

Emerging Technologies and paradigms

1. X. Zhang, R. Bashizade, C. LaBoda, C. Dwyer and A. R. Lebeck, "Architecting a Stochastic Computing Unit with Molecular Optical Devices," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 301-314.

 

2. K. Korgaonkar et al., "Density Tradeoffs of Non-Volatile Memory as a Replacement for SRAM Based Last Level Cache," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 315-327.

 

3. A. Fuchs and D. Wentzlaff, "Scaling Datacenter Accelerators with Compute-Reuse Architectures," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 353-366.

 

4. B. Feinberg, U. K. R. Vengalam, N. Whitehair, S. Wang and E. Ipek, "Enabling Scientific Computing on Memristive Accelerators," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 367-382.

 

5. C. Eckert et al., "Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 383-396.

 

6. M. Alshboul, J. Tuck and Y. Solihin, "Lazy Persistency: A High-Performing and Write-Efficient Software Persistency Technique," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 439-451.

 

7. A. Joshi, V. Nagarajan, M. Cintra and S. Viglas, "DHTM: Durable Hardware Transactional Memory," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 452-465.

 

8. T. Nguyen and D. Wentzlaff, "PiCL: A Software-Transparent, Persistent Cache Log for Nonvolatile Main Memory," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 507-519.

 

9. A. Shahab, M. Zhu, A. Margaritov and B. Grot, "Farewell My Shared LLC! A Case for Private Die-Stacked DRAM Caches for Servers," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 559-572.

 

10. P. Tsai, C. Chen and D. Sanchez, "Adaptive Scheduling for Systems with Asymmetric Memory Hierarchies," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 641-654.

 

 

11. J. Liu, H. Zhao, M. A. Ogleari, D. Li and J. Zhao, "Processing-in-Memory for Energy-Efficient Neural Network Training: A Heterogeneous Approach," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 655-668.

 

12. H. Mao, M. Song, T. Li, Y. Dai and J. Shu, "LerGAN: A Zero-Free, Low Data Movement and PIM-Based GAN Architecture," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 669-681.

 

13. B. Hong, Y. Ro and J. Kim, "Multi-dimensional Parallel Training of Winograd Layer on Memory-Centric Architecture," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 682-695.

 

14. S. Li et al., "SCOPE: A Stochastic Computing Engine for DRAM-Based In-Situ Accelerator," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 696-709.

 

15. D. Zhang, V. Sridharan and X. Jian, "Exploring and Optimizing Chipkill-Correct for Persistent Memory Based on High-Density NVRAMs," 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Fukuoka, 2018, pp. 710-723.

 

Thursday, 02/14

Tuesday, 02/19

Thursday, 02/21

Tuesday, 02/26

Interconnection Networks

1. Y. Yao and Z. Lu, "iNPG: Accelerating Critical Section Access with In-network Packet Generation for NoC Based Many-Cores," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 15-26.

 

2. Mirhossein Mirhosseini, Mohammad Sadrosadati, Behnaz Soltani, Hamid Sarbazi-Azad, and Thomas F. Wenisch. 2017. BiNoCHS: Bimodal Network-on-Chip for CPU-GPU Heterogeneous Systems. In Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS '17). ACM, New York, NY, USA, Article 7, 8 pages.

 

3. F. Alazemi, A. AziziMazreah, B. Bose and L. Chen, "Routerless Network-on-Chip," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 492-503.

 

4. Thomas Moscibroda and Onur Mutlu. 2009. A case for bufferless routing in on-chip networks. In Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09). ACM, New York, NY, USA, 196-207. 

 

5. V. Y. Raparti and S. Pasricha, "DAPPER: Data Aware Approximate NoC for GPGPU Architectures," 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS), Turin, 2018, pp. 1-8.

 

6. R.Boyapati, J. Huang, P. Majumder, K. H. Yum and E. J. Kim, "APPROX-NoC: A data approximation framework for Network-on-Chip architectures," 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON, 2017, pp. 666-677.

 

7. S. Das, K. Basu, J. R. Doppa, P. P. Pande, R. Karri and K. Chakrabarty, "Abetting Planned Obsolescence by Aging 3D Networks-on-Chip," 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS), Turin, 2018, pp. 1-8.

 

8. H. M. G. Wassel et al., "Networks on Chip with Provable Security Properties," in IEEE Micro, vol. 34, no. 3, pp. 57-68, May-June 2014.

 

9. S. Werner, J. Navaridas and M. Luján, "Designing Low-Power, Low-Latency Networks-on-Chip by Optimally Combining Electrical and Optical Links," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, 2017, pp. 265-276.

 

10. Grani, R. Proietti, V. Akella and S. J. B. Yoo, "Design and Evaluation of AWGR-Based Photonic NoC Architectures for 2.5D Integrated High Performance Computing Systems," 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, 2017, pp. 289-300.

 

11. S. Van Winkle, A. K. Kodi, R. Bunescu and A. Louri, "Extending the Power-Efficiency and Performance of Photonic Interconnects for Heterogeneous Multicores with Machine Learning," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 480-491.

 

12. Natalie Enright Jerger, Ajaykumar Kannan, Zimo Li, and Gabriel H. Loh. 2014. NoC Architectures for Silicon Interposer Systems: Why Pay for more Wires when you Can Get them (from your interposer) for Free?. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47). IEEE Computer Society, Washington, DC, USA, 458-470.

 

13. J. Yin et al., "Modular Routing Design for Chiplet-Based Systems," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, 2018, pp. 726-738.

Thursday, 02/28

Tuesday, 03/05

Thursday, 03/07

Project presentations

 

Tuesday,

03/12

Friday, 03/15

 

Final technical report due