IEEE host events along 2023 are:


All tutorials qualify for credit towards IEEE TTTC certification under the TTEP program. Each tutorial requires a separate registration fee. Attendees of tutorials receive study material, handouts, breakfast and coffee breaks. The study material includes copies of the presentation and bibliographical material, and, when applicable, a relevant textbook (textbooks are provided to attendees who register at IEEE/CS member or non-member rates).

2023 Tutorial Summaries

Single click on the Tutorial name for reading the abstract and speakers' bio:

Sunday, October 8th, 8:30am - 12:00pm PDT
1 DEPENDABILITY AND TESTABILITY OF AI HARDWARE Fei SU (Intel), Haralampos STRATIGOPOULOS (Sorbonne Univ., CNRS, LIP6), Yiorgos MAKRIS (UT Dallas)
2 EARLY SYSTEM RELIABILITY ANALYSIS FOR CROSS-LAYER SOFT ERRORS Alberto BOSIO (EC Lyon), Stefano DI CARLO, Alessandro SAVINO (Politecnico di Torino)
3 DEVICE-AWARE TEST FOR EMERGING MEMORIES: THE MEAN TO WIN THE WAR AGAINST UNMODELED FAULTS Said HAMDIOUI (Delft University of Technology)

Sunday, October 8th, 1:00 - 4:30pm PDT
4 RANDOM PROCESS VARIATIONS, CIRCUIT TIMING MARGINALITIES AND SILENT DATA ERRORS Adit SINGH (Auburn University)
5 MIXED-SIGNAL DFT CHALLENGES AND SOLUTIONS Stephen SUNTER (Siemens EDA)
6 HIERARCHICAL AND TILE BASED DFT TECHNIQUES FOR AI AND LARGE SOC Lee HARRISON (Siemens EDA)

Monday, October 9th, 8:30am - 12:00pm PDT
7 SILICON LIFE CYCLE MANAGEMENT FOR EMERGING SOCS Jyotika ATHAVALE, Yervant ZORIAN (Synopsys)
8 FUNCTIONAL SAFETY READINESS: REQUIREMENTS IN DESIGN, TEST AND APPLICATION Rubin PAREKHJI, Prasanth VISWANATHAN PILLAI (Texas Instruments)
9 DOMAIN-SPECIFIC MACHINE LEARNING IN SEMICONDUCTOR TEST Li-C. WANG (UC Santa Barbara)

Monday, October 9th, 1:00 - 4:30pm PDT
10 AUTOMOTIVE SAFETY, RELIABILITY, AND TEST SOLUTIONS Jyotika ATHAVALE, Yervant ZORIAN (Synopsys)
11 TESTING AND MONITORING OF DIE-TO-DIE INTERCONNECTS IN A 2.5D/3D IC Shi-Yu HUANG (National Tsing-Hua University)
12 FUNCTIONAL TESTING TECHNIQUES Paolo BERNARDI (Politecnico di Torino)

Tutorial 1:

DEPENDABILITY AND TESTABILITY OF AI HARDWARE

Fei SU (Intel), Haralampos STRATIGOPOULOS (Sorbonne Univ., CNRS, LIP6), Yiorgos MAKRIS (UT Dallas) 

Toward continued performance improvement despite the slowed-down physical device scaling, adoption of bold and radical innovations in computer architectures has recently accelerated. One such trend focuses on computing architectures for AI hardware. While functionality of AI hardware still remains the main focus, testability and dependability of these new architectures need to be addressed before mainstream adoption. This tutorial covers the state-of-the-art in research and development of dependability and testability solutions for AI hardware (including digital or analog implementations of artificial neural networks (ANNs) and spiking neural networks (SNNs), used in accelerators and neuromorphic designs) and discusses challenges and future trends.

Tutorial 2:

EARLY SYSTEM RELIABILITY ANALYSIS FOR CROSS-LAYER SOFT ERRORS

Alberto BOSIO (EC Lyon), Stefano DI CARLO, Alessandro SAVINO (Politecnico di Torino) 

In a world with computation at the epicenter of every activity, computing systems must be highly reliable even if miniaturization makes the underlying hardware unreliable. Techniques that guarantee high reliability are associated with high costs (reliability tax). Early reliability analysis can take informed design decisions to maximize reliability while minimizing the reliability tax. This tutorial focuses on early cross-layer reliability analysis considering the full computing continuum (from IoT/CPS to HPC applications), emphasizing soft errors. The tutorial will guide attendees from the definition of the problem down to the proper modeling and design exploration strategies considering the entire system stack.

Tutorial 3:

DEVICE-AWARE-TEST FOR EMERGING MEMORIES: THE MEAN TO WIN THE WAR AGAINST UNMODELED FAULTS

Said HAMDIOUI (Delft University of Technology) 

This tutorial discusses a new test approach called Device-Aware Test (DAT) and applies it to two industrial memory designs: STT-MRAMs and RRAMs. DAT is a new test approach that goes beyond Cell-Aware Test; it does not assume that a defect in a device can be modeled electrically as a linear resistor (as the state-of-the art approach suggests), but it rather incorporates the impact of the physical defect into the technology parameters of the device and thereafter in its electrical parameters. Once the defective electrical model is defined, a systematic fault analysis is performed to derive appropriate fault models and subsequently test solutions. The tutorial discusses the testing of interconnect and contact defects as well as unique device defects in STT-MRAMs/ RRAMs. Unique defects are manufacturing defects are BEOL (back-end of line) defects that emerge due to the additional processing steps needed for the integrations of STT-MRAM/ RRAMs. Examples of STT-MRAM unique defects are pinhole and synthetic anti-ferrimagnet flip, and example of RRAM unique defects are forming and ion depletion defects. Industrial case studies for STTMRAM and RRAM show that DAT sensitizes realistic faults as well as new unique defects and faults that can never be caught with the traditional approaches.

Tutorial 4:

RANDOM PROCESS VARIATIONS, CIRCUIT TIMING MARGINALITIES AND SILENT DATA ERRORS

Adit SINGH (Auburn University) 

New types of failures that escape traditional scan DFT tests are being increasingly observed in SOCs. For example, recent presentations (since 2021) from Google and Facebook (Meta) have reported significant levels of silent data corruption in their large data centers. These occasional transient failures have been associated with specific processor cores in these large processor networks, suggesting faulty or unstable hardware from test escapes rather than failures from random environmental noise. The inability of scan structural tests to detect these failures has resulted in the introduction of an entirely new function system level test (SLT) over the past few years, to serve as an additional final defect screen in manufacturing test flows. However, even these expensive functional tests allow significant test escapes that cause malfunction in operation. We explain why timing marginalities resulting from manufacturing process variations, greatly accentuated in low voltage operation, are the likely cause of much of the SLT fallout. Furthermore, these failures can even escape detection by SLT and can cause many of the silent data errors reported by datacenters. To explain this, we review scan DFT tests in depth, including recent advances such as cell aware test, path delay tests, and timing aware tests. This helps us understand why scan tests are unable to reliably detect timing errors from process variations. Finally, we present and explain research, as validated on published volume production test data from Intel’s advanced 14nm FinFET technology, which suggests ways of leveraging the voltage and timing of the applied timing tests to enhance the detection of marginal timing parts during scan and system level testing. The goal is to reliably screen out these marginal parts during postproduction testing and thereby prevent them from causing errors in operation.

Tutorial 5:

MIXED-SIGNAL DFT CHALLENGES AND SOLUTIONS

Stephen SUNTER (Siemens EDA) 

This tutorial explores systematic analog and mixed-signal design-for-test, including analog fault/defect simulation. We review widely-used basic DfT techniques, fault simulation, IEEE 1149.1/4/6/7, 1687, and ISO 26262 metrics, then BIST for ADC/DAC, PLL, SerDes/DDR, and random analog. Essential principles of practical analog BIST are presented, then practical DfT techniques, from quicker analog defect simulation, to DfT that focuses on simplicity, diagnosis, reuse, and automation. We conclude with a detailed summary of the Analog Defect Coverage and Analog Test Access standards (IEEE P1687.2, P2427), as they approach completion thanks to the effort of dozens of people over many years.

Tutorial 6:

HIERARCHICAL AND TILE BASED DFT TECHNIQUES FOR AI AND LARGE SOC

Lee HARRISON (Siemens EDA) 

In this tutorial, we will proceed to give an overview of the exciting field of AI and HPC. It will cover the critical and special characteristics and the architecture of the popular AI chips. Next we will summarize the features of the AI chips from design-for-test (DFT) perspective and introduce the DFT technologies that can help testing AI chips. We will also look at how the shift to 2.5D and 3D including Chiplet development is changing the industry and the adding new challenges for the DFT community Finally, we will present a few case studies on how DFT is implemented in the real AI chips. We will also present some of the functional monitoring techniques that are available today. An overall architecture showing how functional monitoring can be implemented and how the monitor data can be used to manage in-life capabilities. Finally, we will present a few case studies on how DFT is implemented in the real AI chips.

Tutorial 7:

SILICON LIFE CYCLE MANAGEMENT FOR EMERGING SOCS

Jyotika ATHAVALE, Yervant ZORIAN (Synopsys) 

Recent advances in automotive SOCs, artificial intelligence accelerators, and high-performance computing engines in data centers have led to an explosion in the adoption of emerging technology nodes and 3DIC/chiplet packages. This tutorial will present today’s trends and discuss the resiliency challenges for such emerging SOCs. It will focus on optimizing the SOC health using advanced test, measurement and analytic solutions, such as on chip structural sensors, functional monitors, environmental sensors and embedded test & repair engines, typically utilized for managing the different silicon lifecycle stages: from silicon debug in early bring up stage to shorten the time-to-volume; to self-test and repair during volume production stage, in order to improve quality and yield; to power-on self-test in the field stage to address aging challenges; to periodic checking in-system to improve functional safety; and finally to fault tolerance and error correction during mission mode to address a range of transient errors. All of the above optimizations are materialized by on-chip and/or off-chip data analytics.

Tutorial 8:

FUNCTIONAL SAFETY READINESS: REQUIREMENTS IN DESIGN, TEST AND APPLICATION

Rubin PAREKHJI, Prasanth VISWANATHAN PILLAI (Texas Instruments) 

This tutorial covers, (and in the process demystifies), four aspects of semiconductor functional safety: (a) How are the well-known metrics for ASIL (automotive) and SIL (industrial) classification set. (b) How do these requirements drive the selection of the right set of detection and diagnostics mechanisms. (c) How is conformance to these classification levels assessed. (d) How are fault spaces and coverage numbers apportioned to different IC building blocks and higher level compositions at the system level. Industry examples highlighting how these requirements in design, test and application can drive readiness for functional safety will be discussed.

Tutorial 9:

DOMAIN-SPECIFIC MACHINE LEARNING IN SEMICONDUCTOR TEST

Li-C. WANG (UC Santa Barbara) 

The emerge of large Language Model (LM) have significantly impacted our view for applying Machine Learning (ML) in semiconductor test. Recent LMs include Codex focusing on code generation and InstructGPT for capturing user intent. Their successor, ChatGPT, has demonstrated remarkable performance for engaging in dialog on a wide variety of topics, answering questions, and generating code. With these recent LM technological developments, this tutorial provides an integrated view of Domain-Specific Machine Learning (DSML) in semiconductor test. This view calls for an end-to-end AI solution to realize DSML. In our domain, DSML is applied in an iterative exploration process for an engineer to learning knowledge from data. In this iterative process, ML is applied to facilitate an engineer to move from one iteration to the next. To illustrate this DSML view, we will discuss common test data analytics practices including outlier analysis, wafer map pattern recognition, yield optimization, and cross-insertion predictive analysis, and explains their challenges and current solutions. We will then discuss the latest LM technologies and how they fit into our DSML view to build an end-to-end AI solution. Industrial case studies will be provided to illustrate the concepts taught in this tutorial.

Tutorial 10:

AUTOMOTIVE SAFETY, RELIABILITY, & TEST SOLUTIONS

Jyotika ATHAVALE, Yervant ZORIAN (Synopsys) 

With the fast-growing adoption of advanced technology nodes for automotive chips, this tutorial will discuss the implications of automotive quality, functional safety, and reliability on all aspects of automotive SOC lifecycle, while accelerating time to market for these semiconductor ICs. The automotive SOC lifecycle stages will include design, silicon bring-up, volume production, and particularly in-system operation. Today’s automotive safety critical chips need multiple in-system modes, such as power-on and power-off self-test and repair (key-on/key-off), periodic in-field self-test during mission mode, advanced error correction solutions, etc. This tutorial will analyze these specific in-system test modes and the discuss the benefits of using ISO 26262 including its second edition, and several newer standardization efforts, in order to ensure that standardized functional safety requirements are met.

Tutorial 11:

TESTING AND MONITORING OF DIE-TO-DIE INTERCONNECTS IN A 2.5D/3D IC

Shi-Yu HUANG (National Tsing-Hua University) 

With the evolution of multi-die integration into the era of interposer- or InFO-based 2.5-D ICs and/or TSV-based 3D stacked ICs, die-to-die interconnects could operate in a very high speed, with an end-to-end delay of only a few hundreds of picoseconds. Parametric defects (like small delay faults, resistive open/bridging faults, leakage faults, etc.) have been identified as potential threats to the yield and reliability of a 2.5D/3D IC product. Fortunately, various test and online monitoring methods have been developed to deal with this challenge and to guarantee the overall quality of the die-to-die interconnects in a 2.5D/3D IC product.

Tutorial 12:

FUNCTIONAL TESTING TECHNIQUES

Paolo BERNARDI (Politecnico di Torino) 

Since the inception of IC design in the mid-1960s, IC test has been an integral part of the manufacturing process. Initially, tests were of the Functional nature of either randomly generated or created from verification suites. But as chips got larger, testing required a more targeted approach, one that needed to be easily replicated from one design to another. This led to the invention of Structural methods like scan, which made designs combinational and simplified the test generation process. After almost 50 years, the testing scenario evolved just slightly, following technology trends currently led by the complexity of the circuits under test and the field of use (i.e., Automotive). Structural methods are still dominant, at least during the manufacturing test process, but Functional techniques are now recognized to be: (i) Useful to complement structural techniques during the manufacturing test process, such as System Level Test. (ii) Able to mitigate thermal issues that may originate during stress phases like along Burn-In, thus enabling test data collection during this phase. (iii) Very helpful along with the useful life of the components in the mission field, to run a not destructive self-test and also able to capture and store information, opening possibilities for Silicon Lifetime Management (SLM). The talk will provide basic and practical information about some today-relevant functional techniques in the field of Software-Based Self-Test (SBST), Burn-In Functional Stress/Test (TDBI), and System-Level Test (SLT). Automotive chip case studies from STMicroelectronics will be illustrated.