Semiconductor Test Newsletter
Test Times
Approaching "Device-Limited" Test Time
by Gregory Dionne, Teradyne, Inc.
Test engineers have continually looked for ways to eliminate the overhead associated with automatic test equipment (ATE). Reducing set-up time, optimizing test sequences, and streamlining data processing have always been critical factors in reaching maximum throughput. The ideal would be “device-limited” testing – taking no more time to verify the functionality of a device than it takes for the device to perform its operation. Now, the modern distributed computing mixed-signal test architecture is getting closer to the ideal.
The fundamental goal of an ATE test program is to characterize as many functional parts and provide as much characterization data as possible. In production testing, throughput becomes a critical element. Although a device cannot be declared “functional” unless its full set of specifications is verified, the speed at which verification is performed will impact test cost and, ultimately, the productivity of the manufacturing process.
The computation hardware of ATE systems has become increasingly complex in response to the greater integration of device components. Today’s system-on-a-chip (SoC) devices combine multiple functional blocks, or cores, on a single chip. A core could be a converter core, a controller, a memory core, etc.

Figure 1. A typical system-on-a-chip (SoC) device with multiple functional cores.
Comprehensive functional testing of these SoC devices requires executing against complex test parameters involving multiple test frequencies and precise signal timing to verify the operation of the individual cores and the operation of the overall device. These requirements have driven creation of a distributed, multi-tiered system that optimizes data transfer and computation. To put today’s architecture into perspective, it’s helpful to understand the evolution of ATE processing architecture.
First Generation Traditional Test Architectures
First-generation test architecture involves a single computer that controls all instrumentation over a single bus. The computer performs tests in a sequential fashion where each test involves instrument setup, device output capture, data processing, and comparing the computed specifications against a set of defined test requirements. The computer then continues with the second, third, and subsequent tests until the program is complete and the device is binned.
With the system performing only one test at a time, all other instrumentation not involved with a test is left idle. Processing and data recording times greatly impact performance, since they cannot take place in parallel with device set-up.
In production operations on first-generation architectures, test programs are structured to obtain the greatest throughput. Device specifications with high failure rates are typically placed at the start of the program. If a device fails, it can quickly be binned out without expending time to perform the remaining tests.
With the traditional test system architecture, the key non-instrument metrics used to evaluate throughput are the speed of the computer and the speed of the communication bus. Operational improvements in these first-generation systems centered on increasing processing speed and data movement.
Moving to Multi-Site Test
The primary difference between first-generation and second-generation test platforms is the ability to perform multi-site test. Unlike first-generation architectures (which sometimes alternate between sites to eliminate handler index times), second-generation test hardware makes more efficient use of the central bus, and can broadcast instructions to the instruments corresponding to each test site in a mostly parallel fashion. Thus, less instrumentation is left idle at any given time. Users can develop and verify test programs with one site, then quickly enable other sites.
The performance of data-intensive single-bus, single-test computer architectures is still largely determined by the speed of data movement and processing
Figure 2. The test flow with a single-bus processing architecture.
. Some single-bus systems attempt to mitigate the data-intensive activity by detecting and retrieving results during known periods where the test computer is idle (for example, large data capture sets or long functional pattern bursts).
The key metrics for second-generation test systems are split equally between the speed of the bus and the speed of processor. Applications tend to be rated on parallel efficiency metrics, which generally are good when data move and data processing is limited.
Moving Data Processing Into the Background
The next major improvement in multi-site test was a Teradyne-patented approach 1 that added processors besides the test computer. Data movement from instrumentation was still initiated and monitored by the test computer, but for the first time, data processing could be run in the background.
A multi-site background processing test architecture makes it less critical to perform high failure tests at the start of the test program,
Figure 3. The single-bus architecture with Background DSP.
since quick identification of a failing part rarely reduces the time required to verify the remaining sites. The test system still writes to all sites in parallel, so there is significant complexity in interrupting the test flow to replace a failing part at a site. Instead, with DSP now pipelined, test-programming practices evolved to facilitate the new functionality. For maximum throughput, DSP-intensive tests are started before other tests that do not use DSP or are not DSP-intensive. This helps to ensure that the DSP hardware is not idle during the non-DSP testing periods. Binning is delayed until later in the test program.
Key metrics for this third-generation of systems are primarily the speed of the bus, followed by the speed of processors. All but the most data-driven applications tend to be rated well on parallel efficiency metrics.
Creating True Multi-site Parallel Test
The next improvements to test architecture centers on eliminating the serial move of captured device data to enable true parallel multi-site test. Placing DSP on board instruments eliminates data movement entirely; or adding additional data-buses streamlines data flow.
Figure 4. The multiple data-bus architecture
However, dedicated on-board processors are practical only when performing fixed-time filtering, data conversion or scaling operations, and “state-of-the-art” processors quickly lose their edge in the ever-advancing marketplace (requiring upgrades to stay current). Newer processors tend to be incompatible with earlier designs, greatly increasing engineering risk when upgrades are performed.
Predictably, move times become the bottleneck for these fourth-generation systems. Thus, instruments were upgraded to pipeline both the movement of samples and the data computation instructions.
Figure 5. Pipelined move and process.
These instruments used either a partitioning scheme or a circular memory structure to allow the instrument to capture test output while data movement is taking place. This marked the beginnings of true “device-limited” testing. The key metrics for fourth-generation systems center on the speed and number of independent data-buses. Processing-intensive applications benefit more from improvements in processing speed.
Moving to “Device-Limited” Test Time
Now, we move into the fifth generation of DSP test architecture, implemented in Teradyne’s FLEX semiconductor test system designed for high-speed, multi-site test of consumer, automotive, wireless and computer device applications. The system architecture (without tester computer intervention) automatically moves data from tester instrumentation, allowing for full-background move and processing. With double-buffered capture memory, instruments can start capture on the next test without involving the system computer, substantially reducing test time overhead. Instrument-initiated moves also provide a way for digital pattern control of instrumentation, independent of the test computer. This makes it possible for separate functional components integrated on the same device chip to be tested independently of one another. The resulting test time then becomes theoretically equivalent to that of testing the individual core with the longest test time.
Figure 6. Digital pattern control.
Key metrics used to evaluate this new generation system involves the speed and number of data-buses, and the speed and number of processors for DSP-intensive applications. The objective is to get all of the DSP into the background and make it transparent so we can approach “device-limited” testing speed.
Combined with architectural features for synchronizing multiple time domains, independent instrument clocking, and high-density fully integrated instrumentation, this fifth-generation approach to distributed DSP provides the vehicle to approach the ideal of “device-limited” test time.
Footnote:
Proskauer, et al. “Apparatus and method for performing digital signal processing in an electronic circuit”, United States Patent #5,673,272. September 30, 1997. Gamache, Richard E.
© 2004 Teradyne, Inc. All rights reserved. Republication or redistribution of Teradyne content is expressly prohibited without the prior written consent of Teradyne.