## Efficient Gbit/s Data Transceivers Designed for Verification and SoC Integration

Dr. William Ellersick Analog Circuit Works, Inc. Sudbury, MA Bill.ellersick@AnalogCircuitWorks.com

## I. INTRODUCTION

High performance computing systems generate extreme amounts of data that must be reliably transported between chips and modules. While serializer/deserializer (SERDES) research, including that of the author, continues to demonstrate extraordinary data rates per channel, practical interfaces do not approach these limits, but rather find the sweet spot where circuit complexity, power and area start to increase faster than the data rate. System designers then take advantage of advances in packaging that allow highly parallel optical or electrical links, where power and silicon area per Gbit/s can be more important than the data rate per link.

## II. DISCUSSION

This paper describes a transceiver design that has evolved through the implementation and testing of a number of massively parallel communication and imaging systems, with a focus on optimizing system cost, robustness, and time- to - market. Based on circuits developed in 2000 for a 16 Gbit/s link that required 69mW/Gbit/s and 3mm^2 due to the complex equalization needed [1], performance has scaled with process technology to links with simpler equalization that only require 2mW/Gbit/s and fit under the SERDES pads. As shown in Fig. 1, the links are constructed from mixed signal building blocks including clock generation based on a phase-locked loop (PLL), transmitters, receivers, and timing recovery; as well as synthesized logic blocks for coding and upper protocol layers.

Embedded-clock SERDES allow high data rates with minimal signal wires. Low overhead techniques for providing clock edges are used (preventing a long series of zeros from causing bit boundaries to slip). Selfsynchronizing scramblers are a preferred technique, providing DC balance and reducing the probability of a long sequence without data transitions to 1e-12 or lower, without adding overhead. Coding schemes such as 8B10B or 32B34B are also applicable, ensuring adequate data transitions and DC balance at the cost of adding overhead and requiring more complex clock dividers and synchronization memories.

Clock distribution is a key concern, with the critical issues encapsulated in and simplified by the SERDES architecture. Digital clock recovery is preferred for predictable, scalable performance, using robust phase synthesizers to generate adjustable phase clocks. Digital state machines align the clocks to received serial data, compensating for phase and small frequency differences. This approach allows a single PLL to generate the clocks needed for multiple SERDES, while duty cycle correcting block buffers allow high speed clocks to be driven several mm without degradation. The PLL also filters out high frequency jitter, reducing requirements for shielding of the reference clock. An extra stage of retiming flip-flops in the SERDES provides setup and hold times comparable to standard flip-flops while supporting data transfers to on-chip logic at system clock rates, often without requiring

Figure 1



synchronization memories (FIFOs).

Conformance to SERDES standards generally requires changes only to upper protocol layers. One configurable PHY design can support a range of standards, such as JESD, PCIe, XAUI, Thunderbolt, 10GbE and USB3.0. Upper protocol layers are available as synthesizable (soft) macros to communicate over standard interfaces. Alternatively, for internal system links, a lean protocol stack is all that is required, providing simple coding or scrambling to provide framing and edges for clock recovery.

Linear equalization is used as shown in Fig. 2, providing eye openings that can be verified with straightforward simulations and measurements, at the expense of slightly larger signal swings to compensate for reduced signal-to-noise-ratio (SNR) compared to complex and power-hungry decision feedback equalization (DFE). Equalization coefficients are fixed or optimized during system characterization, and may be adapted during use at the expense of additional synthesized logic complexity including support for link performance messages to guide the adaptation. Figure 2:



Real value behavioral models of the data transceivers enable event-driven simulation, which allows digital simulations with large logic blocks to ensure proper control configuration and signal port connection. Portable syntax used in the behavioral model allows the models to be verified in analog (SPICE-based) simulators as shown in Fig. 3, ensuring accurate correspondence between the models and the transistor-level serial interface designs [2]. Conforming to the Universal Verification Methodology (UVM), the real value behavioral models support high levels of verification productivity and coverage, and are proven against the transistor-level designs.

Guidelines for chip and board layout ensure successful integration in large integrated circuits. Identification of victim and aggressor signals informs floorplanning and routing. Placement of SERDES macros near power and data bond pads, with low inductance paths to board supply decoupling capacitors, maintains good supply noise rejection. Shielding or twisted differential signal wires minimizes crosstalk to maintain healthy design margins. The review of these measures by experienced SERDES designers is a critical step.

III. CONCLUSION

First silicon success is paramount in large systems-onchips where digital design effort, power and area often dwarf that of the data transceivers. By focusing on simplicity, optimizing the serial interface design for the desired process and application, and reusing and porting the building blocks, cost and risk are minimized, and time-to-market is decreased. This approach is particularly effective for advanced or "boutique" processes where proven SERDES blocks are not available. The SERDES architecture described allows high performance, complex systems to be designed with robust and efficient data transceivers. The verification and integration methodology minimizes the risk of SERDES performance issues, and allows the focus to remain on digital processing and system design.

## Figure 3:



[1] Ellersick, W., et al, "A Serial-Link Transceiver Based on 8-GSample/s A/D and D/A Converters in 0.25-mm CMOS". International Solid-State Circuits Conference, Feb. 2001,p.58-59.

[2] Ellersick, W., "Real Portable Models for System/Verilog/A/AMS", Cadence CDN Live Conference, October 2010, Santa Clara, CA.