Why 8 Transceivers Per Port?

Wednesday October 7, 2020

By Mark Baumann

Director, Product Definition & Applications

MoSys, Inc.

The question can be asked as to why does the MoSys Accelerator Engines have 8 SerDes Transceivers per GCI port?

The fundamental question is what is the bandwidth of the device that is being accessed. In this case, the MoSys Accelerator Engines. The goal of the device is to utilize SerDes lanes to interface to the device, which has the result of minimizing the number of pins. For example, with a standard QDR type memory with LVCMOS interface signaling, it requires more than 100 pins to service the interface. That is a significant number of pins to dedicate to one device and presents a significant board routing challenge. Running large buses at hundreds of Mega-hertz and tight trace to trace skew requirements is time consuming and eats up valuable board space routing.

The strategy that MoSys has followed is to utilize SerDes lanes (that are becoming more abundant on FPGAs and ASICs) which provide high density and high bandwidth while utilizing a minimum number of pins and the routing restrictions on SerDes pins is much less restrictive than that of high speed single ended signaling. In addition, there is a clear roadmap for continued speed advances with SerDes where, in this writers opinion, we are quickly approaching the practical lifespan of single-ended signaling of large buses on PCBs.

Up to this point, we have not answered the question as to why 8 transceivers. To answer that we need to look at the frame or packet of data to be transferred across the link. In the case of the MoSys Accelerator Engines we have defined an 80 bit frame. Within the frame 72 bits are:

Defined to hold data (either pure data or Address and command), as seen in the above figure and the remaining 8 bits (or 10% of the frame) hold the overhead needed to ensure that the frame is transported reliably across the link.

In the realm of SerDes protocols, this is a small overhead and has been defined and designed by MoSys to minimize the burden when dealing with a connection that only needs to be concerned with a point-to-point transfer. When this interface was designed, it was understood that the access was to a memory and that would require high speed and low latency. Since most SerDes protocols are designed for large packet transfers, it can support the larger overhead of many bytes of data per transfer. However, if the goal is to transfer just 1 to 8 words of data at a time, to carry the extra bits or bytes that a packet protocol carries would be impactful on the overall interface performance. This is the reason that a one-word transfer has as few as 8 bits of overhead in the MoSys GCI protocol.

Looking back at the transfer we have now defined that a Frame is 80 bits and high bandwidth is important, because we are dealing with the speed of memory accesses. This can impact the throughput of the whole system if memory is too slow. This has helped define the lanes of SerDes as well. If we look at the slowest offering which is 10Gbps SerDes, when using 8 lanes we can transfer a frame in one nanosecond 80 bits across 8 lanes is 10 bits per lane and at 10Gbps that takes 1 ns. So, if the memory is capable of receiving a new frame every ns, which the MoSys Accelerator Engines are capable of, the throughput will result in 1 Billion access a second at 10Gbps SerDes. This can now scale with the SerDes rate, at 12.5Gbps SerDes the throughput will be 1.25 Billion Accesses a second and so on. MoSys presently offers a device with 25Gbps SerDes that results in 2.5Billion accesses a second per 8 lane port.

A secondary benefit of using SerDes, is that if desired, it is possible to use less SerDes lanes and tune the required throughput. So, if your FPGA only has 4 SerDes lanes available for use it is possible to utilize this and the result will be half the throughput of an 8-lane implementation. With the MoSys Accelerator Engines, each device has two ports (or 16 SerDes lanes) which also allow for double the throughput of an 8 lane implementation.

To summarize, the reason for the 8 lanes is to allow for an efficient method of interfacing with a minimal number of SerDes lanes. It supports very efficient transfer of the 80-bit frames and allows for a performance tuning by utilizing the available SerDes links.

A secondary benefit is the fact that a SerDes roadmap that is active and aggressively being developed will also help drive future improvements in bandwidth when utilizing SerDes as an interface standard.

Additional Resources:

GCI Interface Specification

MoSys Board Design Guidelines

If you are looking for more technical information or need to discuss your technical challenges with an expert, we are happy to help. Email us and we will arrange to have one of our technical specialists speak with you. You can also sign up for updates. Finally, please follow us on social media so we can keep in touch.

Share on social media

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Why 8 Transceivers Per Port?

Other Blog Posts

Utilizing Synchronous Ethernet Timing in 10GbE Breakout from a 100G Multi-Link Gearbox (MLG)

Giving Your FPGA a Big Boost

Not All Memory Chips Are Created Equal

Dense 10GbE Breakout from a 100G (4x25G) Port Using Multi-Link Gearbox

The Benefits of a Dual Port in Data Acquisition

Expand Your Access to High-Speed Memory on an FPGA

Tackling the Test & Measurement Market

How Board Design Can Expedite Your Next Design Project

Why 8 Transceivers Per Port?

Other Blog Posts

CCPA & GDPR website cookie consent