Why – Bigger – Wider – Faster (B-W-F) Isn’t a Roadmap for MemoryWednesday October 28, 2020
By Mark Baumann
Director, Product Definition & Applications
In the years that I have been associated with memory products the standard response when asked, “What does your Memory Roadmap look like?”, was Bigger (or Denser), Wider (x8, x16, x32, x64, etc.) and faster (10s of MHz I/Os to 100s of MHz I/Os to now GHz I/Os). This response in many ways, seems the logical progression of a system resource that you can never seem to have enough.
If the desire for just increased amount of memory is the only issue to be addressed, then a roadmap that resembles this B-W-F premise could work. However, other real-world issues such as required I/O pins, power, board routing, board space, are rational considerations, and they have a real impact on addressing system level concerns.
Let’s look at each one individually:
Bigger – So, when looking at density requirements of memory, it seems logical that the more that is available, the better off the system can perform. It allows for more data storage, bigger tables and greater tolerance of variations of response across the platform. These are positives but, in some cases, it may be a way of overcompensating for a, “better” fit product. Take the example of buffer memory. The sizing of the memory is based on how fast the memory can be accessed. It may be possible, in many cases, to minimize the density of memory if the speed of access is improved.
It is also a very common practice, especially when using slower DRAM type memory, to make multiple copies of the same table to ensure that access is available for a subsequent request that may be blocked due to a previous access to the DRAM. This lends to the need for, “Bigger” amounts of memory but only because of the inefficiencies of access to the DRAM memory. If it were possible to access the memory at SRAM rates with storage density larger than SRAM, but not as large as DRAM, then it is believed this would satisfy many applications without the need for vast quantities of memory because access to the memory is efficient and does not require multiple copies to satisfy the access requirements.
The MoSys Accelerator engines fit this access and bandwidth criteria. Providing greater than 5 billion accesses a second and a bandwidth of 400 Gbps full duplex.
Wider – When looking at memories, a common practice has been to up the bandwidth by increasing the number of bits that are transmitted in each access. This is a workable approach when transferring packets (which can be 100s of Bytes long). However, this does not benefit an application that is looking to access only a small number of words at a time (perhaps between one and 8) and does this is a truly random fashion.
This single word access case is very common for applications such as search algorithms and table lookups, etc. In these type cases, accessing large data arrays is a detriment to performance. To address this functionality in an array such as DDR or HBM, would require that multiple copies of the same data be stored. This would allow for the ability to randomly access a piece of data that may have been opened in a previous access but is now being closed or is being accessed by another entity.
Single word, random access, is again a use case that has special considerations when addressing a memory roadmap that deviates from a B-W-F paradigm.
Faster – It is hard to say anything negative about faster access. It does make sense that as systems get faster and data collection and manipulation becomes larger and more complex, faster access to data will be needed. Functional checking for issues of correctness and security are continuing to increase and seem to require that data inspection and even deep packet inspection is a common requirement.
All of this does lead to the need for “faster” data access. However, like any issue, this can have multiple dimensions. If the traditional method of a host requesting a piece of data that needs the request and the data to traverse traces on a PCB is utilized that has grown, it may require a large number of pins on both the host silicon and memory silicon. This has become a measurable bottleneck for systems; the pure memory access bandwidth over traditional LVCMOS single-ended bus lines. The issue of running hundreds of bus lines as close to a GHz and terminating them cleanly remains a PCB challenge. The ideal situation would be to minimize the PCB bus transitions and or change the PCB Bus structure.
Here again, MoSys has taken a clear-eyed look at the issue and concluded that if a normal PCB transition is required, then an interface protocol that utilizes SerDes is a cleaner and more readily available and reliable transport, while maintaining a roadmap for future growth. SerDes is readily available even today for anywhere from 6Gbps to 56Gbps. All of this allows for stable and reliable and signal transport.
A second approach that MoSys and others have taken is to move some intelligence closer to the memory. In the case of MoSys, we have designed a PHE (Programmable Hyper-speed Engine) which is a device with 25Gbps SerDes interface, a Gb of onboard memory and 32 RISC cores that are 8-way threaded. With this amount of local compute power, it is very feasible to send some data with a command to execute on the RISC cores to interrogate the presented data and all that is returned is the result of the interrogation. This is the ultimate answer for obtaining FASTER results. Since chip-to-chip transitions are minimized and compute power is placed next to the memory resource, things can run at on-die speeds. The end-result is a system that can increase its throughput.
What I was hoping to relate through this blog is the concept that Bigger-Wider-Faster, when looking at memories, is not what a future “roadmap” may look like. The concepts behind B-W-F are still valid but the way to interpret and execute on them will continue to evolve to take advantage of present and future process technology options.
What MoSys wishes to provide is a clean, fresh eyed look at problems so that we may possibly provide a new, innovative way to address our customers issues.
Do Systems Need SRAMs? (BLOG)
If you are looking for more technical information or need to discuss your technical challenges with an expert, we are happy to help. Email us and we will arrange to have one of our technical specialists speak with you. You can also sign up for updates. Finally, please follow us on social media so we can keep in touch.