There is a New Type of Memory (EFAM) in Town Part 2 of 2

Monday April 13, 2020

By Mark Baumann Director

Product Definition & Applications

MoSys, Inc.

In Part 1 of this blog, we discussed the legacy of memory and how its growth and development tracked closely to Moore’s Law. We also reviewed the differences between a DRAM and DRAM-based device versus MoSys BE devices. In Part 2 of this blog, we will get into the details of EFAM, Embedded Function Accelerator Memory.

EFAM (Embedded Function Accelerator Memory)

When reviewing the common memories used in both networking systems and indeed many systems, there is a need of some portion of the memory to be of high random-access rate and low latency. Whether it is for table lookups, high frequency trading or just to be able to store and forward at the prevailing line rate. It is not uncommon for multiple accesses on the data to perform some form of manipulation while the data is being transported through the system. In this case the additional functionality, supported by the MoSys devices become an even greater advantage, as the MoSys devices have progressed, by adding additional functionality or acceleration. The goal behind the functions is to accelerate what are felt to be very common and time intensive operations. The following are the functions that are presently available in MoSys Accelerator Engine devices:

BURST

The first being a burst operation. The benefit of this operation is straight forward, in that it allows the user to burst data both in and out of the Accelerator Engines. This function allows the user to burst 2, 4, or 8 words (72bits each) with only one command being issued. In these applications no decisions need to be made on the data it just need to be stored and eventually forwarded.

R-M-W

Read – Modify – Write. This too is a function, and when executed, is very time intensive. Again in support of a common time intensive operation, the Accelerator Engines have taken a common operation which in its sub components is a Read operation, A Modify operation and a Write operation and made it a single command with a “fire-and-forget” construct in which the operand is sent with the data modifier and the Accelerator engine will internally perform this operation Atomically to insure that data will not be corrupted or lost even if multiple operations to the same data location are requested. The available operations are as follows:

Boolean
- AND, OR, XOR memory location with a 16b or 32b mask
Arithmetic for statistics functions
- Add/Subtract 16b or 32b immediate
- Read and Set 16b, 32b, 64b & double
- Add 1 to packet and 16b or 32b immediate to byte count
- Decrement Age Count of packet/byte counter pair
Semaphores and Mutex for real-time multi-thread table updates
- Test for Zero and Set on 16b, 32b, & 64b
- Compare and Exchange on 32b
Specialized counting functions
- Compute running average 16b values
- Compute metering (leaky bucket, 3 color, two rates) for policing and shaping

The above list of operations are designed to support data statistics, metering and basic decision functions like aging. All of which take place with a single command being issued.

Programmable Functions

The latest in the family of Accelerator engines is the Programmable HyperSpeed Engine (PHE). In this device, the functionality that is made available include the Burst and R-M-W of the previous devices, as well as adding 32 processor Engines (RISC Cores), each engine is possible to be up to 8 way threaded, each of which has the ability to access all the resources of memory and registers. This provides a large step up in both performance and functionality by providing the user the capability to design and build a custom algorithm or feature set that runs at the internal core rate of up to 1.5GHz. This provides the ultimate in flexibility and performance in allowing you as a designer a clean slate to differentiate the functionality offered in your system.

Large Capacity Data Structure
- 128MB + ECC
- Parallel expansion
Tightly Coupled Cores & Memory
- Direct connect via cross point Switch
- No cache è no miss variability
Optimized Instruction Set
- Hash, Compressed Trie etc.
- Packed bit fields
- 24b x 24b Multiplier
High Parallelism
- Up to 8 way thread cores
- 8 way threaded SRAM
- 16 way Thread 1T-SRAM
Low Latency Memory Access
- 6ns to 25ns
- 4x faster than DRAM
- 2 Level hierarchy possible

Additional Resources:

FPGA and Memory Latency Blog

The Benefits of Serial Memory

FPGA and Memory Latency

If you are looking for more technical information or need to discuss your technical challenges with an expert, we are happy to help. Email us and we will arrange to have one of our technical specialists speak with you. You can also sign up for updates. Finally, please follow us on social media so we can keep in touch.

Share on social media

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

There is a New Type of Memory (EFAM) in Town Part 2 of 2

Other Blog Posts

Utilizing Synchronous Ethernet Timing in 10GbE Breakout from a 100G Multi-Link Gearbox (MLG)

Giving Your FPGA a Big Boost

Not All Memory Chips Are Created Equal

Dense 10GbE Breakout from a 100G (4x25G) Port Using Multi-Link Gearbox

The Benefits of a Dual Port in Data Acquisition

Expand Your Access to High-Speed Memory on an FPGA

Tackling the Test & Measurement Market

How Board Design Can Expedite Your Next Design Project

There is a New Type of Memory (EFAM) in Town Part 2 of 2

Other Blog Posts

CCPA & GDPR website cookie consent