The Blazar Family of Accelerator Engines Provide a system Architect/Designer new Acceleration Options encompassing Software and Hardware options that no Competing Product can Offer.
Table of Contents
- Acceleration Strategy – Two Paths
- MoSys Point Product Accelerator
- Blazar Accelerator Engines Overview
- MoSys Scalable Performance Accelerator
- Stellar Virtual Accelerator Engines
- Stellar Packet Classification Platform
- Technical Overview
- Single Common API
- Single Common RTL Interface
- Scalable Hardware Environment Example
- Adaptation Layer Software
MoSys Inc (NASDAQ:MOSY) has been developing memory-based products for close to 20 years. It started with the development of the 1-T (transistor) memory IP. The 1 T-SRAM has access speeds close to SRAM but support a density that approaches that of a RLDRAM and has revolutionized the memory industry. It is still being used by many companies in their products today.
How To Accelerate The Future
MoSys decided that it would use the 1-T memory and define an Acceleration Product strategy that would increase performance of ASIC and FPGA applications in a way no other product could.
Most high-speed systems today use a FPGA (Field Programmable Gate Array). MoSys acceleration technology is used to accelerate these applications in such areas as networking, video, test & measurement, AI, IOT or any high-speed application. It can accelerate current systems as well as provide a path to higher performance, more flexible applications. The technology’s acceleration options eliminate system data bottlenecks, and when using the In-Memory intelligence capability, provides added acceleration by offloading the CPU or FPGA.
The strategy became a two-way decision tree based on the needs of today’s market, achieving demanding customer product performance levels and, as important, the decide if there is a need to be able to respond to changing market requirements.
Accelerator Engine Integrated Circuit Strategy Achievements
- Acceleration Engine Integrated Circuits
- Largest High-Speed Memory on a single device
- Single circuit with 567Mb to 1Gb
- tRC of 3.2ns
- Use as a QDR type read/write memory interface so current application RTL interfaces almost seamlessly
- Added intelligence of In-Memory functions
- Add in OPTIONAL accelerator features that do not have to be used to use memory AS A STARDARD HIGH SPEED MEMORY
- Use only if the application could benefit from their use
- Include on device are In-Memory Identified High Use system operations call In-Memory functions that are fixed on each device
- Burst … Used for High Speed data Movement for sequential Read or Writes
- RMW… Add multiple ALUs in the memory for Read/Modify/Write operation for COMPUTE AND DECISION
- A Programmable device for HyperSpeed acceleration that includes the fixed function but adds on the device 32 Risc Cores for the highest performance applications allowing embedded user defined functions
- Fast but simple to design with few pins compared to other memories
- Reduce the pins count to as low as possible using serial interface between device and FPGA
- Add signal Auto-Adaption on each pin.
- Achieved 32 pins (lower performance can us as low as 16 pins)
- Use High Speed serial connection to achieve the Highest Bandwidth possible , then provide FPGA RTL logic to provide a QDR like parallel interface
- Design the highest efficiency serial protocol on the market
- Most designers are not familiar with high speed serial protocol
- Provide FPGA RTL to handle memory and serial interface and present to the User
- Make the serial interface transparent with RTL to be easily adapted to an AXI or Avalon bus
- The MoSys supplied RTL is easily convered to a AXI or Avalon bus
Application Acceleration Platform Strategy Achievements
- Stellar Virtual Acceleration Engine Software/Firmware
- Use Cloud Computing like strategy of flexible provisioning with a virtual machine, we call Stellar Virtual Accelerator Engines
- Protect software investment across a range of applications for a family of products or to easily meet future unknown performance requirements
- Provide Software/firmware Application Acceleration Platform products
- Common API
- Common RTL if using FPGA
- Design a common software and hardware interfaces that easily understood by software and hardware architects
- Runs on different scalable performance hardware environments
- User able to defined multiple
- Ability to take advantage of the Accelerator Engine “point performance” products to provide a range of scalable hardware performance levels
- Provide a transportable software/firmware path different hardware environment
- Execute as a software on AND FPGA only environments
- Provide FPGA RTL code for higher performance system using MoSys Accelerator Engines.
- Example: Different hardware environments with speeds from 30M to 3B operations per second (for Stellar Packet Acceleration Platform)
- Entry performance point is CPU only at 30M operations per second
- Highest performance point is using the MoSys Programmable HyperSpeed Acceleration Engine
MoSys Point Product Accelerator
- Most memory manufacturers have focused on memory features of speed or density with a belief that one size fits all. They have not taken into consideration that memories could speed up applications by adding a level of intelligence and improving bandwidth. And, can accelerate system even more with scalable hardware/software virtual solutions.
- As part of MoSys’ evolution, it looked at the common the tasks done in many systems and added intelligence into its memory to execute those tasks, without CPU involvement. In effect, offloading the CPU resulting in Application Acceleration.
- The result, a family of devices call Accelerator Engines. This is a new class of memory called Accelerator Engine.
- These common tasks are called In-Memory Functions. These intelligent functions do not have to be used to use the Accelerator Engines as a memory, but depending on the application, can add another level of acceleration.
Blazar Family Silicon Integrated Circuits
Bandwidth Engine (BE)
There are two high performance Bandwidth Engine memories. They serially attached directly to an FPGA and are the highest density memories available in the high-speed memory market with 512Mb or 1Gb. In addition, to further accelerate applications, intelligence called “In-Memory” functions, have been built into the device for high speed data movement or data computation.
Programmable HyperSpeed Engine (PHE)
The base features of this product are a Bandwidth Engine with 1Gb of memory and adds 32 Computer Cores along with many other performance options. The programmability allows user software to be resident in the PHE to provide HyperSpeed acceleration capability significantly increase performance by its speed and freeing up execution time of the CPU or FPGA.
Accelerator Key Features
- Each Accelerator Engine is a combination of four capabilities:
- High Capacity Memory of 576Mb or 1.152Gb, with a tRC of 3.2ns
- High Speed Embedded In-Memory Functions (Intelligence) which include BURST, RMW and USER PROGRAMMALBE.
- High Speed Serial Interface for High Bandwidth and Simplified Board design requiring only 32 FPGA pins.
- All Accelerator Engines have Embedded In-Memory Functions of BURST and RMW are designed to execute much faster in-memory, than could be executed in traditional memory.
- Programmable User functions can be embedded in the Programmable HyperSpeed Engine that has 32 Risc Cores.
- Interfacing made simple. MoSys’ high speed memory interface, GCI, is available for customer to write their own FPGA interface. The transfer efficiency is unmated in the industry today and achieves 90% efficiency.
- However, Mosys simplifies the design by providing FPGA RTL memory controllers that handle the memory and high speed serial memory interface. This RTL then presents a QDR like parallel interface to the user so that the serial memory interface is completely transparent.
Block Diagrams of Memory Architecture and Capacity
BE2-BURST with 576Mb
BE2-RMW with 576Mb
BE3-BURST with 1.152Gb
BE3-RMW with 1.152Gb
MoSys Scalable Performance Accelerator
The MoSys’ continues to impact the industry with its Software Accelerator Platform products which provide full accelerator decision functions that are portable across a wide range of hardware configurations and do not require the use of MoSys hardware.
In today’s fast paced markets, customers need to keep up with the need for increasing performance and ensuring that their software investment can be Ported to a range of increasing Performance Scalable Hardware
The MoSys’ “Stellar Virtual Accelerator Engine” products facilitate the ability to provide a Software Platform to accelerate key functions in high speed FPGA applications. The first platform, “Stellar Packet Classification” address such markets as networking, datacenter, security and AI by inspecting date and determining the proper handling.
The Stellar VAE uses the Cloud concepts of a software portable common API that will execute on several hardware designs with the ability to scale perform up to 100x from a software only solution to an accelerated FPGA design using Graph Memory Technology.
Stellar Virtual Accelerator Engine (Software/Firmware/Hardware)
- Stellar Virtual Accelerator Engines (VAE) are designed to support a functional platform for example, a “Stellar Packet Filtering Platform”.
- It is “Virtual” because it can be standalone software, FPGA RTL, or embedded Firmware based. Using MoSys’ common software interface (API) across multiple hardware environments, enables system designers to reuse internally developed software code to tune the performance required.
- In addition, all FPGA-based VAEs use a common RTL IF that allows hardware transportability.
- A VAE with a common API can run on:
- Common FPGA RTL
- FPGA that is not attached to a MoSys IC
- FPGA that is attached to a member of the MoSys Accelerator Engine IC.
- Performance scaling from a CPU only to a FPGA attached to a MoSys IC, can be as high as 100 times.
- Key benefits of a Stellar Virtual Accelerator Engine is the focus on applications needing:
- Protection of software investment with common API for transportability
- Performance scaling over many different hardware environment
- By designing to a common API, this allows a system designer to seamlessly port it across a range of performance platforms
MoSys’ new Software Accelerator Product Line is comprised of Function Accelerator Platforms, targeted at specific application functions where the platforms’ common software interface allows performance scalability over multiple hardware environments from CPU only to a range of FPGA performance solutions that use a common RTL I/F. This software defined, hardware accelerated platform architecture allows users to preserve and re-use software assets via a common software interface and then depending on the performance needed, select the appropriate hardware environment.
Stellar Packet Classification Platform
As markets continue to migrate to software-defined environments, most notably software-defined networks (SDN), performance scaling has become key to remaining competitive while addressing the growing demands being placed on the network. Software investments now must be transferrable across multiple hardware environments in order to be both cost-effective and to provide the required flexibility to meet changing performance demands.
A Function Acceleration Platform is hardware agnostic and operates with or without a MoSys IC. For example, on a CPU or FPGA not attached to a MoSys IC, or an FPGA attached to a member of the MoSys Accelerator Engine IC family like the Bandwidth Engine or Programmable HyperSpeed Engine with in-memory compute capability. MoSys expects acceleration performance scaling across these platforms to be as high as 100x over standalone CPU implementations with an FPGA connected to a MoSys PHE. MoSys scalable solutions can meet today’s application needs, as well as flexibly provide a path to new products to address new and more performance-driven market demands.
Stellar Virtual Accelerator Engine Overview
At the core of providing a highly scalable platform is virtualizing the accelerator function by creating a functional abstraction with a high-level software API for a specific application area that can be implemented across different hardware and software environments. MoSys calls this a Stellar Virtual Accelerator Engine (VAE).
The VAE leverages a common application program interface (API) to enable a platform to achieve performance scaling of up to 100x. The same functions can scale from software running on a CPU; RTL running on an FPGA to very high-performance implementations which combine an FPGA connected to MoSys Accelerator Engine ICs with in-memory compute.
Stellar Virtual Accelerator Engine – Key Features
- Single Common API
- Single Common RTL Interface
- Performance Scalability Across Multiple Hardware Environments
- Adaptation Software
Single Common API:
A Stellar Virtual Accelerator Engine (VAE) employs a common application program interface (API) to allow for platform solutions portability of a given accelerator function. Implementations can range from software on a CPU, to modules in FPGAs, to very high-end, highly accelerated solutions using FPGAs with MoSys Accelerator Engine ICs like the MoSys PHE with its 32 RISC cores and 1Gb of embedded memory. A performance increase of a PHE implementation would be 100x over a CPU only solution.
Single Common RTL Interface
Most VAE platforms will have FPGA-based products with different performance capabilities. The RTL interface across these FPGA products will have a Common RTL logic specification (VAE IF).
Designing to a common hardware interface allows for easy migration between different performance and capacity requirements. Additionally, the designer can create a wider range of product offerings and can more easily implement add-on acceleration modules. This provides future proofing because as new silicon becomes available with a VAE port, the designer can drop it into the same logical “socket”.
Performance Scalability Across Multiple Hardware Environments
Scalability comes from the ability of each implementation to take advantage of available compute and memory resources. Memory can range from DRAM adjacent to the CPU, to block memory or HBM on the FPGA or 1T-SRAM tightly coupled with 32 RISC engines in MoSys Accelerator Engine IC products.
The common API is presented to the common s/w (“C VAE IF) or across different FPGA configurations with the common RTL (VAE IF).
Adaptation Layer Software
Implementation of a VAE will require an Adaptation Layer.
The adaptation layer provides a way for upper levels of software with an existing API to bridge to the VAE common API. The designer may develop their own adaption software libraries or choose to use software provided by MoSys as part of application specific platform.
If the user’s Adaptation Layer is included in every system, the Application API would be common across different hardware environment, and therefore could be considered part of the VAE. With MoSys providing the performance scaling, and in an FPGA implementation, the hardware scaling with the Common RTL interface (VAE IF shown in the diagram).
Embedded functions that can be benefit from this technology:
- Key value pair databases
- Networking search functions
- Machine Learning
- Algorithm acceleration
- Security analysis
- And more
Target markets where these functions are used include:
- Smart NICs
- Security appliances
- Network hardware
- Datacenter acceleration cards
- 5G edge compute
- Defense and aerospace
- High-performance computing
- High speed test equipment
- And more
MoSys initial focus is on embedded applications that require:
- High random-access rate memory
- High data rate throughput
- Low latency
Download the white paper here:
MoSys is working with early access customers.
If you would like a demonstration of this technology, please contact us.