The perfect choice of one-stop service for diversification of architecture.

FPGA Coprocessing Using Serial RapidIO

2021-10-28

Digah Company

FPGA coprocessing using serial RapidIO

In order to support the "triple play" application, people's demand for high-speed communication and ultra fast computing is increasing, which puts forward new challenges to system developers, algorithm developers and hardware engineers, requiring them to integrate various standards, components and networking devices into a whole

At the same time, developers should not only keep up with the increasing performance requirements, but also pay attention to keeping the cost low. These purposes can be achieved by effectively using FPGA based on serial RapidIO as DSP coprocessor

Since the triple play application integrates voice, video and data applications, new algorithms must be used to set the parameters of its development and system optimization strategy. In the meantime, developers need to solve the following problems: constructing an adjustable and scalable architecture, supporting distributed processing, adopting standard based design, and optimizing performance and cost

After careful study, we will find that these challenges to meet the application requirements mainly involve two themes: one is connectivity, which essentially means to realize "fast" data transfer between different devices, boards and systems; The second is computing power, which refers to the processing resources available in the equipment, board and system respectively

Connection between computing platforms

The standard based design is usually much simpler than the "free play" design. It is also a typical design mode today. Although the parallel connection standards (PCI, PCI-X, EMIF, etc.) can meet the current requirements, they are insufficient if they consider the adjustability and scalability. With the continuous progress of packet processing technology, The development trend of connection standards obviously tends to high-speed serial connection, which can be seen from Figure 1

High speed serial standards such as PCIe and GBE / XAUI have been applied in desktop and network industries, but the data processing system in wireless communication facilities has slightly different requirements for interconnection. It requires:

1. Few pins;

2. Backplane and chip to chip connection are required;

3. Adjustable bandwidth and speed;

4. Have DMA and message passing functions;

5. Support complex and adjustable topology;

6. Support multipoint transmission;

7. Highly reliable;

8. Support time of day synchronization;

9. Quality of service (QoS) available

Figure 1: trends towards serial connections

Serial RapidIO (sRIO) protocol standard can easily meet most of the above requirements, or even exceed these requirements. Therefore, serial RapidIO has become the mainstream connection technology for data plane interconnection in wireless communication infrastructure. SRIO network is based on two "basic modules": endpoint and switch The endpoint device is responsible for sending and receiving data packets, and the switching device is responsible for transmitting data packets between ports, but not for the interpretation of data packets. Figure 2 shows the construction module of sRIO network

Figure 2: sRIO network building blocks

According to the specification definition, serial RapidIO has a 3-tier architecture, as shown in Figure 3

Figure 3: sRIO architecture

This includes:

Physical layer is responsible for describing device level interface specifications, such as packet transmission mechanism, flow control, electrical characteristics and low-level error management

Transport layer - provides routing information for transmitting packets between different endpoint devices. The switching device works at the transport layer in a device-based routing mode

Logic layer - defines the overall protocol and packet format. Each packet contains a load of up to 256 bytes. Transactions access the address space of 34 / 50 / 66 bits through load, store or DMA operations

SRIO has many advantages. A 4-channel sRIO link running at 3.125 Gbps can provide 10 Gbps traffic on the premise of completely maintaining data integrity. SRIO is similar to microprocessor bus. It completes memory and device addressing and packet processing in hardware, which not only greatly reduces the overhead and delay for I / O processing, It also increases the system bandwidth relative to other bus interfaces, but unlike most other bus interfaces, the sRIO interface has few pins, and its adjustable bandwidth based on high-speed serial link can be adjusted in the range of 1.25 - 3.125 Gbps. Figure 4 is an illustration of the sRIO specification

Figure 4: sRIO specification

Â Â Â Â Â Â Computing resources in the platform

With configurable processing resources, developers can implement their applications in hardware, such as data compression and encryption algorithms, and even a complete set of firewall and security applications that were only implemented in software in the past. Now they can implement them in hardware, but doing so requires a large parallel ecosystem with shared bandwidth and strong processing capacity, that is, CPU, NPU When FPGA and / or ASIC carry out shared or distributed processing to build such a system, some requirements for computing resources include:

1. Support distributed processing capability of complex topology; Â

Â Â Â Â Â Â 2. Highly reliable direct peer-to-peer communication capability; Â

Â Â Â Â Â Â 3. Multiple heterogeneous operating systems; Â

Â Â Â Â Â Â 4. Support the communication data layer through multiple heterogeneous operating systems; Â

Â Â Â Â Â Â 5. Modular and scalable platform with extensive ecosystem support

The sRIO protocol specification and architecture support the different requirements of computing devices in the field of embedded and wireless infrastructure. With sRIO, the independence of system structure can be realized, and scalable systems with operator level reliability, advanced traffic management function and high performance and high throughput can be deployed. In addition, The extensive supplier ecosystem also makes it easier for designers to use off the shelf components to construct sRIO system. SRIO is a packet based protocol that supports:

1. Use grouping operations (including reading, writing and messaging) to realize data movement; Â

Â Â Â Â Â Â 2. I / O inconsistency function and cache consistency function; Â

Â Â Â Â Â Â 3. Realize efficient interworking and protocol encapsulation by supporting data streaming and SAR functions; Â

Â Â Â Â Â Â 4. Implement a traffic management architecture by supporting millions of data streams, 256 traffic categories and lossy operations; Â

Â Â Â Â Â Â 5. Support flow control of multi transaction request flow (including configuring QoS); Â

Â Â Â Â Â Â 6. Support priority division to reduce problems such as bandwidth allocation, transaction reservation and deadlock avoidance; Â

Â Â Â Â Â Â 7. Support various hardware topology modes such as standard topology (tree and grid) and arbitrary topology (daisy chain) through system discovery, configuration and learning, including supporting multiple hosts; Â

Â Â Â Â Â Â 8. Error management and classification (recoverable, notification and critical)

IP scheme of serial RapidIO

In order to support fully compatible maximum load operation when sending and receiving user data through logical (I / O) and target and source interfaces on transport layer IP, Xilinx and other manufacturers have designed their endpoint IP solutions according to the latest RapidIO v1.3 specification

Figure 5 shows Xilinx's complete sRIO endpoint IP scheme, which includes the following components:

1. Logicore RapidIO logic (I / O) and transport layer IP; Â

Â Â Â Â Â Â 2. Reference design of buffer layer; Â

Â Â Â Â Â Â 3. Logicore serial RapidIO physical layer IP; Â

Â Â Â Â Â Â 4. Register manager reference design

Figure 5: sRIO endpoint IP architecture of Xilinx

IP architecture

Xilinx provides the source code of the buffer layer reference design, which can complete the automatic queuing and prioritization of packets. SRIO physical layer IP can realize link training and initialization, discovery and management, and error and retry recovery mechanism. In addition, high-speed transceivers are instantiated in the physical layer IP to support line speeds of 1.25gbps 1-channel and 4-channel sRIO bus connection of 2.5Gbps and 3.125gbps

The reference design of the register manager provided in this scheme allows the sRIO master device to configure and maintain the endpoint device configuration, link state, control and timeout mechanism. In addition, the register manager also provides a port that allows the user to design and detect the endpoint device state

Logicore provides a complete endpoint IP, which has been tested by industry-leading sRIO device manufacturers. Users can obtain it through Xilinx coregen GUI tool. Logicoregen tool can help users configure baud rate and endpoint. Logicore supports extended features such as flow control, retransmission suppression, doorbell and message passing. Therefore, Users can create a set of flexible, adjustable and customized sRIO endpoint IP optimized specifically for application requirements

Using various resources in most high-performance FPGAs provided by Xilinx and other manufacturers, system designers can easily create and deploy their intelligent solutions to enhance the advantages of products in time to market, adjustability, scalability and adaptation to future development. Some system design examples using sRIO and DSP technology are given below

SRIO system application example

1. Embedded system: the CPU structure like x86 is optimized for general applications that do not require a lot of multiplication. In contrast, the DSP structure is optimized for signal processing operations such as filtering, FFT, vector multiplication and search, and image or video analysis

Therefore, the embedded system using both CPU and DSP can easily take advantage of the two structures of general processor and signal processor. Figure 6 shows an example of such a system, which includes FPGA, CPU and DSP architecture at the same time

Figure 6: high performance DSP subsystem based on CPU

In high-end DSP, serial RapidIO has become the mainstream data interconnection mode. The main data interconnection in x86 CPU is realized by PCI Express. As shown in Figure 6, some simple configuration of FPGA can be used to adjust the scale of DSP application and / or bridge several completely different interconnection standards (such as PCI Express and serial RapidIO)

In this system, the root complex chipset manages the PCI Express system, and the sRIO system is managed by a DSP. The 32 / 64 bit address space (base address) of PCIe can be automatically mapped to the 34 / 66 bit sRIO address space (base address). PCIe applications communicate with the root complex chipset through memory or I / O reading and writing. These transactions can be written through streams, primitives and confirm read / write transactions I / O operations such as switches atomic nreads nwrite / nwrite_rs can be easily mapped to sRIO space

Designing this kind of bridging function in Xilinx's FPGA is very simple, because the back-end interface of PCI Express and the functional module of serial RapidIO endpoint are similar packet queue modules. Then, it can realize the conversion from PCIe to sRIO or from sRIO to PCIe, so as to establish data flow between the two protocol domains

2. DSP processing application: in those applications where DSP processing is the main architecture requirement, the system structure can be designed as shown in Figure 7

Figure 7: devices requiring powerful DSP processing power

Xilinx Virtex-5 FPGA can be used as coprocessor of other DSP devices in the system. If sRIO is used for data interconnection, the whole set of DSP system scheme can be easily adjusted. This scheme has scalability, adapts to future development, and can be realized in a variety of overall dimensions

When applications that require powerful DSP functions also need to perform a large number of fast and complex operations or data processing, these processing tasks can be unloaded to x86 CPU to run Xilinx Virtex-5 FPGA, which allows bridging between PCIe subsystem and sRIO structure, so as to realize efficient function unloading

3. Baseband processing system

With the rapid maturity of 3G network, OEM manufacturers will adopt new devices and equipment with overall dimensions to reduce the problems of capacity and coverage. The DSP architecture based on sRIO and FPGA is an excellent scheme to meet such challenges. The traditional DSP system can also be readjusted to this fast and low-power FPGA based structure, so as to make full use of the adjustability advantage of FPGA

In such a system, as shown in Fig. 8, the FPGA can meet the antenna service line speed

FPGA Coprocessing Using Serial RapidIO 1

GET IN TOUCH WITH Us

Understand the Formation of Structures and the Amount of Matter in the Universe

The "I am AI" video at GTC 2020 conference vividly shows the impact of AI technology on various fields around the world.AI will help humans land on Mars, develop vac...

Problems of Phased Array Ground Penetrating Radar Detection System

Although GPR has been widely used in hydrology, engineering, environment and other fields, many basic theoretical and technical problems have not been fundamentally ...

Suggestions for Journaling, Bullet Notes, Activity, Wiki Like Application

So I am thinking of a possible answer to my own question:Build my own journal note entry app linked to a wiki.Zim Wiki uses a file based system for wiki. Maybe I cou...

How Many NCAA Football Bowls Are There?

About 8 I think1. where can i get ncaa football 10 rosters with names?For the last two years I got mine from "Pastapadre". From what i can tell they are really prett...

What Is the Name of a Horror Movie with an Eye Falling in a Cocktail Glass?

The Haunting I vaguely remember some kind of eye injury in the movie.â€¢ Other Related Knowledge ofa cocktail glassâ€” â€&...

Automotive Backup Camera Systems, Backup Hub

rsync can be somewhat painful if you have a very large number of files - especially if your rsync version is lower than 3. On the other hand: if you use tar, you wou...

Blockchain Technology Explained: Powering Bitcoin

Blockchain Technology Explained: Powering BitcoinMicrosoft recently became the latest big name to officially associate with Bitcoin, the decentralized virtual curren...

PLEASE HELP ME CHOOSE a VIDEO CAMERA!?

PLEASE HELP ME CHOOSE A VIDEO CAMERA!?the Flip ultra HD is a really good HD portable camcorder, and it's fairly cheap. A lot of famous youtubers use it, such as timo...

The National Key R & D Plan Launched the Key Project of "key Technologies and Demonstration of Inter

In order to implement the tasks proposed in the outline of the national medium and long term science and technology development plan (2006-2020), the national key R ...

Alibaba Goes to War! Smart Home Battlefield Adds Another Giant

Alibaba group and Royal Philips of the Netherlands announced that they have officially signed an IT infrastructure service framework agreement to jointly promote the...

Guangzhou

House Empire Construction&Furnishing Co.,Ltd

Navigation