Reprinted: 10G Ethernet optical port loopback test the interface with Aurora

10G Ethernet optical interface with high-speed serial interfaces more and more common, this paper through a simple loop experiments to explain matters in common debugging interface that need attention. Xilinx FPGA interface to a variety of tips to learn: Example Design. Welcome to explore.

First, the purpose of the experiment

In order to achieve efficient data transfer between the switch and a large-capacity high-speed communication devices, high-speed interfaces to understand and use increasingly reveals its importance. Using this experimental design plan GTH four high-speed serial interfaces, respectively, and the interface protocol Aurora64b66b 10G Ethernet interface protocol, enabling the exchange plate is connected to the test device and the outer loop to achieve high-speed data through the fiber piece, in order to achieve rapid appreciated Interface protocol and can skillfully use the two kinds of high-speed data transmission and reception interface object.

Second, the interface Introduction

1, GT Interface Overview

Application of high-speed serial data transmission and reception. In the series called A7 chip GTP, the K7 series called GTX, V7 series called GTH, for basically the same physical interface, high-speed communication principle of different speed grades.

1.1, the structure of the transceiver

For every serial high-speed transceiver, which is divided into two sublayers: PCS (physical coding sublayer) and PMA (Physical Medium Attachment sublayer). PCS primary layer and a data codec for processing multiple channels; layer mainly the PMA-parallel and parallel-serial converter, pre-emphasis, de-emphasis, the extracted serial data transmission, the data clock. It can be used to check for loopback test ibert IP interface determines whether the interface can be used normally.

FIG 1 GTX / GTH transceiver block diagram
GT interface to the transmitting side processing flow: First, logical user data after 8b / 10b encoding into a send buffer, this buffer is mainly the PMA sublayer PCS sublayer and two clock domains clock isolation, to solve the problem of both clock rate matching, and phase difference, and finally through a high speed serial conversion performed Serdes. Receiving end and an opposite end of the transmission process, the specific implementation can refer to ug476_7Series_Transceivers learning.

1.2, GT clock instructions

7 Series FPGA typically bank according to points, for GTX / GTH the bank, generally referred to as a Quad, the reason is to improve the degree of integration with the Xilinx FPGA series 7, which is no longer the exclusive high-speed serial transceiver a single reference clock , but in Quad to serial high-speed transceiver packet, four high-speed serial transceivers and a COMMON (QPLL) composed of a Quad, each high-speed serial transceivers is called a Channel, its internal structure shown in Figure 2 Fig.

FIG 2 GTX transceivers on the hardware configuration

From the perspective of the bottom, since every Channel CPLL unique, so all of the interfaces are CPLL Channel underlying this module. The QPLL is another use of the underlying module called common. In the CPLL GTX QPLL and, in addition to the number (four per Quad has a QPLL the CPLL) and a home (QPLL belongs common, CPLL belonging Channel) different than, the biggest difference is that support different maximum line rate frequency. CPLL up only 6.xG, and QPLL can be more than 10G (specific values ​​to be queried DataSheet according to the speed rating of the device). For the 7 Series GTX, each with two Quad external differential reference clock source, external reference clock of each input must be to use after IBUFDS_GTE2 primitive. 7 FPGA family support the use of adjacent north and south reference clock as a reference clock Quad Quad current, but a Quad reference clock source can not be driven more than three transceivers on the Quad (Quad current driving only two adjacent North and South Quad). For a GTX Channel, the reference clock can be selected independently of the transceiver can select QPLL, can choose the CPLL, to be noted that, on each Quad QPLL only a resource, cause repeated embodiment of a wiring error.

1.3, from the main GT concept

When we use the GT interface IP core (Aurora and 10GEthernet also applicable), often mentioned in connection with the main core is not accurate to say from the core, which is actually just one of our shared logic when configuring IP core options, as Figure 3:

Figure 3 GT interface IP core configuration options

The description clearly shows that the two options are represented transceiver QPLL, clock and reset logic, is included in the kernel itself or exemplary design (example design) are, for simplicity, we often share the logic included in the kernel IP itself is called the primary core, the kernel does not include sharing logic is called from the IP core, the following structure shown in FIG. 4 and FIG. 5. The main difference from the nuclear core is: can we modify a shared logic in Example Design in. In the actual design, you can use the main core can also be used from nuclear, but note that after if used in the design of a main core, its interior will be used QPLL resources on the Quad, in use on the Quad other GTX interface, the master can no longer use nuclear, there is no need to give added from nuclear sharing logic.

FIG 4 sharing logic in the kernel

 

 FIG 5 shared logic in Example Design

2, Aurora Interface Overview

2.1 Overview

Aurora is an open protocol provided by Xilinx Inc., free link layer protocol, can be used for point-to-serial data transmission with high efficiency and ease of use features to achieve high performance data transmission system. The design of Aurora 64b66b protocol used is an extensible, lightweight link layer protocol may be used for single or multi-channel serial data communication, single bit wide bus may be implemented as a serial data 64bit conversion between the differential data signal.

2.2 is connected, the signal

Mentioned in the previous section for the high-speed serial transceivers each may be used only in a Quad QPLL (GTE2_COMMON), we generate a core and open its Aurora from the example design, which share part of the logic module comprises in its gt_common_support , the module generates gt_qpllclk_quad2_out, gt_qpllrefclk_quad2_out other signals used for the IP core when we generate a primary core Aurora, the portion of the internal logic is included in the IP core, QPLL be outputted as an output signal from the IP core. When the output signal of the GTX 2 and above the interface design needs, it is necessary to share the logic generated to all IP core to use. + From the nuclear core to the primary as an example, the figure illustrates the connection portion of the signal thereof:

FIG 6 Aurora64b66b connection relationship between the primary core and the core

When using two nuclei from these signals in FIG shared logical connections are generated in the example design, people need to be input to each of the IP interfaces.

2.3, sequential logic

2.3.1, Link Establishment

Aurora channel link will be set after the initialization is complete lane_up bit signal, it indicates that the interface may receive data; channel_up pulled flag when the interface may transmit data. Analyzing generally believed interface has completed the initialization, data transfer can be started when both signals are asserted.

2.3.2 Data Transfer

       Aurora data transmission format in the interface as shown below:

FIG 7 the transmitting side data stream format

In FIG. 7, s_axi_tx_tready signal indicates when pulled ready to transfer data, the clock signal is determined by the compensation mechanism within the link, without human control, and only when s_axi_tx_tvalid s_axi_tx_tready two signals are set to 1, it indicates the the bus clock cycle data is transferred successfully.

FIG terminal 8 receives a data stream format

In FIG. 8, m_axi_rx_tvalid indicates valid data on the bus current.

2.4, the interface hardware

It is SERializer SERDES (serializer) / DESerializer (deserializer). It is a major division multiplex (TDM), peer (P2P) technology serial communication. That signal is converted into parallel low-speed serial multiplex signal at the transmitting end, via a transmission medium (connectors, copper or fiber), and finally at the receiving end converts the signal back into a high-speed serial-speed parallel signal. This point to point serial communication technology make full use of the channel capacity of the transmission medium, and to reduce the number of transmission channels required device pins, to enhance the signal transmission speed, thus greatly reducing communications costs. The advantage of using the number of transmission lines in addition to the SERDES to save the maximum possible extent, may also be compatible between the plate and fiber optic transmission. Whether connecting means by which, require the use of XILINX GTP / GTX high-speed serial transfer interface. A physical implementation of this interface is SERDERS, physical layer coding scheme may be selected or Aurora 8B10B Aurora 64B66B, different application layer protocols can be selected, may not be used.

3, 10G Ethernet interfaces

Before the public can refer to this article number: 10G Ethernet FPGA implementation of the interface, you need to look here .

3.1 Overview

10G Ethernet including 10GBASE-X, 10GBASE-R and 10GBASE-W. Laid 10GBASE-X uses a compact package, each working on the transmitter / receiver / s 3.125Gbit velocity (flow velocity data is 2.5Gbit / s) lower. 10GBASE-R is a 64B / 66B encoding used (no longer used as used in Gigabit Ethernet 8B / 10B) serial interface, a data stream 10.000Gbit / s. 10GBASE-W WAN interfaces are compatible with SONET OC-192, the data stream 9.585Gbit / s. This design is used in official open Xilinx IP core 10G Ethernet subsystem in 10GBASE-R Ethernet optical interface mode.

3.2, the relationship between the clock

       For the layout of the interior of the FPGA clock is divided into the following four parts: differential reference clock (a) a reference clock input via dedicated cache (IBUFDS_GTE2) into a single-ended clock refclk, refclk then divided into two parts, one to QPLL ( QuadraturephasePhase Locking Loop), another clock after the global clock into a BUFG coreclk, coreclk continue to be divided into two parts, one clock as 10G MAC transceiver (xgmii_rx_clk and xgmii_tx_clk) nuclear XGMII interface, another path for the drive 10G Ethernet PCS / PMA logical IP core internal user side. (B) The two clocks qplloutclk qplloutrefclk QPLL and output, primarily for high-performance clock IP nuclear GTH transceivers used directly for driving qplloutclk wherein the serial signal transmitted GTH inner end, a frequency of 5.15625GHz. qplloutrefclk GTH inner part for driving the logic module, a frequency of 156.25MHz. (C) txoutclk 322.26MHz clock generated by a 10G Ethernet PCS / PMA IP, after the clock BUFG into two branches, wherein txusrclk 32bits for driving the data bus in the GTH IP cores, IP cores for driving TXUSRCLK2 PCS module inner layer portion. (D) In ​​the laboratory research since the switch board (chip type xc7vx690tffg1761-2) on, a 25MHz crystal oscillator generates a system clock input to the PLL (Phase LockingLoop) module within the FPGA, PLL clock generation module 156.25MHz clock 25MHz clock is driven user 10G MAC is sent to the user side of the core.

3.3, IP core configuration

Vivado the 10G Ethernet IP core configuration interface below the IP core IEEE802.3-2008 meet standards, comprising MDIO (PHY Management Interface), the FCS configurable processing mechanism, flow control and other functions. MAC and PHY interface XGMII using standard interfaces, which are send and receive data bit 64bit, frequency of 156.25MHz. MAC core with a user interface AXI4_STREAM, which data bit is 64bits, as operating frequency 156.25MHz. Select a shared logic is contained in the example design, the core mode that is from the shared logic tab.

FIG 9 IP core configuration interface

Sharing logic comprises a clock differential input buffer, which is connected to GT_COMMON block, you can have up to four 10G Ethernet subsystem share the core logic of the Quad. Using clock buffers (BUFG_GT) Create coreclk / coreclk_out differential reference clock from the transceiver. coreclk / at the same frequency of the differential clock source coreclk_out. Shared logic GT_CHANNEL from the final BUFG_GT TXOUTCLK, and then connected to GT_CHANNEL, TX transceiver to provide the user clock (TXUSRCLK and TXUSERCLK2). When using the 64-bit data path, the clock frequency is 156.25 MHz; the use of 32-bit data path, the clock frequency is 312.5MHz. Note that the user data are directly connected to the IP core shall be aligned with coreclk, even when the same clock frequency and the local users are coreclk frequency 156.25MHz, may result because of non-homologous phase deviation, and should also be asynchronous FIFO cross clock domain processing.

3.4, signal connection

In a primary core from the nuclear +2 example, the figure illustrates the connection portion of the signal thereof:

10 10G Ethernet signal is connected with the primary core from the core of FIG.

When using two nuclei from these signals in FIG shared logical connections are generated in the example design, people need to be input to each of the IP interfaces.

3.5 Data Transfer

3.5.1 Link Establishment

10G Ethernet link initialization After completion of the channel will be set core_ready signal, it indicates that the interface interface has completed the initialization, data transfer can begin.

3.5.2 Data format

AXI-Stream 10G Ethernet bus interfaces the data format used in the user side as shown below:

FIG 11 AXI-Stream bus signal timing relationships

3.6, the interface hardware

In the distant scene connection, the copper wire can not meet such a long distance, a large amount of data communication, it is necessary to use optical fiber communication scheme. Implementation of this solution requires the use of an optical module. The optical module is photoelectrically converted electro-optical and optoelectronic devices. End of the optical transmission module converts the electrical signal into an optical signal, the receiving end converts the optical signal into an electric signal. Optical module package according to the classification, a common SFP, SFP +, XFP and the like. Interface optical module is fully compatible XILINX the GTP / GTX IO, interface circuit as shown:

FIG 12 XILINX FPGA module is connected with a circuit diagram of the type of the light optical module number, the following items only for three common optical module introduction. . 1) SFP optical module SFP optical module is a small form factor pluggable optical module, the highest rate of up to 10.3G (commercially essentially 1.25G), usually connected to the LC jumper. SFP optical module is mainly constituted by a laser. SFP classification can be divided into classification rate, wavelength classification, pattern classification. SFP optical module also contains a Fast SFP, Gigabit SFP, BIDI SFP, CWDM SFP and DWDM SFP. 2) optical SFP + module SFP + form factor optical modules and SFP optical module is the same, the transmission rate can reach 1OG, commonly used in short-distance transmission. SFP + optical module is a hot-swappable, independent of the communication protocol of the optical transceiver. . 3) XFP optical module XFP optical module is a hot-swappable, independent of the communication protocol of the optical transceiver. You can achieve the same rate of 10G, but the volume is larger than the SFP / SFP + optical module. By comparison and analysis, SFP + optical module having a more compact dimensions than the XFP, SFP higher than the rate, the more excellent is a long-distance optical fiber transmission scheme. This design 10G Ethernet interface uses optical SFP + module implemented in hardware photoelectric conversion.

Third, the frame structure analysis

1, Ethernet frame structure

This part also can be found in public before this article number: Have you seen the physical layer Ethernet frame size looks like? There are two main formats of the Ethernet frame: Ethernet II (DIX 2.0) and IEEE 802.3. This design uses the Ethernet II frame structure, the frame format shown in Figure 13:

Figure 13 Ethernet II Ethernet frame format

 The fields are described as follows: ⑴ preamble (Preamble): 0,1 spaced from the code components, to notify the destination station is ready to receive preparations. ⑵ source address and destination address (Destination Address & Source Address): represents the address of a station transmitting and receiving a frame, each occupying 6 bytes. Among them, the destination address can be a single site, it can be multicast or broadcast address. ⑶ type (Type) or length (Length): This two byte type (Type), high-level protocol type specified in the received data in Ethernet II frames. ⑷ data (Data): After processing of the physical layer and logical link layer, the data contained in the frame will be passed to the higher layer protocol specified in the type segment. The minimum length of the data segment should not be less than 46 bytes, the maximum should not exceed 1500 bytes. If the segment length is too small, it will automatically fill (Trailer) character in the data segment. Conversely, if the data length is too large, the data will be transmitted after the segments of the segmented. ⑸ Frame Check Sequence (FSC): comprising a length of 4 byte cyclic redundancy check value (the CRC), is calculated by the transmitting device produced is recalculated to determine whether the frame is corrupted during transmission on the reception side.

2, Spirent Testcenter traffic stream format

FIG 14 Spirent Testcenter traffic stream format

When the configuration using the Ethernet data frame Testcenter, Testcenter automatically adds the data field of the Ethernet frame 20 bytes of overhead, i.e., the figure above Signature field, the function of each of the fields:

FIG 15 Spirent Testcenter traffic flow

Signature field This field contains 32bit (4 bytes) of the stream ID, supported 4000000000 test stream. The timestamp field has a resolution of 10 nanoseconds when Spirent Testcenter PRBS23 pattern inserted in the payload, PRBS position 1Last bit tells the receiver where the time stamp byte of the field has a built-in UDP / TCP Checksum Cheater field ( placing modifier for use in the payload) Since the Signature field is a unique identifier SpirentTestcenter traffic flows, Testcenter received data stream by identifying Signature field to calculate the link delay, and determining whether there has dropped frames, in addition, the field is not visible to the user in Testcenter software, that we can not artificially to configure this field, it is recommended when processing the data frame, the field is not falsified information. Of course, you can also choose to have Testcenter not add this field, but this Testcenter not be compared with data frames sent after receiving the Ethernet frame. The design alternatives are automatically added by default after the Signature field Testcenter traffic flow.

3, the frame format Custom

       On the basis of this study, a standard Ethernet frame format on EthernetII redefined within the system frame format as shown below:

FIG 16 custom figure above frame format destination address, source address, frame type, and FCS fields are reserved EthernetII frame structure, logic for simplicity, the data field again split into four fields, wherein only a reserved field from placeholder function, Signature field is automatically populated Testcenter overhead fields. I.e. payload byte section 84 used in the real experiments.

Fourth, the flow of data processing

1, implementation

1.1 overall architecture

10G Ethernet interface receives the Ethernet frame transmitted from the test apparatus over Testcenter extracted key data channels 12 to be split into sections in parallel, synchronized with the clock clk, and then these data pack, N the number clk data combined into one, using a aurora64B66B send data out to a receiver the received data frame is parsed and the data is reduced to 12 channels in synchronization with the internal CLK, the data in the merged passage 12 into ether net frame format, the interface sends back Testcenter via 10G Ethernet. Realization diagram as follows:

17 overall block diagram of FIG.

1.2, data flow

According to the design architecture, the design flow of data as shown below:

Data processing flow in FIG. 18

Fifth, the main module simulation RTL-level verification

1, 10G Ethernet interfaces functional verification

       In the 10G Ethernet interface transmitting terminal 1 writes 64-bit fixed frame, the interface converts it into a differential signal output, the differential looped end of the differential signal transmitted from port 1 into the receiving end interface 2, the receiving end to recover the parallel data is compared with the data source. The simulation results as shown below:

Figure 27 10G Ethernet interfaces simulation results

After core_ready transmission signal high end pkt_tx_ * Interface to write data, the interface 2 is connected to the differential interface terminal 1, terminal 2 receives the interface monitoring pkt_rx_ * recovered Ethernet frames.

2, Aurora64B66B verification interface functions

Writing Aurora64B66B interfaces transmitting end fixed frame 64, the interface converts it into a differential signal output, the differential looped end of the differential signal transmitted from port 1 into the receiving end interface 2, the receiving end in parallel recovered data compared with the data source. The simulation results as shown below:

FIG simulation result of the interface 28 Aurora64B66B

Six, board-level verification

1, the verification environment

31 board-level verification environment of FIG.

Laboratory Experiment Selection Testcenter connection with FIG. 32 from the research exchange plates (chip model xc7vx690tffg1761-2), the switch plate having six ports GTH light, the design selected for testing four optical ports, one from the left, four to 10G Ethernet interface, and is connected via an optical fiber Testcenter, shown in Figure 32. 2, 3 is Aurora64B66B interface implemented by an optical fiber connected to the outer ring.

2, the test flow configuration

When configuring the traffic flow on Testcenter supporting software for visual verification of this design feature, the frame configuration Ethernet payload, i.e. add custom header, shown in Figure 33:

FIG 33 Testcenter Ethernet frame payload configuration

3, validation results

FIG 34 parallel channel data 12

FIG transceiver software 35 Testcenter traffic flow statistical comparison signal gripping portion Xilinx ila, can be seen from FIG. 34 of the present design can be successfully extracted payload field of the Ethernet frame, and parses the data channels 12 in parallel, 6.2 60 bytes before the fields of the same configuration, the present design to achieve a good function. FIG Testcenter 35 pairs of received data frames and compared for statistical data frame has been transmitted, the design does not show that the frame has been dropped frame error situation.

VII Appendix

The following provides a further realization of the idea of ​​the design: Quad chapters GT-mentioned problems in the previous QPLL resources, i.e. one can only use a Quad QPLL, so the design uses four shared a common use of GTH the interface logic, QPLL clock signal which is necessary to drive two 10G Ethernet interface and two interfaces Aurora64B66B. For beginners, the use of a clock and disentangled QPLL GT is a certain difficulty, is the easiest way, the interfaces 4 are placed on two Quad, i.e. every two GT QPLL interfaces share a resource, which can 1 main directly with Xilinx official document from the mode 1, as far as possible to simplify the code and greatly reduce the difficulty of debugging. It was chosen for the switch board with a standard FMC expansion port, which is rich in resources GT, the figure shows the expansion board is connected FMC exchange plates, and extended by a coaxial cable port is connected to the differential realization outer plate ring.

FIG front and back panel extension 36 FMC

A coaxial cable 37 in FIG.

FIG FMC expansion port 38 exchange board

FIG realize an outer ring 39 coaxial with the

FIG expansion board 40 is connected to the switch board

 

 

 

Reprinted Source: http://www.360doc.com/content/20/0325/21/69231615_901671847.shtml

Guess you like

Origin www.cnblogs.com/jason20/p/12570781.html