PCIe obtains BAR space length

Reference article:

 

The focus of this article is "how to obtain the length of the BAR space". Before that, we need to lay some foundation.

 

Fundamental:

The location of the base address register (BAR) in the configuration space (Configuration Space) is shown in the figure below:

The Type0 Header has up to 6 BARs, and the Type1 Header has up to two BARs. This means that for Endpoint, you can have up to 6 different address spaces. However, 6 BARs are basically not used in practical applications, usually 1~3 BARs are more common.

The main thing to note is that if the BAR of a certain device is not all used, the corresponding BAR should be set to 0 by the hardware, and the software should be notified that these BARs are not operable. For the used BAR, part of its low bits cannot be operated by software, and only its high bits can be operated by software. These inoperable low bits determine the type of operation supported by the current BAR and the size of the address space that can be applied for.

 

"PCIe Architecture Guide" 2.3.2 (10) mentioned that when the PCI device is reset, the BAR register will be used to store initialization information, including IO/memory space, 32bit/64bit address decoding, and support for pre-reading/unpredictable Reading, the length of the BAR space and so on, how to get this information?

In fact, the attribute field in the BAR register [3:0] is used to record this information. The software reads the value of the BAR. The type of operation is generally determined by the lowest four bits, as shown in the following figure.

 

Through the BAR register [3:0] attribute field, you can obtain information such as the operation type of the BAR, then how to obtain the length of the BAR space?

As shown in the figure (1), the low bits (11~4) of the uninitialized BAR are all 0, and the high bits (31~12) are all uncertain values. The so-called initialization means that the system (software) writes 1 to the entire BAR to determine which bit is the lowest operable bit of the BAR. The currently operable lowest bit is 12, so the current (minimum) address space size that BAR can apply for is 4KB (2^12). If the lowest operable bit is 20, the (minimum) address space size that the BAR can apply for is 1MB (2^20).

The following is an example of applying for a 64MB P-MMIO address space. Since a 64-bit address is used, two BARs are required. The details are shown in the figure below:

Note: It is important to note that the software's detection and operation (Evaluating) of BAR must be executed sequentially, that is, first BAR0, then BAR1, ..., until BAR5. When the software detects those BARs that are set to all 0s by the hardware, it considers that this BAR is not used.

Note: Whether it is PCI or PCIe, there is no clear regulation. The first BAR used must be BAR0. In fact, as long as the designer intended, it is possible to use BAR4 as the first BAR, and set BAR0~BAR3 to not be used.

 

Code analysis:

The focus of this article is "how to obtain the length of the BAR space". The focus of the code is the "__pci_read_base" function. Before that, let me introduce how to call this function during PCI bus enumeration.

Before the PCI Agent device performs data transfer, the system software needs to initialize the BAR0~5 registers of the PCI Agent device. When the system software uses the DFS algorithm to traverse the PCI bus, it completes the initialization of these registers, that is, allocates the address space of these devices in the PCI bus domain. When these registers are initialized, PCI devices can use PCI bus addresses for data transfer.

pci_scan_child_bus  //PCI总线树枚举,分配PCI总线树的PCI总线号
    |——pci_scan_slot  //扫描当前PCI总线所有设备,加入设备队列
        |——pci_scan_device  //对PCI设备的配置寄存器进行读写操作
            |——pci_setup_device  //判断PCI设备类型
                |——pci_read_irq  //获取Interrupt pin和Line,赋值到irq参数
                |——pci_read_bases  //访问PCI设备的BAR空间和ROM空间
                    |——__pci_read_base  //初始化resource参数

Next, explain the __pci_read_base function separately, and only introduce the method for obtaining the length of the BAR space.

int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
		    struct resource *res, unsigned int pos)
{
    mask = type ? PCI_ROM_ADDRESS_MASK : ~0;
    ...
    pci_read_config_dword(dev, pos, &l);
	pci_write_config_dword(dev, pos, l | mask);
	pci_read_config_dword(dev, pos, &sz);
	pci_write_config_dword(dev, pos, l);
 
 	if (sz == 0xffffffff)  //如果BAR上每一位都能设置意味其不能正常工作
      sz = 0;
    ...
    if (res->flags & IORESOURCE_MEM_64) {
        	pci_read_config_dword(dev, pos + 4, &l);
		pci_write_config_dword(dev, pos + 4, ~0);
		pci_read_config_dword(dev, pos + 4, &sz);
		pci_write_config_dword(dev, pos + 4, l);
  
  		l64 |= ((u64)l << 32);
		sz64 |= ((u64)sz << 32);
		mask64 |= ((u64)~0 << 32);
    }
    sz64 = pci_size(l64, sz64, mask64);  //计算并获取BAR空间长度
    ...
    region.start = l64;
	 region.end = l64 + sz64;
    ...
}

Among them, in the pci_size function, the corresponding operations are as follows:

static u64 pci_size(u64 base, u64 maxbase, u64 mask)
{
	u64 size = mask & maxbase;	/* Find the significant bits */
	if (!size)
		return 0;

	/* Get the lowest of them to find the decode size, and
	   from that the extent.  */
	size = (size & ~(size-1)) - 1;

	/* base == maxbase can be valid only if the BAR has
	   already been programmed with all 1s.  */
	if (base == maxbase && ((base | size) & mask) != mask)
		return 0;

	return size;
}

The most important thing in the pci_size function is this formula: size = (size & ~(size-1))-1;

Among them, size is the sz64&mask64 read before.

Finally, through the return value of the pci_size function, the length of the BAR space can be obtained.

 

What is the purpose of this BAR space length?

After getting the size value, you can use to initialize the start and end parameters of pci_dev->resource.

The pci_resource_len function is used to record the length of the BAR space.

Get the BAR starting address through the pci_resource_start function, plus pci_resource_len to get the length of the BAR space, you can calculate the effective range of the current BAR.

 

Case Analysis:

Take Hi3536 as an example to see how the BAR space size can be configured.

Generally speaking, the application size of the BAR address space is the default setting of the EP device and cannot be modified on the RC side (unless the EP supports bar resize capabilty and requires software support).

Currently, each BAR of PCIE provides flexible BAR_MASK register (address is PCIE CFG Base address+0x1000+0x10+N*4), and BAR register [3:0] attribute field (whether it can be prefetched, whether it is a 32-bit or 64-bit address , Whether it is the IO attribute or the MEM attribute) together, can achieve the purpose of expanding the 64-bit bar and adjusting the size of the bar mask.

        bar mask[0] is to enable the current bar;

bar mask[31:1] is the mask size.

Note: If the current bar is 1/3/5, you need to use the 64-bit bar address with bar0/2/4, and you cannot enable it.

 

For example, to expand the current bar0 to a 64-bit prefetchable memory address, and the address space to 64M byte; you need to use bar1, the modification method is as follows:

Step 1: EP local software sets the bar0 mask to 0x3FF_FFFF; bar0[0] means that the current bar0 is enabled, and its bar mask[25:1] is all 1s, then the current bar0 [25:4] is inoperable, and bar0’s [ 31:26] Can be modified by HOST. (That is: the currently operable lowest bit is 26, so the current (minimum) address space size that BAR can apply for is 64MB (2^26) )

Step 2: EP local software sets bar1 mask to 0x0; it means that the current bar1 is disabled, and its bar mask[31:1] is all 0s, then all addresses of current bar1 can be modified by HOST.

Step 3: EP local software sets bar0 to 0xC; it means that the 64-bit prefetchable memory address is requested

Step 4: HOST scan EP.

----End

The PCIe config code under Hi3536 uboot is as follows:

int pcie_conf(void)
{
    ...
    /*memory space enable*/
	__raw_writel(0x2, HISI3536_PCIE_CONFIG_BASE + CFG_COMMAND_REG);

	__raw_writel(0xc, HISI3536_PCIE_CONFIG_BASE + CFG_BAR0_REG);
	__raw_writel(0x0, HISI3536_PCIE_CONFIG_BASE + CFG_BAR1_REG);
	__raw_writel(0xc, HISI3536_PCIE_CONFIG_BASE + CFG_BAR2_REG);
	__raw_writel(0x0, HISI3536_PCIE_CONFIG_BASE + CFG_BAR3_REG);
	__raw_writel(0x0, HISI3536_PCIE_CONFIG_BASE + CFG_BAR4_REG);
	__raw_writel(0x0, HISI3536_PCIE_CONFIG_BASE + CFG_BAR5_REG);

	__raw_writel(0x03ffffff, HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 0);
	__raw_writel(0x0       , HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 1);
	__raw_writel(0x03ffffff, HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 2);
	__raw_writel(0x0       , HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 3);
	__raw_writel(0x0       , HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 4);
	__raw_writel(0x0       , HISI3536_PCIE_CONFIG_BASE + 0x1000 + 0x10 + 4 * 5);

}

Code analysis:

Hi3536 uses two 64-bit BAR addresses, so it is necessary to combine BAR0 and BAR1, BAR2 and BAR3.

The configuration method is as the above example.

BAR4 and BAR5 are not used, so the BAR register and BAR_MASK register are set to zero.

Ps: When the software detects those BARs that are set to all 0s by the hardware, it considers that this BAR is not used.

 

Can the example be brought into the "__pci_read_base" function for analysis?

Let's take the BAR0 and BAR1 of Hi3536 as examples and bring them into the function.

BAR0 register value: 0x0000000c, BAR0_MASK: 0x03ffffff

BAR1 register value: 0x00000000, BAR1_MAK: 0x00000000

int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
		    struct resource *res, unsigned int pos)
{
    /* 
     * enum pci_bar_type {
	 * pci_bar_unknown,	/* Standard PCI BAR probe */
	 * pci_bar_io,		/* An io port BAR */
	 * pci_bar_mem32,		/* A 32-bit memory BAR */
	 * pci_bar_mem64,		/* A 64-bit memory BAR */
     * };
     */
    mask = type ? PCI_ROM_ADDRESS_MASK : ~0;  //type当前的BAR类型为mem64
    ...
    pci_read_config_dword(dev, pos, &l);  //读取 l = 0xc
	pci_write_config_dword(dev, pos, l | mask);  //写入 mask = 0xffffffff
	pci_read_config_dword(dev, pos, &sz);  //读取 sz = 0xfc00000f
	pci_write_config_dword(dev, pos, l);  //写入0xc,寄存器恢复初值
 
 	if (sz == 0xffffffff)  //如果BAR上每一位都能设置意味其不能正常工作
      sz = 0;
      
    if (type == pci_bar_unknown) {
        ...
    } else{
        l64 = l & PCI_BASE_ADDRESS_MEM_MASK;  //PCI_BASE_ADDRESS_MEM_MASK = (~0x0fUL)
        sz64 = sz & PCI_BASE_ADDRESS_MEM_MASK;  
        mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;  
    }}
    
    ...
    if (res->flags & IORESOURCE_MEM_64) {
        	pci_read_config_dword(dev, pos + 4, &l);  //读取 l = 0x0
		pci_write_config_dword(dev, pos + 4, ~0);  //写入 0xffffffff
		pci_read_config_dword(dev, pos + 4, &sz);  //读取 sz = 0xffffffff
		pci_write_config_dword(dev, pos + 4, l);  //写入0x0
  
  		l64 |= ((u64)l << 32);      //l64 = 0x0
		sz64 |= ((u64)sz << 32);    //sz64 = 0xffffffff_fc000000
		mask64 |= ((u64)~0 << 32);  //mask64 = 0xffffffff_ffffffff
    }
    //l64 = 0x0
    //sz64 = 0xffffffff_fc000000
    //mask64 = 0xffffffff_ffffffff
    sz64 = pci_size(l64, sz64, mask64);  //计算并获取BAR空间长度
    ...
}

Brought into the pci_size function:

static u64 pci_size(u64 base, u64 maxbase, u64 mask)
{
	u64 size = mask & maxbase;  //size = 0xffffffff_fc000000
	if (!size)
		return 0;

	/* Get the lowest of them to find the decode size, and
	   from that the extent.  */
	size = (size & ~(size-1)) - 1;
     //(size-1) = 0xffffffff_fbffffff
     //~(size-1) = 0x00000000_04000000
     //(size & ~(size-1)) = 0xffffffff_fc000000 & 0x00000000_04000000 = 0x00000000_04000000
     //size = (size & ~(size-1)) - 1 = 0x00000000_03ffffff
     //计算得出size = 0x00000000_03ffffff
     //BAR空间长度64MB
    
	/* base == maxbase can be valid only if the BAR has
	   already been programmed with all 1s.  */
	if (base == maxbase && ((base | size) & mask) != mask)
		return 0;

	return size;
}

The final size value is 64MB.

 

After the code analysis, directly operate the device to try it out:

Hi3536 corresponds to the base address of PCIe configuration space: 0x1f000000

hisilicon # md 0x1f000000
1f000000: 353619e5 00100146 04800001 00000000    ..65F...........
1f000010: 0000000c 00000000 0000000c 00000000    ................
1f000020: 00000000 00000000 00000000 00020000    ................
1f000030: 00000000 00000040 00000000 000001ff    ....@...........
1f000040: 5fc35001 00000008 00000000 00000000    .P._............

Configure BAR0 and BAR1, first register all write 1:

hisilicon # mw 0x1f000010 0xffffffff
hisilicon # mw 0x1f000014 0xffffffff
hisilicon # md 0x1f000000
1f000000: 353619e5 00100146 04800001 00000000    ..65F...........
1f000010: fc00000f ffffff0f 0000000c 00000000    ................
1f000020: 00000000 00000000 00000000 00020000    ................
1f000030: 00000000 00000040 00000000 000001ff    ....@...........
1f000040: 5fc35001 00000008 00000000 00000000    .P._............

即,size64=0xffffff0f_fc00000f,mask64=0xffffffff_ffffffff

Bring in the pci_size function, as follows:

	size = (size & ~(size-1)) - 1;
     //(size-1) = 0xffffff0f_fbffffff
     //~(size-1) = 0x000000f0_04000000
     //(size & ~(size-1)) = 0xffffff0f_fc000000 & 0x000000f0_04000000 = 0x00000000_04000000
     //size = (size & ~(size-1)) - 1 = 0x00000000_03ffffff
     //计算得出size = 0x00000000_03ffffff
     //BAR空间长度64MB

Return size64 value: 0x00000000_03ffffff, the BAR space size is 64M.

Ps: Although it is not clear why when BAR1 is written with all 1, the value read out is 0xffffff0f, which is a little bit off, but the result is as expected.

For specific reasons, you may need to ask HiSilicon FAE.

 

 

 

 

 

 

 

 

Guess you like

Origin blog.csdn.net/Ivan804638781/article/details/104882479