pci resource allocation failed to assign [io size 0x1000]

First I would like to introduce acpi pci a little, then dig into detail of resource

allocation for pci initial phases, althoug these two are irrelevant.

ACPI PCI device enumeration:

As we know, acpi namespace enumeration and pci enumeration are separately,

because most of times pci device has more information probed from low level

hw, acpi device(aka firmware device) may want to acquire informations from pci device(aka physical

device), there should be a link between acpi device and pci device, this is called

glue.

pci device path:

 
  1. chenyu@chenyu-Surface-Pro-3:/$ ls -l /sys/devices/pci0000\:00/0000\:00\:02.0/firmware_node

  2. lrwxrwxrwx 1 root root 0 3月 28 12:44 /sys/devices/pci0000:00/0000:00:02.0/firmware_node -> ../../LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00

acpi device path:

 
  1. chenyu@chenyu-Surface-Pro-3:/$ ls -l /sys/devices/LNXSYSTM\:00/LNXSYBUS\:00/PNP0A08\:00/LNXVIDEO\:00/physical_node

  2. lrwxrwxrwx 1 root root 0 3月 28 15:32 /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/physical_node -> ../../../../pci0000:00/0000:00:02.0

OK, let's come to our topic today, how the resources are allocated.

Why I'm concern the resource allocation? Because there is a

poweroff bug on Mac Pro 11, which hangs the system immediately

after user type 'poweroff'. 

https://bugzilla.kernel.org/attachment.cgi?id=208961

https://patchwork.kernel.org/patch/9143637/

https://bugzilla.kernel.org/show_bug.cgi?id=103211

Here's the answer from Yinghai:

 
  1. On Thu, Apr 7, 2016 at 8:55 PM, Chen, Yu

  2. > Currently someone on Bugzilla reported that

  3. > he can not poweroff(S5) nor suspend to S3 after boot up, so I did some

  4. > test on his machine that, it looks like after

  5. > pcibios_assign_resources, we can not access ACPI PM sleep

  6. > register(PM_SLP) caused the problem, that is, once outw(0x1804), the

  7. > system hangs, however before pcibios_assign_resources, everything is

  8. > OK. So I checked the boot log, it seems that there are many resource allocation failure during this phase, such as:

  9. >

  10. > [ 0.865437] pci 0000:06:06.0: BAR 13: no space for [io size 0x1000]

  11. > [ 0.865439] pci 0000:06:06.0: BAR 13: failed to assign [io size 0x1000]

  12. >

  13. > I don't know if this is related to above issue, and would the pci

  14. > device iospace be reset/invalid if above failure comes out. Could you

  15. > give me some advices on why this would probably happened? Thank you.

  16. >

  17. > Related Bugzilla thread:

  18. > https://bugzilla.kernel.org/show_bug.cgi?id=103211

  19.  
  20. That allocation failed, as parent bridge only have [0x4000,0x5fff].

  21.  
  22. and /proc/ioports does not report any overlapping...

  23.  
  24. 0d00-ffff : PCI Bus 0000:00

  25. 1800-187f : pnp 00:01

  26. 1800-1803 : ACPI PM1a_EVT_BLK

  27. 1804-1805 : ACPI PM1a_CNT_BLK

  28. 1808-180b : ACPI PM_TMR

  29. 1820-182f : ACPI GPE0_BLK

  30. 1830-1833 : iTCO_wdt.0.auto

  31. 1850-1850 : ACPI PM2_CNT_BLK

  32. 1860-187f : iTCO_wdt.0.auto

  33. 2000-2fff : PCI Bus 0000:02

  34. 3000-303f : 0000:00:02.0

  35. 4000-6fff : PCI Bus 0000:05

  36. 4000-5fff : PCI Bus 0000:06

  37. 4000-4fff : PCI Bus 0000:08

  38. 4000-4fff : PCI Bus 0000:09

  39. 4000-4fff : PCI Bus 0000:0a

  40. 5000-5fff : PCI Bus 0000:3a

  41. efa0-efbf : 0000:00:1f.3

  42. ffff-ffff : pnp 00:01

  43.  
  44. allocation does assign io port to 00:1c.0

  45.  
  46. [ 0.273741] pci 0000:00:1c.0: BAR 8: assigned [mem 0x7fa00000-0x7fbfffff]

  47. [ 0.273751] pci 0000:00:1c.0: BAR 9: assigned [mem

  48. 0x7fc00000-0x7fdfffff 64bit pref]

  49. [ 0.273756] pci 0000:00:1c.0: BAR 7: assigned [io 0x2000-0x2fff]

  50.  
  51. so can you use setpci to clear that before access 0x1804 ?

  52.  
  53.  
  54.  

Before started, I'd like to summarise the overall process of pci resource allocation,

first of all,pci devices must declare their own resources to running some test, the resource

includes io resource and memory resource, both of them are filled by BIOS when probing

pci device, and put these requirement in pci device's registers, and later linux will check

each of these requirement, if nonconflict then allocate for them and set to pci device registers,

if conflict then adjust the region scope and set. at last all these resources are maintained in

ioport_resource and iomem_resource, these resource are not only accessed by pci devices,but

may also be accessed by other components such as acpi.

these are 3 phases here, 

1. scan the pci tree and probe each pci device's config in pci_device.resource array,

acpi_init->acpi_scan_init->pci_scan_child_bus->pci_scan_slot, and these resources regions's start

addr are probed from pci device config.base_address register[0~5], and the size from pci device bar

initial register(maybe incorrect value), these 6 region may be either mem or io (prefetch or not)

2. sort each pci_device's resource into a resource tree by request_resource, and stored

in each pci_bus.resource, in pci_subsys_init->pcibios_init->pcibios_resource_survey

3. finally assign resources according to step 2, thus reset the pci_device's bar registers

to declaim its resource, in pcibios_assign_resources

First I would like to illustrate a picture of the whole pci structure:

(Copy from http://www.tldp.org/LDP/tlk/dd/pci.html)

I would like to exhibit the pci allocation logs during boot up, which

is mostly in pcibios_assign_resources:

 
  1. pcibios_assign_resources:pci_assign_unassigned_root_bus_resources

  2. void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)

  3. {

  4. __pci_bus_size_bridges(bus, add_list);

  5.  
  6. /* Depth last, allocate resources and update the hardware. */

  7. __pci_bus_assign_resources(bus, add_list, &fail_head);

  8. }

__pci_bus_size_bridges is to adjust the size of conflict resources,

while __pci_bus_assign_resources is to modify the pci config for conflict resources.

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_size_bridges:pbus_size_io

  2.  
  3. static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size,

  4. resource_size_t add_size, struct list_head *realloc_head)

  5. {

  6. struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,

  7. IORESOURCE_IO);

  8. if (!b_res)

  9. return;

  10. //so, this function only concerns the ioport which have conflict

  11. size0 = calculate_iosize(size, min_size, size1,

  12. resource_size(b_res), min_align);

  13. size1 = calculate_iosize(size, min_size, 1000 + size1,

  14. resource_size(b_res), min_align);

  15.  
  16. b_res->start = min_align;

  17. b_res->end = b_res->start + size0 - 1;

  18. b_res->flags |= IORESOURCE_STARTALIGN;

  19.  
  20. if (size1 > size0 && realloc_head) {

  21. dev_printk(KERN_DEBUG, &bus->self->dev, "bridge window %pR to %pR add_size %llx\n",

  22. b_res, &bus->busn_res,

  23. (unsigned long long)size1-size0);

  24. }

  25. }

We can see infer from the message that, "bridge window" means there is conflict for

this resource, and it tries to find a hole to insert this region(and we don't care about

the start addr, because we only care the size for this region):

 
  1. [ 0.865300] pci 0000:06:00.0: bridge window [io 0x1000-0x0fff] to [bus 07] add_size 1000

  2. [ 0.865302] pci 0000:06:00.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 07] add_size 200000 add_align 100000

  3. [ 0.865312] pci 0000:06:04.0: bridge window [io 0x1000-0x0fff] to [bus 39] add_size 1000

  4. [ 0.865313] pci 0000:06:04.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 39] add_size 200000 add_align 100000

  5. [ 0.865314] pci 0000:06:04.0: bridge window [mem 0x00100000-0x000fffff] to [bus 39] add_size 200000 add_align 100000

  6. [ 0.865324] pci 0000:06:06.0: bridge window [io 0x1000-0x0fff] to [bus 6b] add_size 1000

  7. [ 0.865325] pci 0000:06:06.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 6b] add_size 200000 add_align 100000

  8. [ 0.865326] pci 0000:06:06.0: bridge window [mem 0x00100000-0x000fffff] to [bus 6b] add_size 200000 add_align 100000

  9. [ 0.865342] pci 0000:00:1c.0: bridge window [io 0x1000-0x0fff] to [bus 02] add_size 1000

  10. [ 0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff 64bit pref] to [bus 02] add_size 200000 add_align 100000

  11. [ 0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff] to [bus 02] add_size 200000 add_align 100000

  12.  
  13.  
  14. [ 0.865356] pci 0000:00:1c.0: res[14]=[mem 0x00100000-0x000fffff] res_to_dev_res add_size 200000 min_align 100000

  15. [ 0.865357] pci 0000:00:1c.0: res[14]=[mem 0x00100000-0x002fffff] res_to_dev_res add_size 200000 min_align 100000

  16. [ 0.865358] pci 0000:00:1c.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  17. [ 0.865359] pci 0000:00:1c.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  18. [ 0.865360] pci 0000:00:1c.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000

  19. [ 0.865361] pci 0000:00:1c.0: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000


Here's what the real allocation do:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:

  2. void __pci_bus_assign_resources(const struct pci_bus *bus,

  3. struct list_head *realloc_head,

  4. struct list_head *fail_head)

  5. {

  6. pbus_assign_resources_sorted(bus, realloc_head, fail_head);

  7. list_for_each_entry(dev, &bus->devices, bus_list) {

  8. b = dev->subordinate;

  9. if (!b)

  10. continue;

  11.  
  12. __pci_bus_assign_resources(b, realloc_head, fail_head);//recursive

  13. switch (dev->class >> 8) {

  14. case PCI_CLASS_BRIDGE_PCI:

  15. pci_setup_bridge(b);

  16. }

  17. }

  18. }

from above logic we know the __pci_bus_assign_resources behaves in the recursive manner.

whenever a bus is encountered, it first allocate resource for it, then goes down the next level to

deal with its children.The actual aloocation is done by pbus_assign_resources_sorted:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:pbus_assign_resources_sorted

  2. static void pbus_assign_resources_sorted(const struct pci_bus *bus,

  3. struct list_head *realloc_head,

  4. struct list_head *fail_head)

  5. {

  6. struct pci_dev *dev;

  7. LIST_HEAD(head);

  8.  
  9. list_for_each_entry(dev, &bus->devices, bus_list)

  10. __dev_sort_resources(dev, &head);

  11.  
  12. __assign_resources_sorted(&head, realloc_head, fail_head);

  13. }


before we process, I want to emphasize that, only conflict resources
are concerned by __dev_sort_resources, thus prepare the resource head list
for __assign_resources_sorted:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:pbus_assign_resources_sorted:

  2. __dev_sort_resources:pdev_sort_resources

  3.  
  4. static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)

  5. {

  6. for (i = 0; i < PCI_NUM_RESOURCES; i++) {

  7. r = &dev->resource[i];

  8. //only conflict resources, thus no parent resources are interested.

  9. if (!(r->flags) || r->parent)

  10. continue;

  11.  
  12. tmp = kzalloc(sizeof(*tmp), GFP_KERNEL);

  13. tmp->res = r;

  14. //insert the tmp into proper position

  15. //head is the list to be inserted

  16. n = head;

  17.  
  18. r_align = pci_resource_alignment(dev, r);

  19. //check each entry in the head list,

  20. //find one entry

  21. list_for_each_entry(dev_res, head, list) {

  22. resource_size_t align;

  23.  
  24. align = pci_resource_alignment(dev_res->dev,

  25. dev_res->res);

  26.  
  27. if (r_align > align) {

  28. n = &dev_res->list;

  29. break;

  30. }

  31. }

  32. /* Insert it just before n*/

  33. list_add_tail(&tmp->list, n);

  34. }

  35. }


Then we can deal with this resources list:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:pbus_assign_resources_sorted:

  2. __assign_resources_sorted:assign_requested_resources_sorted:pci_assign_resource

  3. int pci_assign_resource(struct pci_dev *dev, int resno)

  4. {

  5. find_resource(root, new, size, &constraint);

  6. ret = __request_resource(root, new);

  7.  
  8. //failed?

  9. if (ret < 0) {

  10. dev_info(&dev->dev, "BAR %d: no space for %pR\n", resno, res);

  11. ret = pci_revert_fw_address(res, dev, resno, size);

  12. }

  13.  
  14. if (ret < 0) {

  15. dev_info(&dev->dev, "BAR %d: failed to assign %pR\n", resno,

  16. res);

  17. return ret;

  18. }

  19. //succeed

  20. dev_info(&dev->dev, "BAR %d: assigned %pR\n", resno, res);

  21. pci_update_resource(dev, resno);

  22. }

find_resource is to find a 'hole' of size under root bus resource tree, then __request_resource tries

to insert this region into the resource tree by:

1. find a parent resource contains this new resource

2. insert this resource into parent's child resource-list(children are not overlap with each other)

So we know, if the resource is succeefully allocated it will print something like

"BAR 7: assigned" and then the pci_dev who owns this resource will update its
pci config registers. Otherwise, it will warn "BAR 7: no space for" and returns
without doing anything:

 
  1. [ 0.865365] pci 0000:00:1c.0: BAR 14: assigned [mem 0x7fa00000-0x7fbfffff]

  2. [ 0.865372] pci 0000:00:1c.0: BAR 15: assigned [mem 0x7fc00000-0x7fdfffff 64bit pref]

  3. [ 0.865375] pci 0000:00:1c.0: BAR 13: assigned [io 0x2000-0x2fff]


or

 
  1. [ 0.865429] pci 0000:06:00.0: BAR 13: no space for [io size 0x1000]

  2. [ 0.865431] pci 0000:06:00.0: BAR 13: failed to assign [io size 0x1000]


BTW, here is how pci device updates its pci configs:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:pbus_assign_resources_sorted:

  2. __assign_resources_sorted:assign_requested_resources_sorted:pci_assign_resource:pci_update_resource

  3.  
  4. void pci_update_resource(struct pci_dev *dev, int resno)

  5. {

  6. new = region.start | (res->flags & PCI_REGION_FLAG_MASK);

  7. reg = pci_resource_bar(dev, resno, &type);

  8. pci_write_config_dword(dev, reg, new);

  9. }

This is so weird, when we reach here, the bridge should also be able
to reach this point, which from above log we already know:

[    0.865375] pci 0000:00:1c.0: BAR 13: assigned [io  0x2000-0x2fff]


Thus the resource number(resno) is 13, but from resno judgement in the

pci_resource_bar, it only cares about number smaller than

PCI_BRIDGE_RESOURCES:

 
  1. int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)

  2. {

  3. int reg;

  4.  
  5. if (resno < PCI_ROM_RESOURCE) {

  6. return PCI_BASE_ADDRESS_0 + 4 * resno;

  7. } else if (resno == PCI_ROM_RESOURCE) {

  8. return dev->rom_base_reg;

  9. } else if (resno < PCI_BRIDGE_RESOURCES) {

  10. reg = pci_iov_resource_bar(dev, resno);

  11. }

  12. return reg;

  13. }

 
  1. enum {

  2. /* #0-5: standard PCI resources */

  3. PCI_STD_RESOURCES,

  4. PCI_STD_RESOURCE_END = 5,

  5.  
  6. /* #6: expansion ROM resource */

  7. PCI_ROM_RESOURCE,

  8.  
  9. /* device specific resources */

  10. #ifdef CONFIG_PCI_IOV

  11. PCI_IOV_RESOURCES,

  12. PCI_IOV_RESOURCE_END = PCI_IOV_RESOURCES + PCI_SRIOV_NUM_BARS - 1,

  13. #endif

  14.  
  15. /* resources assigned to buses behind the bridge */

  16. #define PCI_BRIDGE_RESOURCE_NUM 4

  17.  
  18. PCI_BRIDGE_RESOURCES,

  19. PCI_BRIDGE_RESOURCE_END = PCI_BRIDGE_RESOURCES +

  20. PCI_BRIDGE_RESOURCE_NUM - 1,

  21.  
  22. /* total resources associated with a PCI device */

  23. PCI_NUM_RESOURCES,

  24.  
  25. /* preserve this for compatibility */

  26. DEVICE_COUNT_RESOURCE = PCI_NUM_RESOURCES,

  27. };

So we comes to a conclusion, the assign phase has no effect to pci bridges low level

pci config registers, but only allocate regions for them. Actually, we have:

 
  1. if (resno < PCI_BRIDGE_RESOURCES)

  2. pci_update_resource(dev, resno);


in pci_assign_resource, only non-bridge conflict resource will touch the 

PCI_BASE_ADDRESS_0.

And for ordinary

pci devices, this assign phase make sense, for example, PCI_BASE_ADDRESS_0

is for standard PCI resources.

Paste again:

 
  1. [ 0.865365] pci 0000:00:1c.0: BAR 14: assigned [mem 0x7fa00000-0x7fbfffff]

  2. [ 0.865372] pci 0000:00:1c.0: BAR 15: assigned [mem 0x7fc00000-0x7fdfffff 64bit pref]

  3. [ 0.865375] pci 0000:00:1c.0: BAR 13: assigned [io 0x2000-0x2fff]


Then let's see how pci bridges are modified:

 
  1. pci_assign_unassigned_root_bus_resources:__pci_bus_assign_resources:pci_setup_bridge:__pci_setup_bridge

  2. __pci_setup_bridge:pci_setup_bridge_io

  3. dev_info(&bridge->dev, "PCI bridge to %pR\n",

  4. &bus->busn_res);

  5. [ 0.865378] pci 0000:00:01.0: PCI bridge to [bus 01]

  6. static void pci_setup_bridge_io(struct pci_dev *bridge)

  7. {

  8. dev_info(&bridge->dev, " bridge window %pR\n", res);

  9. pci_write_config_word(bridge, PCI_IO_BASE, l);

  10. }

So it seems pci bridge would like to change PCI_IO_BASE at last.

So we finally comes to a conclusion, pci bridge is using PCI_IO_BASE

which controls the resource under this birdge, while ordinary pci device

is using PCI_BASE_ADDRESS_0 to declares  what resource region he wants

to occupy.

Besides, I'd like to also mentioned that, either  pci_update_resource or pci_setup_bridge_io,

use a trick to update their PCI_BASE_ADDRESS_0 and PCI_IO_BASE, that is,

since the address space is ajacent:

 
  1. #define PCI_IO_BASE 0x1c /* I/O range behind the bridge */

  2. #define PCI_IO_LIMIT 0x1d

we use one pci_write_config_dword operation

to perform both the base and limit register, 

First, I'd like to illustrate a picture of what PCI_IO_BASE are, as mentioned in this function's

comment, it is described in PCI-to-PCI Bridge Architecture Specification rev. 1.1 (1998)

and the doc from TI is more precise:

http://www.ti.com.cn/general/cn/docs/lit/getliterature.tsp?genericPartNumber=pci2250&fileType=pdf

let's take pci_setup_bridge_io for example:

 
  1. static void pci_setup_bridge_io(struct pci_dev *bridge)

  2. {

  3. struct resource *res;

  4. struct pci_bus_region region;

  5. unsigned long io_mask;

  6. u8 io_base_lo, io_limit_lo;

  7. u16 l;

  8. u32 io_upper16;

  9.  
  10. io_mask = PCI_IO_RANGE_MASK;

  11. if (bridge->io_window_1k)

  12. io_mask = PCI_IO_1K_RANGE_MASK;

  13.  
  14. /* Set up the top and bottom of the PCI I/O segment for this bus. */

  15. res = &bridge->resource[PCI_BRIDGE_RESOURCES + 0];

  16. pcibios_resource_to_bus(bridge->bus, &region, res);

  17. if (res->flags & IORESOURCE_IO) {

  18. pci_read_config_word(bridge, PCI_IO_BASE, &l);

  19. io_base_lo = (region.start >> 8) & io_mask;

  20. io_limit_lo = (region.end >> 8) & io_mask;

  21. l = ((u16) io_limit_lo << 8) | io_base_lo;

  22. /* Set up upper 16 bits of I/O base/limit. */

  23. io_upper16 = (region.end & 0xffff0000) | (region.start >> 16);

  24. dev_info(&bridge->dev, " bridge window %pR\n", res);

  25. } else {

  26. /* Clear upper 16 bits of I/O base/limit. */

  27. io_upper16 = 0;

  28. l = 0x00f0;

  29. }

  30. /* Temporarily disable the I/O range before updating PCI_IO_BASE. */

  31. pci_write_config_dword(bridge, PCI_IO_BASE_UPPER16, 0x0000ffff);

  32. /* Update lower 16 bits of I/O base/limit. */

  33. pci_write_config_word(bridge, PCI_IO_BASE, l);

  34. /* Update upper 16 bits of I/O base/limit. */

  35. pci_write_config_dword(bridge, PCI_IO_BASE_UPPER16, io_upper16);

  36. }

Why do we right shift 8bit? because the upper 4 bits in IO_BASE(8bit) is corresponding to the address

of bit 12 - 15, thus we have( (addr >> 12)&0xf ) << 4, thus addr>>8.

Here are the logs in detal:

 
  1. [ 0.865381] pci 0000:00:01.0: bridge window [mem 0xa0b00000-0xa0bfffff]

  2. [ 0.865387] pci 0000:06:00.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  3. [ 0.865388] pci 0000:06:00.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  4. [ 0.865389] pci 0000:06:04.0: res[14]=[mem 0x00100000-0x000fffff] res_to_dev_res add_size 200000 min_align 100000

  5. [ 0.865389] pci 0000:06:04.0: res[14]=[mem 0x00100000-0x002fffff] res_to_dev_res add_size 200000 min_align 100000

  6. [ 0.865390] pci 0000:06:04.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  7. [ 0.865391] pci 0000:06:04.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  8. [ 0.865392] pci 0000:06:06.0: res[14]=[mem 0x00100000-0x000fffff] res_to_dev_res add_size 200000 min_align 100000

  9. [ 0.865393] pci 0000:06:06.0: res[14]=[mem 0x00100000-0x002fffff] res_to_dev_res add_size 200000 min_align 100000

  10. [ 0.865394] pci 0000:06:06.0: res[15]=[mem 0x00100000-0x000fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  11. [ 0.865395] pci 0000:06:06.0: res[15]=[mem 0x00100000-0x002fffff 64bit pref] res_to_dev_res add_size 200000 min_align 100000

  12. [ 0.865396] pci 0000:06:00.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000

  13. [ 0.865397] pci 0000:06:00.0: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000

  14. [ 0.865397] pci 0000:06:04.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000

  15. [ 0.865398] pci 0000:06:04.0: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000

  16. [ 0.865399] pci 0000:06:06.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000

  17. [ 0.865400] pci 0000:06:06.0: res[13]=[io 0x1000-0x1fff] res_to_dev_res add_size 1000 min_align 1000

  18. [ 0.865402] pci 0000:06:00.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  19. [ 0.865405] pci 0000:06:00.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  20. [ 0.865408] pci 0000:06:04.0: BAR 14: no space for [mem size 0x00200000]

  21. [ 0.865410] pci 0000:06:04.0: BAR 14: failed to assign [mem size 0x00200000]

  22. [ 0.865413] pci 0000:06:04.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  23. [ 0.865416] pci 0000:06:04.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  24. [ 0.865419] pci 0000:06:06.0: BAR 14: no space for [mem size 0x00200000]

  25. [ 0.865421] pci 0000:06:06.0: BAR 14: failed to assign [mem size 0x00200000]

  26. [ 0.865423] pci 0000:06:06.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  27. [ 0.865426] pci 0000:06:06.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  28. [ 0.865429] pci 0000:06:00.0: BAR 13: no space for [io size 0x1000]

  29. [ 0.865431] pci 0000:06:00.0: BAR 13: failed to assign [io size 0x1000]

  30. [ 0.865433] pci 0000:06:04.0: BAR 13: no space for [io size 0x1000]

  31. [ 0.865435] pci 0000:06:04.0: BAR 13: failed to assign [io size 0x1000]

  32. [ 0.865437] pci 0000:06:06.0: BAR 13: no space for [io size 0x1000]

  33. [ 0.865439] pci 0000:06:06.0: BAR 13: failed to assign [io size 0x1000]

  34. [ 0.865442] pci 0000:06:06.0: BAR 14: no space for [mem size 0x00200000]

  35. [ 0.865443] pci 0000:06:06.0: BAR 14: failed to assign [mem size 0x00200000]

  36. [ 0.865446] pci 0000:06:06.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  37. [ 0.865449] pci 0000:06:06.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  38. [ 0.865452] pci 0000:06:06.0: BAR 13: no space for [io size 0x1000]

  39. [ 0.865454] pci 0000:06:06.0: BAR 13: failed to assign [io size 0x1000]

  40. [ 0.865456] pci 0000:06:04.0: BAR 14: no space for [mem size 0x00200000]

  41. [ 0.865458] pci 0000:06:04.0: BAR 14: failed to assign [mem size 0x00200000]

  42. [ 0.865460] pci 0000:06:04.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  43. [ 0.865463] pci 0000:06:04.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  44. [ 0.865466] pci 0000:06:04.0: BAR 13: no space for [io size 0x1000]

  45. [ 0.865468] pci 0000:06:04.0: BAR 13: failed to assign [io size 0x1000]

  46. [ 0.865470] pci 0000:06:00.0: BAR 15: no space for [mem size 0x00200000 64bit pref]

  47. [ 0.865473] pci 0000:06:00.0: BAR 15: failed to assign [mem size 0x00200000 64bit pref]

  48. [ 0.865476] pci 0000:06:00.0: BAR 13: no space for [io size 0x1000]

  49. [ 0.865477] pci 0000:06:00.0: BAR 13: failed to assign [io size 0x1000]

  50. [ 0.865480] pci 0000:06:00.0: PCI bridge to [bus 07]

  51. [ 0.865484] pci 0000:06:00.0: bridge window [mem 0xa0d00000-0xa0dfffff]

  52.  
  53. [ 0.865490] pci 0000:06:03.0: PCI bridge to [bus 08-38]

  54. [ 0.865493] pci 0000:06:03.0: bridge window [io 0x4000-0x4fff]

  55. [ 0.865497] pci 0000:06:03.0: bridge window [mem 0xa0e00000-0xa4dfffff]

  56. [ 0.865501] pci 0000:06:03.0: bridge window [mem 0xace00000-0xb0dfffff 64bit pref]

  57.  
  58. [ 0.865506] pci 0000:06:04.0: PCI bridge to [bus 39]

  59. [ 0.865515] pci 0000:06:05.0: PCI bridge to [bus 3a-6a]

  60. [ 0.865518] pci 0000:06:05.0: bridge window [io 0x5000-0x5fff]

  61. [ 0.865522] pci 0000:06:05.0: bridge window [mem 0xa4e00000-0xa8dfffff]

  62. [ 0.865525] pci 0000:06:05.0: bridge window [mem 0xb0e00000-0xb4dfffff 64bit pref]

  63.  
  64. [ 0.865531] pci 0000:06:06.0: PCI bridge to [bus 6b]

  65. [ 0.865539] pci 0000:05:00.0: PCI bridge to [bus 06-6b]

  66. [ 0.865542] pci 0000:05:00.0: bridge window [io 0x4000-0x5fff]

  67. [ 0.865546] pci 0000:05:00.0: bridge window [mem 0xa0d00000-0xa8dfffff]

  68. [ 0.865549] pci 0000:05:00.0: bridge window [mem 0xace00000-0xb4dfffff 64bit pref]

  69.  
  70. [ 0.865555] pci 0000:00:01.1: PCI bridge to [bus 05-9b]

  71. [ 0.865557] pci 0000:00:01.1: bridge window [io 0x4000-0x6fff]

  72. [ 0.865560] pci 0000:00:01.1: bridge window [mem 0xa0d00000-0xacdfffff]

  73. [ 0.865562] pci 0000:00:01.1: bridge window [mem 0xace00000-0xb8dfffff 64bit pref]

  74.  
  75. [ 0.865566] pci 0000:00:1c.0: PCI bridge to [bus 02]

  76. [ 0.865575] pci 0000:00:1c.0: bridge window [io 0x2000-0x2fff]

  77. [ 0.865580] pci 0000:00:1c.0: bridge window [mem 0x7fa00000-0x7fbfffff]

  78. [ 0.865584] pci 0000:00:1c.0: bridge window [mem 0x7fc00000-0x7fdfffff 64bit pref]

  79.  
  80. [ 0.865591] pci 0000:00:1c.2: PCI bridge to [bus 03]

  81. [ 0.865596] pci 0000:00:1c.2: bridge window [mem 0xa0400000-0xa08fffff]

  82.  
  83. [ 0.865603] pci 0000:00:1c.3: PCI bridge to [bus 04]

  84. [ 0.865607] pci 0000:00:1c.3: bridge window [mem 0xa0900000-0xa0afffff]

  85. [ 0.865612] pci 0000:00:1c.3: bridge window [mem 0x80000000-0x8fffffff 64bit pref]


We can see from above, there are quite many allocation failures, take the following for example:

 
  1. [ 0.865433] pci 0000:06:04.0: BAR 13: no space for [io size 0x1000]

  2. [ 0.865435] pci 0000:06:04.0: BAR 13: failed to assign [io size 0x1000]

So why it fails? this is because there is not enough space to insert a region of 0x1000 into

pci device 0000:06:04.0 's parent region node, who's  0000:06:04.0  and his parent resource?

Let defer from /proc/ioport:

 
  1. 0000-0cf7 : PCI Bus 0000:00

  2. 0000-001f : dma1

  3. 0020-0021 : pic1

  4. 0040-0043 : timer0

  5. 0050-0053 : timer1

  6. 0060-0060 : keyboard

  7. 0062-0062 : PNP0C09:00

  8. 0062-0062 : EC data

  9. 0064-0064 : keyboard

  10. 0066-0066 : PNP0C09:00

  11. 0066-0066 : EC cmd

  12. 0070-0077 : rtc0

  13. 0080-008f : dma page reg

  14. 00a0-00a1 : pic2

  15. 00c0-00df : dma2

  16. 00f0-00ff : fpu

  17. 00f0-00f0 : PNP0C04:00

  18. 0300-031f : APP0001:00

  19. 0300-031f : applesmc

  20. 0410-0415 : ACPI CPU throttle

  21. 0800-087f : pnp 00:01

  22. 0cf8-0cff : PCI conf1

  23. 0d00-ffff : PCI Bus 0000:00

  24. 1800-187f : pnp 00:01

  25. 1800-1803 : ACPI PM1a_EVT_BLK

  26. 1804-1805 : ACPI PM1a_CNT_BLK

  27. 1808-180b : ACPI PM_TMR

  28. 1820-182f : ACPI GPE0_BLK

  29. 1830-1833 : iTCO_wdt.0.auto

  30. 1850-1850 : ACPI PM2_CNT_BLK

  31. 1860-187f : iTCO_wdt.0.auto

  32. 2000-2fff : PCI Bus 0000:02

  33. 3000-303f : 0000:00:02.0

  34. 4000-6fff : PCI Bus 0000:05

  35. 4000-5fff : PCI Bus 0000:06

  36. 4000-4fff : PCI Bus 0000:08

  37. 5000-5fff : PCI Bus 0000:3a

  38. efa0-efbf : 0000:00:1f.3

  39. ffff-ffff : pnp 00:01

We can  see there is a PCI Bus 0000:06 who holds the resource for
4000-5fff, and PCI Bus 0000:08 has occupied 4000-4fff while 
PCI Bus 0000:3a has occupied 5000-5fff, since bridge 0000:06:04.0 
may not connect to BUS 0000:08 nor 0000:3a, it is impossible
for PCI Bridge 0000:06:04.0 to declair windows under PCI Bus 0000:06,
due to all  resources  been exhausted, so do 0000:06:06.0 and
0000:06:00.0.
 

 
  1. 06:04.0 PCI bridge: Intel Corporation DSL5520 Thunderbolt [Falcon Ridge] (prog-if 00 [Normal decode])

  2. 06:06.0 PCI bridge: Intel Corporation DSL5520 Thunderbolt [Falcon Ridge] (prog-if 00 [Normal decode])

  3. 06:00.0 PCI bridge: Intel Corporation DSL5520 Thunderbolt [Falcon Ridge] (prog-if 00 [Normal decode])

  4. Memory behind bridge: a0d00000-a0dfffff


Then we see there is no "bridge window [io]" showed, which indicates the

success of io region allocation, BTW, the backtrace of bridge setting is

as followed:

 
  1. [ 0.310266] pci 0000:06:03.0: PCI bridge to [bus 08-38]

  2. [ 0.310270] pci 0000:06:03.0: bridge window [io 0x4000-0x4fff]

  3. [ 0.310303] [<ffffffff814036fc>] pci_setup_bridge_io+0x9c/0x120

  4. [ 0.310306] [<ffffffff8140380a>] __pci_setup_bridge+0x8a/0x90

  5. [ 0.310309] [<ffffffff8140572a>] __pci_bus_assign_resources+0x1da/0x1e0

  6. [ 0.310313] [<ffffffff814056b2>] __pci_bus_assign_resources+0x162/0x1e0

  7. [ 0.310316] [<ffffffff814056b2>] __pci_bus_assign_resources+0x162/0x1e0

  8. [ 0.310320] [<ffffffff81405c3c>] pci_assign_unassigned_root_bus_resources+0x25c/0x270

  9. [ 0.310324] [<ffffffff81dc714e>] ? ras_debugfs_init+0x1b/0x1b

  10. [ 0.310327] [<ffffffff81dae5d1>] pci_assign_unassigned_resources+0x1d/0x25

  11. [ 0.310330] [<ffffffff81dc7165>] pcibios_assign_resources+0x17/0xb4

  12. [ 0.310333] [<ffffffff810021b9>] do_one_initcall+0x149/0x1e0

  13. [ 0.310336] [<ffffffff81099a00>] ? parse_args+0x60/0x480

  14. [ 0.310339] [<ffffffff81d67282>] kernel_init_freeable+0x174/0x1ff

  15. [ 0.310342] [<ffffffff81d66a03>] ? set_debug_rodata+0x12/0x12

  16. [ 0.310345] [<ffffffff817c945e>] kernel_init+0xe/0x110

BTW, we also saw the initial io resource allocation(during pci device probe in acpi_init),

which is before actual resource allocation in pcibios_assign_resources, that is :

 
  1. [ 0.281883] pci 0000:06:05.0: supports D1 D2

  2. [ 0.281885] pci 0000:06:05.0: PME# supported from D0 D1 D2 D3hot D3cold

  3. [ 0.281961] pci 0000:06:06.0: [8086:156d] type 01 class 0x060400

  4. [ 0.281991] [<ffffffff813f711e>] pci_setup_device+0x15e/0x510

  5. [ 0.281994] [<ffffffff813f542e>] ? pci_bus_get+0x1e/0x30

  6. [ 0.281997] [<ffffffff813f77fd>] pci_scan_single_device+0x7d/0xb0

  7. [ 0.282000] [<ffffffff813f7889>] pci_scan_slot+0x59/0x130

  8. [ 0.282003] [<ffffffff813f8a68>] pci_scan_child_bus+0x38/0x140

  9. [ 0.282006] [<ffffffff813f870d>] pci_scan_bridge+0x31d/0x640

  10. [ 0.282009] [<ffffffff813f8adf>] pci_scan_child_bus+0xaf/0x140

  11. [ 0.282012] [<ffffffff813f870d>] pci_scan_bridge+0x31d/0x640

  12. [ 0.282015] [<ffffffff813f77cf>] ? pci_scan_single_device+0x4f/0xb0

  13. [ 0.282018] [<ffffffff813f8adf>] pci_scan_child_bus+0xaf/0x140

  14. [ 0.282021] [<ffffffff8144dd1c>] acpi_pci_root_create+0x184/0x1df

  15. [ 0.282024] [<ffffffff816b2d5e>] pci_acpi_scan_root+0x15e/0x1b0

  16. [ 0.282027] [<ffffffff8144d991>] acpi_pci_root_add+0x3b7/0x4a3

  17. [ 0.282031] [<ffffffff81448e7e>] acpi_bus_attach+0x109/0x1a9

  18. [ 0.282034] [<ffffffff81448eeb>] acpi_bus_attach+0x176/0x1a9

  19. [ 0.282037] [<ffffffff81448eeb>] acpi_bus_attach+0x176/0x1a9

  20. [ 0.282040] [<ffffffff81db05e0>] ? acpi_sleep_proc_init+0x28/0x28

  21. [ 0.282044] [<ffffffff8144900b>] acpi_bus_scan+0x5b/0x6b

  22. [ 0.282047] [<ffffffff81db0a88>] acpi_scan_init+0x5d/0x1a0

  23. [ 0.282050] [<ffffffff81db0852>] acpi_init+0x272/0x28a

  24. [ 0.282053] [<ffffffff810021b9>] do_one_initcall+0x149/0x1e0

  25. [ 0.282056] [<ffffffff81099a00>] ? parse_args+0x60/0x480

  26. [ 0.282059] [<ffffffff81d67282>] kernel_init_freeable+0x174/0x1ff

  27.  

and the backtrace of pci bridge resource declaim:

 
  1. [ 0.280885] pci 0000:00:1c.4: bridge window [io 0x4000-0x6fff]

  2. [ 0.280915] [<ffffffff813f6e22>] pci_read_bridge_bases+0x412/0x420

  3. [ 0.280918] [<ffffffff816b4322>] pcibios_fixup_bus+0x12/0xc0

  4. [ 0.280921] [<ffffffff813f8a9c>] pci_scan_child_bus+0x6c/0x140

  5. [ 0.280924] [<ffffffff813f870d>] pci_scan_bridge+0x31d/0x640

  6. [ 0.280927] [<ffffffff813f77cf>] ? pci_scan_single_device+0x4f/0xb0

  7. [ 0.280930] [<ffffffff813f8adf>] pci_scan_child_bus+0xaf/0x140

  8. [ 0.280933] [<ffffffff8144dd1c>] acpi_pci_root_create+0x184/0x1df

  9. [ 0.280937] [<ffffffff816b2d5e>] pci_acpi_scan_root+0x15e/0x1b0

  10. [ 0.280940] [<ffffffff8144d991>] acpi_pci_root_add+0x3b7/0x4a3

  11. [ 0.280943] [<ffffffff81448e7e>] acpi_bus_attach+0x109/0x1a9

  12. [ 0.280946] [<ffffffff81448eeb>] acpi_bus_attach+0x176/0x1a9

  13. [ 0.280950] [<ffffffff81448eeb>] acpi_bus_attach+0x176/0x1a9

  14. [ 0.280953] [<ffffffff81db05e0>] ? acpi_sleep_proc_init+0x28/0x28

  15. [ 0.280956] [<ffffffff8144900b>] acpi_bus_scan+0x5b/0x6b

  16. [ 0.280959] [<ffffffff81db0a88>] acpi_scan_init+0x5d/0x1a0

  17. [ 0.280962] [<ffffffff81db0852>] acpi_init+0x272/0x28a

  18. [ 0.280966] [<ffffffff810021b9>] do_one_initcall+0x149/0x1e0

So it looks like in pci_scan_child_bus we probe pci device and pci bridge separately.




 

 
  1. [ 0.865506] pci 0000:06:04.0: PCI bridge to [bus 39]

  2. [ 0.865531] pci 0000:06:06.0: PCI bridge to [bus 6b]

  3. [ 0.865480] pci 0000:06:00.0: PCI bridge to [bus 07]

  4. [ 0.865484] pci 0000:06:00.0: bridge window [mem 0xa0d00000-0xa0dfffff]

then let's take a successful allocation:

[    0.865375] pci 0000:00:1c.0: BAR 13: assigned [io  0x2000-0x2fff]

According to /por/ioport, we know this PCI bridge 0000:00:1c.0 must
be connect and it is the only connection of PCI Bus 0000:02

According to lspci and io port logs such as bridge window, we can deduce

the pci io resource tree as followed, green text stands for succeed of allocation

for this bridge window, red stands for failure.

 
  1. 0000-0cf7 : PCI Bus 0000:00

  2. 0000-001f : dma1

  3. 0020-0021 : pic1

  4. 0040-0043 : timer0

  5. 0050-0053 : timer1

  6. 0060-0060 : keyboard

  7. 0062-0062 : PNP0C09:00

  8. 0062-0062 : EC data

  9. 0064-0064 : keyboard

  10. 0066-0066 : PNP0C09:00

  11. 0066-0066 : EC cmd

  12. 0070-0077 : rtc0

  13. 0080-008f : dma page reg

  14. 00a0-00a1 : pic2

  15. 00c0-00df : dma2

  16. 00f0-00ff : fpu

  17. 00f0-00f0 : PNP0C04:00

  18. 0300-031f : APP0001:00

  19. 0300-031f : applesmc

  20. 0410-0415 : ACPI CPU throttle

  21. 0800-087f : pnp 00:01

  22. 0cf8-0cff : PCI conf1

  23. 0d00-ffff : PCI Bus 0000:00

  24. 1800-187f : pnp 00:01

  25. 1800-1803 : ACPI PM1a_EVT_BLK

  26. 1804-1805 : ACPI PM1a_CNT_BLK

  27. 1808-180b : ACPI PM_TMR

  28. 1820-182f : ACPI GPE0_BLK

  29. 1830-1833 : iTCO_wdt.0.auto

  30. 1850-1850 : ACPI PM2_CNT_BLK

  31. 1860-187f : iTCO_wdt.0.auto

  32. 2000-2fff : PCI Bus 0000:02

  33. 3000-303f : 0000:00:02.0

  34. 4000-6fff : PCI Bus 0000:05

  35. 4000-5fff : PCI Bus 0000:06

  36. 4000-4fff : PCI Bus 0000:08

  37. 5000-5fff : PCI Bus 0000:3a

  38. efa0-efbf : 0000:00:1f.3

  39. ffff-ffff : pnp 00:01


 

During bootup, we found some pci bridge has invalid io limit value, say,

base:fffff000,limit:0, on a Mac Pro 12, then during pci  bridge device resource probing in 

pci_read_bridge_bases, then there is no 'bridge window [io xxx] displayed for this

pci bridge, such as 0000:06:00.0:

 
  1. [ 0.385573] pci 0000:06:00.0: PCI bridge to [bus 07]

  2. [ 0.385586] pci 0000:06:00.0: bridge window [mem 0xc1900000-0xc19fffff]

  3. [ 0.385657] pci 0000:06:03.0: PCI bridge to [bus 08-38]

but later in pcibios_assign_resources, we found there is io request for this

pci bridge which wants to occupy some io resource:

[    0.451679] pci 0000:06:00.0: bridge window [io  0x1000-0x0fff] to [bus 07] add_size 1000

which means this pci bridge tries to declaim a io resource region of size 0xfff(4k)

So how does this 0xfff come out? The simple answer is, pbus_size_io.

In above function, it checks every pci_device under specific pci bus,which is

the child of pci bridge, (BTW pci bus is pointed by pci_bridge->subordinate),

and check each sub pci device io resource and sum them together to determin

how large this pci bridge should contain a io resource. In above Mac Pro 12 case,

for pci bridge 0000:06:00.0, there are sub pci_device connected to it:

0000:07:00.0, however this pci_device does not have any io resource but only

mem resource:

 
  1. [ 0.378985] pci 0000:07:00.0: [8086:156c] type 00 class 0x088000

  2. [ 0.379008] pci 0000:07:00.0: reg 0x10: [mem 0xc1900000-0xc193ffff]

  3. [ 0.379022] pci 0000:07:00.0: reg 0x14: [mem 0xc1940000-0xc1940fff]


So , pci bridge 0000:06:00.0 will not allocate any io resource in theory, but according to the log,

0000:06:00.0 is actually trying to declaim a io resource of size 0xfff, this is because,

0000:06:00.0 is a hotplug pci bridge, and the minumal io resource for such kind of pci device

is 0x100:

 
  1. case PCI_CLASS_BRIDGE_PCI:

  2. pci_bridge_check_ranges(bus);

  3. if (bus->self->is_hotplug_bridge) {

  4. additional_io_size = pci_hotplug_io_size;

  5. additional_mem_size = pci_hotplug_mem_size;

  6. }

  7. /* Fall through */

  8. default:

  9. pbus_size_io(bus, realloc_head ? 0 : additional_io_size,

  10. additional_io_size, realloc_head);

besides, in pbus_size_io, per spec, I/O windows are 4K-aligned:

 
  1. static resource_size_t window_alignment(struct pci_bus *bus,

  2. unsigned long type)

  3. {

  4. resource_size_t align = 1, arch_align;

  5.  
  6. if (type & IORESOURCE_MEM)

  7. align = PCI_P2P_DEFAULT_MEM_ALIGN;

  8. else if (type & IORESOURCE_IO) {

  9. /*

  10. * Per spec, I/O windows are 4K-aligned, but some

  11. * bridges have an extension to support 1K alignment.

  12. */

  13. if (bus->self->io_window_1k)

  14. align = PCI_P2P_DEFAULT_IO_ALIGN_1K;

  15. else

  16. align = PCI_P2P_DEFAULT_IO_ALIGN;

  17. }

So the finally request io resource is 0x1000, thus we see the

[    0.451679] pci 0000:06:00.0: bridge window [io  0x1000-0x0fff] to [bus 07] add_size 1000

And the resource for this pci bridge has been updated to start = 0x1000, end = 0xfff with valid flag.

That is to say, when coming into pbus_size_io, 

 
  1. struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO,

  2. IORESOURCE_IO);

will find a uninitialized resource with IO flag, and try to fill in it with adjust region.

for pci agent device, it is resource[3], and for bridge, it becomes resource[13].

Then after the traverse of each device under this bus, finally we set the probed total

resource len to resource[13], here is a log from boot up, which demonstrate this behavior: 

 
  1. [ 0.451922] pci 0000:06:00.0: Start scaning pci bridge

  2. [ 0.451924] Before checking each sub device

  3. [ 0.451927] pci 0000:06:00.0: dump pci device resource for BAR 0, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  4. [ 0.451931] pci 0000:06:00.0: dump pci device resource for BAR 1, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  5. [ 0.451935] pci 0000:06:00.0: dump pci device resource for BAR 2, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  6. [ 0.451940] pci 0000:06:00.0: dump pci device resource for BAR 3, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  7. [ 0.451944] pci 0000:06:00.0: dump pci device resource for BAR 4, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  8. [ 0.451949] pci 0000:06:00.0: dump pci device resource for BAR 5, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  9. [ 0.451953] pci 0000:06:00.0: dump pci device resource for BAR 6, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  10. [ 0.451957] pci 0000:06:00.0: dump pci device resource for BAR 7, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  11. [ 0.451962] pci 0000:06:00.0: dump pci device resource for BAR 8, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  12. [ 0.451966] pci 0000:06:00.0: dump pci device resource for BAR 9, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  13. [ 0.451971] pci 0000:06:00.0: dump pci device resource for BAR 10, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  14. [ 0.451975] pci 0000:06:00.0: dump pci device resource for BAR 11, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  15. [ 0.451979] pci 0000:06:00.0: dump pci device resource for BAR 12, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  16. [ 0.451984] pci 0000:06:00.0: dump pci device resource for BAR 13, flags:100,start:0,end:0,[io 0x0000], parent: (null)

  17. [ 0.451988] pci 0000:06:00.0: dump pci device resource for BAR 14, flags:200,start:c1900000,end:c19fffff,[mem 0xc1900000-0xc19fffff], parent:ffff880260a9e6f8

  18. [ 0.451993] pci 0000:06:00.0: dump pci device resource for BAR 15, flags:102201,start:0,end:0,[mem 0x00000000 64bit pref], parent: (null)

  19. [ 0.451997] pci 0000:06:00.0: dump pci device resource for BAR 16, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  20. [ 0.452001] pci 0000:06:00.0: End scaning pci bridge, size:0,size1:0, min_size:0,min_align:1000, add_size:100, children_add_size:0

  21. [ 0.452006] pci 0000:06:00.0: size0:0, size1:1000

  22. [ 0.452008] Before bridge window request io resource

  23. [ 0.452010] pci 0000:06:00.0: dump pci device resource for BAR 0, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  24. [ 0.452015] pci 0000:06:00.0: dump pci device resource for BAR 1, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  25. [ 0.452019] pci 0000:06:00.0: dump pci device resource for BAR 2, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  26. [ 0.452024] pci 0000:06:00.0: dump pci device resource for BAR 3, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  27. [ 0.452028] pci 0000:06:00.0: dump pci device resource for BAR 4, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  28. [ 0.452032] pci 0000:06:00.0: dump pci device resource for BAR 5, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  29. [ 0.452037] pci 0000:06:00.0: dump pci device resource for BAR 6, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  30. [ 0.452041] pci 0000:06:00.0: dump pci device resource for BAR 7, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  31. [ 0.452045] pci 0000:06:00.0: dump pci device resource for BAR 8, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  32. [ 0.452050] pci 0000:06:00.0: dump pci device resource for BAR 9, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  33. [ 0.452054] pci 0000:06:00.0: dump pci device resource for BAR 10, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  34. [ 0.452058] pci 0000:06:00.0: dump pci device resource for BAR 11, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  35. [ 0.452063] pci 0000:06:00.0: dump pci device resource for BAR 12, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  36. [ 0.452067] pci 0000:06:00.0: dump pci device resource for BAR 13, flags:100,start:0,end:0,[io 0x0000], parent: (null)

  37. [ 0.452072] pci 0000:06:00.0: dump pci device resource for BAR 14, flags:200,start:c1900000,end:c19fffff,[mem 0xc1900000-0xc19fffff], parent:ffff880260a9e6f8

  38. [ 0.452076] pci 0000:06:00.0: dump pci device resource for BAR 15, flags:102201,start:0,end:0,[mem 0x00000000 64bit pref], parent: (null)

  39. [ 0.452081] pci 0000:06:00.0: dump pci device resource for BAR 16, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  40. [ 0.452086] pci 0000:06:00.0: bridge window [io 0x1000-0x0fff] to [bus 07] add_size 1000

  41. [ 0.452089] After bridge window request io resource

  42. [ 0.452092] pci 0000:06:00.0: dump pci device resource for BAR 0, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  43. [ 0.452096] pci 0000:06:00.0: dump pci device resource for BAR 1, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  44. [ 0.452101] pci 0000:06:00.0: dump pci device resource for BAR 2, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  45. [ 0.452105] pci 0000:06:00.0: dump pci device resource for BAR 3, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  46. [ 0.452110] pci 0000:06:00.0: dump pci device resource for BAR 4, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  47. [ 0.452114] pci 0000:06:00.0: dump pci device resource for BAR 5, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  48. [ 0.452118] pci 0000:06:00.0: dump pci device resource for BAR 6, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  49. [ 0.452123] pci 0000:06:00.0: dump pci device resource for BAR 7, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  50. [ 0.452127] pci 0000:06:00.0: dump pci device resource for BAR 8, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  51. [ 0.452131] pci 0000:06:00.0: dump pci device resource for BAR 9, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  52. [ 0.452136] pci 0000:06:00.0: dump pci device resource for BAR 10, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  53. [ 0.452140] pci 0000:06:00.0: dump pci device resource for BAR 11, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  54. [ 0.452145] pci 0000:06:00.0: dump pci device resource for BAR 12, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  55. [ 0.452149] pci 0000:06:00.0: dump pci device resource for BAR 13, flags:80100,start:1000,end:fff,[io 0x1000-0x0fff], parent: (null)

  56. [ 0.452154] pci 0000:06:00.0: dump pci device resource for BAR 14, flags:200,start:c1900000,end:c19fffff,[mem 0xc1900000-0xc19fffff], parent:ffff880260a9e6f8

  57. [ 0.452158] pci 0000:06:00.0: dump pci device resource for BAR 15, flags:102201,start:0,end:0,[mem 0x00000000 64bit pref], parent: (null)

  58. [ 0.452163] pci 0000:06:00.0: dump pci device resource for BAR 16, flags:0,start:0,end:0,[??? 0x00000000 flags 0x0], parent: (null)

  59. [ 0.455222] pci 0000:06:00.0: res[13]=[io 0x1000-0x0fff] res_to_dev_res add_size 1000 min_align 1000

  60. [ 0.455286] pci 0000:06:00.0: BAR 13: no space for [io size 0x1000]

  61. [ 0.455288] pci 0000:06:00.0: BAR 13: failed to assign [io size 0x1000]

actually the initial pci bridge io resource[13] flag is set to zero because it has invalid

base/limit pair, but later in real assignment in __pci_bus_size_bridges->pci_bridge_check_ranges,

the io resource[13] flag will be add IORESOURCE_IO to it if this pci bridge can access PCI_IO_BASE

register. So later we when we check the io resource status of this pci bridge, we will find a free(uninitialized

)resource[13] to set the resource[13].start with 0x1000, end 0xfff respectively. Finally we need to find a empty hole

in resource tree and allocate this region for this pci bridge, then we can see, because the linux failed to find a resource 

with size 0x1000 under 0000:06:00.0's parent pci bus, it refused to assign io resource to this device:

 
  1. [ 0.453733] pci 0000:06:00.0: BAR 13: no space for [io size 0x1000]

  2. [ 0.453736] pci 0000:06:00.0: BAR 13: failed to assign [io size 0x1000]


and this code is in __pci_bus_assign_resources->assign_requested_resources_sorted->

pci_assign_resource->_pci_assign_resource->pci_bus_alloc_resource->

 
  1. err = find_resource(root, new, size, &constraint);

  2. if (err >= 0 && __request_resource(root, new))

  3. if (ret < 0) {

  4. dev_info(&dev->dev, "BAR %d: no space for %pR\n", resno, res);

  5. ret = pci_revert_fw_address(res, dev, resno, size);

  6. }

  7.  
  8. if (ret < 0) {

  9. dev_info(&dev->dev, "BAR %d: failed to assign %pR\n", resno,

  10. res);

  11. return ret;

  12. }


So, if we want to bypass one pci bridge device resource re-allocation, maybe we can simply

add some quirk in pci_bridge_check_ranges.

The following are resource dump on this platform:

 
  1. [ 0.865618] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]

  2. [ 0.865619] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]

  3. [ 0.865620] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]

  4. [ 0.865621] pci_bus 0000:00: resource 7 [mem 0x000c0000-0x000c3fff window]

  5. [ 0.865622] pci_bus 0000:00: resource 8 [mem 0x000c4000-0x000c7fff window]

  6. [ 0.865622] pci_bus 0000:00: resource 9 [mem 0x000c8000-0x000cbfff window]

  7. [ 0.865623] pci_bus 0000:00: resource 10 [mem 0x000cc000-0x000cffff window]

  8. [ 0.865624] pci_bus 0000:00: resource 11 [mem 0x000d0000-0x000d3fff window]

  9. [ 0.865625] pci_bus 0000:00: resource 12 [mem 0x000d4000-0x000d7fff window]

  10. [ 0.865625] pci_bus 0000:00: resource 13 [mem 0x000d8000-0x000dbfff window]

  11. [ 0.865626] pci_bus 0000:00: resource 14 [mem 0x000dc000-0x000dffff window]

  12. [ 0.865627] pci_bus 0000:00: resource 15 [mem 0x000e0000-0x000e3fff window]

  13. [ 0.865627] pci_bus 0000:00: resource 16 [mem 0x000e4000-0x000e7fff window]

  14. [ 0.865628] pci_bus 0000:00: resource 17 [mem 0x000e8000-0x000ebfff window]

  15. [ 0.865629] pci_bus 0000:00: resource 18 [mem 0x000ec000-0x000effff window]

  16. [ 0.865630] pci_bus 0000:00: resource 19 [mem 0x000f0000-0x000fffff window]

  17. [ 0.865630] pci_bus 0000:00: resource 20 [mem 0x7fa00000-0xfeafffff window]

  18. [ 0.865631] pci_bus 0000:00: resource 21 [mem 0xfed40000-0xfed44fff window]

  19. [ 0.865632] pci_bus 0000:01: resource 1 [mem 0xa0b00000-0xa0bfffff]

  20. [ 0.865633] pci_bus 0000:05: resource 0 [io 0x4000-0x6fff]

  21. [ 0.865634] pci_bus 0000:05: resource 1 [mem 0xa0d00000-0xacdfffff]

  22. [ 0.865635] pci_bus 0000:05: resource 2 [mem 0xace00000-0xb8dfffff 64bit pref]

  23. [ 0.865636] pci_bus 0000:06: resource 0 [io 0x4000-0x5fff]

  24. [ 0.865636] pci_bus 0000:06: resource 1 [mem 0xa0d00000-0xa8dfffff]

  25. [ 0.865637] pci_bus 0000:06: resource 2 [mem 0xace00000-0xb4dfffff 64bit pref]

  26. [ 0.865638] pci_bus 0000:07: resource 1 [mem 0xa0d00000-0xa0dfffff]

  27. [ 0.865639] pci_bus 0000:08: resource 0 [io 0x4000-0x4fff]

  28. [ 0.865639] pci_bus 0000:08: resource 1 [mem 0xa0e00000-0xa4dfffff]

  29. [ 0.865640] pci_bus 0000:08: resource 2 [mem 0xace00000-0xb0dfffff 64bit pref]

  30. [ 0.865641] pci_bus 0000:3a: resource 0 [io 0x5000-0x5fff]

  31. [ 0.865642] pci_bus 0000:3a: resource 1 [mem 0xa4e00000-0xa8dfffff]

  32. [ 0.865643] pci_bus 0000:3a: resource 2 [mem 0xb0e00000-0xb4dfffff 64bit pref]

  33. [ 0.865644] pci_bus 0000:02: resource 0 [io 0x2000-0x2fff]

  34. [ 0.865644] pci_bus 0000:02: resource 1 [mem 0x7fa00000-0x7fbfffff]

  35. [ 0.865645] pci_bus 0000:02: resource 2 [mem 0x7fc00000-0x7fdfffff 64bit pref]

  36. [ 0.865646] pci_bus 0000:03: resource 1 [mem 0xa0400000-0xa08fffff]

  37. [ 0.865647] pci_bus 0000:04: resource 1 [mem 0xa0900000-0xa0afffff]

  38. [ 0.865648] pci_bus 0000:04: resource 2 [mem 0x80000000-0x8fffffff 64bit pref]


Above info is dumped by 

pci_assign_unassigned_root_bus_resources:pci_bus_dump_resources

OK, so actually there is a bug report that, the poweroff and suspend do not work on Mac Pro 11,

the problem is illustrated below, 

https://patchwork.kernel.org/patch/9140867/

https://patchwork.kernel.org/patch/9143637/

So I post a patch to work around:

 
  1. Currently there are many people reported that they can not

  2. do a poweroff nor a suspend to memory on their Mac Pro 11.

  3. After some investigations it was found that, once the PCI bridge

  4. 0000:00:1c.0 reassigns its mm windows([mem 0x7fa00000-0x7fbfffff]

  5. and [mem 0x7fc00000-0x7fdfffff 64bit pref]), the region of ACPI

  6. io resource 0x1804 becomes unaccessible immediately, where the

  7. ACPI Sleep register is located, as a result neither poweroff(S5)

  8. nor suspend to memory(S3) works.

  9.  
  10. I don't know why setting the base/limit of PCI bridge mem resource

  11. would affect another io resource region, so this quirk just simply

  12. bypass the assignment of these mm resources on 0000:00:1c.0, by

  13. resetting the resource flag to 0 before updating the base/limit registers.

  14. This patch also introduces a new pci fixup phase before the actual bridge

  15. resource assignment.

  16.  


 

And the PCI maintainer has some concerns:

 
  1. Is this device *only* used on the Mac Pro 11? http://pci-ids.ucs.cz

  2. says "8 Series/C220 Series Chipset Family PCI Express Root Port #1",

  3. which sounds pretty generic.

And this is a good point, we might need a quirk. Besides, I also

illustrated the info in detail:

 
  1. according to the boot logs, the pci bridge

  2. in question has not declaimed any valid device/bridge resource

  3. (both io and mem) during probe(because base=0xfff>limit=0x0),

  4. so I think it has not hardware resource setting at that time(at least

  5. in BIOS)

  6. until it reaches pcibios_assign_resources and it has to allocate a

  7. minimal io/mem resource, then it tries to assign them to

  8. [mem 0x7fa00000 - 0x7fbfffff]

  9. [mem 0x7fc00000-0x7fdfffff 64bit pref]

  10. [io 0x2000-0x2fff],

  11. so if we reset the flag to zero for these mem resource, the pci bridge

  12. will not assign any pci mem windows for it

  13. (in this way

  14. find_free_bus_resource(bus, mask | IORESOURCE_PREFETCH, type)

  15. will not return any free resource, thus bypass the assignment)

  16.  
  17. According to the boot log at

  18. https://bugzilla.kernel.org/attachment.cgi?id=210141

  19. , we can see there is no bridge windows assign for 0000:00:1c.0 during

  20. early probe:

  21.  
  22. [ 0.807893] pci 0000:00:1c.0: [8086:8c10] type 01 class 0x060400

  23. [ 0.807949] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold

  24. [ 0.831281] pci 0000:00:1c.0: PCI bridge to [bus 02]

  25.  
  26. and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal

  27. resource window and then update related base/limit registers:

  28.  
  29. [ 0.865342] pci 0000:00:1c.0: bridge window [io 0x1000-0x0fff] to

  30. [bus 02] add_size 1000

  31. [ 0.865343] pci 0000:00:1c.0: bridge window [mem

  32. 0x00100000-0x000fffff 64bit pref] to [bus 02] add_size 200000 add_align

  33. 100000

  34. [ 0.865344] pci 0000:00:1c.0: bridge window [mem

  35. 0x00100000-0x000fffff] to [bus 02] add_size 200000 add_align 100000

  36.  


 

Bjorn helped to give some clues on how to further debug:

 
  1. Here are some ideas for debugging this:

  2.  
  3. 1) Apparently the problem is sensitive to programming the prefetchable

  4. memory aperture of 00:1c.0 to [mem 0x7fc00000-0x7fdfffff 64bit

  5. pref]. The only effect of that *should* be that the bridge will

  6. now claim accesses in the aperture, when it didn't before.

  7.  
  8. We *think* there's nothing else at the address of that aperture,

  9. but if there is an unreported device there, it may stop working. I

  10. would pore over the E820 memory map, EFI memory map, all ACPI _CRS

  11. methods, etc., looking for anything in that area that we haven't

  12. accounted for.

  13.  
  14. There are lots of anomalies in this system, e.g., (these are from

  15. Bastien's dmesg log at

  16. https://bugzilla.kernel.org/attachment.cgi?id=208961)

  17.  
  18. BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved

  19. PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)

  20. acpi PNP0A08:00: [Firmware Info]: MMCONFIG for domain 0000 [bus 00-9b] only partially covers this bridge

  21. pci_bus 0000:00: root bus resource [mem 0x7fa00000-0xfeafffff window]

  22. system 00:04: [mem 0xe0000000-0xefffffff] could not be reserved

  23. ACPI Warning: SystemIO range 0x000000000000EFA0-0x000000000000EFBF conflicts with OpRegion 0x000000000000EFA0-0x000000000000EFAF (\_SB.PCI0.SBUS.SMBI)

  24.  
  25. The MMCONFIG area appears correctly described in the 00:04 _CRS,

  26. but incorrectly in MCFG. The E820 region appears to be a chunk in

  27. the middle of the MMCONFIG area. The host bridge window

  28. [mem 0x7fa00000-0xfeafffff window] is clearly bogus -- it includes

  29. the MMCONFIG area, which is definitely not a window. I doubt

  30. anything above 0xdfffffff should be included.

  31.  
  32. I don't know what the ACPI conflict warning is about, but I'd try

  33. to figure it out.

  34.  
  35. Either the firmware is badly broken, or we're not interpreting

  36. something correctly.

  37.  
  38. I'd try assigning the 00:1c.0 aperture at the end of the actual

  39. aperture instead of the beginning. I think Windows, and likely

  40. MacOS uses a top-down allocation strategy instead of bottom-up like

  41. Linux does. If that makes poweroff work, there is likely an

  42. unreported device in the [mem 0x7fc00000-0x7fdfffff 64bit pref]

  43. area.

  44.  
  45. This experimentation could all be done with setpci, without

  46. requiring kernel patches.

  47.  
  48. 2) I would explore exactly what the grub "halt" command does and how

  49. it compares to what Linux is doing. I see the assertion that this

  50. is related to [io 0x1804], but I don't know what that's based on.

  51. Programming the 00:1c.0 prefetchable aperture shouldn't have

  52. anything to do with I/O ports, so if it really is related, there

  53. might be some SMM magic or something where SMM code is doing

  54. something that relates to the memory aperture.

  55.  
  56. 3) The lspci output at https://bugzilla.kernel.org/attachment.cgi?id=219321

  57. (I think this is from MacOS) shows invalid data for devices

  58. starting at 04:00.0. Why? Maybe this is an unrelated artifact,

  59. but it doesn't smell right.

  60.  
  61. 4) If you can hot-add devices under MacOS, look to see what address

  62. space they get assigned. That may tell you what allocation

  63. strategy MacOS uses.

  64.  
  65. 5) 00:1c.0 claims to have a slot that supports hotplug. Is that

  66. actually true? Could you add a device below it? If not, maybe the

  67. problem is that the BIOS should have configured 00:1c.0 so it

  68. doesn't report a slot. If it didn't report a slot, we shouldn't

  69. assign resources to it, since there is no possibility of a device

  70. below it.

  71.  
  72. Bjorn


 

 
  1. > 5) 00:1c.0 claims to have a slot that supports hotplug. Is that

  2. > actually true? Could you add a device below it? If not, maybe the

  3. > problem is that the BIOS should have configured 00:1c.0 so it

  4. > doesn't report a slot. If it didn't report a slot, we shouldn't

  5. > assign resources to it, since there is no possibility of a device

  6. > below it.

  7.  
  8. Of course, this would only be *part* of the problem, because a hot-added

  9. device somewhere else could still be assigned the space at

  10. [mem 0x7fc00000-0x7fdfffff].

  11.  
  12. This just smells like an unreported device in there somewhere.

OK, so Bjorn does not like this workaround  to be merged upstream, and he gave a lot of debug suggestions,

let's look at them one by one:

First question:

 
  1. The E820 region appears to be a chunk in

  2. the middle of the MMCONFIG area

 
  1. BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved

  2. PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)

So what is MMCONFIG? 
 

I have to confess that, I've searched through many articles but haven't find any in detail, but only

get a feeling that, this is a new machanism to read pci device's config space directly by memory.

So as we all know, the legacy method to read pci config is via register cf8 /cfc etc, but if we have

mmconfig enabled, the config space is in memory address, thus we can ioremap them and access it

directly. And the condition is that, the mmconfig space must inside e820 reserved region, otherwise, 

we still use the legacy method to read config space.

arch_initcall(pci_arch_init);
 
  1. static __init int pci_arch_init(void)

  2. {

  3. type = pci_direct_probe();

  4. pci_mmcfg_early_init();

In pci_direct_probe, we try to set the ops which is used to access config space to

pci_direct_conf1(legacy access by cf8), then in pc_mmcfg_early_init try to change the ops to mmcfg ops.

 
  1. void __init pci_mmcfg_early_init(void)

  2. {

  3. if (pci_probe & PCI_PROBE_MMCONF) {

  4. if (pci_mmcfg_check_hostbridge())

  5. known_bridge = 1;

  6. else

  7. acpi_sfi_table_parse(ACPI_SIG_MCFG, pci_parse_mcfg);

  8. __pci_mmcfg_init(1);

  9.  
  10. set_apei_filter();

  11. }

  12. }

Then in above function, after  enter pci_mmcfg_check_hostbridge , we first check

if there is any particular pci device which has customized mmcfg probe callbacks,

this is done by comparing  the pre-defined array pci_mmcfg_probes, 

 
  1. static const struct pci_mmcfg_hostbridge_probe pci_mmcfg_probes[] __initconst = {

  2. { 0, PCI_DEVFN(0, 0), PCI_VENDOR_ID_INTEL,

  3. PCI_DEVICE_ID_INTEL_E7520_MCH, pci_mmcfg_e7520 },

  4. { 0, PCI_DEVFN(0, 0), PCI_VENDOR_ID_INTEL,

  5. PCI_DEVICE_ID_INTEL_82945G_HB, pci_mmcfg_intel_945 },

  6. { 0, PCI_DEVFN(0x18, 0), PCI_VENDOR_ID_AMD,

  7. 0x1200, pci_mmcfg_amd_fam10h },

  8. { 0xff, PCI_DEVFN(0, 0), PCI_VENDOR_ID_AMD,

  9. 0x1200, pci_mmcfg_amd_fam10h },

  10. { 0, PCI_DEVFN(0, 0), PCI_VENDOR_ID_NVIDIA,

  11. 0x0369, pci_mmcfg_nvidia_mcp55 },

  12. };


Since we don't have these special pci devices, we check if there is any pci bridge already registered in the list of

pci_mmcfg_list , the answer is no, because there is no mmcfg probed yet.

Let's go back to pci_mmcfg_early_init, then we have to enumerating the ACPI table to find appropriate MCFG tables,

which contains the config info. Thus in 

acpi_sfi_table_parse(ACPI_SIG_MCFG, pci_parse_mcfg);

so first try acpi table and if fails, fall into sfi to find the table of "MCFG"

if found, invoke pci_parse_mcfg.

 
  1. static int __init pci_parse_mcfg(struct acpi_table_header *header)

  2. {

  3. struct acpi_table_mcfg *mcfg;

  4. struct acpi_mcfg_allocation *cfg_table;

  5. mcfg = (struct acpi_table_mcfg *)header;

  6.  
  7. cfg_table = (struct acpi_mcfg_allocation *) &mcfg[1];

  8. for (i = 0; i < entries; i++) {

  9. cfg = &cfg_table[i];

  10. if (acpi_mcfg_check_entry(mcfg, cfg)) {

  11. free_all_mmcfg();

  12. return -ENODEV;

  13. }

  14.  
  15. if (pci_mmconfig_add(cfg->pci_segment, cfg->start_bus_number,

  16. cfg->end_bus_number, cfg->address) == NULL) {

  17. pr_warn(PREFIX "no memory for MCFG entries\n");

  18. free_all_mmcfg();

  19. return -ENOMEM;

  20. }

  21. }

  22.  
  23. }

So this function is to deal with our lovely "MCFG" table, it bypass the table header, then parse

the data load started at mcfg[1]. for each entry in the data load section, it is a structure of

struct acpi_mcfg_allocation, if it is a valid entry, then invoke pci_mmconfig_add to allocate

a new internal structure pci_mmcfg_region to add this

entry into list of &pci_mmcfg_list, and notice, the entry in &pci_mmcfg_list

are sorted by entry.start_bus, from left to right, get bigger. (list_add_tail(new, head) is used to add new before

head, this is useful to implement queue).  And how do we determin the region scope of each pci_mmcfg_region?

it is in pci_mmconfig_alloc:

 
  1. static struct pci_mmcfg_region *pci_mmconfig_alloc(int segment, int start,

  2. int end, u64 addr)

  3. {

  4. struct pci_mmcfg_region *new;

  5. struct resource *res;

  6.  
  7. if (addr == 0)

  8. return NULL;

  9.  
  10. new = kzalloc(sizeof(*new), GFP_KERNEL);

  11. if (!new)

  12. return NULL;

  13.  
  14. new->address = addr;

  15. new->segment = segment;

  16. new->start_bus = start;

  17. new->end_bus = end;

  18.  
  19. res = &new->res;

  20. <span style="color:#FF0000;"> res->start = addr + PCI_MMCFG_BUS_OFFSET(start);

  21. res->end = addr + PCI_MMCFG_BUS_OFFSET(end + 1) - 1</span>;

  22. res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;

  23. snprintf(new->name, PCI_MMCFG_RESOURCE_NAME_LEN,

  24. "PCI MMCONFIG %04x [bus %02x-%02x]", segment, start, end);

  25. res->name = new->name;

  26.  
  27. return new;

  28. }

#define PCI_MMCFG_BUS_OFFSET(bus)      ((bus) << 20)


OK, as pci bus number starts from zero, we know the firmware has reserved 1<<20 bytes thus 1M bytes for each pci bus number.

 
  1. pr_info(PREFIX

  2. "MMCONFIG for domain %04x [bus %02x-%02x] at %pR "

  3. "(base %#lx)\n",

  4. segment, start, end, &new->res, (unsigned long)addr);

PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)


 

OK we are a little far, let's back to pci_mmcfg_early_init, after the data load entry regions(start_bus, end_bus, addr) is added

into pci_mmcfg_list, we need to deal with these list entries, that is to say, if we want to access these configs in these regions,

we need to ioremap these address and provide a virtual address space for them. This is what

__pci_mmcfg_init(1); done for us.

 
  1. static void __init __pci_mmcfg_init(int early)

  2. {

  3. pci_mmcfg_reject_broken(early);

  4. if (list_empty(&pci_mmcfg_list))

  5. return;

  6.  
  7. if (pcibios_last_bus < 0) {

  8. const struct pci_mmcfg_region *cfg;

  9.  
  10. list_for_each_entry(cfg, &pci_mmcfg_list, list) {

  11. if (cfg->segment)

  12. break;

  13. pcibios_last_bus = cfg->end_bus;

  14. }

  15. }

  16.  
  17. if (pci_mmcfg_arch_init())

  18. pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;

  19. else {

  20. free_all_mmcfg();

  21. pci_mmcfg_arch_init_failed = true;

  22. }

  23. }

First it is the sanitu check,

 
  1. static void __init pci_mmcfg_reject_broken(int early)

  2. {

  3. struct pci_mmcfg_region *cfg;

  4.  
  5. list_for_each_entry(cfg, &pci_mmcfg_list, list) {

  6. if (pci_mmcfg_check_reserved(NULL, cfg, early) == 0) {

  7. pr_info(PREFIX "not using MMCONFIG\n");

  8. free_all_mmcfg();

  9. return;

  10. }

  11. }

  12. }

For early check, e820 map is used to check if this new mmcfg region is inside any of the

E820_RESERVED region, if it not, shrink the mmcfg size and check again,  starts checking

again from (start, start + size/2), etc. Until we reached a minumal region size, that is,

16<<20 (16K), since mmcfg should not be smaller than 16k, we return a failure.

In our example, the region of mmcfg is bigger than e820 reserved region:

 
  1. BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved

  2. PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)

We return error here. But yes, we know the start addr is 0xe00f0000, but how does 0xe9bf ffff come out?

Le

and failed in pci_mmcfg_reject_broken and free all the mmcfg entries in previous

pci_mmcfg_list, and got a warning:

[    0.218319] PCI: not using MMCONFIG

thus __pci_mmcfg_init terminates, so does pci_mmcfg_early_init. as well as pci_arch_init,

we have to use legacy config opts callbacks, thus via cf8:

[    0.218320] PCI: Using configuration type 1 for base access

Later we have another chance to probe mmcfg again. That is in :

subsys_initcall(acpi_init);
 
  1. static int __init acpi_init(void)

  2. {

  3. pci_mmcfg_late_init();

  4. acpi_scan_init();

  5. ...

  6. }

 
  1. void __init pci_mmcfg_late_init(void)

  2. {

  3. /* MMCONFIG disabled */

  4. if ((pci_probe & PCI_PROBE_MMCONF) == 0)

  5. return;

  6.  
  7. if (known_bridge)

  8. return;

  9.  
  10. /* MMCONFIG hasn't been enabled yet, try again */

  11. if (pci_probe & PCI_PROBE_MASK & ~PCI_PROBE_MMCONF) {

  12. acpi_sfi_table_parse(ACPI_SIG_MCFG, pci_parse_mcfg);

  13. __pci_mmcfg_init(0);

  14. }

  15. }


 

In the comment above, if MMCONFIG hasn't been enabled yet(if mmcfg initialized successfuly,

the pci_probe should be set with PCI_PROBE_MMCONF, and other bits are cleared),

--  in __pci_mmcfg_init, if pci_mmcfg_reject_broken passed, we have:(however we failed

in pci_mmcfg_reject_broken previously):

pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;

Back to pci_mmcfg_late_init, the acpi_sfi_table_parse(ACPI_SIG_MCFG, pci_parse_mcfg); will

enumerate the mfg table again, and invoke __pci_mmcfg_init with different param 0, inidicating this

is not early init anymore:

 
  1. static void __init __pci_mmcfg_init(int early)

  2. {

  3. pci_mmcfg_reject_broken(early);

  4. if (list_empty(&pci_mmcfg_list))

  5. return;

  6.  
  7. if (pcibios_last_bus < 0) {

  8. const struct pci_mmcfg_region *cfg;

  9.  
  10. list_for_each_entry(cfg, &pci_mmcfg_list, list) {

  11. if (cfg->segment)

  12. break;

  13. pcibios_last_bus = cfg->end_bus;

  14. }

  15. }

  16.  
  17. if (pci_mmcfg_arch_init())

  18. pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;

  19. else {

  20. free_all_mmcfg();

  21. pci_mmcfg_arch_init_failed = true;

  22. }

  23. }

Thus during traversing &pci_mmcfg_list,  we check confliction by checking

"ACPI motherboard resources", rather than "E820".The checking callback is

is_acpi_reserved:

 
  1. static int is_acpi_reserved(u64 start, u64 end, unsigned not_used)

  2. {

  3. struct resource mcfg_res;

  4.  
  5. mcfg_res.start = start;

  6. mcfg_res.end = end - 1;

  7. mcfg_res.flags = 0;

  8.  
  9. acpi_get_devices("PNP0C01", find_mboard_resource, &mcfg_res, NULL);

  10.  
  11. if (!mcfg_res.flags)

  12. acpi_get_devices("PNP0C02", find_mboard_resource, &mcfg_res,

  13. NULL);

  14.  
  15. return mcfg_res.flags;

  16. }


So this confliction detection is implemented by comparing the motherboard resource owned by

ACPI device "PNP0C01", "PNP0C02", that is, checking the resource under _CRS of "PNP0C01",

compare if mcfg_res in strictly inside any of the _CRS resource region.

After we find the PNP0C01, check the _CRS inside it:

 
  1. static acpi_status find_mboard_resource(acpi_handle handle, u32 lvl,

  2. void *context, void **rv)

  3. {

  4. struct resource *mcfg_res = context;

  5.  
  6. acpi_walk_resources(handle, METHOD_NAME__CRS,

  7. check_mcfg_resource, context);

  8.  
  9. if (mcfg_res->flags)

  10. return AE_CTRL_TERMINATE;

  11.  
  12. return AE_OK;

  13. }


for each resource in the _CRS, compare mcfg_res with it, if found one of the fixed memory crs region

contains mcfg_res, then return ok and terminates the walk resource, and finally set OK indicator.

thus back to __pci_mmcfg_init,  we have passed the check of pci_mmcfg_reject_broken,

we need to ioremap these address in pci_mmcfg_arch_init:

cfg->virt = mcfg_ioremap(cfg);

How to remap? now we have start addr, start_bus, end_bus, thus we need to remap

from addr + start_bus*2M to addr + (end_bus - start_bus)*2M, with nocache attribute:

 
  1. static void __iomem *mcfg_ioremap(struct pci_mmcfg_region *cfg)

  2. {

  3. void __iomem *addr;

  4. u64 start, size;

  5. int num_buses;

  6.  
  7. start = cfg->address + PCI_MMCFG_BUS_OFFSET(cfg->start_bus);

  8. num_buses = cfg->end_bus - cfg->start_bus + 1;

  9. size = PCI_MMCFG_BUS_OFFSET(num_buses);

  10. addr = ioremap_nocache(start, size);

  11. if (addr)

  12. addr -= PCI_MMCFG_BUS_OFFSET(cfg->start_bus);

  13. return addr;

  14. }

notice, the addr mapped by ioremap_nocache, must be accessed by readw/writew, etc, because
the address to be mapped is likely to be pci bus address, and although the pci bus address is the same

for cpu address on x86, on other platforms this might not be the case.

After the remapping, the ops is updated to mmcfg ops:

raw_pci_ext_ops = &pci_mmcfg;
 
  1. [ 0.238097] PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)

  2. [ 0.238420] PCI: MMCONFIG at [mem 0xe0000000-0xe9bfffff] reserved in ACPI motherboard resources


Then later we have:

[    0.238426] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug

So for some special platforms, we don't use _CRS, but in our case, we use _CRS by default.

Later we have:

[    0.243396] acpi PNP0A08:00: [Firmware Info]: MMCONFIG for domain 0000 [bus 00-9b] only partially covers this bridge

OK, this is printed during acpi pci device was enumerated.

Let's back to  subsys_initcall(acpi_init);

 
  1. static int __init acpi_init(void)

  2. {

  3. pci_mmcfg_late_init();

  4. acpi_scan_init();

  5. }

Above warning is printed in acpi_scan_init, and the code path is a little long:

 
  1. int __init acpi_scan_init(void)

  2. {

  3. acpi_pci_root_init();

  4. acpi_bus_scan(ACPI_ROOT_OBJECT);

  5. }

acpi_pci_root_init :

 acpi_scan_add_handler_with_hotplug(&pci_root_handler, "pci_root");

 
  1. static const struct acpi_device_id root_device_ids[] = {

  2. {"PNP0A03", 0},

  3. {"", 0},

  4. };

  5.  
  6. static struct acpi_scan_handler pci_root_handler = {

  7. .ids = root_device_ids,

  8. .attach = acpi_pci_root_add,

  9. .detach = acpi_pci_root_remove,

  10. .hotplug = {

  11. .enabled = true,

  12. .scan_dependent = acpi_pci_root_scan_dependent,

  13. },

  14. };


 

Thus we add pci_root_handler as a handler for "pci_root" device, which means, later in

acpi_bus_scan we will invoke the pci_root_handler once we encountered a "pci_root" device:

 
  1. acpi_bus_attach(device);

  2.  
  3. then:

  4.  
  5. acpi_scan_attach_handler(device);

  6.  
  7.  
  8. then:

  9.  
  10. acpi_pci_root_add,

  11.  
  12.  

in acpi_pci_root_add, we get the root bridge's downstream bus range,

by checking _CRS:

 
  1. /* Check _CRS first, then _BBN. If no _BBN, default to zero. */

  2. root->secondary.flags = IORESOURCE_BUS;

  3. status = try_get_root_bridge_busnr(handle, &root->secondary);

So we know, once we find the root pci bus by PNP0A03, we check the _CRS in this

device, and the _CRS is just the bus region of PNP0A03's root pci bridge 's(PNP0A08) bus range.

Thus the bus range is stored in root->secondary, later we compare the mmcfg region with it:
 

 
  1. then:

  2.  
  3. pci_acpi_scan_root

  4.  
  5.  
  6. then:

  7.  
  8. acpi_pci_root_create

  9.  
  10.  
  11. then:

  12.  
  13. acpi_pci_root_ops.init_info

  14.  
  15.  
  16. then:

  17.  
  18. static int pci_acpi_root_init_info(struct acpi_pci_root_info *ci)

  19. {

  20. return setup_mcfg_map(ci);

  21. }



 

Thus in setup_mcfg_map we tries to insert

 
  1. static int setup_mcfg_map(struct acpi_pci_root_info *ci)

  2. {

  3. int result, seg;

  4. struct pci_root_info *info;

  5. struct acpi_pci_root *root = ci->root;

  6. struct device *dev = &ci->bridge->dev;

  7.  
  8. info = container_of(ci, struct pci_root_info, common);

  9. info->start_bus = (u8)root->secondary.start;

  10. info->end_bus = (u8)root->secondary.end;

  11. info->mcfg_added = false;

  12. seg = info->sd.domain;

  13.  
  14. /* return success if MMCFG is not in use */

  15. if (raw_pci_ext_ops && raw_pci_ext_ops != &pci_mmcfg)

  16. return 0;

  17.  
  18. if (!(pci_probe & PCI_PROBE_MMCONF))

  19. return check_segment(seg, dev, "MMCONFIG is disabled,");

  20.  
  21. result = pci_mmconfig_insert(dev, seg, info->start_bus, info->end_bus,

  22. root->mcfg_addr);

  23. if (result == 0) {

  24. /* enable MMCFG if it hasn't been enabled yet */

  25. if (raw_pci_ext_ops == NULL)

  26. raw_pci_ext_ops = &pci_mmcfg;

  27. info->mcfg_added = true;

  28. } else if (result != -EEXIST)

  29. return check_segment(seg, dev,

  30. "fail to add MMCONFIG information,");

  31.  
  32. return 0;

  33. }

 
  1. /* Add MMCFG information for host bridges */

  2. int pci_mmconfig_insert(struct device *dev, u16 seg, u8 start, u8 end,

  3. phys_addr_t addr)

  4. {

  5. cfg = pci_mmconfig_lookup(seg, start);

  6. if (cfg) {

  7. if (cfg->end_bus < end)

  8. dev_info(dev, FW_INFO

  9. "MMCONFIG for "

  10. "domain %04x [bus %02x-%02x] "

  11. "only partially covers this bridge\n",

  12. cfg->segment, cfg->start_bus, cfg->end_bus);

  13. mutex_unlock(&pci_mmcfg_lock);

  14. return -EEXIST;

  15. }

  16. }


OK, so check if the start_bus,end_bus depicted by _CRS for root pci bridge is already included in

the mmcfg list, and only the following  bus range(start,end) is considered to be valid:

cfg->start_bus <= start && cfg->end_bus >= end, as we have warning:

[    0.243396] acpi PNP0A08:00: [Firmware Info]: MMCONFIG for domain 0000 [bus 00-9b] only partially covers this bridge


The PNP0A08 is a pci root bridge, whose bug range comes from _CRS of pci root bus PNP0A03.

OK, so this smells like a firmware issue.

OK, back to original bug report, 

In order to figure it out, we can bypass setting the memory  aperture, then after boot up, set the aperture to another window.

But before that, I also tried with clear the io aperture after system bootup, it looks like take no effect:

 
  1. # setpci -s 0000:00:1c.0 IO_BASE

  2. 20

  3. # setpci -s 0000:00:1c.0 IO_LIMIT

  4. 20

  5. # setpci -s 0000:00:1c.0 IO_BASE_UPPER16

  6. 0000

  7. # setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16

  8. 0000

  9.  
  10. 3.

  11. # setpci -s 0000:00:1c.0 IO_BASE.B=f0

  12. # setpci -s 0000:00:1c.0 IO_LIMIT.B=0

  13. # setpci -s 0000:00:1c.0 IO_BASE_UPPER16.W=0

  14. # setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16.W=0

  15.  
  16. 4.

  17. # setpci -s 0000:00:1c.0 IO_BASE

  18. f0

  19. # setpci -s 0000:00:1c.0 IO_LIMIT

  20. 00

  21. # setpci -s 0000:00:1c.0 IO_BASE_UPPER16

  22. 0000

  23. # setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16

  24. 0000


 

OK then Bjorn suggested to leave the bridge alone without declaring any aperturns, then we did:

 
  1. 1.

  2. setpci -s 0000:00:1c.0 MEMORY_BASE

  3. setpci -s 0000:00:1c.0 MEMORY_LIMIT

  4. setpci -s 0000:00:1c.0 PREF_MEMORY_BASE

  5. setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT

  6.  
  7. 2.

  8. setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000

  9. setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020

  10. setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020

  11. setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040

  12.  
  13. 3.

  14. setpci -s 0000:00:1c.0 MEMORY_BASE

  15. setpci -s 0000:00:1c.0 MEMORY_LIMIT

  16. setpci -s 0000:00:1c.0 PREF_MEMORY_BASE

  17. setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT


So no one should use

[f0000000-f01fffff] and [f0200000 - f03fffff pre]

because neither e820 nor iomem has declaired this region:

 
  1. [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable

  2. [ 0.000000] BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved

  3. [ 0.000000] BIOS-e820: [mem 0x0000000000059000-0x000000000009ffff] usable

  4. [ 0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000bffff] reserved

  5. [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000078d00fff] usable

  6. [ 0.000000] BIOS-e820: [mem 0x0000000078d01000-0x0000000078d48fff] ACPI NVS

  7. [ 0.000000] BIOS-e820: [mem 0x0000000078d49000-0x0000000078d5cfff] usable

  8. [ 0.000000] BIOS-e820: [mem 0x0000000078d5d000-0x0000000078d8efff] ACPI data

  9. [ 0.000000] BIOS-e820: [mem 0x0000000078d8f000-0x0000000078e39fff] usable

  10. [ 0.000000] BIOS-e820: [mem 0x0000000078e3a000-0x0000000078e8efff] reserved

  11. [ 0.000000] BIOS-e820: [mem 0x0000000078e8f000-0x0000000078ecbfff] usable

  12. [ 0.000000] BIOS-e820: [mem 0x0000000078ecc000-0x0000000078efefff] type 20

  13. [ 0.000000] BIOS-e820: [mem 0x0000000078eff000-0x0000000078f87fff] usable

  14. [ 0.000000] BIOS-e820: [mem 0x0000000078f88000-0x0000000078fdefff] reserved

  15. [ 0.000000] BIOS-e820: [mem 0x0000000078fdf000-0x0000000078ffffff] usable

  16. [ 0.000000] BIOS-e820: [mem 0x0000000079000000-0x000000007f9fffff] reserved

  17. [ 0.000000] BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved

  18. [ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved

  19. [ 0.000000] BIOS-e820: [mem 0x00000000ffd70000-0x00000000ffd9ffff] reserved

  20. [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047f5fffff] usable

 
  1. 00000000-00000fff : reserved

  2. 00001000-00057fff : System RAM

  3. 00058000-00058fff : reserved

  4. 00059000-0009ffff : System RAM

  5. 000a0000-000bffff : PCI Bus 0000:00

  6. 000a0000-000bffff : reserved

  7. 000c0000-000c3fff : PCI Bus 0000:00

  8. 000c4000-000c7fff : PCI Bus 0000:00

  9. 000c8000-000cbfff : PCI Bus 0000:00

  10. 000cc000-000cffff : PCI Bus 0000:00

  11. 000d0000-000d3fff : PCI Bus 0000:00

  12. 000d4000-000d7fff : PCI Bus 0000:00

  13. 000d8000-000dbfff : PCI Bus 0000:00

  14. 000dc000-000dffff : PCI Bus 0000:00

  15. 000e0000-000e3fff : PCI Bus 0000:00

  16. 000e4000-000e7fff : PCI Bus 0000:00

  17. 000e8000-000ebfff : PCI Bus 0000:00

  18. 000ec000-000effff : PCI Bus 0000:00

  19. 000f0000-000fffff : PCI Bus 0000:00

  20. 000f0000-000fffff : System ROM

  21. 00100000-78d00fff : System RAM

  22. 01a00000-0226647c : Kernel code

  23. 0226647d-02bb843f : Kernel data

  24. 02da4000-02fe9fff : Kernel bss

  25. 78d01000-78d48fff : ACPI Non-volatile Storage

  26. 78d49000-78d5cfff : System RAM

  27. 78d5d000-78d8efff : ACPI Tables

  28. 78d8f000-78e39fff : System RAM

  29. 78e3a000-78e8efff : reserved

  30. 78e8f000-78ecbfff : System RAM

  31. 78ecc000-78efefff : reserved

  32. 78eff000-78f87fff : System RAM

  33. 78f88000-78fdefff : reserved

  34. 78fdf000-78ffffff : System RAM

  35. 79000000-7f9fffff : reserved

  36. 7fa00000-feafffff : PCI Bus 0000:00

  37. 7fa00000-7fbfffff : PCI Bus 0000:02

  38. 7fc00000-7fdfffff : PCI Bus 0000:02

  39. 80000000-8fffffff : PCI Bus 0000:04

  40. 80000000-8fffffff : 0000:04:00.0

  41. 80000000-8fffffff : S2 MEM

  42. 90000000-9fffffff : 0000:00:02.0

  43. 90000000-91436fff : BOOTFB

  44. a0000000-a03fffff : 0000:00:02.0

  45. a0400000-a08fffff : PCI Bus 0000:03

  46. a0400000-a07fffff : 0000:03:00.0

  47. a0800000-a0807fff : 0000:03:00.0

  48. a0900000-a0afffff : PCI Bus 0000:04

  49. a0900000-a09fffff : 0000:04:00.0

  50. a0900000-a09fffff : ISP IO

  51. a0a00000-a0a0ffff : 0000:04:00.0

  52. a0a00000-a0a0ffff : S2 IO

  53. a0b00000-a0bfffff : PCI Bus 0000:01

  54. a0b00000-a0b01fff : 0000:01:00.0

  55. a0b00000-a0b01fff : ahci

  56. a0c00000-a0c0ffff : 0000:00:14.0

  57. a0c00000-a0c0ffff : xhci-hcd

  58. a0c10000-a0c13fff : 0000:00:03.0

  59. a0c10000-a0c13fff : ICH HD audio

  60. a0c14000-a0c17fff : 0000:00:1b.0

  61. a0c14000-a0c17fff : ICH HD audio

  62. a0c18000-a0c18fff : 0000:00:1f.6

  63. a0c19000-a0c190ff : 0000:00:1f.3

  64. a0c19100-a0c1910f : 0000:00:16.0

  65. a0d00000-acdfffff : PCI Bus 0000:05

  66. a0d00000-a8dfffff : PCI Bus 0000:06

  67. a0d00000-a0dfffff : PCI Bus 0000:07

  68. a0d00000-a0d3ffff : 0000:07:00.0

  69. a0d00000-a0d3ffff : thunderbolt

  70. a0d40000-a0d40fff : 0000:07:00.0

  71. a0e00000-a4dfffff : PCI Bus 0000:08

  72. a0e00000-a12fffff : PCI Bus 0000:09

  73. a0e00000-a12fffff : PCI Bus 0000:0a

  74. a0e00000-a0e0ffff : 0000:0a:00.0

  75. a4e00000-a8dfffff : PCI Bus 0000:3a

  76. ace00000-b8efffff : PCI Bus 0000:05

  77. ace00000-b4efffff : PCI Bus 0000:06

  78. ace00000-b0efffff : PCI Bus 0000:08

  79. ace00000-acefffff : PCI Bus 0000:09

  80. ace00000-acefffff : PCI Bus 0000:0a

  81. ace00000-ace0ffff : 0000:0a:00.0

  82. ace00000-ace0ffff : tg3

  83. ace10000-ace1ffff : 0000:0a:00.0

  84. ace10000-ace1ffff : tg3

  85. b0f00000-b4efffff : PCI Bus 0000:3a

  86. e00f8000-e00f8fff : reserved

  87. fec00000-fec003ff : IOAPIC 0

  88. fed00000-fed03fff : pnp 00:00

  89. fed00000-fed003ff : HPET 0

  90. fed10000-fed17fff : pnp 00:04

  91. fed18000-fed18fff : pnp 00:04

  92. fed19000-fed19fff : pnp 00:04

  93. fed1c000-fed1ffff : reserved

  94. fed1c000-fed1ffff : pnp 00:04

  95. fed1f410-fed1f414 : iTCO_wdt.0.auto

  96. fed20000-fed3ffff : pnp 00:04

  97. fed40000-fed44fff : PCI Bus 0000:00

  98. fed45000-fed8ffff : pnp 00:04

  99. fed90000-fed93fff : pnp 00:04

  100. fed90000-fed90fff : dmar0

  101. fed91000-fed91fff : dmar1

  102. fee00000-feefffff : pnp 00:04

  103. fee00000-fee00fff : Local APIC

  104. fef00000-fef0ffff : APP0001:00

  105. ff000000-ffffffff : INT0800:00

  106. ffd70000-ffd9ffff : reserved

  107. 100000000-47f5fffff : System RAM

  108. 47f600000-47fffffff : RAM buffer


And the following is my response to Bjorn:

 
  1. I've grep CRS in the table, there are mainly two kinds of CRS provided:

  2. 1st is dynamically allocated, since the problematic mem aperture are not

  3. inside any of the e820 regions, this is not the cause.

  4. 2nd is Memory32Fixed resource template, and I do not see any conflict with

  5. this mem aperture.

  6.  
  7.  
  8. - show quoted text -

  9.  
  10. I think firmware has really given some incompatible info to the Linux, and

  11. as you suggested, I've replied in the Bugzilla to try setting the aperture to

  12. other windows than the problematic address, by setpci command, and let's see

  13. what will happen then.

  14.  
  15. The SystemIO error message above is printed if the ACPI Operation Region conflicts with

  16. any native driver, but since 1804 is very far from efa0, I don't know if this has

  17. impact on that.

  18.  
  19. And another important info is that, on another similar Macbook Pro 12 on which

  20. poweroff works(the problematic one is MacBook Pro 11), we still see above MMCFG conflict with

  21. e820 and host bridge mem window, as well as SystemIO warnings. But the difference is

  22. that, MacBook Pro 12 does not have this pci bridge(8c10). So I suspect this is

  23. highly related to this pci device.

  24. Related bugzilla on Macbook Pro 12 at:

  25. https://bugzilla.kernel.org/show_bug.cgi?id=101681

  26. dmesg at:

  27. https://bugzilla.kernel.org/attachment.cgi?id=185781

  28.  
  29. - show quoted text -

  30.  
  31. Previously I have enabled the grub debug feature, it shows that when halt is invoked, it tries

  32. to halt by filling the ACPI Sleep Register at ioport 0x1804 with S5. And in linux,

  33. we implement poweroff by the same method, i.e., outb _S5 to 0x1804.

  34. Yes, maybe this is related to something like SMM.

  35.  
  36. - show quoted text -

  37.  
  38. Mac Os is not using this device, humm, I did not quite understand, how to add

  39. a device below it, do we need to hack the hardware?

  40.  
  41. Since this issue is likely caused by suspicious device using that address, or

  42. maybe that pci bridge device is actually broken, since it only happens on Mac Pro 11,

  43. I'm still thinking if dmi+quirk would be the simplest workaround to make it work

  44. for now.

  45.  

猜你喜欢

转载自blog.csdn.net/bjzhaoxiao/article/details/81268798