scrapy's xpath cannot match tbody tags

I encountered a little problem when using scrapy's xpath just now. I personally find it very interesting, and I found it by accident. I think it is a pit, and it is also a very interesting problem, so I will make an introduction and notes here.

Problem: Using scapy 's xpath doesn't get it when matching <tbody> location. (Use the browser's own copy of xpach)

The charm browser I use, when inspecting an element, directly copies its xpath expression. But when running, the purpose information has never been obtained.



Using it's expression, you'll see, there's nothing, so I started checking the problem, reducing the labels one by one until


It only returns when it is reduced to <table>, which means that there is a problem with the next tag matching of <table>, that is, <tbody>. I also modified some acquisition methods, but it still doesn't work.

It's a headache. After repeated experiments, I checked the web page structure here, and tried to match and skip it. I didn't expect to find it.


When the <tbody> tag is removed, it can actually match down. It's very interesting, so I googled the reason, and the summary is not very detailed, probably because the charm browser added a <tbody> tag when parsing the web page, I don't know if it is because there is no <tbody> in the <table> tag. It will be added, or there is a basis for adding.

Now just pay attention to the matching of the <tbody> tags. When encountering the same problem, you can refer to this problem. There may be the same reasons.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324644212&siteId=291194637