Java读取html中所有img标签的src值

原文地址:https://blog.csdn.net/zgs_shmily/article/details/49799997

原理:利用正则匹配进行查找

直接上代码

public List<String> getImgSrc(String htmlStr) {

String img = "";

Pattern p_image;

Matcher m_image;

List<String> pics = new ArrayList<String>();

// String regEx_img = "<img.*src=(.*?)[^>]*?>"; //图片链接地址

String regEx_img = "<img.*src\\s*=\\s*(.*?)[^>]*?>";

p_image = Pattern.compile(regEx_img, Pattern.CASE_INSENSITIVE);

m_image = p_image.matcher(htmlStr);

while (m_image.find()) {

img = img + "," + m_image.group();

// Matcher m =

// Pattern.compile("src=\"?(.*?)(\"|>|\\s+)").matcher(img); //匹配src

Matcher m = Pattern.compile("src\\s*=\\s*\"?(.*?)(\"|>|\\s+)").matcher(img);

while (m.find()) {

pics.add(m.group(1));

}

}

return pics;

}

猜你喜欢

转载自blog.csdn.net/tanga842428/article/details/82219510
今日推荐