The encounter with incomplete tags in html

On front-end pages such as html or jsp, be sensitive to where html tags appear

Because once an incomplete html tag appears on the page, your page will crash and you will be at a loss

I once encountered a problem that there was no problem with the system page before, and there was no problem with the data itself

But after importing new data, the html page cannot be displayed normally, and it crashes very much

Then I think that the jsp page has individual modules that extract a piece of text that may contain html tags from the text for display

And the text is randomly intercepted fixed-length characters

Then the question arises, what should I do if there is an incomplete html tag after the interception?

The answer is very simple: your page will be terrible, and you will be helpless at that moment, because it was fine before! O(∩_∩)O haha~

Here are some ways to remove html tags:

The java(jsp) side only extracts Chinese content:

String regex="([\u4e00-\u9fa5]+)";
String aimStr = "";
Matcher matches = Pattern.compile (regex) .matcher (aimStr);
if(matcher.find()){
	aimStr = matcher.group(0);
}
System.out.println(aimStr);

 

The java (jsp) side removes the html tag:

public String removeHtmlTag() {
	/ / Use this method remember to introduce the corresponding class
        String htmlStr = "";//Text content with html tags
	String regEx_script = "<script[^>]*?>[\\s\\S]*?<\\/script>"; // 去除script
	String regEx_style = "<style[^>]*?>[\\s\\S]*?<\\/style>"; // 去除style
	String regEx_html = "<[^>]+>"; // 去除HTML tag
	String regEx_space = "\\s+|\t|\r|\n";// 去除other characters
	Pattern p_script = Pattern.compile(regEx_script,
		Pattern.CASE_INSENSITIVE);
	Matcher m_script = p_script.matcher(htmlStr);
	htmlStr = m_script.replaceAll("");
	Pattern p_style = Pattern
		.compile(regEx_style, Pattern.CASE_INSENSITIVE);
	Matcher m_style = p_style.matcher(htmlStr);
	htmlStr = m_style.replaceAll("");
	Pattern p_html = Pattern.compile(regEx_html, Pattern.CASE_INSENSITIVE);
	Matches m_html = p_html.matcher (htmlStr);
	htmlStr = m_html.replaceAll("");
	Pattern p_space = Pattern
		.compile(regEx_space, Pattern.CASE_INSENSITIVE);
	Matcher m_space = p_space.matcher(htmlStr);
	htmlStr = m_space.replaceAll(" ");
	return htmlStr;
}

 

The js side only extracts Chinese content:

aimStr.replace(/[^\u4e00-\u9fa5]/gi,"");

 

js side to remove html tags:

aimStr.replace(/<[^>]+>/g,"");

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326793211&siteId=291194637