As a premise, I have an HTML text, with some <ol>
elements. These have a start
attribute, but the framework I'm using is not capable to interpret them during a PDF conversion. So, the trick I am trying to apply is to add a number of invisible <li>
elements at the beginning.
As an example, suppose this input text:
<ol start="3">
<li>Element 1</li>
<li>Element 2</li>
<li>Element 3</li>
</ol>
I want to produce this result:
<ol>
<li style="visibility:hidden"></li>
<li style="visibility:hidden"></li>
<li>Element 1</li>
<li>Element 2</li>
<li>Element 3</li>
</ol>
So, adding n-1 invisible elements into the ordered list. But I'm not able to do that from Java in a generalized way.
Supposing the exact case in the example, I could do this (using replace
, so - to be honest - without regex):
htmlString = htmlString.replace("<ol start=\"3\">",
"<ol><li style=\"visibility:hidden\"></li><li style=\"visibility:hidden\"></li>");
But, obviously, it just applies to the case with "start=3". I know that I can use groups to extract the "3", but how can I use it as a "variable" to specify the string <li style=\"visibility:hidden\"></li>
n-1 number of times? Thanks for any insight.
Since Java 9, there's a Matcher.replaceAll
method taking a callback function as a parameter:
String text = "<ol start=\"3\">\n\t<li>Element 1</li>\n\t<li>Element 2</li>\n\t<li>Element 3</li>\n</ol>";
String result = Pattern
.compile("<ol start=\"(\\d)\">")
.matcher(text)
.replaceAll(m -> "<ol>" + repeat("\n\t<li style=\"visibility:hidden\" />",
Integer.parseInt(m.group(1))-1));
To repeat
the string you can take the trick from here, or use a loop.
public static String repeat(String s, int n) {
return new String(new char[n]).replace("\0", s);
}
Afterwards, result
is:
<ol>
<li style="visibility:hidden" />
<li style="visibility:hidden" />
<li>Element 1</li>
<li>Element 2</li>
<li>Element 3</li>
</ol>
If you are stuck with an older version of Java, you can still match and replace in two steps.
Matcher m = Pattern.compile("<ol start=\"(\\d)\">").matcher(text);
while (m.find()) {
int n = Integer.parseInt(m.group(1));
text = text.replace("<ol start=\"" + n + "\">",
"<ol>" + repeat("\n\t<li style=\"visibility:hidden\" />", n-1));
}
Update by Andrea ジーティーオー:
I modified the (great) solution above for including also <ol>
that have multiple attributes, so that their tag do not end with start
(example, <ol>
with letters, as <ol start="4" style="list-style-type: upper-alpha;">
). This uses replaceAll
to deal with regex as a whole.
//Take something that starts with "<ol start=", ends with ">", and has a number in between
Matcher m = Pattern.compile("<ol start=\"(\\d)\"(.*?)>").matcher(htmlString);
while (m.find()) {
int n = Integer.parseInt(m.group(1));
htmlString = htmlString.replaceAll("(<ol start=\"" + n + "\")(.*?)(>)",
"<ol $2>" + StringUtils.repeat("\n\t<li style=\"visibility:hidden\" />", n - 1));
}