I read in an XML file (provided by another system, so I cannot control it) in order to convert it to JSON. Using Jackson. I am seeing some undesirable behavior where any "empty" nodes in the source XML file are being converted to JSON with "\n <many spaces if source is indented>" as the content. For example:
Generated output:
{"a":"Dummy Content","b":"\n "}
Desired output:
{"a":"Dummy Content","b":""}
What is the most acceptable way to correct this in a generic enough way that it will work on any XML file with any empty XML nodes?
When loading the file, I tried iterating each line to clear it up like this:
String content = "";
try (BufferedReader br = new BufferedReader(new FileReader("MyFile.xml"))) {
String line;
while ((line = br.readLine()) != null) {
content += line.replace(System.getProperty("line.separator"), "").trim();
}
}
It appears to work however I was wondering if there is a better solution? The source XML files could be quite large (hundreds of thousands of lines).
Sample code that illustrates the issue
private static String testXML
= "<Root>\n"
+ " <a>Dummy Content</a>\n"
+ " <b>\n"
+ " </b>\n"
+ "</Root>";
public static void main(String[] args) {
XmlMapper xmlMapper = new XmlMapper();
JsonNode jsonNode = null;
try {
jsonNode = xmlMapper.readTree(testXML);
} catch (IOException ex) {
System.out.println(ex);
}
System.out.println(jsonNode);
}
Generated Output:
{"a":"Dummy Content","b":"\n "}
Desired output:
{"a":"Dummy Content","b":""}
If you deserialise XML
to JsonNode
you can override JsonNodeFactory
which creates nodes with data. For String
we need to override textNode
method and in case value is blank, just trim it to empty String
.
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.JsonNodeFactory;
import com.fasterxml.jackson.databind.node.TextNode;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import org.apache.commons.lang3.StringUtils;
public class XmlApp {
public static void main(String[] args) throws Exception {
String testXML = "<Root>\n <a>Dummy Content</a>\n <b>\n </b>\n</Root>";
XmlMapper xmlMapper = new XmlMapper();
xmlMapper.setNodeFactory(new TrimStringTextJsonNodeFactory());
JsonNode jsonNode = xmlMapper.readTree(testXML);
System.out.println(jsonNode);
}
}
class TrimStringTextJsonNodeFactory extends JsonNodeFactory {
@Override
public TextNode textNode(String text) {
if (StringUtils.isBlank(text)) {
text = StringUtils.trimToEmpty(text);
}
return super.textNode(text);
}
}
Above code prints:
{"a":"Dummy Content","b":""}