Java SAX XML Parser

  SAX parser is working differently with a DOM parser, it neither load any XML document into memory nor create any object representation of the XML document. Instead, the SAX parser use callback function (org.xml.sax.helpers.DefaultHandler) to informs clients of the XML document structure.

  SAX Parser is faster and uses less memory than DOM parser.

1.0 How to read XML file in Java – (SAX Parser)

file.xml

<?xml version="1.0"?>
<company>
    <staff>
        <firstname>yong</firstname>
        <lastname>mook kim</lastname>
        <nickname>mkyong</nickname>
        <salary>100000</salary>
    </staff>
    <staff>
        <firstname>low</firstname>
        <lastname>yin fong</lastname>
        <nickname>fong fong</nickname>
        <salary>200000</salary>
    </staff>
</company>

ReadXMLFile.java

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class ReadXMLFile {

   public static void main(String argv[]) {
    try {

    SAXParserFactory factory = SAXParserFactory.newInstance();
    SAXParser saxParser = factory.newSAXParser();

    DefaultHandler handler = new DefaultHandler() {

    boolean bfname = false;
    boolean blname = false;
    boolean bnname = false;
    boolean bsalary = false;

    public void startElement(String uri, String localName,String qName,
                Attributes attributes) throws SAXException {

        System.out.println("Start Element :" + qName);

        if (qName.equalsIgnoreCase("FIRSTNAME")) {
            bfname = true;
        }

        if (qName.equalsIgnoreCase("LASTNAME")) {
            blname = true;
        }

        if (qName.equalsIgnoreCase("NICKNAME")) {
            bnname = true;
        }

        if (qName.equalsIgnoreCase("SALARY")) {
            bsalary = true;
        }

    }

    public void endElement(String uri, String localName,
        String qName) throws SAXException {
        System.out.println("End Element :" + qName);

    }

    public void characters(char ch[], int start, int length) throws SAXException {

        if (bfname) {
            System.out.println("First Name : " + new String(ch, start, length));
            bfname = false;
        }

        if (blname) {
            System.out.println("Last Name : " + new String(ch, start, length));
            blname = false;
        }

        if (bnname) {
            System.out.println("Nick Name : " + new String(ch, start, length));
            bnname = false;
        }

        if (bsalary) {
            System.out.println("Salary : " + new String(ch, start, length));
            bsalary = false;
        }
    }
     };

/**  指定编码格式的写法,否则会出现如下异常
 if you parse a XML file which contains some special UTF-8 characters, it will prompts “Invalid byte 1 of 1-byte UTF-8 sequence” exception.

    File file = new File("c:\\file.xml");
    InputStream inputStream= new FileInputStream(file);
    Reader reader = new InputStreamReader(inputStream,"UTF-8");

   InputSource is = new InputSource(reader);
   is.setEncoding("UTF-8");
  saxParser.parse(is, handler);
**/
       saxParser.parse("c:\\file.xml", handler);
     } catch (Exception e) {
       e.printStackTrace();
     }
   }
}

Result:

Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
Start Element :salary
Salary : 100000
End Element :salary
End Element :staff
Start Element :staff
Start Element :firstname
First Name : low
End Element :firstname
Start Element :lastname
Last Name : yin fong
End Element :lastname
Start Element :nickname
Nick Name : fong fong
End Element :nickname
Start Element :salary
Salary : 200000
End Element :salary
End Element :staff
End Element :company

2.0 SAX Error – Content is not allowed in prolog

Problem

Working XML via SAX parser, but when it parse some XML file, it prompts following error message :

org.xml.sax.SAXParseException: Content is not allowed in prolog.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    //...

Solution

This error message is always caused by the invalid XML content in the beginning element. For example, extra small dot “.” in the beginning of XML element.

Any characters before the “

.<?xml version="1.0"?>
<company>
    <staff>
        <firstname>yong</firstname>
        <lastname>mook kim</lastname>
        <nickname>mkyong</nickname>
        <salary>100000</salary>
    </staff>
    <staff>
        <firstname>low</firstname>
        <lastname>yin fong</lastname>
        <nickname>fong fong</nickname>
        <salary>200000</salary>
    </staff>
</company>

To fix it, just delete all those weird characters before the “

猜你喜欢

转载自blog.csdn.net/tb9125256/article/details/81230260