unable to parse latin and unicode characters using XMLReader.parse() [closed]

I’m facing issue in parsing the XML data which contains Unicode and Latin character as part of the XML data. its throwing an error saying unable to parse the input XML data. Please find the attached code snippet and do the needful to fix the issue.

This is the input that we are passing it in our application

String url = "<?xml version='1.0' encoding='UTF-8'?><candidate-registrations customer-id='197'>
<registration-details method=' '>
<candidate-demographics>
  <candidate-details>
    <candidate-id-type value='SSN'/>
    <candidate-id value='567876456'/>
    <first-name value='ยง'/>
    <last-name value='mohan'/>
    <date-of-birth value='03/03/1980'/>
    <email-address value='[email protected]'/>
                            <school-code>0129</school-code>
  </candidate-details>
            </candidate-demographics>
            </registration-details></candidate-registrations>";

This is the code that we are using

private XMLReader xr;

public SaxMapper( )
{
    try
    {
        // Create the XML reader...
        xr = XMLReaderFactory.createXMLReader();            
    }
    catch(Exception e)
    {
        LoggerManager.Log(LogLevelConstants.INFO, className, "SaxMapper", e.getMessage(),e);
    }

}

public Object fromXML( String url )
{
   try
    {
        return fromXML( new InputSource( url ));
    }
    catch ( Exception e )
    {
       LoggerManager.Log(LogLevelConstants.INFO, className, "fromXML", e.getMessage(),e);
       return null;
    }
 }
private synchronized Object fromXML( InputSource in ) throws Exception
{
       // Set the ContentHandler...
       xr.setContentHandler( this );

       // Parse the file...
       xr.parse( in  );                    
       return getMappedObject();
 }

This is the error which i’m getting,

Error : <?xml version="1.0" encoding="UTF-8"?><import-results><result>BAD</result><reason-code>100</reason-code><reason-desc>Unable to parse the input XML</reason-desc><error>Unable to parse the input XML</error></import-results>

Answer

If you read the documentation on InputSource, you will notice that

new InputSource(String)

does something different then you expect it to.

To parse a string with sax see: https://docs.oracle.com/javase/tutorial/jaxp/sax/parsing.html

Please note that the given tutorial focuses on parsing a file, instead of a given string. But after you understood that, it will be easy to transform it.

Leave a Reply

Your email address will not be published. Required fields are marked *