Page 1 of 1

XML not well formed

Posted: 2010-10-16 08:43:37
by elman
Hi,

I'm currently working on Android version of Pocket AMC Reader. However I ran into problems when importing XML catalog. Can't use DOM parser as I did with Windows Mobile version since it gives me out of memory exceptions with 18Mb catalog (max heap is set to 25 Mb on Android 2.2). Then I tried SAX parser, however this one is giving me 'document not well-formed' exception. Looking into XML, I found out that characters like ’,“,” are causing problems.

Is there any way to fix it? Can AMC produce well formed XML? Would using Windows-1252 encoding be any help?

Elman

Posted: 2010-10-17 10:24:26
by antp
Hi,
Catalogs are stored in Windows' current encoding, so specifying Windows-1252 if you use west-european character in Windows set might help (as these characters exist in that charset but not in iso-8891 for example).
Else you can make a small script to replace such "invalid" characters in the catalog (but it is annoying to have to run it each time you want to export).

Posted: 2010-10-21 09:16:31
by elman
Thanks for help. Seems to work now. What I did was I opened catalog with user selectable Window encoding (in my case 1250) and then converted it to UTF-8 for database import. This way any user will able to import his characters correctly.