Page 1 of 1

utf-8 encoding in xml export

Posted: 2006-01-16 10:59:33
by .rm
there seems to be an issue with utf-8 encoding in the xml export (at least my parser says so). for me, it occured in the translated title as set by the imdb import for the movie "Prince du Pacifique, Le".

my guess is that the charecter in question was this one: "í" (if it comes through correctly here), that's a spanish character, an "i" with an accent.

regards

.rm

Posted: 2006-01-16 12:40:07
by antp
As far as I remember, the character set used is not UTF8 but Windows' local character set. So I guess that in your case it will be windows-1252.

Posted: 2006-01-16 13:02:19
by .rm
true! i looked into the wrong document. ok, so for me here it's 8859-1

but: why is it not utf-8? i mean, the catalog itself supports unicode perfectly, and utf-8 support should probably be readily available in your programming language, too ... or is it not?

the original title field does have a rather big probability of containing characters from various different charcter sets ...

:)

.rm

Posted: 2006-01-16 13:57:56
by antp
The catalog (the program and the binary .amc file) does not support unicode at all, it works in ANSI.
It is not strictly iso-8859-1, it is windows-1252. So some special symbols like €, œ, “ ” etc. are not the same.