Maybe i'm a bit utopist but, billboards apart, shouldn't is there a better method to support unicode !?
A summary of what we know:
Text import process.
- The executable first lists all the text relative to the tags at the start of the game. (CvXMLLoadUtility::read())
- It imports all the language text, or english by default, into a cached file (memory ?)
- By hijacking importation, we can write files to UTF8 and convert unicode strings to local codepage during reading so the game store local codepage strings.
- All the standard functions render these local CP without the need to change anything else in the code.
Other infos:
- The game renders graphically the CityBillboard (name and current production), so no hope or magic for this.
- Standard text IS NOT rendered by the graphical engine, as we can write any character from the local codepage.
- Base files are written in iso8859-1, which is pretty limited to US and western europe languages.
Now, let's push this a little further.
WHAT IF... *drum rolls* we would rewrite the way it works with the known limits of the game ?
- Store EVERY character as number code (like in ISO8859-1 old method), for example & #1049;, in the xml (just need to modify a bit the xml parser to do this easily).
- The game starts and imports these number suite (we remove the conversion for now).
- Then we create new functions to convert & #1049; into an unicode char, rewrite getText to getUnicodeText and replace all the calls in the DLL codes (and in python, as 2.4 support unicode). If the game outputs ?, that does not necessarily mean that it don't know which char it is, but maybe it don't which one to render. "& # 1 0 4 9" is just a suite of ASCII chars.
It shouln't be impossible (if it work at all lol).
Now look at this (chinese) or this (japanese):
How can chinese, japanese, korean patches can exists using 2 bytes ? We are missing something here.
A summary of what we know:
Text import process.
- The executable first lists all the text relative to the tags at the start of the game. (CvXMLLoadUtility::read())
- It imports all the language text, or english by default, into a cached file (memory ?)
- By hijacking importation, we can write files to UTF8 and convert unicode strings to local codepage during reading so the game store local codepage strings.
- All the standard functions render these local CP without the need to change anything else in the code.
Other infos:
- The game renders graphically the CityBillboard (name and current production), so no hope or magic for this.
- Standard text IS NOT rendered by the graphical engine, as we can write any character from the local codepage.
- Base files are written in iso8859-1, which is pretty limited to US and western europe languages.
Now, let's push this a little further.
WHAT IF... *drum rolls* we would rewrite the way it works with the known limits of the game ?
- Store EVERY character as number code (like in ISO8859-1 old method), for example & #1049;, in the xml (just need to modify a bit the xml parser to do this easily).
- The game starts and imports these number suite (we remove the conversion for now).
- Then we create new functions to convert & #1049; into an unicode char, rewrite getText to getUnicodeText and replace all the calls in the DLL codes (and in python, as 2.4 support unicode). If the game outputs ?, that does not necessarily mean that it don't know which char it is, but maybe it don't which one to render. "& # 1 0 4 9" is just a suite of ASCII chars.
It shouln't be impossible (if it work at all lol).
Now look at this (chinese) or this (japanese):
How can chinese, japanese, korean patches can exists using 2 bytes ? We are missing something here.