• We are currently performing site maintenance, parts of civfanatics are currently offline, but will come back online in the coming days (this includes any time you see the message "account suspended"). For more updates please see here.

How to add a translation to C2C

AIAndy

Deity
Joined
Jun 8, 2011
Messages
3,428
Starting with the current SVN version (so it will be in V30) adding a new translation has become considerably easier.
You can now translate as few texts as you want and it will fall back to using English for all texts that you have not provided a translation for.
For the languages that are choosable in the Civ4 option dialog, use these tags:
English, French, German, Italian, Spanish, Finnish, Hungarian, Polish, Russian, Chinese, Japanese.
You can also add a new language. Choose a fitting tag name (for instance if you want to add Portuguese translations, best use Portuguese as the tag name) and then add your translations using that tag name. To select your new language, open "Caveman2Cosmos\Assets\XML\GlobalDefines.xml". Near the top you will find the new entry LANGUAGE. By setting it to the tag name of your language, you will override the normal language selection and use your new translation instead.

Best start with your translation in the folder "Caveman2Cosmos\Assets\XML\Text". It contains a lot of text files with entries similar to this:
Code:
	<TEXT>
		<Tag>TXT_KEY_CORPORATION_BURGERWORLD</Tag>
		<English>Burgerworld</English>
		<French>MondoBurger</French>
		<German>Burgerworld</German>
		<Italian>Burgerworld</Italian>
		<Spanish>Burgerworld</Spanish>
	</TEXT>

If your language tag is already in there, just replace the text with your translation. If not, then add it after the rest. So if you add portuguese, it might look like this afterwards:
Code:
	<TEXT>
		<Tag>TXT_KEY_CORPORATION_BURGERWORLD</Tag>
		<English>Burgerworld</English>
		<French>MondoBurger</French>
		<German>Burgerworld</German>
		<Italian>Burgerworld</Italian>
		<Spanish>Burgerworld</Spanish>
		<Portuguese>Your translation goes here</Portuguese>
	</TEXT>

Non ASCII characters need to be represented by their number in the character set, so ö without space results in ö. You can look that up for instance here:
http://ascii.cl/htmlcodes.htm
EDIT: From a fast test, it seems like ä,ö and similar also work so I am not sure why the current translations use the number codes instead.

There are some special codes that refer to arguments that are passed to the text generation. You will usually use them in a similar way as the text that is already in there so I will not go into detail here (best check out the different translation files if you want to learn more). In general %s1 means that a text argument is added while %d1 means a number argument. What kind of info is in that text or number depends on the specific text.
[COLOR_HIGHLIGHT_TEXT]Some Text[COLOR_REVERT] and similar changes the color of the text in between there.
[LINK=literal]A unit or building name or similar[\LINK] will turn that text into a link to the pedia.

In addition to the text files in that folder there are also text files in the different folders below "Caveman2Cosmos\Assets\Modules". They all have CIV4GameText in their name.
 
Thanks so much for taking care of this :worship:

There are some special codes that refer to arguments that are passed to the text generation.

These special codes are working for grammar useage. For german I found a neat explanation with some explanations and syntax-examples too here:
http://www.civforum.de/archive/index.php/t-47015.html - but I am still not sure about the mechanics behind, for example how the engine decides which declination to take (asides from explicitely defined ones) and if there are fallback-solutions - and this is again important as every language has its own grammar. Needs further testing into detail and is on my todo-list too (at least for german).

For the Charsets.... all sources mention only ISO-8859-1 (west-latin) would be useable ((never trust what all sources mention, I know....)) . Polish for example (as this got requested lately) would need ISO-8859-2, russian again another one and so on..., so in my unexperienced opinion its not possible at the moment, as all the files are defined and coded as west-latin. Do you know more about this topic ?

And... would it be a good idea (if everything works well now with this new approach) to remove all non-english empty language tags ? As this would reduce the size of the xml files quite much and it would be enabling the proper fallback to english instead.
<German/>
<German>.</German>
<German></German>
these are the most common expressions I found in the files til now.

Edit: the link just shows examples for the tags definition, not for the special codes.
 
Thanks so much for taking care of this :worship:



These special codes are working for grammar useage. For german I found a neat explanation with some explanations and syntax-examples too here:
http://www.civforum.de/archive/index.php/t-47015.html - but I am still not sure about the mechanics behind, for example how the engine decides which declination to take (asides from explicitely defined ones) and if there are fallback-solutions - and this is again important as every language has its own grammar. Needs further testing into detail and is on my todo-list too (at least for german).

For the Charsets.... all sources mention only ISO-8859-1 (west-latin) would be useable ((never trust what all sources mention, I know....)) . Polish for example (as this got requested lately) would need ISO-8859-2, russian again another one and so on..., so in my unexperienced opinion its not possible at the moment, as all the files are defined and coded as west-latin. Do you know more about this topic ?

And... would it be a good idea (if everything works well now with this new approach) to remove all non-english empty language tags ? As this would reduce the size of the xml files quite much and it would be enabling the proper fallback to english instead.
<German/>
<German>.</German>
<German></German>
these are the most common expressions I found in the files til now.
I added some code to ignore the empty tags, meaning the first and third in your examples, but not the one that contains just a '.' so it is better if those are removed.
In general the tags that are empty or contain a copy of the English text can now be removed. That might make it easier to see what is missing translation but it is also quite some work. The size reduction will not matter much in the compressed file but will result in some MB less disk storage used (the ingame memory footprint won't change though).
 
So its better to have 3 than 2 then?? Even if there is nothing in the English tag?
Which C2C text has nothing in the English tag?
Number 2 will actually display a '.' and there are rare occasions that that is actually the text you wanted to display. Number 1 and 3 are equivalent and both will be ignored now and the English text used instead.
 
Great documentation and great work in general AIAndy!

Btw... what does happen if there IS no English tag? Obviously this is wrong but will it create problems?
 
Great documentation and great work in general AIAndy!

Btw... what does happen if there IS no English tag? Obviously this is wrong but will it create problems?
The new code falls back to using the second tag (which is usually the one after Tag) and if that one is empty or there is only one tag, then it discards the text.
 
Thanks a lot AIAndy, this is really great news!

Non ASCII characters need to be represented by their number in the character set, so ö without space results in ö. You can look that up for instance here:
http://ascii.cl/htmlcodes.htm
EDIT: From a fast test, it seems like ä,ö and similar also work so I am not sure why the current translations use the number codes instead.

I did some german translations in the past e.g. for ROM and AND and I found that in many cases there was no problem to use the ä, ö, ü and ß. But sometimes it just didn´t display right, I don´t know why and I am sorry that I can´t remember any more details, it is a long time ago now. Therefore I did replace all the ä, ö , ü, Ä, Ö, Ü and ß with their ASCII codes and it did work fine. It is not really complicated to do this with the find/replace function in Notepad++ which can also search folders.
 
To make real use of the new language system I have checked all the xml for any dots (e.g. <German>.<German>) and for any "Blah" and have them replaced with empty language slots. This was done for French, German, Italian and Spanish. Basis was the current SVN 5277.

Maybe someone with SVN writing access can merge these files in.
 

Attachments

To make real use of the new language system I have checked all the xml for any dots (e.g. <German>.<German>) and for any "Blah" and have them replaced with empty language slots. This was done for French, German, Italian and Spanish. Basis was the current SVN 5277.

Maybe someone with SVN writing access can merge these files in.

I'll do it right now for my XML. DH, you don't need to merge my GameText in that ZIP.

Edit: AIAndy, is it OK to have <language></language> for gametext tags, or should it be <language/>?
 
I'll do it right now for my XML. DH, you don't need to merge my GameText in that ZIP.

Edit: AIAndy, is it OK to have <language></language> for gametext tags, or should it be <language/>?
<X></X> and <X/> is equivalent to the XML reader in all cases.
 
<X></X> and <X/> is equivalent to the XML reader in all cases.

HUH, didnt know that, i thought it was always "better" to have at least the minimum when doing stuff like this, ie: <X/> ;)
 
Here is a short overview of the tags used in the PEDIA, maybe they might come in handy - and I dont know if there are even more:

Text-Structuring:
[SPACE] - forces a space
[ TAB ] - forces a tab (without the spaces of course)
[NEWLINE] - is a linebreak, the same as...
[PARAGRAPH:1] - Paragraph, where the number indicates, how many lines the distance is.

Text[\H1] - makes it a big headline

Text[\H2] - a small headline, same size and appearance as...
[BOLD]Text[\BOLD] - marking a text in bold (in the textflow). Btw, no italics here.
[ICON_BULLET] - Bullet sign



-) Icons to embed:
[ICON_HAPPY]
[ICON_UNHAPPY]
[ICON_UNHEALTHY]
[ICON_HEALTHY]

[ICON_ESPIONAGE]
[ICON_RELIGION]
[ICON_CULTURE]
[ICON_RESEARCH]
[ICON_FOOD]
[ICON_PRODUCTION]
[ICON_COMMERCE] -the single gold-coin-like symbol
[ICON_GOLD] - the stack-of-coins-symbol
[ICON_TRADE]

[ICON_STRENGTH]
[ICON_MOVES]​

-) Colors to use:
[COLOR_HIGHLIGHT_TEXT]Text[COLOR_REVERT]
[COLOR_BUILDING_TEXT]Text[COLOR_REVERT]
[COLOR_UNIT_TEXT]Text[COLOR_REVERT]
[COLOR_WARNING_TEXT]Text[COLOR_REVERT]
[COLOR_TECH_TEXT]Text[COLOR_REVERT]
[COLOR_YIELD_COMMERCE]Text[COLOR_REVERT]
[COLOR_YIELD_FOOD]Text[COLOR_REVERT]

[COLOR_YELLOW]Text[COLOR_REVERT]
[COLOR_RED]Text[COLOR_REVERT]
[COLOR_BLUE]Text[COLOR_REVERT]
[COLOR_GREEN]Text[COLOR_REVERT]
[COLOR_BLACK]Text[COLOR_REVERT]
[COLOR_CYAN]Text[COLOR_REVERT]​

-) Examples of Links to other entries:
[LINK=CONCEPT_WORLD_VIEW]Text of the Link[\LINK] - seems to point to (TXT_KEY_)CONCEPT...
[LINK=BUILDING_HEALERHUT]Text of the Link[\LINK]​

 
Thank you, Pin.

In addition, there is [ICON_PROPERTY_CRIME] and similar for all properties.

EDIT: And here are all other icons from the code:

Spoiler :
Code:
[ICON_BULLET]
[ICON_HAPPY]
[ICON_UNHAPPY]
[ICON_HEALTHY]
[ICON_UNHEALTHY]
[ICON_STRENGTH] 
[ICON_MOVES]
[ICON_RELIGION] 
[ICON_STAR]
[ICON_SILVER_STAR]
[ICON_TRADE]
[ICON_DEFENSE]
[ICON_GREATPEOPLE]
[ICON_BAD_GOLD] 
[ICON_BAD_FOOD] 
[ICON_EATENFOOD]
[ICON_GOLDENAGE]
[ICON_ANGRYPOP] 
[ICON_OPENBORDERS]
[ICON_DEFENSIVEPACT]
[ICON_MAP]
[ICON_OCCUPATION]
[ICON_POWER]

[ICON_GOLD]
[ICON_RESEARCH]
[ICON_CULTURE]
[ICON_ESPIONAGE]

[ICON_FOOD]
[ICON_PRODUCTION]
[ICON_COMMERCE]
 
Can we get a link to those two posts in the Modder's Documentation thread? THAT is a very nice compilation there! Very useful if it can be quickly referenced.
 
Can we get a link to those two posts in the Modder's Documentation thread? THAT is a very nice compilation there! Very useful if it can be quickly referenced.

Very much AGREED!!!!;)
 
There are also a number of other tags used in diplomacy text:
[OUR_NAME]
[OUR_EMPIRE]
[OUR_CIV_SHORT]
[OUR_CIV_ADJ]
[OUR_STATE_RELIGION]
[OUR_BEST_UNIT]
[OUR_WORST_ENEMY]

[CT_NAME]
[CT_EMPIRE]
[CT_CIV_SHORT]
[CT_CIV_ADJ]
[CT_STATE_RELIGION]
[CT_BEST_UNIT]
[CT_WORST_ENEMY]

The "our" tags for the the player the message is from, "CT" is for the player they in contact with. They can actually take an argument which is used by some of the non-English text where the translations for words can have different forms for different parts of speech, which is done like "[OUR_NAME:3]" to get the text for the 3rd form (whichever one that may be) for example.
 
Back
Top Bottom