How to add a translation to C2C

AIAndy · Apr 15, 2013

Starting with the current SVN version (so it will be in V30) adding a new translation has become considerably easier.
You can now translate as few texts as you want and it will fall back to using English for all texts that you have not provided a translation for.
For the languages that are choosable in the Civ4 option dialog, use these tags:
English, French, German, Italian, Spanish, Finnish, Hungarian, Polish, Russian, Chinese, Japanese.
You can also add a new language. Choose a fitting tag name (for instance if you want to add Portuguese translations, best use Portuguese as the tag name) and then add your translations using that tag name. To select your new language, open "Caveman2Cosmos\Assets\XML\GlobalDefines.xml". Near the top you will find the new entry LANGUAGE. By setting it to the tag name of your language, you will override the normal language selection and use your new translation instead.

Best start with your translation in the folder "Caveman2Cosmos\Assets\XML\Text". It contains a lot of text files with entries similar to this:

Code:

	<TEXT>
		<Tag>TXT_KEY_CORPORATION_BURGERWORLD</Tag>
		<English>Burgerworld</English>
		<French>MondoBurger</French>
		<German>Burgerworld</German>
		<Italian>Burgerworld</Italian>
		<Spanish>Burgerworld</Spanish>
	</TEXT>

If your language tag is already in there, just replace the text with your translation. If not, then add it after the rest. So if you add portuguese, it might look like this afterwards:

Code:

	<TEXT>
		<Tag>TXT_KEY_CORPORATION_BURGERWORLD</Tag>
		<English>Burgerworld</English>
		<French>MondoBurger</French>
		<German>Burgerworld</German>
		<Italian>Burgerworld</Italian>
		<Spanish>Burgerworld</Spanish>
		<Portuguese>Your translation goes here</Portuguese>
	</TEXT>

Non ASCII characters need to be represented by their number in the character set, so ö without space results in ö. You can look that up for instance here:
http://ascii.cl/htmlcodes.htm
EDIT: From a fast test, it seems like ä,ö and similar also work so I am not sure why the current translations use the number codes instead.

There are some special codes that refer to arguments that are passed to the text generation. You will usually use them in a similar way as the text that is already in there so I will not go into detail here (best check out the different translation files if you want to learn more). In general %s1 means that a text argument is added while %d1 means a number argument. What kind of info is in that text or number depends on the specific text.
[COLOR_HIGHLIGHT_TEXT]Some Text[COLOR_REVERT] and similar changes the color of the text in between there.
[LINK=literal]A unit or building name or similar[\LINK] will turn that text into a link to the pedia.

In addition to the text files in that folder there are also text files in the different folders below "Caveman2Cosmos\Assets\Modules". They all have CIV4GameText in their name.

Pin · Apr 15, 2013

Thanks so much for taking care of this :worship:

There are some special codes that refer to arguments that are passed to the text generation.

These special codes are working for grammar useage. For german I found ~~a neat explanation with some explanations and~~ syntax-examples too here:
http://www.civforum.de/archive/index.php/t-47015.html - but I am still not sure about the mechanics behind, for example how the engine decides which declination to take (asides from explicitely defined ones) and if there are fallback-solutions - and this is again important as every language has its own grammar. Needs further testing into detail and is on my todo-list too (at least for german).

For the Charsets.... all sources mention only ISO-8859-1 (west-latin) would be useable ((never trust what all sources mention, I know....)) . Polish for example (as this got requested lately) would need ISO-8859-2, russian again another one and so on..., so in my unexperienced opinion its not possible at the moment, as all the files are defined and coded as west-latin. Do you know more about this topic ?

And... would it be a good idea (if everything works well now with this new approach) to remove all non-english empty language tags ? As this would reduce the size of the xml files quite much and it would be enabling the proper fallback to english instead.
<German/>
<German>.</German>
<German></German>
these are the most common expressions I found in the files til now.

Edit: the link just shows examples for the tags definition, not for the special codes.

AIAndy · Apr 15, 2013

Pin said:
Thanks so much for taking care of this

These special codes are working for grammar useage. For german I found a neat explanation with some explanations and syntax-examples too here:
http://www.civforum.de/archive/index.php/t-47015.html - but I am still not sure about the mechanics behind, for example how the engine decides which declination to take (asides from explicitely defined ones) and if there are fallback-solutions - and this is again important as every language has its own grammar. Needs further testing into detail and is on my todo-list too (at least for german).

For the Charsets.... all sources mention only ISO-8859-1 (west-latin) would be useable ((never trust what all sources mention, I know....)) . Polish for example (as this got requested lately) would need ISO-8859-2, russian again another one and so on..., so in my unexperienced opinion its not possible at the moment, as all the files are defined and coded as west-latin. Do you know more about this topic ?

And... would it be a good idea (if everything works well now with this new approach) to remove all non-english empty language tags ? As this would reduce the size of the xml files quite much and it would be enabling the proper fallback to english instead.
<German/>
<German>.</German>
<German></German>
these are the most common expressions I found in the files til now.

I added some code to ignore the empty tags, meaning the first and third in your examples, but not the one that contains just a '.' so it is better if those are removed.
In general the tags that are empty or contain a copy of the English text can now be removed. That might make it easier to see what is missing translation but it is also quite some work. The size reduction will not matter much in the compressed file but will result in some MB less disk storage used (the ingame memory footprint won't change though).

strategyonly · Apr 15, 2013

So its better to have 3 than 2 then?? Even if there is nothing in the English tag?

AIAndy · Apr 15, 2013

strategyonly said:
So its better to have 3 than 2 then?? Even if there is nothing in the English tag?

Which C2C text has nothing in the English tag?
Number 2 will actually display a '.' and there are rare occasions that that is actually the text you wanted to display. Number 1 and 3 are equivalent and both will be ignored now and the English text used instead.

Thunderbrd · Apr 16, 2013

Great documentation and great work in general AIAndy!

Btw... what does happen if there IS no English tag? Obviously this is wrong but will it create problems?

AIAndy · Apr 16, 2013

Thunderbrd said:
Great documentation and great work in general AIAndy!

Btw... what does happen if there IS no English tag? Obviously this is wrong but will it create problems?

The new code falls back to using the second tag (which is usually the one after Tag) and if that one is empty or there is only one tag, then it discards the text.

Snofru1 · Apr 16, 2013

Thanks a lot AIAndy, this is really great news!

AIAndy said:
Non ASCII characters need to be represented by their number in the character set, so ö without space results in ö. You can look that up for instance here:
http://ascii.cl/htmlcodes.htm
EDIT: From a fast test, it seems like ä,ö and similar also work so I am not sure why the current translations use the number codes instead.

I did some german translations in the past e.g. for ROM and AND and I found that in many cases there was no problem to use the ä, ö, ü and ß. But sometimes it just didn´t display right, I don´t know why and I am sorry that I can´t remember any more details, it is a long time ago now. Therefore I did replace all the ä, ö , ü, Ä, Ö, Ü and ß with their ASCII codes and it did work fine. It is not really complicated to do this with the find/replace function in Notepad++ which can also search folders.

Hale_9204 · Apr 16, 2013

from what I found, entries like <Italian>.</Italian> and <Italian>Blah.</Italian> are often found, especially in the pedia and strategy entries

about the special characters, I always used à,ò,è and so on and they're working fine for me (using Notepad++)

Snofru1 · Apr 16, 2013

To make real use of the new language system I have checked all the xml for any dots (e.g. <German>.<German>) and for any "Blah" and have them replaced with empty language slots. This was done for French, German, Italian and Spanish. Basis was the current SVN 5277.

Maybe someone with SVN writing access can merge these files in.

Dancing Hoskuld · Apr 16, 2013

Since I usually merge the language stuff in I do this also.

Snofru1 · Apr 16, 2013

Dancing Hoskuld said:
Since I usually merge the language stuff in I do this also.

Thank you, Mr. Hoskuld

!

ls612 · Apr 16, 2013

Snofru1 said:
To make real use of the new language system I have checked all the xml for any dots (e.g. <German>.<German>) and for any "Blah" and have them replaced with empty language slots. This was done for French, German, Italian and Spanish. Basis was the current SVN 5277.

Maybe someone with SVN writing access can merge these files in.

I'll do it right now for my XML. DH, you don't need to merge my GameText in that ZIP.

Edit: AIAndy, is it OK to have <language></language> for gametext tags, or should it be <language/>?

AIAndy · Apr 16, 2013

ls612 said:
I'll do it right now for my XML. DH, you don't need to merge my GameText in that ZIP.

Edit: AIAndy, is it OK to have <language></language> for gametext tags, or should it be <language/>?

<X></X> and <X/> is equivalent to the XML reader in all cases.

strategyonly · Apr 16, 2013

AIAndy said:
<X></X> and <X/> is equivalent to the XML reader in all cases.

HUH, didnt know that, i thought it was always "better" to have at least the minimum when doing stuff like this, ie: <X/>

Pin · Apr 19, 2013

Here is a short overview of the tags used in the PEDIA, maybe they might come in handy - and I dont know if there are even more:

Text-Structuring:

[SPACE] - forces a space
[ TAB ] - forces a tab (without the spaces of course)
[NEWLINE] - is a linebreak, the same as...
[PARAGRAPH:1] - Paragraph, where the number indicates, how many lines the distance is.

Text[\H1] - makes it a big headline

Text[\H2] - a small headline, same size and appearance as...
[BOLD]Text[\BOLD] - marking a text in bold (in the textflow). Btw, no italics here.
[ICON_BULLET] - Bullet sign

-) Icons to embed:

[ICON_HAPPY]
[ICON_UNHAPPY]
[ICON_UNHEALTHY]
[ICON_HEALTHY]

[ICON_ESPIONAGE]
[ICON_RELIGION]
[ICON_CULTURE]
[ICON_RESEARCH]
[ICON_FOOD]
[ICON_PRODUCTION]
[ICON_COMMERCE] -the single gold-coin-like symbol
[ICON_GOLD] - the stack-of-coins-symbol
[ICON_TRADE]

[ICON_STRENGTH]
[ICON_MOVES]

-) Colors to use:

[COLOR_HIGHLIGHT_TEXT]Text[COLOR_REVERT]
[COLOR_BUILDING_TEXT]Text[COLOR_REVERT]
[COLOR_UNIT_TEXT]Text[COLOR_REVERT]
[COLOR_WARNING_TEXT]Text[COLOR_REVERT]
[COLOR_TECH_TEXT]Text[COLOR_REVERT]
[COLOR_YIELD_COMMERCE]Text[COLOR_REVERT]
[COLOR_YIELD_FOOD]Text[COLOR_REVERT]

[COLOR_YELLOW]Text[COLOR_REVERT]
[COLOR_RED]Text[COLOR_REVERT]
[COLOR_BLUE]Text[COLOR_REVERT]
[COLOR_GREEN]Text[COLOR_REVERT]
[COLOR_BLACK]Text[COLOR_REVERT]
[COLOR_CYAN]Text[COLOR_REVERT]

-) Examples of Links to other entries:

[LINK=CONCEPT_WORLD_VIEW]Text of the Link[\LINK] - seems to point to (TXT_KEY_)CONCEPT...
[LINK=BUILDING_HEALERHUT]Text of the Link[\LINK]

AIAndy · Apr 19, 2013

Thank you, Pin.

In addition, there is [ICON_PROPERTY_CRIME] and similar for all properties.

EDIT: And here are all other icons from the code:

Spoiler :

Thunderbrd · Apr 19, 2013

Can we get a link to those two posts in the Modder's Documentation thread? THAT is a very nice compilation there! Very useful if it can be quickly referenced.

strategyonly · Apr 19, 2013

Thunderbrd said:
Can we get a link to those two posts in the Modder's Documentation thread? THAT is a very nice compilation there! Very useful if it can be quickly referenced.

Very much AGREED!!!!

God-Emperor · Apr 20, 2013

There are also a number of other tags used in diplomacy text:
[OUR_NAME]
[OUR_EMPIRE]
[OUR_CIV_SHORT]
[OUR_CIV_ADJ]
[OUR_STATE_RELIGION]
[OUR_BEST_UNIT]
[OUR_WORST_ENEMY]

[CT_NAME]
[CT_EMPIRE]
[CT_CIV_SHORT]
[CT_CIV_ADJ]
[CT_STATE_RELIGION]
[CT_BEST_UNIT]
[CT_WORST_ENEMY]

The "our" tags for the the player the message is from, "CT" is for the player they in contact with. They can actually take an argument which is used by some of the non-English text where the translations for words can have different forms for different parts of speech, which is done like "[OUR_NAME:3]" to get the text for the 3rd form (whichever one that may be) for example.

How to add a translation to C2C

Deity

Chieftain

Deity

C2C Supreme Commander

Deity

C2C War Dog

Deity

Emperor

Warlord

Emperor

Attachments

Deity

Emperor

Deity

Deity

C2C Supreme Commander

Chieftain

Text[\H1] - makes it a big headline

Text[\H2] - a small headline, same size and appearance as... [BOLD]Text[\BOLD] - marking a text in bold (in the textflow). Btw, no italics here. [ICON_BULLET] - Bullet sign

Deity

C2C War Dog

C2C Supreme Commander

Deity

Similar threads

Text[\H2] - a small headline, same size and appearance as...
[BOLD]Text[\BOLD] - marking a text in bold (in the textflow). Btw, no italics here.
[ICON_BULLET] - Bullet sign