XML requests

n47

Prince
Joined
Aug 20, 2013
Messages
355
Location
Europe
As a trial run for eventual generating some data from XMLs for TD, I've parsed our files and this is what have came.

Code:
Caveman2Cosmos\Assets\Modules\Custom Leaderheads\Sa\Sa_CIV4LeaderHeadInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE1 0x3C 0x2F 0x44, line 13, column 25

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Jivaro\Leaders\Head Hunter_CIV4ArtDefines_Leaderhead.xml
xmlns: 'x-schema:Head Hunter_CIV4ArtDefinesSchema.xml' is not a valid URI, line 6, column 22

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Jivaro\Leaders\Head Hunter_CIV4LeaderHeadInfos.xml
xmlns: 'x-schema:Head Hunter_CIV4CivilizationsSchema.xml' is not a valid URI, line 8, column 27

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Pirates\SO_CIV4LeaderHeadInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0x96 0x20 0x4E 0x6F, line 10, column 39

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Slovakia\Slovakia_CIV4CivilizationInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE9 0x3C 0x2F 0x43, line 52, column 17

Caveman2Cosmos\Assets\XML\Units\CIV4UnitInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0x92 0x43 0x6F 0x6E, line 161306, column 29
There should be some severe penalties for not caring about proper XML writing. :nono: -- Like no cookies for a week or something.

Will someone take care about this?
 
There is absolutely nothing wrong that I can see in Caveman2Cosmos\Assets\XML\Units\CIV4UnitInfos.xml line 161306, column 29! Which is the third underscore in
Code:
			<Description>TXT_KEY_UNIT_JUDGE</Description>
or should I be looking at the text that replaces this and if so which language?
 
There is absolutely nothing wrong that I can see in Caveman2Cosmos\Assets\XML\Units\CIV4UnitInfos.xml line 161306, column 29! Which is the third underscore in
Code:
			<Description>TXT_KEY_UNIT_JUDGE</Description>
or should I be looking at the text that replaces this and if so which language?
Yeah, it is little shifted. It is probably about "O&#1219;onnor". ;)
Code:
		<UnitInfo>
			<Class>UNITCLASS_JUDGE</Class>
			<Type>UNIT_JUDGE</Type>
			<UniqueNames>
				<UniqueName>Judge Roy Bean</UniqueName>
				<UniqueName>Lord Denning</UniqueName>
				<UniqueName>Sandra Day O&#1219;onnor</UniqueName>
 
Yes in O'Conor quote is not proper utf 8 char.
 
Code:
Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Pirates\SO_CIV4LeaderHeadInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0x96 0x20 0x4A 0x75, line 656, column 53

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Pirates\SO_CIV4LeaderHeadInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE7 0x20 0x28 0x46, line 657, column 104

Caveman2Cosmos\Assets\Modules\Custom_Civilizations\Pirates\SO_CIV4LeaderHeadInfos.xml
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE7 0x20 0x77 0x61, line 657, column 129
:nope:
 
Can someone correct those head hunters?

Code:
xmlns="x-schema:Head Hunter_CIV4CivilizationsSchema.xml"
This is illegal. There cannot be spaces in paths. Can you just rename files like HeadHunter_CIV4CivilizationsSchema.xml?
 
This is very very important. All xml files encoding must be fixed before I and n47 (as a team :P) can do anything with it via scripting. (mass changes/ extracting etc)
 
NO
Open each file in NOtepad++ choose Encoding menu (fifth from left) and choose encode in utf-8

Than go to lines pointed by n47 and changr weird signs to proper from keyboard.
Save after fix. Thats all.
 
@SO, thanks you were trying to fix this encoding, but there was more of it. Btw, have you tried to use Notepad++ as Nimek said? It is a good tool for such things. It can even automatically convert characters. -- You can select a text with bad encoding, copy it (Ctr+C), change encoding, past it (Ctrl+V) and the text is automatically converted to the target encoding.

I've corrected all problems mentioned problems. Just am hoping, removing spaces from Head Hunter files won't break any dependency.
 
I have some first results from xerces. PromotionInfo/UnitCombats can take only boolean values, but in \Assets\XML\Units\CIV4PromotionInfos.xml there is something like this. Can someone correct it?
Code:
<PromotionInfo>
			<Type>PROMOTION_ANTIBIOLOGICAL</Type>
			...
			<UnitCombats>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_SIEGE</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_GUN</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
				<UnitCombatMod>
					<UnitCombatType>UNITCOMBAT_WHEELED</UnitCombatType>
					<iUnitCombatMod>35</iUnitCombatMod>
				</UnitCombatMod>
				<UnitCombatMod>
					<UnitCombatType>UNITCOMBAT_TRACKED</UnitCombatType>
					<iUnitCombatMod>20</iUnitCombatMod>
				</UnitCombatMod>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_HELICOPTER</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_DREADNOUGHT</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_ASSAULT_MECH</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
				<UnitCombat>
					<UnitCombatType>UNITCOMBAT_CLONES</UnitCombatType>
					<bUnitCombat>1</bUnitCombat>
				</UnitCombat>
			</UnitCombats>
 
It's not to be corrected. Don't know how xerces came to this conclusion but the original CivIV programming had this sort of thing taking place quite often. Yes, it ends up being an array of booleans. I'm sure you'll probably feel how AIAndy did about this as he looked at the method in horror but:
Code:
pXML->SetVariableListTagPair(&m_pbUnitCombat, "UnitCombats", sizeof(GC.getUnitCombatInfo((UnitCombatTypes)0)), GC.getNumUnitCombatInfos());
is intended to read that exactly as it is shown. It goes through ALL unit combats and establishes an array of the size of the numUnitCombats and then assigns a default boolean of 0 to each that aren't called in the xml there under that tag then if the tag calls for any of the unit combat definitions, it will match the boolean there to the bUnitCombat call value.

Point being, this is exactly how the xml is supposed to be programmed for that tag. I know it sucks. We haven't taken the time to improve those as the use of this heavy method (arrays that really should be vectors) is used so extensively throughout many xml files.
 
It's not to be corrected. Don't know how xerces came to this conclusion but the original CivIV programming had this sort of thing taking place quite often. Yes, it ends up being an array of booleans. I'm sure you'll probably feel how AIAndy did about this as he looked at the method in horror but:
Code:
pXML->SetVariableListTagPair(&m_pbUnitCombat, "UnitCombats", sizeof(GC.getUnitCombatInfo((UnitCombatTypes)0)), GC.getNumUnitCombatInfos());
is intended to read that exactly as it is shown. It goes through ALL unit combats and establishes an array of the size of the numUnitCombats and then assigns a default boolean of 0 to each that aren't called in the xml there under that tag then if the tag calls for any of the unit combat definitions, it will match the boolean there to the bUnitCombat call value.

Point being, this is exactly how the xml is supposed to be programmed for that tag. I know it sucks. We haven't taken the time to improve those as the use of this heavy method (arrays that really should be vectors) is used so extensively throughout many xml files.
If you want to this tag look like this, we can make some walk-around, but generally I want to to make such situation generate error messages, because it is likely to be an error.

Besides, putting integers there is currently illegal according to the schema.
Code:
	<ElementType name="UnitCombat" content="eltOnly">
		<element type="UnitCombatType"/>
		<element type="bUnitCombat"/>
	</ElementType>
 
ah... crap.. I get what you're saying now:
Code:
				<UnitCombatMod>
					<UnitCombatType>UNITCOMBAT_WHEELED</UnitCombatType>
					<iUnitCombatMod>35</iUnitCombatMod>
				</UnitCombatMod>
				<UnitCombatMod>
					<UnitCombatType>UNITCOMBAT_TRACKED</UnitCombatType>
					<iUnitCombatMod>20</iUnitCombatMod>
				</UnitCombatMod>
very bad yes.
 
No... you've gotta notice that those nested integer calls are entirely wrong - not a problem in the schema at all.

They are utilizing <UnitCombatMod> as opposed to <UnitCombat> which means they were copied over from or intended originally to be a part of the <UnitCombatMods> tag which isn't a prerequisite access tag as <UnitCombats> is but rather is a Combat Modifier against that particular type of unit combat. That tag calls for an integer while this one calls for a boolean <bUnitCombat> not <iUnitCombat>.
 
Next error:
Assets\Modules\ls612\Traits\ls612_CIV4TraitInfos.xml
Code:
			<Type>TRAIT_PHILOSOPHICAL</Type>
			...
					<Not>
						<GOMType>GOM_OPTION</GOMType>
						<ID>GAMEOPTION_LEADERHEAD_LEVELUPS</ID>
					</Not>
Probably should be
Code:
			<Type>TRAIT_PHILOSOPHICAL</Type>
			...
					<Not>
						<Has>
							<GOMType>GOM_OPTION</GOMType>
							<ID>GAMEOPTION_LEADERHEAD_LEVELUPS</ID>
						</Has>
					</Not>
 
Back
Top Bottom