What sort of features would you want in an XML error checker

j_mie6

Deity
Joined
Dec 20, 2009
Messages
2,963
Location
Bristol (uni)/Swindon (home)
Hello,
I'm making a new utility born of frustration with having to load up civ every time I want to check code, for only silent errors to appear. So I am making a new utility which will independantly check for reference errors in mod source code and schematic problems.

My question to you is: what sort of features do you want?

At the moment it will be able to load both the games assets and the mod assts and check for references as well as schema problems. I am also including a new scripting language for use with the program to exclude certain files and tags from the checker to allow for ignoring undefined text references etc.

If you want any more detail on the features please ask!

Download Version 1.0.1 here!

Thanks,
Jamie
 
The most frustrating thing for me has been those horrible and obnoxious silent CTDs that occur without fail whenever I reference a Buildingclass that doesn't exist (from making some typo, or forgetting to actually create the buildingclass), so a program like this would be much appreciated.
 
Yeah it will pick up on that! That is frustrating for me as well. Something I will have to try is detecting things that need to be included, for example civs will not work properly without 2 uniques and spy names, the program won'tick up on that so I will have a think about those...
 
So, here are the few examples I promised!

Assume that the base game doesn't exist here and only what is declared here is real.

we have the following things defined by our mod:

Civilizations:
CIVILIZATION_ENGLAND
CIVILIZATION_FRANCE
CIVILIZATION_SPAIN
CIVILIZATION_GERMANY

UnitClasses:
UNITCLASS_CROSSBOWMAN
UNITCLASS_TANK
UNITCLASS_MUSKETMAN

Units (With UnitClassType in brackets):
UNIT_LONGBOWMAN (UNITCLASS_CROSSBOWMAN)
UNIT_PANZER (UNITCLASS_TANK)
UNIT_TERCIO (UNITCLASS_PIKEMAN)
UNIT_MINUTEMAN (UNITCLASS_MUSKETMAN

Civilization_UnitClassOverrides (With UnitClassType, Unit in brackets):
CIVILIZATION_ENGLAND (UNITCLASS_CROSSBOWMAN, UNIT_LONGBOWMAN)
CIVILIZATION_FRANCE (UNITCLASS_MUSKETMAN, UNIT_MUSKETEER)
CIVILIZATION_SPAIN (UNITCLASS_PIKEMAN, UNIT_TERCIO)
CIVILIZATION_GERMANY (UNITCLASS_TANK, UNIT_PANZER)
CIVILIZATION_AMERICA (UNITCLASS_MUSKETMAN, UNIT_MUSKETEER)

so, if we load the mod into the program and don't create any excludes (yet) the program will tell us some interesting information

Code:
Missing Reference: Row with type UNIT_TERCIO in Units references UNITCLASS_PIKEMAN in tag UnitClassType, this object doesn't exist
Missing Reference: Entry with CivilizationType = CIVILIZATION_FRANCE in Civilization_UnitClassOverrides references UNIT_MUSKETEER in tag UnitType, this object doesn't exist
Missing Reference: Entry with CivilizationType = CIVILIZATION_SPAIN in Civilization_UnitClassOverrides references UNITCLASS_PIKEMAN in tag UnitClassType, this object doesn't exist
Missing Reference: Entry with CivilizationType = CIVILIZATION_AMERICA in Civilization_UnitClassOverrides references CIVILIZATION_AMERICA in tag CivilizationType, this object doesn't exist

So, it appears we might have made some mistakes!!! And checking back it is right, we didn't define UNITCLASS_PIKEMAN, UNIT_MUSKETEER or CIVILIZATION_AMERICA!!!

But, we already knew we hadn't made the UNIT_MUSKETEER yet, we just wanted to test everything else first (I know it isn't hard to ignore that error, but with text tags for example it could become very annoying to look at) so lets write an exclude to ignore that Row from parsing

Code:
Enter Commands
exclude <UNIT_MUSKETEER>;
end;

if we run the checks now it gives us this:

Code:
Missing Reference: Row with type UNIT_TERCIO in Units references UNITCLASS_PIKEMAN in tag UnitClassType, this object doesn't exist
Missing Reference: Entry with CivilizationType = CIVILIZATION_SPAIN in Civilization_UnitClassOverrides references UNITCLASS_PIKEMAN in tag UnitClassType, this object doesn't exist
Missing Reference: Entry with CivilizationType = CIVILIZATION_AMERICA in Civilization_UnitClassOverrides references CIVILIZATION_AMERICA in tag CivilizationType, this object doesn't exist

Perfect! But what if we wanted to ignore more general terms? Like we want to exclude any checks involving units from the program. This is where the script uses a specification called regex (Regular Expressions):

Code:
Enter Commands
exclude <UNIT.*>;
end;

Here we used the regex .* to mean exclude anything which matches UNIT with 1 or more characters added on the end. If we run the program now:

Code:
Missing Reference: Entry with CivilizationType = CIVILIZATION_AMERICA in Civilization_UnitClassOverrides references CIVILIZATION_AMERICA in tag CivilizationType, this object doesn't exist

everything to do with units is gone! so UNITCLASS_... matches as it is UNIT followed by stuff as does UNIT_MUSKETEER.

Using regex it is possible to create complex excludes for files and tags, but I won't go into that here! :D

So that is what the program does so far, when I get past the prototype I will be able to say more about it. But at least the errors are more meaningful, the scripting is optional too! The only thing you might notice it doesn't show you is the File the error is in, but this is because all the tags are loaded into memory so the File is irrelevant, just like in-game.
 
A the moment yes (in my excludes there is a built in exclude (.*\\.sql);)) but after I have the XML file reading up and running I will either work on the GUI or develop some SQL parsing. Either way I hope to get it done. The main ideas will be to parse SQL statements into my tables and then treat them as normal. It's just a case of developing a method for parsing the code. SQL will also need to be adopted into the schema checking (where new rows can be defined). So for the moment it is not included but I hope it will be :D
 
This utility sounds incredible! ...Except there's a lot more errors (unfortunately) that you can make with xml than just civilization and unitclass references... :p

And yes, it would be helpful to know when, probably due to a typo, that you're referencing a nonexistent object, but what about references to existent objects that are also errors? Like in unit art defining, referencing the wrong unit's art info, but that art define tag does exist - if that makes sense? (Yes, I have had a unit show up wrong because of that)

But that aside... would you happen to have an alpha version that I could try out? ;)
 
Hmm, I hadn't thought about that, but then I have been deliberately avoiding the art defines in my mod :lol: well if they are perfectly valid the program will not be able to read your mind, though I can try and set up a system when I know more about the art defines to check if the name of your define matches the unit you are applying it to, ie if the UNIT_SWORDSMAN is using ARTDEFINES_ARCHER suggest to the user they might have done something wrong, for example. Then you could write excludes to ignore those recommendations.

And unfortunately no :p I'm actually in pre-alpha! At the moment the tool doesn't actually load or parse XML files (though I have an idea about how I want to do it) but it will check for references in data I manually shove in its database. Sorry :lol: after implementing a requires statement (so that it checks that you have defined 2 uniques and have spy names etc, ie things that crash the game) I will be working on the XML parsing, then as I say either the GUI or The SQL parsing (or the schema checking, so much to do :lol:
 
Hmm, I hadn't thought about that, but then I have been deliberately avoiding the art defines in my mod :lol: well if they are perfectly valid the program will not be able to read your mind, though I can try and set up a system when I know more about the art defines to check if the name of your define matches the unit you are applying it to, ie if the UNIT_SWORDSMAN is using ARTDEFINES_ARCHER suggest to the user they might have done something wrong, for example. Then you could write excludes to ignore those recommendations.
My worst favorite thing about computers. :lol:
Then I console myself, saying "well if they could read your mind, they would probably be able to take over the world."

(actually the problem with the art defines was with <ArtDefine_StrategicView>, (this was with my Minoan mod, BTW) where instead of <StrategicViewType>ART_DEF_UNIT_XIPHOS</StrategicViewType>, I had
<StrategicViewType>ART_DEF_UNIT_SWORDSMAN</StrategicViewType> - also a valid unit art info but not the one I needed. And somehow that made the unit show up as spearmen, for whatever reason... :confused:
And unfortunately no :p I'm actually in pre-alpha! At the moment the tool doesn't actually load or parse XML files (though I have an idea about how I want to do it) but it will check for references in data I manually shove in its database. Sorry :lol: after implementing a requires statement (so that it checks that you have defined 2 uniques and have spy names etc, ie things that crash the game) I will be working on the XML parsing, then as I say either the GUI or The SQL parsing (or the schema checking, so much to do :lol:
Well, I'm looking forward to being able to solve this pesky CTD - which actually is the only reason I haven't released one of my mods yet (okay, and the UA doesn't work). Keep up the good work!
 
Thanks, but I doubt it will help that much :p I've already done the logic for the xml testing, just need to load the xml, and as I am using Java for this the xml parsers will be different :D Thanks for the pointer though :)
 
It's a wonderful idea..

But here are some tips for reducing data entry mistakes, from the point of view of a total conversion modder (after adding >50 new civs, >100 units, >80 new techs, etc).

It's much easier to see and edit in SQL. You won't make as many errors if you can "see" all the new data in one screen in nice columns.

The best way to avoid errors is to not have to enter data at all. For example, here's a way to solve all UnitClass errors ... forever ...
Code:
DELETE FROM UnitClasses;
INSERT INTO UnitClasses (Type, UnitType, Description) SELECT UnitClass, Type, Description FROM Units;

Run that after your changes to Units. This only works if you want one-to-one relationship in all cases, but that's not a bad idea anyway. (Code from memory so I'm not 100% certain it's right, but I'll check and update if it's wrong...)


Also, you can skip a lot of columns like Description, Civililopedia, etc. when entering each item by SQL. Just knock them all off with one-liners like:
Code:
UPDATE Units SET Description = 'TXT_KEY_' || Type, Civilopedea = 'TXT_KEY_' || Type || '_PEDIA';

Of course, the line above assumes that you are open to the idea of strict consistency in text key naming, which would be violating Firaxis examples (but hey I'm a rebel).

There is also no reason to spend more than one second on the silent not-loaded-to-end-of-file errors when they occur.
 
I've always had a small preference with xml (after working with sqlite for another project, which involved a lot of statements :rolleyes:), but yeah I do see what you mean ;)

As for your method, before I made this program I was hoping to try it out, but then I quickly realised it still involved loading the game AND then the mod. and I just don't like to spend 40 seconds loading my mod to check for errors which I end up doing several times a day when working (and that is with an SSD, I have a friend who takes 2 minutes to load up civ!) :p, which is why this program is perfect for me. The actual logic here should work very fast.
 
I've got my syntax for the requires statement btw... should be ok to use but most of the requires will be shoved in a saved file that the program can load if the user wants with run [requires_script]; or something similar:

requires 2 <CIVILIZATION.*> where {(Civilization.*ClassOverrides)|Improvements};
requires <CIVILIZATION.*> where {Civilization_SpyNames};

these say that two occurrences of CIVILIZATION_? are required in the overrides or improvements (ie, do they have some uniques) and any number of occurences are required in spynames. Easy enough to understand?
 
Basic xml parser should check for
  • well formed (valid is harder to do as you can't write a DTD due to the dynamic nature of the tables)
  • correct structure (ie <Row> within <tablename>, single <Where> and <Set> within <Update>, etc)
  • spurious - characters (from using copy in MS-IE and paste into Notepad!)
  • table names are correct
  • column names are correct
  • other stuff I've forgotten ;)
The last two you can do by connecting to the SQLite .db file and reading the meta-data from the database schema - PM me if you need to know how as that's how I dumped all the unit art and civilization data
 
  • table names are correct
  • column names are correct
Well, that you can just get from the database.log, with a little bit of inferring.
 
well the physical syntax errors of the xml are picked up by modbuddy, so I didn't feel I needed to worry too much about that (though presumably the Java xml parsers will pick up on bad xml themselves, with various degrees of useful exceptions :rolleyes:). I should probably be able to write up something for the checking the order of tags while I load the xml data into the memory, as long if the method is dynamic enough (I want to be using minimal hardcoding, hence the scripting language :lol:)

I will be attempting to have the program run around grabbing schema as it scans xml files, so that it can try the xml against the schema to check validity, but I should think the meta-data from the database should be more accurate, what sort of things would I expect to find in there that I could use?
 
what sort of things would I expect to find in there that I could use?
Every table name, every column name for every table, and the datatype (text/int) for every column. Also, most of the "this col must reference that col in that table" info is in there.

And if you use the database directly, every time you hit an ALTER TABLE statement in an SQL file all you have to do is execute it against the database and then continue normally, as your checking will dynamically adapt for any subsequent XML.
 
Back
Top Bottom