Fun with xml extraction

For Buildings we could have something like a file for ordinary buildings, another for wonders and one for special + one for modular parts, etc.
Amount of work aside (I'd be ready to help for that), are there downsides I don't see to that approach?

I would definetly support this. Most buildings are in Hydro's or my module and they are so deep in the building structure that you can't really toggle of a module without causing a lot of XML errors. So they can be merged into core. I think the main reason why this hasn't happened is that neither Hydro nor I have time for this at the moment :(
 
I would definetly support this. Most buildings are in Hydro's or my module and they are so deep in the building structure that you can't really toggle of a module without causing a lot of XML errors. So they can be merged into core. I think the main reason why this hasn't happened is that neither Hydro nor I have time for this at the moment :(

If it's just a matter of merging files, I can do that quite quickly (auto-merge with a notepad++ plugin and delete the parts from the closing tag from each former file up to the opening tag of the next file), is there anything else I've missed?

The main issue I see is to sort out all files (which current file should go in each final file) and order them based on the loading order... I'll look into that and make a suggestion.
 
I am not sure if merging is so simple. We have a lot of stacked layers. Such as I have tweaked versions of core buildings. Some are minor tweaks while others are a complete overwrite of the original and should not be merged but instead replaced. IO also think there are some in the Project Hades mod that if my stuff would be in the core would mess them up. Mine is on top of that mod and thus can overwrite the changes made in that mod.

In short my mod is not there to be modular but to stay on top of the stack of changes by other mods below it.

I recommend you just don't touch it. Its a HUGE can of worms that can break the game if merged wrong. Note all this came about since in the beginning we did not have an SVN so we had to each work on separate parts of the mod. I also started as a Modmod for RoM/AND so it was easier for my stuff to just stay as a mod rather than merging.

EDIT: Faustmouse's stuff is much less and made after the Hades Mod was made so you may want to try out merging his stuff into the core first as a test run.
 
EDIT: Faustmouse's stuff is much less and made after the Hades Mod was made so you may want to try out merging his stuff into the core first as a test run.

Ah ok, yeah this would be a good start. Would at least clean up some stuff.
I thnk Vokarya and Sargon both have a module with wonders that can be merged too.
 
I am not sure if merging is so simple. We have a lot of stacked layers. Such as I have tweaked versions of core buildings. Some are minor tweaks while others are a complete overwrite of the original and should not be merged but instead replaced. IO also think there are some in the Project Hades mod that if my stuff would be in the core would mess them up. Mine is on top of that mod and thus can overwrite the changes made in that mod.

In short my mod is not there to be modular but to stay on top of the stack of changes by other mods below it.

I recommend you just don't touch it. Its a HUGE can of worms that can break the game if merged wrong. Note all this came about since in the beginning we did not have an SVN so we had to each work on separate parts of the mod. I also started as a Modmod for RoM/AND so it was easier for my stuff to just stay as a mod rather than merging.

EDIT: Faustmouse's stuff is much less and made after the Hades Mod was made so you may want to try out merging his stuff into the core first as a test run.

1) Fortunately we went through and removed from core any buildings that your in your mod at one stage, so it is not quite as bad as you may think. I have noticed a few have crept in since then so it should not be too hard to just remove those buildings from core.

Your mod does make it difficult for others to mod the buildings you have in your files because of the number of force overwrite tags you have used. I have had to throw out a couple of ideas because of it.

Eventually they need to be moved. The Modules area needs to be for optional stuff it is a BtS standard and can confuse people who come to C2C expecting the modules there to be optional. Perhaps we need a new folder for such non-optional components to be in. That will need to be up to the dll people to come up with a solution as it will need to load from there just like from Modules but without the MLF stuff.

2) Deciding what is part of optional modules or even what should be optional is more important than merging everything down into core. For example all Vokarya's wonders that are not complete should be left in the module whereas the ones that are complete could be merged into core. That way it would be easy to turn them off until someone has time to complete them.

It has been requested that the properties be optional - that is all and each property. It is a pity they are not WoC compatible as that will make such a thing difficult.
 
They'd be replacement mechanism valid though so it may still be possible.

IMO, a merge program would be nearly useless for merging modular stuff into the core... would take a far more hands on approach understanding how the woc system works intrinsically to do it safely.
 
If the module only has new buildings then a merge is trivial. You just add the one file on the end of the other. Most of the stuff Faustmouse, Vokarya and Sargon did are new buildings since they did them as "beginner XML modders".
 
Ah ok, yeah this would be a good start. Would at least clean up some stuff.
I thnk Vokarya and Sargon both have a module with wonders that can be merged too.

Yeah we may need to go in order of the modular loading, Starting with the mods that come first and are overwritten the most. Mine would come near the end.

1) Fortunately we went through and removed from core any buildings that your in your mod at one stage, so it is not quite as bad as you may think. I have noticed a few have crept in since then so it should not be too hard to just remove those buildings from core.

Your mod does make it difficult for others to mod the buildings you have in your files because of the number of force overwrite tags you have used. I have had to throw out a couple of ideas because of it.

Eventually they need to be moved. The Modules area needs to be for optional stuff it is a BtS standard and can confuse people who come to C2C expecting the modules there to be optional. Perhaps we need a new folder for such non-optional components to be in. That will need to be up to the dll people to come up with a solution as it will need to load from there just like from Modules but without the MLF stuff.

Yeah I realize you did some already. If merging my stuff we should probably go one sub-mod at a time. Starting with easy submods like Cranes, Pests and Disease and then work our way up to the biggest most connected mods like the Science, Health, Tweaks and the grand daddy of them all the Craft mod.

And really the Crafts mod is the most huge of my mods which has so much stuff and links to not only the core but many other mods. It should be done last and with the most care. Its like its own 2nd Core.
 
Rwn, can you tell when the length of a text tag is too long? There are a number of STRATEGY texts that got the pedia info in them so are making play difficult. Any which are over say 60-100 characters are probably too long and need work.
 
Rwn, can you tell when the length of a text tag is too long? There are a number of STRATEGY texts that got the pedia info in them so are making play difficult. Any which are over say 60-100 characters are probably too long and need work.

Sure. Attached is the list of pedia and strategy text entries which are more than 60 character longs sorted by total length of the text entry (copy and paste the unzipped txt file into your favorite spreadsheet).

I might have missed some, but I think that's only the ones from vanilla BTS, so it shouldn't matter.
 

Attachments

  • Too long entries.rar
    399.9 KB · Views: 31
Thanks. Pedia can be any length, the longer the better.

edit Is it possible to get a list of duplicates of any TXT_KEY_ entries. Once I know which have duplicates I can use WinGrep to find the files. Having duplicates means we are never sure which one is the one we should change. :D

edit 2 a list of orphan ones would be useful also, ie TXT_KEY_ entries that only occur in text files and not in any of the other XML files.
 
edit Is it possible to get a list of duplicates of any TXT_KEY_ entries. Once I know which have duplicates I can use WinGrep to find the files. Having duplicates means we are never sure which one is the one we should change. :D

Yes, here is the list of all duplicate text entries, based on all the C2C xml files containing "text" in the namefile or located in the xml/text/ folder.

When an entry is there more than once it means that there's more than one duplicate. Also, duplicates are not necessarily in different files (for example I found TXT_KEY_TECH_FIBER_OPTICS_STRATEGY twice in RoM_GameText_Strategy.xml). Also I probably won't be available shortly, so if you need anything else I'll look at it in a couple of weeks.


Edit : Sorry, missed this one :
edit 2 a list of orphan ones would be useful also, ie TXT_KEY_ entries that only occur in text files and not in any of the other XML files.
That sounds a bit more complicated. I'll try to find a way when I get back.
 

Attachments

  • Duplicatetextentries.txt
    31.2 KB · Views: 47
- Honestly, I'm not sure I'll want to do again the work to build those files anytime soon (though it was an interesting exercise to find a way to do it initially)...

I've worked again on that, but instead of going through the whole process, I scripted/automated as much as I could to generate the tables from XML; it's still a bit long to do (especially for buildings and units entries), but I'd say it now takes me about 1 or 2h each instead of 6, so that still quite an improvement ;)

Also, Thunderbrd, you wanted to have the file path for each entry (useful for sorting out duplicates for example), I took this opportunity to find a way to get them.

So, here are the complete tables with path included, based on SVN 7421 (should be the latest). It's in .csv format which should be readable with any software, but let me know if somebody needs another format (such as txt).
- Buildings (all files matching *CIV4BuildingInfo*)
- Units (all files matching *CIV4UnitInfo*)
- Tech (all files matching *CIV4TechInfo*)

You'll notice that the format of the first lines slightly changed (each parent tag now has its own line); if you find it less readable, that can be easily changed, again, let me know what you'd like.
 

Attachments

  • Civ4BuildingInfos.rar
    201.2 KB · Views: 36
  • Civ4TechInfos.rar
    42 KB · Views: 31
  • Civ4UnitInfos.rar
    129 KB · Views: 21
edit 2 a list of orphan ones would be useful also, ie TXT_KEY_ entries that only occur in text files and not in any of the other XML files.

I couldn't find a good way to get this automatically BUT, based on the other files I've worked on (Units, Buildings and Tech), I have a first list that should be useful for text entries starting with either:
TXT_KEY_BUILDING_
TXT_KEY_TECH_
TXT_KEY_UNIT_

What I've done is simply looking whether any existing text entry matching one of these patterns appeared in any file matching *CIV4BuildingInfo*, *CIV4UnitInfo* or *CIV4TechInfo*. This means that I don't guarantee that all entries mentionned in the file are truly orphans, just that they don't appear in the files where they most obvious should. Also, there might be other entries not matching the above patterns that are orphans, but with ~2000 supposedly orphan entries already identified, that's a good start...
 

Attachments

  • Orphan txt entries.txt
    65.6 KB · Views: 31
- Buildings (all files matching *CIV4BuildingInfo*)
- Units (all files matching *CIV4UnitInfo*)
- Tech (all files matching *CIV4TechInfo*)

Reallygood, ty a lot! :goodjob: However, it would be nice to see the actual name of a building as well, since the tags are not always matching. Or are they in this files and i was just to blind to see them? :sad:
 
Reallygood, ty a lot! :goodjob: However, it would be nice to see the actual name of a building as well, since the tags are not always matching. Or are they in this files and i was just to blind to see them? :sad:

No, I haven't added them yet, but it should be easy to do. I'll do that shortly.
 
Top Bottom