1. We have added a Gift Upgrades feature that allows you to gift an account upgrade to another member, just in time for the holiday season. You can see the gift option when going to the Account Upgrades screen, or on any user profile screen.
    Dismiss Notice

XML Autocorrect Tool

Discussion in 'Civ4 - Caveman 2 Cosmos' started by billw2015, Sep 21, 2019.

  1. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    This tool will do semi-automated grammar and spelling correction on Text XML files.
    Usage:
    1. Have java installed (most people probably do?)
    2. Run Tools\Autocorrect\StartServer.bat
      Spoiler :

      This starts the grammar and spelling server that the tool uses, a local version of the LanguageTool service. However I only use this for grammar and detecting initial spelling errors, its dictionary is way too limited for our eclectic needs. I added a couple of other spell checkers on top of this to give better accuracy, however the default is set to just use google search for spell checking.
    3. Drag and drop Text XML files on to Tools\Autocorrect\DropXMLHere.bat
      You can also run Autocorrect.bat from the command line and see the available options.
    The server is quite slow (I does about one piece of text per second or so), but it picks up a lot of things beyond spelling, including incorrect unit conversions, and some bad writing style (!).

    Using google as the spell check means it shouldn't hit many false positives for spelling, but it will miss some things like incorrect capitalization etc. Basically if you can type it in google and google doesn't say "did you mean xyz?" it won't be picked up as an error in this system by default.

    The interface is just simple keyboard + console, with some color added to make it easier to understand. When the tool finds errors in a TEXT entry it will present all the corrections applied and allow you to choose to either apply them all, or interactively select which ones you want to use. Of course you can skip, ignore and exit at the appropriate places as well.

    Spoiler Screenshot :

    upload_2019-9-21_22-45-4.png
     
    KaTiON_PT likes this.
  2. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    This didn't work:
    I ran Start Server bat, let it do stuff, and then I dragged and dropped text xml on Dropzmlhere bat, and got error.
    Start server bat failed to install stuff properly - it says path not found.
    Spoiler :

    Dwm 2019-09-22 09-57-39-32.png Dwm 2019-09-22 09-57-36-33.png

    Python folder is pythonx86 and not python.
    It seems like python folder wasn't renamed.
    So force rename pythonx86 to python in Tools folder.
    Apparently that thing recreated that folder lol.

    Now that thing works.

    Best way to do it is: open two foders.
    First one would display Autocorrect folder content so you can drop thing on bat file.
    Second one can be opened in assets. Type *text*.xml in search bar to find all text files.
    There are a lot of text files scattered in modules
     
    Last edited: Sep 22, 2019
  3. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Ah okay, I guess you have 32bit OS?? I will modify the batch file to do the rename then.
     
  4. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    @billw2015 it crashes on larger files.

    Dwm 2019-09-22 10-47-44-07.png

    For example it can't process P2K_CIV4GameText fully - it has 96 KB, biggest text files are closer to 1 MB.
     
  5. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    I have 64 bit Windows 7.
     
  6. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Hmm powershell thinks you have 32 bit OS so it installed the x86 version of python, hence the directory name. Weird
     
  7. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    It shouldn't crash on too many requests error, instead it should retry request.
     
  8. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Software shouldn't crash you say? I think I'm going to need to run this one by the guys in the lab :mischief:
    I have not seen this error. I guess we spammed google spell check too hard? Which file did you use?

    edit: nm didn't see your eariler post. You just reiterating for emphasis?
     
  9. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    Yeah, because now I don't know if it finished checking file or was abruptly shut down.

    Also if I wanted to recheck file again it didn't even check file as it just immediately said "too many requests"
    Any moderately large (>100 kB) text file would eventually result in this error.

    Also we have plenty of text files.
    Spoiler :

    Dwm 2019-09-22 12-09-16-54.png

    Over 48 000 english text entries
     
    Last edited: Sep 22, 2019
  10. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Yeah but it only uses google when the other spell check detects a spelling error, so it doesn't make 48000 requests to google. Regardless I will try it out today and see if there is a fix.
     
    raxo2222 likes this.
  11. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Okay new version is in. It is a LOT faster now, like 50x faster (I found a bug in the library I was using, and rewrote it). It catches errors with google, but it also calls google a lot less now anyway.
    And it has nice progress dots that show some indication of what it is doing.
    I fixed up that file you listed above in about 10 seconds using it.

    edit: actually it looks like it misses some spelling errors in its current form, I will need to tweak it again a bit.
     
    Last edited: Sep 22, 2019
    raxo2222 likes this.
  12. billw2015

    billw2015 Chieftain

    Joined:
    Jun 22, 2015
    Messages:
    664
    Okay last change for now is in, improving the spell check performance, and stopping it from missing obvious spelling mistakes.
     
    raxo2222 likes this.
  13. raxo2222

    raxo2222 Time Traveller

    Joined:
    Jun 10, 2011
    Messages:
    6,166
    Location:
    Poland
    @billw2015 I can't install it anymore - some red wall of text appears before exiting instantly, when I launch StartServer.bat.
    As if something python related doesn't install fully.
    Spoiler :

    Dwm 2019-10-19 09-25-34-47.png Dwm 2019-10-19 09-23-40-04.png Dwm 2019-10-19 09-23-40-35.png


    I tried deleting python folders in tools, but it didn't work.
     
    billw2015 likes this.

Share This Page