My humble suggestion and offer to the AI team. I have always thought that it would be to Firaxis' advantage, or the Better AI team's, to have more automated testing of the code across a wide range of variables. By this, I mean the ability to generate a starting game, then run that game all the way through, the start the same game again with a small parameter change, and run it all the way through again and see how that affected the game.
Of course, this would preclude having a human player affecting the game. The changes would be things like: (1) what if you left five of the civs the same but changed out one civ? (2) what if you had slightly less food resources in the capital? (3) what if the Better AI team changed one parameter? (4) how would the various combinations of civs come out in all combinations of duels? (5) what if praetorians were nerfed to strength five? (6) how does the same strategy work on a different map type?
To this end, I would volunteer my computer, and I'm sure other people would as well. I envision a system where the Better AI team generates games that they want to see run. I sign up and download one such game and start it, and it runs to completion (based on time or victory). Then I send the game save from the victory back to the AI team for analysis. I'm sure I could run a couple of games overnight, and another couple of games through the day, and if enough people volunteered, you could have a hundred different examples running every day.
I know the AI team is small. Maybe they cannot analyze each game in detail, and would concentrate on those games where something unusual jumped out at them. But if all they had to do was load up the save and run the playback, maybe it would give them some useful information.