Combat Odds and RNG
A discussion about what you need to understand about the two and how they are different.
Feel free to add your comments below or ask any questions.
First, I'd like the reader to first understand that a combat odds "calculator" and an "RNG" are completely separate things and perform very different functions. Yet in a lot of conversations these functions seem to be confused. The RNG's function is to provide a random number between 0 and N, pretty much. If you ask it for a random number between 0 and 99 it will give you one and you have no way of predicting what it could be.
Contrary to what many people seem to feel, streaks occur all the time from RNGs and statistically they are expected. Many people wrongly believe that a "fair" RNG (or a coin or die for that matter) must balance out toward the mean on a fairly short timescale. This is of course a fallacy because it implies the RNG or coin must possess some form of memory. In mathematical language this is called "independence". It means the result of each trial cannot and does not depend on the previous trial nor any part of its history.
Combat Odds Calculator:
The purpose of an odds calculator is to compute the probabilities of various outcomes occuring. Computing odds is something that people can do very well, as well as machines which are programmed to do so (as is the case with Civ4). Compare this with how notoriously bad people are at generating apparently random sequences of numbers i.e. performing the role of a RNG. When done properly odds are absolutely correct - they are not "fuzzy" or "rough" guesses, just as I can tell you that when you toss a coin you have 50% chance of getting tails. You don't have "roughly" 50% odds, you have exactly 50% odds (assuming a perfect unbiased coin of course!).
Is the odds calculator accurate?
In the very earliest versions of Civilization IV there existed one or a few errors in the combat odds calculation that were sometimes very obvious because you could supposedly have greater than 100% (e.g. 101%) odds of victory. Since it was fixed way back in vanilla, the standard odds calculator has been accurate ever since. Things like first strikes etc. are factored into the calculation absolutely correctly now. There are however two very specific situations in which the displayed combat odds are (or can be) wrong.
1. The first of these errors is due to what should probably be called a bug in the combat resolution code. Some units in Civ4 when they attack are not allowed to kill units (e.g. catapults, cannons etc.). These units have what is called a combat limit and cannot damage any unit beyond that damage amount, and they will instead be forced to withdraw. When in combat, if one of these units with a combat limit damages a defender to exactly the combat limit (e.g. a catapult damaging a unit down to exactly 25HP), then rather than retreating right away combat continues. Since combat continues, if the unit with the combat limit scores another hit it will do 0HP of damage and the combat will end with the unit withdrawing. This is not very intuitive and the way the combat odds calculator works assumes the opposite - that the unit would have withdrawn before it got to a situation where it could deal 0HP of damage. This has the effect of making the combat odds calculator usually over-estimate the unit's odds of withdrawal. The largest discrepency I've seen would be a unit that supposedly had about 67% chance of surviving (withdrawing) actually having only 50% chance. Usually the difference is smaller than that.
2. The second way the odds can be wrong relates solely to barbarian combat when a player has a number of barbarian free wins remaining. On Prince difficulty and below, a certain number of "barbarian free wins" are given to the player and whenever he wins a combat with a barbarian (including animals!) the counter is decreased by 1 each time, until it reaches 0. Before the counter reaches 0, there is a specific intervention in every barbarian combat that gives the player at least 90% odds in each combat round (and consequently 10% at most for the barbarian). Most often this results in overall combat odds of around 99%-ish so almost always these combats will be skewed enough so that the player wins (and his free win counter is then decreased). The reason the combat odds calculator is wrong here is that it ignores this intervention to the combat resolution and assumes the combat is done as normal. This means the odds calculator tends to under-estimate your odds of survival in such barbarian combats. It can be argued that this is not really an "error" per se but is intended to be hidden from the player, but in any case it is misleading IMO.
The two above potential errors have been rectified in Advanced Combat Odds. However the barb free wins info can be ignored with a particular setting if the player or modder wants the barb free wins to remain hidden to the player.
But something is not right with the odds. I lose way too many battles at X%.
I personally have verified independently (i.e. calculated the combat odds separately and then compared the algorithms) the correctness of the game's odds calculator and this was further verified by another mathematician from the forums (Roland Johansen). The RNG has also been confirmed (by DanF5771, ParadigmShifter and others) to be one that is usually called a Linear Congreuential Generator. Without going into too much of the specifics of how it works, it has a period of more than four billion. This means it would need to be called about as many times before it started repeating itself. LCG is the type of RNG that is used in many applications - especially games - because it is very fast (only a couple of integer operations for each output) and very easy to implement. Except for things like online casinos and so forth, the LCG is more than adequate for games. Most people without any statistical software would have no chance of telling apart a LCG's output and the output of a real random number generator (e.g. physically tossing a coin over and over).
The point is, the odds are correct. If you still feel something is wrong, you are either noticing a flaw of the LCG RNG or (more likely) making some wrong assumptions about how a good RNG is supposed to behave. Being suspicious of the RNG in civ is about as logical as being suspicious of the dice when playing monopoly. It sounds more like an excuse for poor planning.
Even when I fight battles at very high odds, like 99%, my units always seem to barely win, and they take more hits than the enemy does even though my units are much stronger!
Another complaint that is frequently made in relation to the odds calculator is how much units get injured even in the highest odds battles. This is a bit unfair because actually the default odds calculator does not even try to given you any information about how much damage your unit will take (Advanced Combat Odds, on the other hand, does!). It's not unusual at all to fight battles at high odds (e.g. 95%) where the average hitpoints after battle will be something like 50HP or less. This is a direct result of the fact that combat mechanics force battle outcomes into a "binomial" distribution. Only when units have a lot of first strikes do they have reasonable odds of surviving unscratched. In most other cases you can expect the victor of any battle to take a lot of damage.
As for the other complaint - that the enemy always seems to get more hits in - I would mainly put this down to selective memory or whatever other name people give to that these days. When you win battles with hardly any damage do you go and check the combat log in disbelief at how well you won? I doubt it. Yet when you lose battles or apparently "barely" win you are more likely to check how on Earth it happened. In these situations you would have to expect to find the enemy got in a larger number of hits. An example would be where two full HP units are fighting - your one does 25HP damage per hit, and the other does 15HP per hit. Your unit only needs to land 4 hits to kill the enemy. The enemy needs to land 7. Obviously if your unit wins but with significant damage then the enemy must have landed more hits. Even though the odds in each round of the enemy hitting you were low, because you are only checking the combat log after the "bad luck" happened you aren't really conducting an objective examination of the combat odds.
A discussion about what you need to understand about the two and how they are different.

Feel free to add your comments below or ask any questions.

First, I'd like the reader to first understand that a combat odds "calculator" and an "RNG" are completely separate things and perform very different functions. Yet in a lot of conversations these functions seem to be confused. The RNG's function is to provide a random number between 0 and N, pretty much. If you ask it for a random number between 0 and 99 it will give you one and you have no way of predicting what it could be.
Contrary to what many people seem to feel, streaks occur all the time from RNGs and statistically they are expected. Many people wrongly believe that a "fair" RNG (or a coin or die for that matter) must balance out toward the mean on a fairly short timescale. This is of course a fallacy because it implies the RNG or coin must possess some form of memory. In mathematical language this is called "independence". It means the result of each trial cannot and does not depend on the previous trial nor any part of its history.
Combat Odds Calculator:
The purpose of an odds calculator is to compute the probabilities of various outcomes occuring. Computing odds is something that people can do very well, as well as machines which are programmed to do so (as is the case with Civ4). Compare this with how notoriously bad people are at generating apparently random sequences of numbers i.e. performing the role of a RNG. When done properly odds are absolutely correct - they are not "fuzzy" or "rough" guesses, just as I can tell you that when you toss a coin you have 50% chance of getting tails. You don't have "roughly" 50% odds, you have exactly 50% odds (assuming a perfect unbiased coin of course!).
Is the odds calculator accurate?
In the very earliest versions of Civilization IV there existed one or a few errors in the combat odds calculation that were sometimes very obvious because you could supposedly have greater than 100% (e.g. 101%) odds of victory. Since it was fixed way back in vanilla, the standard odds calculator has been accurate ever since. Things like first strikes etc. are factored into the calculation absolutely correctly now. There are however two very specific situations in which the displayed combat odds are (or can be) wrong.
1. The first of these errors is due to what should probably be called a bug in the combat resolution code. Some units in Civ4 when they attack are not allowed to kill units (e.g. catapults, cannons etc.). These units have what is called a combat limit and cannot damage any unit beyond that damage amount, and they will instead be forced to withdraw. When in combat, if one of these units with a combat limit damages a defender to exactly the combat limit (e.g. a catapult damaging a unit down to exactly 25HP), then rather than retreating right away combat continues. Since combat continues, if the unit with the combat limit scores another hit it will do 0HP of damage and the combat will end with the unit withdrawing. This is not very intuitive and the way the combat odds calculator works assumes the opposite - that the unit would have withdrawn before it got to a situation where it could deal 0HP of damage. This has the effect of making the combat odds calculator usually over-estimate the unit's odds of withdrawal. The largest discrepency I've seen would be a unit that supposedly had about 67% chance of surviving (withdrawing) actually having only 50% chance. Usually the difference is smaller than that.
2. The second way the odds can be wrong relates solely to barbarian combat when a player has a number of barbarian free wins remaining. On Prince difficulty and below, a certain number of "barbarian free wins" are given to the player and whenever he wins a combat with a barbarian (including animals!) the counter is decreased by 1 each time, until it reaches 0. Before the counter reaches 0, there is a specific intervention in every barbarian combat that gives the player at least 90% odds in each combat round (and consequently 10% at most for the barbarian). Most often this results in overall combat odds of around 99%-ish so almost always these combats will be skewed enough so that the player wins (and his free win counter is then decreased). The reason the combat odds calculator is wrong here is that it ignores this intervention to the combat resolution and assumes the combat is done as normal. This means the odds calculator tends to under-estimate your odds of survival in such barbarian combats. It can be argued that this is not really an "error" per se but is intended to be hidden from the player, but in any case it is misleading IMO.
The two above potential errors have been rectified in Advanced Combat Odds. However the barb free wins info can be ignored with a particular setting if the player or modder wants the barb free wins to remain hidden to the player.
But something is not right with the odds. I lose way too many battles at X%.
I personally have verified independently (i.e. calculated the combat odds separately and then compared the algorithms) the correctness of the game's odds calculator and this was further verified by another mathematician from the forums (Roland Johansen). The RNG has also been confirmed (by DanF5771, ParadigmShifter and others) to be one that is usually called a Linear Congreuential Generator. Without going into too much of the specifics of how it works, it has a period of more than four billion. This means it would need to be called about as many times before it started repeating itself. LCG is the type of RNG that is used in many applications - especially games - because it is very fast (only a couple of integer operations for each output) and very easy to implement. Except for things like online casinos and so forth, the LCG is more than adequate for games. Most people without any statistical software would have no chance of telling apart a LCG's output and the output of a real random number generator (e.g. physically tossing a coin over and over).
The point is, the odds are correct. If you still feel something is wrong, you are either noticing a flaw of the LCG RNG or (more likely) making some wrong assumptions about how a good RNG is supposed to behave. Being suspicious of the RNG in civ is about as logical as being suspicious of the dice when playing monopoly. It sounds more like an excuse for poor planning.

Even when I fight battles at very high odds, like 99%, my units always seem to barely win, and they take more hits than the enemy does even though my units are much stronger!

Another complaint that is frequently made in relation to the odds calculator is how much units get injured even in the highest odds battles. This is a bit unfair because actually the default odds calculator does not even try to given you any information about how much damage your unit will take (Advanced Combat Odds, on the other hand, does!). It's not unusual at all to fight battles at high odds (e.g. 95%) where the average hitpoints after battle will be something like 50HP or less. This is a direct result of the fact that combat mechanics force battle outcomes into a "binomial" distribution. Only when units have a lot of first strikes do they have reasonable odds of surviving unscratched. In most other cases you can expect the victor of any battle to take a lot of damage.
As for the other complaint - that the enemy always seems to get more hits in - I would mainly put this down to selective memory or whatever other name people give to that these days. When you win battles with hardly any damage do you go and check the combat log in disbelief at how well you won? I doubt it. Yet when you lose battles or apparently "barely" win you are more likely to check how on Earth it happened. In these situations you would have to expect to find the enemy got in a larger number of hits. An example would be where two full HP units are fighting - your one does 25HP damage per hit, and the other does 15HP per hit. Your unit only needs to land 4 hits to kill the enemy. The enemy needs to land 7. Obviously if your unit wins but with significant damage then the enemy must have landed more hits. Even though the odds in each round of the enemy hitting you were low, because you are only checking the combat log after the "bad luck" happened you aren't really conducting an objective examination of the combat odds.