Solved (more or less):
Combat Damage Dealt = exp[ 3.369311+ ( Attacker Strength - Defender Strength) x 0.037637 + U(-.27,.23) ]
U ~ a random sample from the uniform distribution with bounds -.27, +.23 (
the uniform distribution)
Yes, I know it's asymmetrical; one of two things is happening. One, the devs are putting 'boxing gloves' on combat, with a modest negative bias to damage dealt to make units slightly harder to kill without HP-inflation. Two, our modest sample just happens to be in that part of the sample space, while the underlying distribution is in fact random.
This is based on analysis of the data set provided by OP, those parameter estimates for base damage (3.36...) and relative strength effect (.037...) might tighten up to a more reasonable number with increased data collection. It's quite possible (based on the standard error for those value estimates) that the base damage factor is actually +/- up to .04 and the strength effect is +/- up to .001. Overall, the model accounts for roughly 90% of the variation within the source data with the base damage and relative strength effects 'highly significant', or in layman's terms, 'mathematically legit'.
Based on this we'll 'reliably' (in 50%+ of cases) see 1-damage for a Strength Difference of -89.52124 (that is, Attacker Strength 89.5 points
below Defender Strength, possibly something like -90) and 100-damage for a Strength Difference of 32.83628 (probably 33). In the source data we only saw four 100-damage cases:
Crossbowman wrecks Scout with a 39-point advantage,
Crossbowman wrecks Scout with a 41-point advantage,
Horseman wrecks Barb Horse Archer with a 40-point advantage (taking 6 damage)
Roman Legion wrecks Slinger with a 40-point advantage (taking 6 damage, again, possibly there's a floor on attacker damage for melee?).
Supporting this idea about damage floors, there is the fact that the lowest Defender damage we saw in the data is 13 (double a 6.5 damage attacker floor, perhaps?). There were seven cases of this in the data, four were District City Centers (versus other District City Centers, strangely), with strength differences ranging from -16 to -22. In our data however, we didn't see any "terrible generalship" and the data didn't have late-game units getting attacked by scouts and other trash, so it's not surprising that we didn't see any 1-damage pew-pew nonsense.
The range on the uniform distribution is based on the residuals of the model fit using the parameter estimates above. Those residuals have quartiles:
0%: -0.26960892
25%: -0.10573067
50%: 0.01500539
75%: 0.11655122
100%: 0.22684554
Those are pretty flat (as a uniform distribution should be) and supported by the Quantile/Quantile Plot below (against the aforementioned Uniform distribution). A nice diagonal line between the bottom left and top right corner of the plot indicates a 'good' fit.
And below we have observed and theoretical ranges of Attacker Damage:
...and Defender Damage:
Here we have a plot of the source data, attacker and defender damage shown separately. Notice ranged units chilling at the bottom.
And here we have paired damage outcomes from our source data - notice how few outcomes we see above the 'tie' line (the line indicates attacker damage equal to defender damage, above the line indicates attacker damage higher than defender damage), showing at least that the AI and our intrepid player favors 'winning' battles. (Though I think we all agree that the A.I. could use a little more of the 'lose the battle, win the war' maxim).
In any case, more data collection yields better estimates, but I think these give a pretty good understanding of the damage calculation.
Source: Statistician is my day job.