Napoleon was the Best General Ever, and the Math Proves it.

Cutlass

The Man Who Wasn't There.
Joined
Jan 13, 2008
Messages
47,758
Location
US of A
Ranking Every* General in the History of Warfare
*Almost every

When Africanus asked who, in Hannibal’s opinion, was the greatest general, Hannibal named Alexander… as to whom he would rank second, Hannibal selected Pyrrhus… asking whom Hannibal considered third, he named himself without hesitation. Then Scipio broke into a laugh and said, “What would you say if you had defeated me?”— Livy
Like Hannibal, I wanted to rank powerful leaders in the history of warfare. Unlike Hannibal, I sought to use data to determine a general’s abilities, rather than specific accounts of generals’ achievements. The result is a system for ranking every prominent commander in military history.

The Method
Inspired by baseball sabermetrics, I opted to use a system of Wins Above Replacement (WAR). WAR is often used as an estimate of a baseball player’s contributions to his team. It calculates the total wins added (or subtracted) by the player compared to a replacement-level player. For example, a baseball player with 5 WAR contributed 5 additional wins to his team, compared to the average contributions of a high-level minor league player. WAR is far from perfect, but provides a way to compare players based on one statistic.

I adopted WAR to estimate a given military tactician’s contributions beyond or below an average general. My model, which I explain below, provides an estimate for the performance of an average general in any given circumstances. I can then evaluate a general’s quality based on how much they exceeded or fell short of a replacement general in the same circumstances (assuming a replacement general would perform at an average level). In other words, I would find the generals’ WAR, in war.
https://towardsdatascience.com/napo...eral-ever-and-the-math-proves-it-86efed303eeb

I don't know how well this article really holds up. But it's an interesting take.
 
Hasn't this article made the rounds here before? Or am I thinking of something else?

Anyway, it was the subject of a brief but intense discussion shortly after its publication. The caveats list at the end of the article addresses it in some detail. Something to keep in mind is that the article was explicitly designed as an "interesting thought experiment" rather than a scholarly contribution. That's fair! Let's see if it works based on that.

First, the concept. The author used a data set derived from Wikipedia's collection of battles and casualties taken. This is unquestionably the easiest set to analyze and convert into usable data, but it is a deeply flawed set, as the author acknowledges later. It is extremely incomplete, with some incompleteness based on geographical bias (engagements outside Europe and to a lesser extent modern America are hard to find). It also does not compare like with like, because Wikipedia's definition of a "battle" is highly mutable and inconsistent. This is mostly a problem with the modern data. Starting in 1864, it becomes harder to tease out individual battles because of the altering nature of warfare. This is directly reflected in the data, as the author notes that his system seems to rate modern generals much lower; he does not, however, seem to understand or address why that is the case.

The limitations of the data are real and severe, but the underlying concept - rating tactical acumen based on an up/down number for victory - is problematic. Wikipedia's numbers for casualties are often wrong and could be improved through further research, but that they were not incorporated at all is a severe flaw. (I would also suggest normalizing casualties based on era-specific rates, because the changing nature of warfare has led to dramatically different average casualty numbers over the last three millennia.) The up/down number for victory is also tricky because what a victory actually meant is hard to define in many cases.

After each general receives a rating for each battle, the author totals up the ratings for the battles. This is after the fashion of WAR (Wins Above Replacement, the baseball stat) and it is the most obviously flawed aspect of the whole thing. WAR is a counting stat; longer careers at a reasonably high level are generally better than short careers at an extremely high peak, because the total number of wins generated is higher. This makes sense for baseball, because "health is a skill" - and a quite relevant one, for managers wanting to rack up wins and make the postseason - but it does not translate over to warfare very well! Generals don't have 162-battle seasons. Some of them have short careers by virtue of mostly leading during a time of peace. Some of them have short careers in the Wikipedia-derived data set because the data set isn't very good. Some of them have short careers because their style of warfare did not privilege set-piece battles. As we will see, this leads directly to some of the more counterintuitive and flawed results of the thought experiment.

Finally, there is the concept of the "replacement level", which the author admits isn't very useful in the caveats after spending entirely too much time on it in the article. I would agree that it's not very useful. First of all, he doesn't really use the equivalent of a "replacement level player" as in baseball, but rather the mean player skill, which is very much not the same thing. (Replacement-level players are significantly worse than the mean player in MLB, because they're basically the guys left over after the managers have already assembled their teams.) That's mostly just a nomenclature issue, though. The bigger issue is all of the data that the Wikipedia battle boxes don't actually cover. They don't cover technological differences, differences in troop quality or training, differences in firepower available (usually), differences in terrain effects, and so on. The author admits these omissions and suggests them for further investigation, and points out that the model's conclusions are sometimes difficult to understand because of these omissions!

On to specific conclusions, then:

Unsurprisingly, Napoleon Bonaparte ranks at the top. There's an obvious reason for this. Napoleon's career coincided with an era in warfare that strongly favored fighting set-piece battles and during which set-piece battles were very thoroughly recorded, meaning he has a very large data set; for counting stats like WAR, that rates him very highly indeed. Only tactical engagements are considered, meaning that Napoleon's most egregious military failures, in Spain, the Levant, and Russia, don't enter the equation. But there's nothing wrong with this conclusion. In the highly subjective discussion of best generals, Napoleon is certainly a valid answer. There is a reason Clausewitz labeled him "the god of war". Napoleon possessed remarkable gambler's instincts and an excellent eye for terrain, and he was an aggressive troop leader at the tactical and operational levels. You could do worse than call him the greatest tactician of all time, even if his gambling did eventually catch up to him. The author touts Napoleon's extreme outlier status as a mark of his troop-leading ability, but I would suggest that it was rather a combination of troop-leading ability and these other factors.

As you can see, the facts that a) we're using a counting stat and b) focusing on tactical engagements define the nature of the conclusions. Other warlords that rank highly according to these criteria are Caesar and Alexander, both of whom fought a lot of battles and won a lot of them, and both of whom were sufficiently famous and whose careers were well-recorded enough to result in lots of Wikipedia battleboxes. No surprises there. They are also highly regarded commanders from military history, so again: nothing counterintuitive about that.

One of the conclusions that I have mixed feelings about is the model's extremely low rating of Robert E. Lee. The author suggests that, although Lee was "saddled" with severe disadvantages, his reputation as a tactician is "likely undeserved". I fully agree that there is a particular group of individuals who have inflated Lee's abilities beyond all reasonable levels. Many recent scholars of the war have chosen to focus on Lee's Virginia-centric military policy (not reflected in the ratings) and the high relative casualties his army absorbed (also not reflected in the ratings). But the notion of a "replacement general" here is problematic. Based on those disadvantages, how well would a replacement-level general - the likes of Johnston or Beauregard, in this case - done? Probably not very well! The Army of Northern Virginia was repeatedly saved from total disaster in the Overland Campaign by Lee's rapid responses and tactical maneuvers. He devised audacious plans that often rebounded to advantage. It is extremely hard to imagine another likely candidate in the same scenario, with the same army and same opponents, succeeding even half as well as Lee. This conclusion points up some of the goofier omissions of the model - casualties on both sides not considered, and the deeply flawed concept of the "replacement level".

A conclusion that I believe to be totally unfounded is the model's conclusion on Pyrrhos of Epeiros, an ancient general mentioned in the Hannibal quotation from the beginning. The author finds it difficult to understand why Pyrrhos was rated so highly by Hannibal and suggests that the "Pyrrhic victory" - absorbing so many casualties that even victory becomes a defeat - should mean he ought to be rated even lower. This is because the data set for Pyrrhos is so disastrously incomplete! Only his three battles against the Romans are considered in the model - not his conquests of Macedonia, nor his war in Sicily, nor his battles against the various other Successors in southern Greece. Of those three battles, one is correctly rated a victory for Pyrrhos (Battle of Herakleia), one is correctly rated a defeat (Battle of Beneventum), and one is inexplicably described as a defeat despite being a victory (Battle of Asculum), presumably due to Roman propaganda that Pyrrhos' army absorbed too many casualties despite that, uh, not being the case. Pyrrhos held the field at Asculum and inflicted significantly higher casualties on the Roman army than his own army absorbed, yet Asculum often goes down as a "Pyrrhic victory" (read: defeat) due to Roman efforts to argue that Pyrrhos took so many casualties that he was unable to accomplish his goals. Even if this were true (it isn't), it's an odd time to suddenly decide that operational and strategic concerns matter in this alleged discussion of tactical prowess and only tactical prowess.

The author touts Moshe Dayan as one of the greater modern military leaders, and conversely points out that George S. Patton Jr. was not particularly highly rated. This makes a lot of sense when you look at the author's data, in which Patton is credited with "Operation Torch" (as a single battle) and "Battle of the Bulge" (as a single battle??!??!?!?!?) and...that is it. Conversely, Moshe Dayan is credited with...the 1948, 1956, 1967, and 1973 wars. In their entirety. The data set sucks. Even going by the limitations of the nature of warfare - Napoleon's era privileging tactical engagements defined as "battles" and modern "battles" usually being more like entire campaigns - this is a terrible list. The author suggests that modern generals are underrated in the model because of the changing nature of warfare and the decreasing participation of generals in various battles. The changing nature of command in an era of million-man armies is certainly a relevant consideration, but not even remotely the whole picture. I would also point out the changing nature of battle itself (transitioning to periods of more or less constant fighting for much longer times on much larger fronts) and also the fact that the data set apparently thinks that George Patton didn't command troops in Sicily, Normandy, Lorraine, or southern Germany. Frankly, the generals rating system based on tactical considerations for discrete, tactical-only engagements falls apart after about 1880 or so, and arguably before that (the Overland Campaign, for example).

Anyway. The article. The concept isn't that bad, but even as a thought experiment it isn't very useful because of a highly flawed data set and the problematic way that "replacement level" is described. There aren't really many valid conclusions here that military historians haven't already reached through other means.

Usually, authors attempt to employ data science and statistics to add rigor to a subject in which rigor was severely lacking. That's a laudable goal! I agree that some aspects of military history could use more rigor and closer attention to the numbers (although only the worst of pop-historians ever try to avoid statistics entirely). However, this attempt generally applied less rigor than is now standard in the field. Even on its own terms, I can't but think that it wasn't a very good attempt.
 
Last edited:
Napoleon lost battles, let alone ruined his army in Russia. Others (eg Alexander) never lost a single battle :)

Austerlitz>Gaugamela, fite me


@Dachs we should separate commanders into tiers. The highest tier is Napoleon-tier and the lowest tier is Braxton Bragg-tier
 
Moshe Dayan looked like some sort of contemporary Jewish pirate with his eyepatch. He deserves points for that alone.

Thought experiments can be fun, and they often provoke insight and discussion, but don't be reading too much into this. Dachs nailed it. After all, the very concept of ranking generals is fairly stupid in the first place; how do you compare Napoleon to Giap, for instance? They fought in different wars, at different times, using different strategies against different foes, etc..
 
Top Bottom