Both hex grids and square grids (and all discrete grids for that matter) have the property that there is some optimal direction of travel, the direction in which one can move the fastest. On a square grid the optimal directions are the 4 diagonals. On a hex grid, the optimal directions are the perpendiculars to the 6 edges.
Consider the distance that is moved in a single move in one of these optimal directions. We'll call that distance d. If we then wanted to move some very large distance, l, in a direction which happened to be one of our optimal directions, then we could do so in l/d turns.
Suppose instead that we wish to move in a different non-optimal direction. The difference in angle between this new direction and the nearest optimal direction forms some angle, theta.
In order to move the same distance, l, in the new direction, one must expend more moves. We can think of movement along the optimal direction as being perfectly efficient in that it takes 1.0 * l/d turns to travel l distance, whereas movement in other directions takes E * l/d turns with E being some number greater than 1.0. E represents the inefficiency of moving in the new direction.
This efficiency is a function of the angle theta. When theta = 0, we are moving directly along one of the optimal directions, so E = 1.0. The inefficiency climbs to its maximum value when the required direction lies directly between two optimal directions. This graph shows the proportional number of moves require to travel a set distance as theta varies.
So the difference between hex and square grids is really just one of degree (at least in the case of large open areas when journeys on the square grid can always be made with a combination of diagonal moves). In a square grid, cost of moving can be up to 41% higher depending on the direction of travel, whereas on the hex grid the least efficient directions only cost 15% more.
While neither grid can really be said to be perfectly realistic, the greater uniformity of the hex grid is why it is generally used for wargames.
Consider the distance that is moved in a single move in one of these optimal directions. We'll call that distance d. If we then wanted to move some very large distance, l, in a direction which happened to be one of our optimal directions, then we could do so in l/d turns.
Suppose instead that we wish to move in a different non-optimal direction. The difference in angle between this new direction and the nearest optimal direction forms some angle, theta.

In order to move the same distance, l, in the new direction, one must expend more moves. We can think of movement along the optimal direction as being perfectly efficient in that it takes 1.0 * l/d turns to travel l distance, whereas movement in other directions takes E * l/d turns with E being some number greater than 1.0. E represents the inefficiency of moving in the new direction.
This efficiency is a function of the angle theta. When theta = 0, we are moving directly along one of the optimal directions, so E = 1.0. The inefficiency climbs to its maximum value when the required direction lies directly between two optimal directions. This graph shows the proportional number of moves require to travel a set distance as theta varies.

So the difference between hex and square grids is really just one of degree (at least in the case of large open areas when journeys on the square grid can always be made with a combination of diagonal moves). In a square grid, cost of moving can be up to 41% higher depending on the direction of travel, whereas on the hex grid the least efficient directions only cost 15% more.
While neither grid can really be said to be perfectly realistic, the greater uniformity of the hex grid is why it is generally used for wargames.