I agree that that's how it works in Civ 5, but you're phrasing that as something that is universally true - it doesn't have to be. I mean, to make an example: Let's reduce free basic yields in Expansions and give the Palace +6 Production.
What you end up with is a scenario where every new city will develop a lot slower than the capital. Now you add small, flat yield bonuses vs big growth bonuses into Policies, make Internal Trade Routes produce Food and some Production at the same time, limit them in numbers so that a wide empire cannot come close to supplying all cities and there you go, you have a tall vs. wide system that does not rely on penalties per city and therefor doesn't discourage from funding smaller Cities in suboptimal spaces later on.
After the early part you'd be able to use Gold to get a new Expansion running relatively quickly, and in the midgame scaling wide empires vs. tall empires is really just a matter of creating bonuses that are too specialized to be useful for the other type of empire and scaling them manually so that both can somewhat compete with each other.
Well, a similar system already exists in Civ 5, the "XY thinks we're building cities too quickly"-modifier.