There are actually not that many calls over the DLL boundaries that are problematic. Main issue are the few ones that use vectors or pass CvString and similar or return it.IIRC Koshling tried this early last year (at least the part about moving the DLL to C++ 11.0) and found that without a LOT of work it wasn't going to go anywhere.
The shim DLL might work, but given that it would, if I'm understanding correctly, deallocate everything the exe sends it, reallocate it in a manner that the c++ 11 DLL would be OK with, and then forward it to another interface I can see preformance issues here. That, and the additional issue of how to handle Python calls to and from the DLL. Another issue would be whether or not the C++ 11 debugger chokes when you and look at something going on in the shim.
The benefits would probably be worth it though. Last year I tested one of alberts DLLs compiled with the Intel C++ compiler and turns went almost twice as fast.
It seems like several of the programmers at Firaxis were aware of the issues of passing STL classes over DLL boundaries and some were not.
The calls into and from the DLL are not that performance relevant in general because the AI turns happen nearly entirely in the DLL.
In regards to Python: The calls from Python to the DLL do not touch the exe at all but currently they do go over a Boost Python DLL. In my tests I bypassed that by linking a new version of Boost Python statically which then directly communicates with the Python24.dll.