If you're CPU bound in pure-python computations then the GIL (Global Interpreter Lock) will cause you to only make use of one core. This is intentional: threads are hard, processes are simple.
I know that, except most web application servers spend most of their time in I/O - socket, database, etc. Threads are not hard: wanton abuse of shared state and concurrent modification of the shared state is hard to get right; tread a threaded app like a message-passing app and you have a much simpler life.
I'm intimately familiar with the limitations of python threads - and right now I'm the maintainer of the multiprocessing module, which is the process-based "reply" to the stdlib threading module.
I wasn't trying to claim you don't understand python, simply that the only way to be CPU bound on multiple cores with a single python process (as is claimed) is to be running non-pure python code that unlocks the GIL before doing CPU intensive work.