The resolution cap in the article is due to Netflix's DRM, but current VR movie theater/media playback applications let you load your own media.
When you do this with HD content, it actually looks quite good in spite of the resolution of the actual HMD due to the subtle movements of your POV. The in-game camera tracked by your head is always barely moving, so the pixels you actually see in the source media are always being transposed and blended differently.
What I meant is that you need a certain amount of pixels just to be able to fully represent the 1080p video pixels, and if you see "extra space" in the VR world around the screen, then you need 1080p video + extra space in pixels, so it probably should be at least 4k.
You need a certain amount of pixels to represent the 1080p video in one particular frame. When the next frame comes along, your head has moved enough that you'll be seeing a different subset of those total pixels. At sufficient framerates (and motion capture rates), this actually does a pretty good job of approximating the full resolution of the imagery (especially when this is happening 2 or 3 times per source video frame, as the case might be with NTSC/PAL content).
Even without the DRM resolution cap, you need somewhere greater than 4096x4096 pixels per eye to adequately represent a 1920x1080 display. VR is not going to be as crisp as a normal monitor for quite a long time.
When you do this with HD content, it actually looks quite good in spite of the resolution of the actual HMD due to the subtle movements of your POV. The in-game camera tracked by your head is always barely moving, so the pixels you actually see in the source media are always being transposed and blended differently.