It shouldn't need to hurt your framerate too much, considering that the font rendering only needs to happen once every few seconds. A new subtitle can be rendered, kept in memory as a texture, and then just blended by the GPU as pixels. The titles are also known ahead of time, so it's possible to set up a pipeline with no sudden increases in processing load.