You're talking about support for checkpointing every epoch, but a 30 second shutoff would require checkpointing every N mini-batches, which is not a typical use for checkpointing.
My point is, you would have to write a nontrivial amount of code to get this working. It would not work out of the box with Keras right now, for example.