Reverse typesetting: reflowing page layouts where you don't have knowledge of the typesetting structure, i.e. a scanned physical book or PDF paper. Naive rules-based heuristics based on the dimensions of bounding boxes and gaps. Point is to reflow things for resizing to eink readers. (Specifically the size that fits in my pocket which I carry around. User #1 is me). Building in Common Lisp and targeting an Emacs mode for interactive execution with manual feedback.
You're inferring the structure of the document from the printed result. If typesetting takes a set of layout directives and outputs a page, this is taking a finished page and guessing what layout directives could create it. Then you can take that inferred structure and reflow the page in a new layout.
The reason I'm not falling back on OCR is because the general case is full of things, like math equations and inset graphics/diagrams, that can't be OCR'd. The only robust way to deal with those is to treat them as graphical atoms: "this bounding box can be moved around, but should not be split up into pieces".