It's only "solved" if you're okay with a 50-90% retrieval rate or have particularly nice data. There's a lot of stuff like "referencing the techniques from Chapter 2 we do <blah>" in the wild, and any chunking solution is unlikely to correctly answer queries involving both Chapter 2 and <blah>, at least not without significant false positive rates.
That said, the chunking people are doing is worse than the SOTA. The core thing you want to do is understand your data well enough to ensure that any question, as best as possible, has relevant data within a single chunk. Details vary (maybe the details are what you're asking for?).
That said, the chunking people are doing is worse than the SOTA. The core thing you want to do is understand your data well enough to ensure that any question, as best as possible, has relevant data within a single chunk. Details vary (maybe the details are what you're asking for?).