I agree. Saving web-links is a solved problem. Saving the CONTENT afaik is not solved and I've wanted a solution for YEARS, as well as the rest of humanity has needed this (including those who don't even know they need it, haha!).
Closest thing would be to "print" to PDF but that always creates a trainwreck in the PDF and is never good enough for me. Looking forward to trying Omnom if it's open source.
That looks like a pretty heavy-weight solution, with a lot of complexity, and I don't mean that as a criticism at all. I'm not a 'go' developer myself. I've always wanted a pure JS solution (as a browser extension, maximum of 200 lines of code) that can capture the content of a web page (doing a virtual scroll to the bottom, to capture the whole page). Since there's no perfect way to translate HTML to PDF, my idea had always been to capture the IMAGE of the page (aside from capturing keywords for DB indexing which can be done separately just for 'search' support later on).
The fly in the ointment is of course the scrolling too, because some apps have "infinite" scrolling, and so in many SPAs there's literally no such thing as "The whole page". Anway, I haven't tried your app yet, because of not-JS and not-Small, reasons, but I'm just sharing my perspective on this topic. Thanks for sharing your project!
I recently released a Chrome extension that converts webpages to PDF. It's free, but you need to register to get a key. Unfortunately, this solution isn't client-side JavaScript; I'm using an API underneath. To be honest, I mainly created it to promote the API, but if it's useful for people, I might develop it further. Perhaps it could be useful to you in some way. I don't know your requirements, but maybe with this base in the form of this extension, it wouldn't be difficult to add something that meets your expectations, let me know. However, if you want to export a PDF from Ahrefs, for example, I'm afraid that might not be possible; currently, only basic authentication is supported. Unless maybe I could add an option like in my API to pass JavaScript code, but I also doubt that would work because Ahrefs probably has some bot protection.
the difference is, that archive.ph snapshots something in headless. omnom snapshots the exact same state that your browser is displaying you. so if there is js interactions that change the dom, those will be snapshotted, unlike with archive.ph.
also lets not forget that archive.ph wraps everything in their own frame and has their own way of mangling the result. not in a bad way, it's just not the original as it would have been rendered in your browser.
omnom is for snapshotting, not for circumventing paywalls. i'm merely comparing the snapshot feature of the two projects. circumventing paywalls is out of scope.
btw it is perfectly fine to circumvent a paywall with archive.ph and then to snapshot it with omnon so your bookmark never linkrots away. also when i say "js manipulation" i also mean stuff like captchas, or dynamic documents that you change by interacting with it, or even private services like e.g. rocket chat hidden behind some barrier like http auth, or private vpn. archive.ph will never have access to what your browser might have access to.
this must be parody. what exactly is the threatmodel where memory-safety matters for a calculator? did these devs miss the point of popping a calc.exe? surely no bc has ever been used for LPE or RCE.