I guess there's a lot of pressure from Cursor and Google's Antigravity. Also with Zed you can bring your own API key which VS Code didn't support for a long time.
17 years ago I went to a summer vacation with my family (still a teenager). That meant 10 days without any internet connectivity. I just got my first laptop and I was allowed to take it with me. I was reverse engineering MSN Messenger's user to user and profile picture exchange protocol from TCP dumps. MSN Messenger did not use any encryption. Before I went to the vacation I recorded a bunch of sessions with Wireshark (maybe it was still Ethereal back then). Then for 10 days I was just trying to figure out from the dumps how the binary protocol worked and was writing the code without any way to test it. When I came back I just had to fix some minor bugs and it worked. Fun times.
I've done business logic sharing where the engine was written in Rust, WASM for web with React for UI, uniffi-rs for Android and iOS with Kotlin Compose for Android and SwiftUI for iOS, Tauri for desktop.
There were no good examples for how to do this but once it was set up it worked extremely well.
It uses tokio for Android/iOS/desktop and even embeds a web server for fake API for end to end testing (even on mobile)
Llguidance implements constrained decoding. It means that for each output token sequence you know which fixed set of tokens are allowed for decoding the next token. You prepare token masks so that in the decoding step you limit which tokens can be sampled.
So if you expect a JSON object the first token can only be whitespace or token '{'. This can be more complex because the tokenizers usually allow byte pair encoding which means they can represent any UTF-8 sequence. So if your current tokens are '{"enabled": ' and your output JSON schema requires 'enabled' field to be a boolean, the allowed tokens mask can only contain whitespace tokens, tokens 'true', 'false', 't' UTF-8 BPE token or 'f' UTF-8 BPE token ('true' and 'false' are usually a single token because they are so common)
JSON schema must first be converted into a grammar then into token masks. This takes some time to be computed and takes quite a lot of space (you need to precompute token masks) so this is usually cached for performance.
Each token affects the probabilities of subsequent tokens. Let's say you want the model to produce Python code, and you are using a grammar to force JSON output. The model wasn't trained on JSON-serialized Python code. It was trained on normal Python code with real newlines. Wouldn't forcing JSON impair output quality in this case?
This scam is true for all AI technologies. It only "works" as far as we interpret it as working. LLMs generate text. If it answers our question, we say that the LLM works. If it doesn't, we say that it is "hallucinating".
Im sorta beginning to think some LLM/AI stuff is the Wizard of Oz(a fake it before you make it facade).
Like why can an LLM create a nicely designed website for me but asking it to do edits and changes to the design is a complete joke. Lots of the time it creates another brand new design (not what i asked all) and it's attempts at editing it LOL. It makes me think it does no design at all rather it just went and grab one from the ethers of the Internet acting like it created it.
It's not a scam because it does make you code faster even if you must review everything and possibly correct (either manually or via instruction) some things.
As far as hallucinations go, it is useful as long as its reliability is above a certain (high) percentage.
That's the point, nobody really believes there is an intelligence generating Google results. It is a best-effort based engine. However, people have this belief that ChatGPT has somehow some intelligent engine generating results, which is incorrect. It is only generating statistically good results; if it is true or false depends on what the person using it will do with the results. If it is poetry, for example, it is always true. If it is how to find the cure for cancer it will with very high probability be false. But if you're writing a novel about a scientist finding a cure for cancer, then that same response will be great.
I hear you, but GenAI also gets the opposite fork from people who hate it: It's good result that used GenAI at any point => your prompting and curation and editing is worthless and deserves no credit; it's not good result => that proves AI isn't real intelligence.
As with Marmite, I find it very strange to be surrounded by a very big loud cultural divide where I am firmly in the middle.
Unlike Marmite, I wonder if I'm only in "the middle" because of the extremities on both ends…
And the hayro library is standalone and can easily be used outside of Typst. It only uses CPU and is pure Rust so it can also be used with WebAsembly. Link to demo below.
reply