I would assume most websites would still set cookies even if you reject the consent, because the consent is only about not technically necessary cookies. Just because the website sets cookies doesn't tell you whether it respects you selection. Only if it doesn't set any cookies can you be sure, and I would assume that's a small minority of websites.
Yeah, I was thinking the same. I have two Dyson vacuum cleaners, one purchased about 9 years ago, and the other two years ago. Both are excellent, and I still use the old one for my basement.
This really depends on the failure modes. In general, humans fail in predictable, and mostly safe, ways. AIs fail in highly unpredictable and potentially very dangerous ways. (A human might accidentally drop a knife, an AI might accidentally stab you with it.)
I've never used LLMs for this, but as someone who's been through a lot of sports-related injuries, I find doctors more or less useless (except for prescribing painkillers and performing surgeries.)
No doctor or physio has ever been able to fix my chronic issues, and I've always had to figure them out myself through lots of self-study and experimentation.
> If it was pulled into Rust stdlib, that team would be stuck handling it, and making changes to any of that code becomes more difficult.
I think Rust really needs to do more of this. I work with both Go and Rust daily at work, Go has its library game down -- the standard library is fantastic. With Rust it's really painful to find the right library and keep up for a lot of simple things (web, tls, x509, base64 encoding, heck even generating random numbers.)
I disagree, as I see it Rust's core-lib should be to interact with abstract features (intrinsics, registers, memory, borrow-checker, etc), and std-lib should be to interact with OS features (net, io, threads). Anything else is what Rust excels at implementing, and putting them into stdlib would restrict the adoption of different implementations.
For example there are currently 3, QUIC (HTTP/3) implementations for rust: Quiche (Cloudflare), Quinn, S2N-QUIC (AWS). They are all spec compliant, but may use different SSL & I/O backends and support different options. 2 of them support C/C++ bindings. 2 are async, 1 is sync.
Having QUIC integrated into the stdlib wouuld means that all these choices would be made beforehand and be stuck in place permanently, and likely no bindings for other languages would be possible.
There's no single ordering -- it really depends on what you're trying to do, how long you're willing to wait, and what kinds of modalities you're interested in.
This is not the case -- it's actually the opposite. The more of these tokens it generates, the more thinking time it gets (very much like humans going "ummm" all the time.) (Loosely speaking) every token generated is an iteration through the model, updating (and refining) the KV cache state and further extending the context.
If you look at how post-training works for logical questions, the preferred answers are front-loaded with "thinking tokens" -- they consistently perform better. So, if the question is "what is 1 + 1?", they're post-trained to prefer "1 + 1 is 2" as opposed to just "2".
That's not how LLMs work. These filler word tokens eat petaflops of compute and don't buy time for it to think.
Unless they're doing some crazy speculative sampling pipeline where the smaller LLM is trained to generate filler words while instructing the pipeline to temporarily ignore the speculative predictions and generate full predictions from the larger LLM. That would be insane.
The filler tokens actually do make them think more. Even just allowing the models to output "." until they are confident enough to output something increases their performance. Of course, training the model to do this (use pause tokens) on purpose works too: https://arxiv.org/pdf/2310.02226
OK that effect is super interesting, though if you assume all the computational pathways happen in parallel on a GPU, that doesn't necessarily increase the time the model spends thinking about the question, it just conditions them to generate a better output when it actually decides to spit out a non-pause answer. If you condition them to generate pauses, they aren't really "thinking" about the problem while they generate pauses, they are just learning to generate pauses and do the actual thinking only at the last step when non-pause output is generated, utilizing the additional pathways.
If however there were a way to keep passing hidden states to future autoregressive steps and not just the final tokens from the previous step, that might give the model true "thinking" time.
> if you assume all the computational pathways happen in parallel on a GPU, that doesn't necessarily increase the time the model spends thinking about the question
The layout of the NN is actually quite complex, which a large amount of information calculate beside the token-themselves, and the weights (think "latent vectors").
Each token requires the same amount of compute. To a very crude approximation, model performance scales with total compute applied to the task. It’s not absurd that producing more tokens before an answer improves performance, in a way that’s akin to giving the model more time (compute) to think.
It’s more like conditioning the posterior of a response on “Ok, so…” lets the model enter a better latent space for answering logically vs just spitting out a random token.
These tokens DO extend the thinking time. We are talking about causal autoregressive language models, and so these tokens can be used to guide the generation.
This. We've used GCP Appengine for years and it is rock solid. Their SRE game is top level, and when there is an outage, they do a serious investigation and make it fully public, even if they screwed up badly. Including the vital "this is how we're going to stop this ever happening again". The last outage (that we noticed) was several years ago.
Which is why humans use calculators. That is the key point being made secondary to the reliability. The LLM "knows" it is bad at Math. It knows the purpose of calculators. However, doesn't use this information to inform the user.
It could also propose to the user it could write the answer using code. It doesn't do that either.
How do you do that? Cookies are typically opaque (encrypted or hashed) bags of bits.