More

Deathmax · 2025-04-17T22:26:22 1744928782

It is gated behind the GOOGLE_INTERNAL visibility flag, which only internal Google projects and Cursor have at the moment as far as I know.

Deathmax · 2025-03-01T03:21:34 1740799294

That limitation should go away when Trusted Signing graduates from preview to GA. The current limitation is because the CA rules say you must perform identity validation of the requester for orgs younger than 3 years old, which Microsoft isn't set up for yet.

Deathmax · 2025-02-15T19:34:54 1739648094

Their primary business model nowadays is as an advertising agency, not book selling: https://www.guinnessworldrecords.com/business-marketing-solu...

Deathmax · 2025-02-11T14:50:51 1739285451

The cables between 12V-2x6 and 12VHPWR are identical, it's the port that has different pin lengths (shorter sensing pin, longer conductor pins) to allow for better detection of poorly seated cables and better conductivity while loose.

whywhywhywhy · 2025-02-11T17:41:54 1739295714

Pedantry between "cable" and "connectors" and claiming one if the same doesn't help people understand the situation.

Deathmax · 2025-02-08T03:33:52 1738985632

It's helpful for evading detection, because if you've compromised a machine, you can drop in the server binary and it'll have been added to the allowlist for devs to run.

Deathmax · 2025-01-31T13:31:46 1738330306

The most recent one of the top of my head is their horrendous aliasing of DeepSeek R1 on their model hub, misleading users into thinking they are running the full model but really anything but the 671b alias is one of the distilled models. This has already led to lots of people claiming that they are running R1 locally when they are not.

TeMPOraL · 2025-01-31T14:13:40 1738332820

The whole DeepSeek-R1 situation gets extra confusing because:

- The distilled models are also provided by DeepSeek;

- There's also dynamic quants of (non-distilled) R1 - see [0]. Those, as I understand it, are more "real R1" than the distilled models, and you can get as low as ~140GB file size with the 1.58-bit quant.

I actually managed to get the 1.58-bit dynamic quant running on my personal PC, with 32GB RAM, at about 0.11 tokens per second. That is, roughly six tokens per minute. That was with llama.cpp via LM Studio; using Vulkan for GPU offload (up to 4 layers for my RTX 4070 Ti with 12GB VRAM :/) actually slowed things down relative to running purely on the CPU, but either way, it's too slow to be useful with such specs.

--

[0] - https://unsloth.ai/blog/deepseekr1-dynamic

zozbot234 · 2025-01-31T14:17:30 1738333050

> it's too slow to be useful with such specs.

Only if you insist on realtime output: if you're OK with posting your question to the model and letting it run overnight (or, for some shorter questions, over your lunch break) it's great. I believe that this use case can fit local-AI especially well.

adastra22 · 2025-01-31T13:59:59 1738331999

I'm not sure that's fair, given that the distilled models are almost as good. Do you really think Deepseek's web interface is giving you access to 671b? They're going to be running distilled models there too.

Deathmax · 2025-01-31T14:39:37 1738334377

It's simple enough to test the tokenizer to determine the base model in use (DeepSeek V3, or a Llama 3/Qwen 2.5 distill).

Using the text "സ്മാർട്ട്", Qwen 2.5 tokenizes as 10 tokens, Llama 3 as 13, and DeepSeek V3 as 8.

Using DeepSeek's chat frontend, both DeepSeek V3 and R1 returns the following response (SSE events edited for brevity):

  {"content":"സ","type":"text"},"chunk_token_usage":1
  {"content":"്മ","type":"text"},"chunk_token_usage":2
  {"content":"ാ","type":"text"},"chunk_token_usage":1
  {"content":"ർ","type":"text"},"chunk_token_usage":1
  {"content":"ട","type":"text"},"chunk_token_usage":1
  {"content":"്ട","type":"text"},"chunk_token_usage":1
  {"content":"്","type":"text"},"chunk_token_usage":1

which totals to 8, as expected for DeepSeek V3's tokenizer.

adastra22 · 2025-01-31T15:49:25 1738338565

I’m not sure I understand what this comment is responding to. Wouldn’t a distilled Deepseek still use the same tokenizer? I’m not claiming they are using llama in their backend. I’m just saying they are likely using a lower-parameter model too.

zozbot234 · 2025-01-31T16:03:45 1738339425

The small models that have been published as part of the DeepSeek release are not a "distilled DeepSeek", they're fine-tuned varieties of Llama and Qwen. DeepSeek may have smaller models internally that are not Llama- or Qwen-based but if so they haven't released them.

adastra22 · 2025-01-31T17:32:17 1738344737

Thank you. I’m still learning as I’m sure everyone else is, and that’s a distinction I wasn’t aware of. (I assumed “distilled” meant a compressed parameter size, not necessarily the use of another model in its construction.)

zozbot234 · 2025-01-31T14:03:51 1738332231

Given that the 671B model is reportedly MoE-based, it definitely could be powering the web interface and API. MoE slashes the per-inference compute cost - and when serving the model for multiple users you only have to host a single copy of the model params in memory, so the bulk doesn't hurt you as much.

adastra22 · 2025-01-31T14:27:28 1738333648

They can still run a lot more users on the same number of GPUs (and they don't have a lot) using distilled models.

Deathmax · 2024-10-27T13:29:05 1730035745

According to ThinkBroadband's tracking [1], the headline figures are 85.20% of premises are gigabit capable (FTTP/FTTH/Cable [DOCSIS]) with 71.86% being full fibre.

[1]: https://www.thinkbroadband.com/news/10343-85-gigabit-coverag...

Deathmax · 2024-10-05T12:55:50 1728132950

As far as I know, there's no available tooling for the public to detect SynthID watermarks on generated text, image, or audio, outside of Google Search's About this Image feature.

Deathmax · 2024-09-26T15:31:02 1727364662

The free tier API isn't US-only, Google has removed the free tier restriction for UK/EEA countries for a while now, with the added bonus of not training on your data if making a request from the UK/CH/EEA.

Deathmax · 2024-09-24T22:08:56 1727215736

They do: https://cloud.google.com/vertex-ai/generative-ai/docs/partne...