Thanks, but I've skimmed through both and couldn't find an answer on why they ca...

AzN1337c0d3r · 2025-04-17T01:17:02 1744852622

  BitNet: Scaling 1-bit Transformers for Large Language Models

was actually binary (weights of -1 or 1),

but then in the follow-up paper they started using 1.58bit weights (https://arxiv.org/pdf/2402.17764)

  The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

This seems to be first source of the confounding of "1-bit LLM" and ternary weights that I could find.

  In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}.

LeonB · 2025-04-17T04:08:22 1744862902

It’s “1-bit, for particularly large values of ‘bit’”

biomcgary · 2025-04-17T16:28:13 1744907293

Should be 1-trit.