Cool stuff! Is the goal of this project personal learning, inference performance...

nirw4nna · 2025-06-18T22:05:20 1750284320

Thanks! To be honest, it started purely as a learning project. I was really inspired when llama.cpp first came out and tried to build something similar in pure C++ (https://github.com/nirw4nna/YAMI), mostly for fun and to practice low-level coding. The idea for DSC came when I realized how hard it was to port new models to that C++ engine, especially since I don't have a deep ML background. I wanted something that felt more like PyTorch, where I could experiment with new architectures easily. As for llama.cpp, it's definitely faster! They have hand-optimizing kernels for a whole bunch of architectures, models and data types. DSC is more of a general-purpose toolkit. I'm excited to work on performance later on, but for now, I'm focused on getting the API and core features right.

NalNezumi · 2025-06-19T05:00:54 1750309254

If someone wanted to learn the same thing, what material would you suggest is a good place to start?

nirw4nna · 2025-06-19T06:15:31 1750313731

You just need a foundation of C/C++. If you already have that then just start programming, it's way better than reading books/guides/blogs (at least until you're stuck!). Also, you can read the source code of other similar projects on GitHub and get ideas from them, this is what I did at the beginning.

liuliu · 2025-06-18T18:16:15 1750270575

Both uses cublas under the hood. So I think it is similar for prefilling (of course, this framework is too early and don't have FP16 / BF16 support for GEMM it seems). Hand-roll gemv is faster for token generation hence llama.cpp is better.

kajecounterhack · 2025-06-19T23:47:04 1750376824

Unrelated: my man, I loved your C vision library back in the day.