More

brainless · 2026-03-11T08:54:07 1773219247

I am interested in MetalRT. I am an indie builder, focused mostly on building products with LLM assistance that run locally. Like: https://github.com/brainless/dwata

I would be interested if MetalRT can be used by other products, if you have some plans for open source products?

sanchitmonga22 · 2026-03-11T15:04:10 1773241450

Yes, that's the plan. MetalRT will ship as part of the RunAnywhere SDK so other developers can integrate it into their own apps. We're working on making that available. If you want to be in the early access group, drop me a line at founder@runanywhere.ai or open an issue on the RCLI repo. Happy to look at your project.

brainless · 2026-03-10T07:00:04 1773126004

Hey Reiss, I just checked Synthetic. So nice to see indie providers for smaller LLMs. I am personally building products to run only with small (actually < 20b) models. My aim is for laptop usage. Would love to know what plans you have for models smaller than you have currently. Industrial use is all about smaller models IMHO

brainless · 2026-03-08T10:06:08 1772964368

Local models, particularly the new ones would be really useful in many situations. They are not for general chat but if tools use them in specific agents, the results are awesome.

I built https://github.com/brainless/dwata to submit for Google Gemini Hackathon, and focused on an agent that would replace email content with regex to extract financial data. I used Gemini 3 Flash.

After submitting to the contest, I kept working on branch: reverse-template-based-financial-data-extraction to use Ministral 3:3b. I moved away from regex detection to a reverse template generation. Like Jinja2 syntax but in reverse, from the source email.

Financial data extraction now works OK ish and I am constantly improving this to aim for a launch soon. I will try with Qwen 3.5 Small, maybe 4b model. Both Ministral 3:3b and Qwen 3.5 Small:4b will fit on the smallest Mac Mini M4 or a RTX 3060 6GB (I have these devices). dwata should be able to process all sorts of financial data, transaction and meta-data (vendor, reference #), at a pretty nice speed. Keep it running a couple hours and you can go through 20K or 30K emails. All local!

brainless · 2026-02-22T09:37:10 1771753030

New GPUs come out all the time. New phones come out (if you count all the manufacturers) all the time. We do not need to always buy the new one.

Current open weight models < 20B are already capable of being useful. With even 1K tokens/second, they would change what it means to interact with them or for models to interact with the computer.

lm28469 · 2026-02-22T09:46:06 1771753566

hm yeah I guess if they stick to shitty models it works out, I was talking about the models people use to actually do things instead of shitposting from openclaw and getting reminders about their next dentist appointment.

imtringued · 2026-02-22T10:44:46 1771757086

Considering that enamel regrowth is still experimental (only curodont exists as a commercial product), those dentist appointments are probably the most important routine healthcare appointments in your life. Pick something that is actually useless.

lm28469 · 2026-02-22T23:31:50 1771803110

If you need a full blown llm with root access to all your devices to remind you about an appointment something is very wrong with your life.

brainless · 2026-02-22T10:00:46 1771754446

The trick with small models is what you ask them to do. I am working on a data extraction app (from emails and files) that works entirely local. I applied for Taalas API because it would be awesome fit.

dwata: Entirely Local Financial Data Extraction from Emails Using Ministral 3 3B with Ollama: https://youtu.be/LVT-jYlvM18

https://github.com/brainless/dwata

brainless · 2026-02-22T09:34:45 1771752885

If we can print ASIC at low cost, this will change how we work with models.

Models would be available as USB plug-in devices. A dense < 20B model may be the best assistant we need for personal use. It is like graphic cards again.

I hope lots of vendors will take note. Open weight models are abundant now. Even at a few thousand tokens/second, low buying cost and low operating cost, this is massive.

brainless · 2026-02-22T09:30:44 1771752644

If each of the Expert models were etched in Silicon, it would still have massive speed boost, isn't it?

I feel printing ASIC is the main block here.

brainless · 2026-02-22T09:28:29 1771752509

Seems both Nvidia (Groq) and OpenAI (Codex Spark) are now invested in the ASIC route one way or another.

brainless · 2026-02-21T15:36:14 1771688174

I am in India and this is the reason I have not verified till now. I do not know how LinkedIn has the audacity to ask for this level of personal detail. This seems dystopian to me.

LinkedIn is a social network and I wish there was an alternative.

sdkfjhdsjk · 2026-02-21T16:33:26 1771691606

I am in the USA (regrettably--my nation was conquered and subjugated long ago) and it IS dystopian, but there IS an alternative.

The alternative is stay far away from digital slavery. Keep out of the slaughterhouse. Never approach it, and denounce it with every breath and fiber of your being.

Do you have a phone? It's a surveillance device. Its entire purpose from day one was to enslave you. Do not participate.

The question is, how much are you willing to give up in order to obtain freedom? What lengths will you go to? How badly do you really want it?

brainless · 2026-02-20T15:19:21 1771600761

I know it is not easy to see the benefits of small models easily but this is what I am building for (1). I created a product for Google Gemini 3 Hackathon and I used Gemini 3 Flash (2). I tested locally using Ministral 3B and it was promising. Definitely will need work. But 8B/14B may give awesome results.

I am building a data extraction software on top of emails, attachments, cloud/local files. I use a reverse template generation with only variable translation done by LLMs (3). Small models are awesome for this (4).

I just applied for API access. If privacy policies are a fit, I would love to enable this for MVP launch.

1. https://github.com/brainless/dwata

2. https://youtu.be/Uhs6SK4rocU

3. https://github.com/brainless/dwata/tree/feature/reverse-temp...

4. https://github.com/brainless/dwata/tree/feature/reverse-temp...

brainless · 2026-02-17T03:12:57 1771297977

More people are jumping in because of the thrill of it.

We are in the early days and I believe that things will get better as more people will calm the f down. People who have built things for ages will continue to do so, with or without coding agents.

In the long term, I think Open Source will win. I can imagine content management systems, eCommerce software, CRM, etc. to all become coding agent friendly - customer can customize the core software with agents and the scaffold would provide fantastic guardrails.

Self-hosting is already becoming way more popular than it ever was. People are downloading all sorts of tools to build software. Building is better. A structure needs to emerge.