> Why does AI need that folder structure? Why not a flat list of files and let the AI agent explore with BM25 / grep, etc.
It doesn't. The human creating the files needs it, to make it easier to traverse in future as the file count grows. At 52k files, that's a horrendous list to scroll through to find the thing you're looking for. Meanwhile, an AI can just `find . -type f -exec whatever {} \;` and be able to process it however it needs. Human doesn't need to change the way they work to appease the magic rock in the box under the desk.
why? The human would just talk to the AI agent. Why would they need to scroll through that many files?
I made a similar system with 232k files (1 file might be a slack message, gitlab comment, etc). it does a decent job at answering questions with only keyword search, but I think i can have better results with RAG+BM25.
Just because AI exists doesn't mean we can neglect basic design principles.
If we throw everything out the window, why don't we just name every file as a hash of its content? Why bother with ASCII names at all?
Fundamentally, it's the human that needs to maintain the system and fix it when it breaks, and that becomes significantly easier if it's designed in a way a human would interact with it. Take the AI away, and you still have a perfectly reasonable data store that a human can continue using.
Why not think of it a different way; why do we need to put up with breaking changes at all?
I'd much rather stand up a replacement system adjacent to the current one, and then switch over, than run the headache of debugging breaking changes every single release.
To me, this is the difference between an update and an upgrade. An update just fixes things that are broken. And upgrade adds/removes/changes features from how they were before.
I'm all for keeping things up to date. And software vendors should support that as much as possible. But forcing me to deal with a new set of challenges every few weeks is ridiculous.
This idea of rapid releases with continuous development is great when that's the fundamental point of the product. But stability is a feature too, and a far more important one in my opinion. I'd much rather a stable platform to build upon, than a rickety one that keeps changing shape every other week that I need to figure out what changed and how that impacts my system, because it means I can spend all of my time _using_ the platform rather than fixing it.
This is why bleeding edge releases exist. For people who want the latest and greatest, and are willing to deal with the instability issues and want to help find and squash bugs. For the rest of us, we just want to use the system, not help develop it. Give me a stable system, ship me bug fixes that don't fundamentally break how anything works, and let me focus on my specific task. If that costs money, so be it, but I don't want to have to take one day per week running updates to find something else is broken and have to debug and fix it. That's not what I'm here to do.
And as for cleaning the house - we always have the option of hiring a cleaner. That costs us money, but they keep the house cleanliness stable whilst we focus on something else to make enough money to cover the cleaner's cost plus some profit.
I assume the cost of duplicating every piece of infra we have to create a suitable 'replacement system' would cost more than the impact the few times we have had downtime due to breaking changes in updates. Ymmv ofc
And also because, for the others, you have to migrate everybody from the "old" to the "new"; Large project, low value, nobody cares, "just to your job and don't bother us with your shit"
> But let's ignore that web API worst case. Imagine that you have some semi-trusted software and because you don't want to take any risks, you run in nested VMs three layers deep. The software has some plausible excuse to require access to the Bluetooth (perhaps it's a beacon demo?) so you grant an exception. You're not happy with the result (the beacon demo does not work as promised?), you remove the software and you also reset all three VM layers for good measure. Gone for good, nice. Unfortunately, the guest the malware installed on the ESP when it had access is still there...
You're hopping through 4 security boundaries and granting direct hardware access. If you don't understand the decisions you're making by doing that, all bets are off.
Better to give a virtualised bluetooth device and let the hypervisor drive the real one. Will hit performance a little, but it's far more secure.
It's absolutely a good thing, and arguably not a security issue at all.
It needs access to the command interface of the chip, which means you need to either have physical access to the device or compromise whatever is physically connected to the device.
It's practically like calling a read/write filesystem a security issue. Yes, an attacker can write to disk and persist there, and they can overwrite files, etc. But there needs to be a flaw that allows access to that behaviour first, else it's just a part of the interface.
And in this instance, it's a part of the debug interface of the chip. And practically makes it a perfect candidate for future bluetooth security tools, similar to the Atheros chipsets used for WiFi sniffing. Now we can do bluetooth impersonation attacks for $2 instead of hundreds.
Betting there'll be some good bluetooth research in the near future, showing all sorts of devices are vulnerable to attacks using $2 hardware. That's the real security problem here.
I would say `narrator voice` in this context is distinctly different from internal dialogue. Obviously the internal voice can narrate to yourself, but that voice tends to be slightly different in my experience.
Narrator voice is a specific voice, cadence, pattern etc. that your internal voice takes on when reading a narrative, rather than the narration of your experience. For me, I have a different voice when reading/writing technical documentation compared to reading/writing fictional works, which both are different from my day to day internal dialogue.
I have a narrator voice when reading code. It's the same narrator voice as when reading technical reports and white papers. I do sometimes struggle to process code that doesn't read well, likely because the cadence of the narrator voice gets disturbed and doesn't flow well.
I've found that my descriptions and explanations of how things work generally follow the same cadence as this narrator voice, which tends to help me explain things succinctly and transfer knowledge quickly.
I do agree it's distinctly different from the narrator voice for fiction books. I would assume that's due to the presence/lack of emotive language between the two types of writing.
Perhaps this is a similar phenomenon to the inner monologue that some people have but others don't? Or the ability to imagine various levels of detail of objects without physically seeing them? The mind is a strange beast.
...you can mark all the books you've checked with a hole so you know you've checked them.
Work harder, not smarter. /s
Jokes aside, I still agree with GP, in that there are more practical skills that are left out of education that would be far more useful in day to day life. The Dewey Decimal System has been replaced with search engines.
I don't need to know how a search engine works in order to type in "how to use a drill press" and read the results. But that's because the knowledge and understanding of how computers work at a high level circumvents the need to do that - a search engine is a form, give it something to search and hit the submit button. Easy.
Being taught by someone to use machining tools helps build an understanding of how every day items are made using those tools, so you have a more fundamental understanding of the items could be combined together into other more interesting things, repair them, take them apart and service them.
It's almost the opposite problem of maths in schools. We're taught maths in various incresasingly complex ways, all the way up to calculus. Those methods teach us how to use maths to do clever things. But every day maths doesn't need that. We're taught compound interest, but we have to use that to figure out how to do our taxes by ourselves without any help. Wouldn't it be nice to have an overlap there, hit two birds with one stone and we all walk away with a stronger understanding of the world?
If we're not taught how to make things, we struggle to learn how things are made, which means less things get made. Learning how to make things early, and embedding the knowledge of how things are made, enables more things to be made in future.
Yeah, sure, today we can teach people to use a search engine and whether you should believe the first result. Is the chatbot always truthful? Not sure when or why it was decided that media literacy isn't a useful everyday skill.
But the only differing cost in that scenarios is the 2 extra conductors to the local transformer, and more likely just to the edge of the property. The 3-phase power is still present in the area, as ideally alternating houses are on alternating phases to balance the load over the phases. The difference in cost would be a one-time hit during installation, and the ongoing maintenance would be the same as 3 separate houses (1 house per phase). That maintenance could be rolled into the cost per watt, so the more you have available the more you can use and the more you could pay.
The installation cost should vary based on what the house wants access to, and the ongoing cost should be the same as every household. A standing charge for the cost of the infrastructure existing is ridiculous when that same infrastructure is what the power company relies on to deliver their chargeable commodity. It's effectively double dipping - how is it any different from ISPs charging for access and then charging for data on top?
I think an important consideration is that the overall grid is not designed for all houses to have the higher capacity connections. So enough connections and they're forced to make massive changes to the infrastructure.
Not to say utility companies don't make obscene profits instead of reinvesting much of that into the infrastructure.
Regarding the ISP, its really the same argument. If I want 2GB/s on a neighbourhood line that supports a max avg service size of 1GB/s then they would be forced to upgrade their lines to service me. Granted, unlike power grids they're liable to just not do that and let service quality degrade and quote the "Up to X Gbps" clause
IANAL, but I believe the terms of the purchased license would be to use the product with only the license key provided. Therefore using an alternative key would be a breach of the license terms, meaning you're using unlicensed software and subject to all relevant laws.
I can see an argument that the vendor is in breach of contract by failing to provide a valid key with the license, and therefore the contract is void and the vendor should refund.
And yet again, I can see an argument that the license has been paid for, and a different key is used in the interim to access the paid for software, therefore no loss has occurred to either party rendering the whole argument moot.
Would be interested to know if this has been tested in the court system.
Depending on jurisdiction there may be rules or case law related to re-engineering and patching of purchased software to keep it working. This is again a reminder how important the right to repair is.
It doesn't. The human creating the files needs it, to make it easier to traverse in future as the file count grows. At 52k files, that's a horrendous list to scroll through to find the thing you're looking for. Meanwhile, an AI can just `find . -type f -exec whatever {} \;` and be able to process it however it needs. Human doesn't need to change the way they work to appease the magic rock in the box under the desk.