Hacker Newsnew | past | comments | ask | show | jobs | submit | Quothling's commentslogin

We've got a rather extensive AI setup through our equity fund and I've setup a group of agents for data architecture at scale. One is the main agent I discuss with and it's setup to know our infrastructure and has access to image generation tools, websearch, hand off agents and other things. I tend to use Opus (4-6 currently) and I find it to be rather great. As you point out it comes with the danger of making mistakes, and again, as you point out, it's not an issue for things I'm an expert on. What I rely on it for, however, is analysing how specific tools would fit into our architecture. In the past you would likely have hired a group of consultants to do this research, but now you can have an AI agent tell you what the advantages and disadvantages of Microsoft Fabric in your setup. Since I don't know the capabilities of Fabric I can't tell if the AI gives me the correct analysis of a Lakehouse and a Warehouse (fabric tools).

What I do to mitigate this is that I have fact checking agents configured to be extremely critical and non-biased on Opus, Gemini and GPT. Which are then handed the entire conversation to review it. Then it's handed off to a Opus agent which is setup to assume everything is wrong. After this, and if I'm convinced something is correct I'll hand the entire thing off to a sonnet agent, which is setup to go through the source material and give me a compiled list of exactly what I'll need to verify.

It's ridicilously effective, but I do wonder how it would work with someone who couldn't challenge to analytic agent on domain knowledge it gets wrong. Because despite knowing our architecture and needs, it'll often make conceptional errors in the "science" (I'm not sure what the English word for this is) of data architecture. Each iteration gets better though, and with the image generation tools, "drawing" the architecture for presentations from c-level to nerds is ridiclously easy.


Are you using this agent hive for any repeatable tasks? What you described, superficially, seems like a one off. Genuinely curious.

I think it depends on what you mean by repeatable tasks. I reuse the critical handoff agents quite a lot since they are basically just set up to help spot bias and errors. I kind of reuse the top agent. I have a few "core" configurations that I can add to. So one will know our network, one will know our data architecture and so on, to keep them a little more focused. So for this specific agent that I described, I'll add a few lines on what I'm considering to the configuration, that I'll not reuse for anything unrelated to Microsoft Fabric. I've tried using these "core" agent configurations as hand-off agents in the past, but it doesn't seem to work well in our setup which is very isolated because we're NIS2 compliant.

I don't usually go back to the original prompt. I've actually done it a few times in regards to the presentation, to get some refined images but usually I'll start a new prompt.


> Any examples how you see some engineers being left behind?

I don't know where you live, but around where I live in Denmark you'd fail for not using AI at a senior interview in a lot of places. Even places which aren't exactly AI fans use AI to some extend.

The biggest challenge we face right now is figuring out how you create developers who have enough experience to know how to use the AI tools in a critical manner. Especially because you're typically given agents for various taks, which are already configured to know how we want things to be written.


Around here on your southern neighbour, everyone is supposed to be doing AI and being evaluated by this, yet in many projects if clients don't sign off on the use of AI tools, there is no AI to use anyway.

Additionally there are the AI targets set by C suites based on what everyone is saying on TV, and what we can actually deliver based on the available data sets, integration points, and naturally those sign offs for data governance, and hallucinations guardrails.


I work for a fortune 50 that is heavily tech based.

If you can’t interview without immediately reaching for an LLM you are considered unfit to work here.


Around here C levels have AI adoption goals and are actively pushing it throughout organisations. Even when it doesn't exactly make sense.

> Everyone is jumping off the cliff

> If you don't jump off the cliff you're falling behind


I was just giving them an anecdotal example of what they were asking for. I think the answer is somewhere in the middle, but I'm not in a position to push any form of change on the C levels.

I've noticed that back in Europe everyone's in a panic mode, but that's because of the inferiority complex most people have vs both US and China. It's unwarranted.

It has an annoying bug where approving PR's from the cli won't delete branches when you squash commit, while clicking the button in the UI does it perfectly fine. It's been a bug for a while (as in several years), and if you find something like that, don't expect it to ever be fixed. As a whole it's not a bad tool though.

As you say it's limited, but that can be both good and bad.


How often are the consumers and users of tools like this also in positions to contribute financially? It's silly, but I can spin up $10000 worth of azure resources and nobody would mind (as long as they actally had a purpose etc). In contrast I doubt I'd ever get a decisionmaker to sign off on supporting an OSS project with even $50, even if we have tech that depends on it.


It's a little sad that tech now comes down to geopolitics, but if you're not in the USA then what is the difference? I'm Danish, would I rather give my data to China or to a country which recently threatened the kingdom I live in with military invasion? Ideally I'd give them to Mistral, but in reality we're probably going to continue building multi-model tools to make sure we share our data with everyone equally.


Lol EU pats you on the head

Its sad to see how you have regulated yourselves into a position where Mistral is your only claim.


Europe is already moving into the EU cloud. Hetzner, OGH Cloud and so on as well as local data centers where partner companies set up own cloud with various things to rival office 365. So far it's mainly the public sector. My own city cut their IT budget by 70% by switching from Microsoft.

The key point is the partner companies. Almost nobody is actually running their own clouds the way they would with various 365 products, AWS or Azure. They buy the cloud from partners, similar to how they used to (and still do) buy solutions from Microsoft partners. So if you want to "sell cloud" you're probably going to struggle unless you get some of these onboard. Which again would probably be hard because I imagine a lot of what they sell is sort of a package which basically runs on VM's setup as part of the package that they already have.


I've got the smallest version of the m1 macbook air when they came out. It's still my daily driver when I'm not on my corporate T14 gen 6 I7 with 32gb of ram, and while it no longer outperforms my corporate computer it keeps up well enough while being cold to the touch and noiseless. It's also significantly lighter and has better battery life despite being old, though corporate does kill a lot of that on the pc.

Not being able to feel that it's turned on is basically the primary feature of a laptop for me. I've wanted to switch my personal device to linux for a while, but there just... isn't... one. I know I could run linux on the macbook, but the point here is that there is nothing which compares, not even at higher prices.


Kattegat where I live is probably double the width of Hormuz and if you're in a small ship you can probably sail most of those 140 km. Not without risk, but you'd be relatively safe for the most part. Big ships can't though, so even though there might be 50km on each side of them they could potentially have a shipping lane which is only a few hundred meters wide.

I can't say that I know anything about Iran, but if we were to close our straits off so you couldn't enter the north sea from the baltic sea then our navy would rapidly deploy various different mines that lay on the bottom on the shallower parts and control the shipping lanes with things like suicide drones. I imagine Iran would do something similar, only they've probably been preparing for it a lot more than we have.


We've got access to opus 4-6, gpt 5.4, gemini pro and a few others through corprate. I have customized agents on claude, gpt and gemini since we tend to run out of tokens for x model by the end of a month. Out of all of them I've consistently been using sonnet for most tasks. Opus functions mainly as hand-off agents and reviewer". In my anecdotal experience Claude is miles ahead of the other models and has been for a long while... when it comes to writing code the way we want it. Which eksplicit, no-abstraction, no-external packages, fail fast defensive programming. I imagine you'd get different milage with different models and different coding styles.

The rest of the organisation, which is not software development or IT related, mainly uses GPT models. I just wish I hadn't taught risk management about claude code so they weren't wasting MY tokens.


You don't have to go that deep. 99% of the time our analytics or risk management teams have some really memory inefficient Python and they want me to write them one of our "magic C things" it turns out to be fixable by replacing their in-memory iterations with a generator.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: