Hacker Newsnew | past | comments | ask | show | jobs | submit | maliker's commentslogin

It might just be me, but this interface is the first time I felt the desire to interact with long-running agents even though I use chat interfaces all day long. Maybe it was the demo video on the landing page which was compelling with its examples. Maybe it was the feeling that I could see what was going on because I would be on a canvas. Nicely done!

Off to keep iterating on the prototype app I started...


This is really great to hear, thank you! Have fun with the prototype, let us know how it goes.

I'll play slight devil's advocate. The buttons in the toolbar are duplicative of the options in the menubar, and I don't want to learn 2 locations for every feature. You can't turn off the menubar items, so I end up turning off the toolbar. So I don't care what that part of the UI looks like, and the sidebar for formatting they added, as pointed out in the article, uses the horizontal space on screens better than options stretched out over the full width of the menu.

Now the visibility of the liquid glass stuff, that is definitely a problem. Can't recognize a UI element if it's constantly rendered differently and with very little contrast with the background elements.

Well, I guess someone is going to vibecode a decent Linux GUI or fix the driver pains there or something and we'll be free of this. Because Microsoft/Apple and to a lesser extent Google have jumped the shark with their UI these days.


When I used to use Pages frequently I just memorized all the relevant keyboard shortcuts and turned off the entire toolbar. It’s easy: for each button in the toolbar find the equivalent in the menu, and the shortcut is written on the menu item itself. That’s, however, entirely unacceptable for most users.

The sidebar for formatting they added is strictly worse than the inspector UI in old Pages ’09. The sidebar is constrained not to overlap with content, but the user can choose to overlap the inspector. It’s strictly better flexibility for users. If you are doing a lot of fine adjustments to a single text box, then of course it’s fewer mouse movement if the inspector is located right next to the text box, despite that it has obscured other irrelevant text boxes. I dearly miss Pages ’09.


> and I don't want to learn 2 locations for every feature.

No one forces you to, you can learn it only once for the toolbar as it's 1 click instead of many clicks for menu navigation. Just like when you use shortcuts you don't need to remember where the command is in the menu?


I've standardized on getting github actions to create/pull a docker image and run build/test inside that. So if something goes wrong I have a decent live debug environment that's very similar to what github actions is running. For what it's worth.


I do the same with Nix as it works for macOS builds as well

It has the massive benefit of solving the lock-in problem. Your workflow is generally very short so it is easy to move to an alternative CI if (for example) Github were to jack up their prices for self hosted runners...

That said, when using it in this way I personally love Github actions


Nix is so nice that you can put almost your entire workflow into a check or package. Like your code-coverage report step(s) become a package that you build (I'm not brave enough to do this)

I run my own jenkins for personal stuff on top of nixos, all jobs run inside devenv shell, devenv handles whatever background services required (i.e. database), /nix/store is shared between workers + attic cache in local network.

Oh, and there is also nixosModule that is tested in the VM that also smoke tests the service.

First build might take some time, but all future jobs run fast. The same can be done on GHA, but on github-hosted runners you can't get shared /nix/store.


I'm scared by all these references to nix in the replies here. Sounds like I'm going to have learn nix. Sounds hard.


Gemini/ChatGPT help (a lot) when getting going. They make up for the poor documentation


LLMs are awful at nix in my experience. Just learn the fundamentals of the language and build something with it.


Calling nix documentation poor is an insult to actually poor documentation.

Remember the old meme image about Vim and Emacs learning curves? Nix is both of those combined.

It's like custom made for me on the idea level, declarative everything? Sign me up!

But holy crap I have wasted so much time and messed up a few laptops completely trying to make sense of it :D


Whata the killer benefit of nix over, like, a docker file or a package.lock or whatever?


package.lock is JSON only, Nix is for the entire system, similar to a Dockerfile

Nix specifies dependencies declaritively, and more precisely, than Docker (does by default), so the resulting environment is reproducibly the same. It caches really well and doubles as a package manager.

Despite the initial learning curve, I now personally prefer Nix's declarative style to a Dockerfile


same here. though, i think bazel is better for DAGs. i wish i could use it for my personal project (in conjunction with, and bootstrapped with nix), but that's a pretty serious tooling investment that I just feel is just going to be a rabbit hole.


I tend to have most of my workflows setup as scripts that can run locally in a _scripts diorectory, I've also started to lean on Deno if I need anything more complex than I'm comfortable with in bash (even bash in windows) or powershell, since it executes .ts directly and can refer directly to modules/repos without a separate install step.

This may also leverage docker (compose) to build/run different services depending on the stage of action. Sometimes also creating "builder" containers that will have a mount point for src and output to build and output the project in different OSes, etc. Docker + QEMU allows for some nice cross-compile options.

The less I rely on Github Actions environment the happier I am... the main points of use are checkout, deno runtime, release please and uploading assets in a release.

It sucks that the process is less connected and slow, but ensuring as much as reasonable can run locally goes a very long way.


I just use the fact that any action run can trigger a webhook.

The action does nothing other than trigger the hook.

Then my server catches the hook and can do whatever I want.


I wish I had the courage to run my own CI server. But yes, I think your approach is the best for serious teams that can manage more infrastructure.


I am embarrassed that I didn't think to do this. Thank you :)


I was doing something similar when moving from Earthly. But I have since moved to Nix to manage the environment. It is a lot better of a developer experience and faster! I would checkout an environment manager like Nix/Mise etc so you can have the same tools etc locally and on CI.


Yeah, images seem to work very well as an abstraction layer for most CI/CD users. It's kind of unfortunate that they don't (can't) fully generalize across Windows and macOS runners as well, though, since in practice that's where a lot of people start to get snagged by needing to do things in GitHub Actions versus using GitHub Actions as an execution layer.


I’ve VNCed into CI to debug selenium tests failing because of platform font and scrollbar rendering. I never really thought about doing that locally in a docker container, but it definitely wouldn’t be convenient to always run those tests locally in a docker container. I guess having an option to would sort of simplify debugging, but I’d still have to VNC into the docker container I think


Me, too, though getting the trusted publisher NPM settings working didn't help with this. But it does help with most other CI issues.


Most of the npm modules I've built are fortunately pretty much just feature complete... I haven't had to deal with that in a while...

I do have plans to create a couple libraries in the near future so will have to work through the pain(s)... also wanting to publish to jsr for it.


So you've implemented GitLab CI in GitHub... We used to do this in Jenkins like 7 years ago.


Masterclass in turning a goodbye email into a hire me after my next gig ends. I’m not being sarcastic, this is a great example of highlighting the value they added.


I wonder how hard it is to remove that SynthID watermark...

Looks like: "When tested on images marked with Google’s SynthID, the technique used in the example images above, Kassis says that UnMarker successfully removed 79 percent of watermarks." From https://spectrum.ieee.org/ai-watermark-remover



Berkeley National Lab did a great study on this recently [0]. Short answer what's raised prices over the last 5 years, slide 22 in the linked doc: supply chain disruption increasing hardware prices, wildfires, and renewable policies (ahem, net metering) that over-reimburse asset owners.

I'd love to be able to point at something that implicates data centers, but first I'd need to see the data. So far, no evidence. Hint: it would show up in bulk system prices not consumer rates, which are dominated by wires costs.

[0] https://eta-publications.lbl.gov/sites/default/files/2025-10...


I can live with the different visual style but iOS 26 has cost about 30% of my battery even running all day on low power mode on an iPhone 14. It’s horrendous. Hard to even get through one day on a charge now.


Yeah I’ve never understood this for lithium ion systems. Maybe some parallel or series the cells differently to get different total max power outputs? But I don’t expect that would affect cost either way.

With flow batteries there are definitely differences since the power and energy components of the system can each be scaled independently from each other. Ie need more total energy then just expand the amount of liquid electrolyte storage you have.


Interesting. What LLM model? 4o, o3, 3.5? I had horrible performance with earlier models, but o3 has helped me with health stuff (hearing issues).


Whichever the default free model is right now- I stopped paying for it when Gemini 2.5 came out in Google's AI lab.

4o, o4? I'm certain it wasn't 3.5

Edit: while logged in


> Whichever the default free model is right now

Sigh. This is a point in favor of not allowing free access to ChatGPT at all given that people are getting mad at GPT-4o-mini which is complete garbage for anything remotely complex... and garbage for most other things, too.

Just give 5 free queries of 4o/o3 or whatever and call it good.


If you're logged in, 4o, if you're not logged int, 4o-mini. Both don't score well on the benchmark!


This gets at the UX issue with AI right now. How's a normie supposed to know and understand this nuance?


Or a non-normie. Even while logged in, I had no idea what ChatGPT model it was using, since it doesn't label it. All the label says is "great for everyday tasks".

And as a non-normie, I obviously didn't take its analysis seriously, and compared it to Grok and Gemini 2.5. The latter was the best.


Added context: While logged in


Might be worth trying again with Gemini 2.5. The reasoning models like that one are much better at health questions.


Gemini 2.5 in AI Studio gave by far the best analysis


I can’t believe you’re getting downvoted for answering the question about the next-token-predictor model you can’t recall using.

What is happening?


This is an awesome, and as a bonus I learned about a mature reactive notebook for python. Great stuff.

The data sharing is awesome. I previously used Google Colab to share runnable code with non-dev coworkers, but their file support requires some kludges to get it working decently.

I know I should just RTFM, but are you all working on tools to embed/cross-compile/emulate non-python binaries in here? I know this is not a good approach, but as a researcher I would love to shut down my server infrastructure and just use 3-4 crusty old binaries I rely on directly in the browser.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: