Hacker Newsnew | past | comments | ask | show | jobs | submit | mnorris's commentslogin

disclaimer: I work on a different project in the space but got excited by your comment

DeepSteve (deepsteve.com) has a similar premise: it spawns Claude Code processes and attaches terminals to them in a browser UI, so you can automate coordination in ways a regular terminal can’t: Spawning new agents from GitHub issues, coordinating tasks via inter-agent chat, modifying its own UI, terminals that fork themselves.

Re: native vs external orchestration, I think the external layer matters precisely because it doesn’t have to replicate traditional company hierarchies. I’m less interested in “AI org chart” setups like gstack (we don’t have to bring antiquated corporate hierarchies with us) and more in hackable, flat coordination where agents talk to each other via MCP and you decide the topology yourself.


I was intrigued and had a look at deepsteve.com, but I couldn't figure the website out. I'm guessing it won't give you any information about it until you install it?


Thanks for the feedback.

Deepsteve is a node server that runs on your machine, so the website is designed to look like DeepSteve's UI. You really just access it at localhost:3000 in your browser, not via deepsteve.com

But now I can see how that would be confusing.


I ran exiftool on an image I just generated:

$ exiftool chatgpt_image.png

...

Actions Software Agent Name : GPT-4o

Actions Digital Source Type : http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgori...

Name : jumbf manifest

Alg : sha256

Hash : (Binary data 32 bytes, use -b option to extract)

Pad : (Binary data 8 bytes, use -b option to extract)

Claim Generator Info Name : ChatGPT

...


Exif isn't all that robust though.

I suppose I'm going to have to bite the bullet and actually train an AI detector that works roughly in real time.


Why food? It's static, and AI 3D models do not make food that I want to eat. Using photogrammetry means that high quality reconstructions of real food look tasty - it's an easy qualitative metric for me.

Previously the app only produced 3D models and threw away the original video, but incorporating the underlying videos both shows off to new users what type of content they're supposed to record (i.e. a 1 second video of a darkly lit pizza box is NOT going to produce good content), and it makes the output shareable content.


I've been working on Mukbang 3D for the past year and a half—an iOS app that converts food videos into interactive 3D models using photogrammetry. Record a short video of food, and viewers can rotate/zoom/explore it while the video plays.

I recently added pose tracking of the 3d model so I can overlay 3d effects onto the underlying video.

Here's a demo: https://mukba.ng/p?id=29265051-b9c7-400b-b15a-139ca5dfaf7e


This is awesome!

I'd love to boot this up and see how it runs on a Quest headset


Thanks for sharing!

I've been doing this for the last year with YouTube. YouTube is just a search bar now which prevents mindless browsing and distractions.

If I find myself using a site mindlessly I add it to my /etc/hosts file to block them.


The existence of LLMs has made me feel more alive and better able to live in the moment.

So what if a machine can write a novel? I don't even want to read it.

Human intention is what gives meaning and value to my life.

The things that I want to build need me to make them. Why would I wait for a machine to do it later without my imperfect human perspective?


It sounds like you want to do adaptive bitrate streaming. This [1] blogpost probably does better justice than I could do.

I think it kind of sounds similar to what you were mentioning, but it sounds like the lowest possible latency solution is to stream multiple streams of different bit rates at the same time and then WebRTC picks up the best one it can.

[1] https://bloggeek.me/webrtc-simulcast-and-abr-two-sides-of-th...


Thank you for being so helpful! That sounds good :) no I wasn't fishing for super detailed help or contribs but thanks for considering.


Great, glad that was helpful. Good luck!


Thanks for sharing this!

This is a cool game concept and I feel like it compressed a lot of geometry intuition into a short period of time. I have a math degree but managed to never take a geometry class in college or high school, so this was the first time I've had my (non-existent) knowledge of geometry "graded."

I hope more games like this can be incorporated into the formal educational process in the future; I feel like my childhood video game addiction could have been exploited by the education system just as much as the gaming companies, but with a better outcome.

Maybe the same type of game could be made for other subjects, too.

I'd like to see the concept extended in 3d with augmented reality with a limited set of construction tools. Maybe I'll try to do that if I get the time.

Also, I just realized that I only played the tutorial! There goes my morning.


How much prep have you done?

People have a wide range of reactions to FAANG interviews. One of my friends has panic attacks when thinking about interviews. I actually find LeetCode questions to be fun way to spend the evening and I like being interviewed, but I know I'm in the minority.

I consider myself a deep thinker too. And I've been slower at solving interview questions before and got faster over time.

The key about doing these problems is not memorization. I haven't memorized binary search. That's not why I'm fast. But I know the concepts and I can reproduce it at will, maybe with a bug or two which I can iron out while walking through it.

Solving problems in 40 minutes is actually the very last step in the learning process. So not being able to do that just means you haven't completed your training yet.

The process of practicing different problems over and over helps you see the patterns across different types of problems. Similar problems will have similar code structure, data structures, and algorithmic choices.

If I hear a new song on the radio I can guess what band it is before the singing starts because bands often have the same style of songs. It's the same thing when I see a coding problem and choose how to solve the problem. It's an impulse I learned from training, it has nothing to do with me being a quick thinker.

Some people on Blind mention that they do 100's of questions on LeetCode before doing FAANG interviews. Some still fail. I've failed plenty.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: