Hacker News new | past | comments | ask | show | jobs | submit | more johncoogan's comments login

Just shared a google sheet with you.


Do you mind sharing it with me as well?


Would you mind posting it?


Highly recommend the Space Engine subreddit: https://www.reddit.com/r/spaceengine/


I wonder how long it will be for them to build a version that actually uses the screen images as opposed to all the data from the Bot API. Seems like it would be a lot harder when you add an image processing step and restrict processing of information from across the map without actually scrolling over there.


The point isn't to see if the model can process images. OpenAI's goal is to see if they can recreate the ability to plan and strategize over a partial information continuous long time horizon environment.

You wouldn't want AlphaGo to have to input it's commands using robotic hands right? It's the same thing in that, sure it might be interesting, but that isn't what we care about. Image processing and robotics controls are largely solved. Showcasing that a model can gain the ability to plan and think is the novel stuff here, and is the path to where "artificial intelligence" if any appears. That's the ultimate goal in playing any of these games.


>> Image processing and robotics controls are largely solved

Image processing and robotic control are very far from being solved problems. I guess you are saying that in the case of alpha go it would not be a super difficult step to have a camera and robotic hand physically move pieces around, and that's probably true. But I think in the DOTA case are new image processing challenges that interact with the AI in interesting ways.

I'm mostly talking about the need to move the game's camera around to gain more information. If you don't see your ally on your screen and need to see how they are handling a gank or something (full disclosure I don't play DOTA at all this could be a silly scenario). Then the AI would have to recognize this and move the camera to the allies location in order to gain that information. So really the novelty here would be in the network to somehow realize what information it needs and then further to learn how to gather that information. I honestly think that sounds like an extremely difficult next step.


Human beings are restricted to what's on the screen, which includes not only the camera's perspective of the playfield but also, crucially, the minimap in the corner of the screen. Plus there's weird stuff like how you don't have perfect information about the health/mana of your teammates unless you hold down Alt... so yeah OpenAI is "cheating" somewhat and it would be really cool to see, once it evolves further, restrictions that allow it to better mimic human player capabilities.

That said, everything they've done so far is absolutely incredible (especially now that the AI can draft!!)


This is something i noticed; a human initiator would get counter initiated almost instantly, every single time by OpenAI. The blink dagger is much less effective. Pro humans do this too, but not every single time with perfect timing.

Humans dont concentrate on the whole screen, attention is directed...


It would be interesting to take this project up a few levels later on and see how it compares to direct API interaction.

I would love to see camera/mechanical interface like mentioned by others. Similarly, like you said humans don't focus on the whole screen. I would love to see how well the AI could perform if it was given something like blinders where only a small portion of the screen is in focus at any one time much like how human eyes work.


I believe human counter-initiation is only frame-perfect when the human is anticipating the initiation to happen (baiting).

Otherwise, you still have to add time to react plus mouse travel time.


There is actually an interesting paper on that, you can find it on the youtube channel two minute papers.

Basically, they let an AI loose on a simplified version of Quake's Capture the Flag. The AI processes game video output only and has learned several key strategies. The latest update has the AI with a winrate of 71% against top humans. Unlike the DOTA match the AI has no restricted reaction time.

The AI seems to be jittering the camera left and right to reconstruct a 3D image reliable from the screen which is aquite interesting way to compensate for lack of 3D vision (and the compensation our brain is capable of naturally to get a 3D intuition from a 2D image)


> OpenAI's goal is to see if they can recreate the ability to plan and strategize over a partial information continuous long time horizon environment.

If this was actually the goal, they would add further mechanical restrictions beyond the 200ms delay to simulate the way humans players play. That way, human and AI would be on a roughly even mechanical playing field, leaving the differentiating factor only strategy/tactics.

As it stands, it looks like their victory is as much based on raw mechanical superiority as it is strategic/tactical superiority. Computers being able to be pixel-perfect accurate at all times, issue commands at ludicrous speeds, etc. is kind of an uninteresting advantage in the context of building strategic AI.


What in your opinion would be a sufficient delay? 200 milliseconds is well within the average reaction time of a human, especially where at the pro level, I'm betting it's much lower than that.


That doesn't account for focusing on other things. There are a multitude of things to take into account while playing dota that can pull your attention; you can't always directly focus on your character in expectation of a blink initiation.


but that's moving the goal post: why is attention a factor here at all?


Because ~200ms reaction time isn't exactly accurate when people are comparing focus on one action versus focusing on many things at once. Reaction time is going to be delayed then for humans (unless they happen to be expecting it at that particular moment), but for bots that doesn't happen.

So this current ai is uninteresting because bots can always instantaneously begin to react on any feedback, whereas humans have to pan and drag the camera around to look at different feedback in the first place, let alone react. Mechanically, humans also have to move the mouse all over the place and think of key combinations, in addition to reacting. Not just clicking a static box on cue.

It -would- be interesting if bots were limited just like humans to the camera view, -not- an API that continuously feeds them information. The bot would then have to learn how to prioritize working the camera, and it would be limited to only what the camera sees, etc.


I think the ai isn't winning on superior speed and reaction time alone (but it is indeed a factor).

Computers already have perfect memory and recall, so when the image recognition tech becomes good enough to only rely on the visual input, are you then going to say the bot must now limit its recall to "human" levels?


No, but that at least would be interesting, since it would be playing using the same mechanics as a human, and with the same limitations (of the camera, etc). Not using an API.


Reaction time is usually measured as time from stimulus to a button press.

For many things in Dota you also need to move the mouse cursor to a specific point on the screen which obviously takes longer than just pressing a button.


Per the architecture, the model does use a CNN to process minimap data.


Yes but the comment you're replying to is saying that it would be interesting to only rely on information from visual input on the screen, rather than on getting (for example) the absolute XY position of every player as a direct input to the network.


It's like writing a bot that would play chess using video feed from camera. Yes, it's doable and an interesting problem on it's own but completely unrelated to what openai is doing.


Chess is a poor example because it's a turn-based game whereas Dota is real-time and incredibly fast-paced. Visually parsing a chessboard to know which piece is which is trivial, and also you have all the time in the world to do it (between turns). In Dota things happen so quickly, particle effects pop off all over the place, and through all of it, you have to constantly manually re-place your camera in the optimal position.

In chess, both players always have perfect information about the game-state, and this is far from the case in Dota. OpenAI does account for fog of war, so it's not COMPLETELY omniscient, but it is still more omniscient than human players ever have the ability to be, without having to fiddle with the camera etc.


yes I see what you are saying with the chess example, but I think in this case adding the visual layer actually adds interesting problems related to what openAI is trying to do. See my reply to lawrenceyan above.


According to the Q&A at the end of the event, one of the main obstacles for using the regular game as output instead of the bot API is that self play would become prohibitively expensive. The AI plays thousands and thousands of games every day and you would need an enormous amount of GPU resources to render all of those.


That's a bullshit argument. You can seperate the CV portion from the analysis portion and just train the analysis portion by giving it realistcally limited information.


That's just equal footing, I say.


To be pedantic, equal footing here would be about GPU resources needed to interpret the images. The point I mentioned was about the GPU resources needed to render them to the screen in the first place.


I'd say the technology for building a bot with computer vision is already here or close to it. So building a working one would not take long.

The problem, as they stated in the QnA is that the image processing would take much more hardware and computing power, this increasing both cost and training time, as you would not be able to run games as quickly anymore.


Thanks for building this! I was wondering about this a few months ago, glad it’s finally here. This is such a good example. https://twitter.com/johncoogan/status/890780575480430593


Ah there you go! We had the same thought a few weeks ago.


You don't need to work 24/7. At most jobs, you just need to seem like you work 24/7. Batch up all your emails for the day, then space them out to be sent at odd hours like 11:30pm and 5:45am using something like Boomerang for Gmail.[1] Everyone will think you're working constantly.

[1] http://www.boomeranggmail.com/


I work at a company that's extremely supportive of good work life balance. We offer lots of flexibility on hours, facilitate working from home, and encourage people to work at a sustainable pace.

That didn't stop one fellow, many years ago, from trying exactly this. He'd commit code that was simply reindentation/style changes, send emails, etc, late at night, in an attempt to paper over the fact that he was simply a weak contributor that produced low quality code.

Everyone saw straight through him and recognized the absurdity for what it was.

Maintain work-life balance. If the company you work for doesn't support that choice, work somewhere else. It is an absolute lie that "most jobs" need you to "seem like you work 24/7", and perpetuating that lie only makes it easier for the bad actors to justify their behaviour.


Completely agree. Was joking in my original comment. Obviously having a healthy lifestyle will lead to increased productivity. Fetishizing workaholism is dumb and a big part of why Wall Street is struggling to retain talent these days.


Honestly, I figured that was the case, but, as my experience demonstrates, some people might actually take that suggestion seriously!


Please don't do this. It just makes everyone else feel like they have to work 24/7. It's very, very toxic.


You mean I don't have to be stupid,there's an app that just makes me look like I'm stupid! Praise the tech gods!


I've said it before and I'll say it again, Robertson Dean is the greatest audiobook narrator of all time. http://www.audiofilemagazine.com/narrators/robertson-dean/


He's great, we know his work. Thanks for the recommendation!


You, sir, are my people.


The Power Broker, am I right?



>Coffiest also includes 75 mg of L-theanine per bottle to promote relaxation without drowsiness and works in concert with caffeine to boost cognitive performance.


All recorded calls on UberConference begin with an automated message: "This call is being recorded"


Doctors record your health info too.


Soylent | Multiple Roles | Los Angeles, CA | On-site | Full-time | Visa / Immigration support | Python, Django, Postgres, Javascript, Business Intelligence.

About us: http://soylent.com Open Positions: http://jobs.soylent.com

About us: Soylent is a simple, nutritious, and affordable food that possesses all the essential ingredients a body needs to be healthy.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: