Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Solving Path of Exile Item Crafting with Reinforcement Learning (dennybritz.com)
87 points by dennybritz on July 14, 2024 | hide | past | favorite | 19 comments


Wow. Absolutely incredible. I have a truly shameful amount of time playing PoE, crafting by jumping between craftofexile, poedb, spreadsheets, etc. I was always curious if something like this was possible and you’ve gone and done it. Fantastic read as well, thank you for sharing.


Thank you Exile :) I'm glad you enjoyed the read!


Is the code available by chance?

See you on the shore next Friday :)


I do want to release the code, but it may take a couple more weeks. It's currently a mess of research code that's mixed with some other stuff I was working on so I would need to do a cleanup first. See you on the boat!


Wait a couple weeks until I'm done with the league before you destroy the economy.


This is awesome! I don’t really play PoE anymore but was addicted to the game back in open beta/v1 launch. I still have my chaos orb key chain I got from my donor pack, unfortunately I lost my shirt somewhere. It’s to this day imo the best ARPG out there. I love seeing RL applied to this game because it is seriously complex! Nicely done and would love to see the full code at some point!


Great read, I followed it the best I could with my vast poe knowledge and zero knowledge about the techniques you used. I got the gist of it I think, well done. Love that poe attracts nerds from many domains and people come up with amazing stuff while impatiently waiting for the next season :)

Stay sane, Exile.


Chess also has a huge search space and allows cycles, still, wouldn't a better approach be refining a scoring function and then doing a deep search and suggest an action and min maxes your odds of victory?


I think there are some key differences to cycles in Chess. In Chess, the cycles come from your (or your opponents) actions. With minimax you would max (or min) over these actions, so if you pruned away the cycles at some depth your result is not incorrect because the min or max does not change. But here the cycles come from randomness in the game of which chess has none, and you need to take an expectation over the cycles. So you can't just prune them away without changing the result.

A scoring function could definitely help guide the exploration and/or prune the tree, but only at the action nodes, not the environment randomness nodes. Rolling out the full tree more than 1-2 levels would be infeasible because of the randomness in the environment. When you take an action, the randomness can transport you into an exponential number of states, so you have a huge branching factor that is much larger than chess. I think in chess you have a factor of 40ish? Here it's more like ~1000 or ~10,000 depending on the item.

I also wouldn't know how to design a scoring function for this. If you do something simple to take the number of missing modifiers you will end up stuck in bad states. Maybe there is something really clever here that you can do, but I don't know what it is.

If you have an idea how to make tree search work for this I'd love to try it.


I think the heuristic approach is not as bad as you think. Human crafters, and the current crafting techniques we have are pretty efficient. You don't need to explore the full graph of random states because the mod weights themselves are a representation of those states. As a human I can't think of any crafting item (alt and chaos spamming aside) where the probability isn't a simple to understand number. You think you can use something like modweight x currency cost (x some time modifier for acquisition and use of currency) for scoring. This is how craftofexile does it.


The more I think about this the more I feel you went really overkill on this. The complexity is several orderers of magnitude lower than what you claim. Like try crafting stuff on the emulator and see how easily you can narrow it down to usually 2 or 3 choices at most steps. There should be really easy heuristics to invalidate most branches. I've also used the COE simulator a ton and never had to model something that's on the order of 50+ states even.


Just a remark about the cycles not changing the min-max. That's not correct. Identical positions do not always have the same value. Sometimes, a reaching the same position again will allow a player to claim a draw (3 times repetition). Sometimes a repeated position will bring them closer to some other draw mechanism (fe 50 moves without a capture or pawn move).


However, reading this, It seems like a good fit to try a genetic algorithm. I know, GAs are regarded as a complete backwater in AI these days, but nature has shown that they work in these kind of situations.


It’s often associated in PoE that friction is what usually drives profits. Using a recent meta example, Graveyard crafting was extremely tedious while simultaneously being extremely lucrative. I am quite interested then to see what the implications on the economy are if adopted mainstream by the masses. People nowadays would rather pay 3x the prices of the craft rather than learning how to do it themselves, but if there was a convenient calculator that could derive the optimal algorithm I suspect we will see a different behavior emerge.


I think most experienced players are already pretty optimal at crafting. There might be some rare case where it might save you a couple div at best. It's usually obvious by design how to achieve something. Of course there is a market because some part of the playerbase can't do this (which might become lower) and some part of the playerbase still sees it as time efficient to just buy it precrafted. Likely there's less incentive to craft in bulk, large bulk currency trades become more expensive, making the crafted items more expensive for the unkowing new player who doesn't use this tool. Personally, I don't think it would change the economy that much.


Agreed, and I think my point around adoption by the masses was made a little too lightly - as you mentioned there will always be a part of the player base who prefer buying and playing the game rather than spending time crafting (regardless of how trivial the craft is).


It gets more interesting if you optimise for time instead of cost. If it takes you 15 minutes to craft something using 80c of currency, but you can instead run maps and average 300c an hour, then you still might prefer paying 100c for the guaranteed outcome (buying it already made) than going for "average" cost of 80c and having to craft it yourself.

Do it again, but optimise for fun instead of time and we've come full-circle :)


> does something like the following pseudocode:

The following code is Python. Hilarious!


I used to not like Python, but i gave it another try thanks to Sublime Text's plugin api, and i now love it, it allows you to be very creative very fast, perfect for prototyping

Pseudo code -> working example, same thing lol




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: