Hacker News new | past | comments | ask | show | jobs | submit login
Ex Valve dev on CS:GO’s codebase (twitter.com/richgel999)
166 points by hnthwaccount on Nov 23, 2020 | hide | past | favorite | 121 comments



One of the best, and first, things we did when starting our machine learning platform was to design it using a plugin architecture. There's a lot of scar tissue and horrible experience through our previous ML products we built for enterprise.

Namely, it was extremely hard to onboard new developers to work on the product. They had to understand the whole thing in order to contribute.

Changing something was also hard, since it was intertwined. Adding a feature or removing a feature was hard. Especially given the fact we were not designing to spec, and the domain experts we were building for could not give feedback. That was a constraint outside of our control, so we were building for users we never met based on what we thought would make sense.

The first commits on our own platform were to establish a plugin architecture. There's a core, and there are plugins. We could add or remove functionality changing a config file. Applications are plugins and onboarding is easy and smooth, since a junior developer can start working on one plugin and then expand their knowledge.

We're reaping the rewards of that.


A way to tackle this further is to use an analytics tool like CodeScene. You link it with your repo and it shows you what people have knowledge across which parts of code[0].

Then you can identify parts that share too few contributors and encourage people to work on them together. You might also find parts where the knowledge is already lost, since everyone who worked on it already left the company. In that case some team can volunteer to take ownership of the code and take time to make some sense of it.

Of course, easier said than done.

[0] https://codescene.io/projects/167/jobs/55946/results/social/...


That kind of looks useful, would be nice if it handled Perforce and could be installed on private servers.

Don't think the company would want to release all the code publicly.


Are you sure it couldn't be installed locally?

My client is a bank with private GitHub servers, so I assume their codescene is self hosted as well.

edit: Yep, there is an on-premise option: https://codescene.com/pricing/


Yes, there's an on-prem version of CodeScene that can be run on private servers. The latest release is described here: https://codescene.com/blog/architectural-analyses-simplified...

To analyse Perforce, you setup an automated conversion to a read-only Git repo that you then point the analysis to.


Fully support those who want to use these tools, just don’t forget they are not a crutch. You can still write hard to work with code and know exactly who the SME is. That doesn’t make the code less difficult to work with, less error prone to change. That can only be solved by design and enforced through discipline.


Fully agree! That's why I recommend to monitor the code health trends continuously -- the earlier we can catch any potential issues, the better: https://codescene.io/projects/167/jobs/55946/results/warning...

We have these code health trends supervised as part of our pull requests and use them as (soft) quality gates. [1]

[1] https://codescene.com/blog/measure-code-health-of-your-codeb...


Absolutely. This only helps to nerf the knowledge loss from teammates leaving the project, nothing more.


And nerf knowledge loss from ourselves after a couple of [weeks/months/years].


Definitely; the biggest challenge in software development is keeping things manageable. I think this is why microservices are so popular; the individual services are manageable. But that's a misconception, because the overall architecture becomes a monstrosity, especially if there is no oversight (which in my experiences with the pattern, there wasn't).


I think microservices are and aren't an overreaction.

If you think of microservices as contract enforcement, it should be harder to produce unintended consequences on the macro level, because everything should flow through the api. (assuming you don't have something weird like several microservices manipulating the same data sources directly), so architecturally it's easier to understand code/data flows than the same monolith system that hasn't properly enforced modularity.

The main problem is most folks rushing head first into the silver bullet don't understand is things have trade off, and in microservice land it's versioning, testing, deploying, and monitoring.

Nowadays, though the tooling is pretty good, and if you put your microservices in a monorepo (seems backwards, I know) you can avoid the versioning, testing and deployment difficulties, and use GKE + istio and you've got tooling to help handle the ops problems, so actually, maybe enforcing code quality is actually the harder problem, and limiting the size and scope does sorta make sense.


Are there any drawbacks of using a plugin centric approach? Typically there is loss of expressiveness in code, loss of performance or disconnect between core and plugin development.


> Are there any drawbacks of using a plugin centric approach?

Here are some:

1. You're perhaps more subject to Hyrum's law. If plugin devs can see it, they will use it. The general observation here is that it's harder to control the visible interfaces and implicit dependencies you export than the dependencies and interfaces you rely on. As one example, semantic versioning doesn't cater for this at all. Plus, most of the practice knowledge in software is on managing relied on dependencies.

2. Dog follows tail. It can happen that a plugin becomes so successful the overall system evolution slows down. The core system can upgrade, but adoption/deployment can be constrained when a particularly valuable plugin doesn't move up to the latest and the customer base sees more value in the plugin than its core platform. This can compound poorly over time, and in extreme cases the desirable plugin can become its own platform/system (something I think business savvy tech leaders are increasingly aware, and wary, of).

3. Operational complexity. It can be harder to run and maintain a plugin based system than a closed one. 2 is a consideration here, but so are other concerns, such as security and resource isolation. Strategies vary, but who pays this cost on a relative basis is one of the more (and perhaps the most) interesting aspect of working on or using plugin systems. As one example of this, think about allocating responsibility for bugs.

4. R&D complexity. It may take more time to design and build a plugin system than a closed one. Incrementally evolving to a plugin system can be difficult if you didn't start there to begin to with. So you usually need a clear opening motivation to delay reward (or avoid over engineering) to invest in a system design where functionality can be extended by non-core developers.


Not that I could see at this point. It is not a panacea, but for us, it was a way to contain scope at different levels.

For exaple, we have the platform and it has icons on the sidebar for Notebook, Object Storage, etc.

Every single one of these is a separate application and a separate repository. These applications are independent in how they deal with business logic, so there's no loss of expressiveness. They just must present certain "receptors" or interface if they want to be plugged into the system. The "interface" is a big word, and someone can produce a valid minimal plugin (that does nothing except be loaded) in two minutes.

This allows us to contain details of a plugin to the plugin itself, and not having it leak to other parts of the product. If we want to activate/de-activate the plugin, it takes less than 10 seconds manually.

Now, sometimes a plugin depends on another plugin. But they make their requests to that plugin, and fall-back to something else in case that plugin is unavailable.

The amount of engineering time this has saved us is delightful. I think of all the code we did not have to write and it makes me smile.

That's for containment and encapsulation at the application level. But we also follow that mode at the functionality level, too. For example, model detection and tracking is done by plugins.

We like to do things and have an abstraction so that we can churn out functionality for similar things "industrially", without thinking too much, but also so we could remove things easily without breaking the rest. Making code not just easy to add, but easy to remove is important. When we did that, we were able to remove a lot of code, too.

It is a spectrum, and we started by using it to contain at the "app" level.


That’s a rosy picture, but I’d point out that atrocities such as Eclipse Rich Client platform and associated OSGi specs are where this plugin concept can lead and it has its own problems! Complexity and discoverability are two!


It's not a panacea, but as a guiding thought it has served us well. We don't go all in on things "just because", and we develop what we need as opposed to what we imagine we need, and stop at an abstraction level that gets the job done.


Thank you, I appreciate the detailed answer.


you need to have someone who is very comfortable saying "no" to be in charge of maintaining the interface. otherwise, and especially if the plugin devs have access to the core code, they will say stuff like "hey, I see you have a very convenient function in the core, can you expose it for my plugin?". once you open the door to this, your encapsulation suffers death by a thousand cuts. you can end up with a de facto monolithic codebase that also has a complicated plugin interface that doesn't really encapsulate anything.


> you need to have someone who is very comfortable saying "no" to be in charge of maintaining the interface.

Of which the end result will be that the desired functionality will be somehow hacked within the plugin or will not be available at all.

Problem with such plugin based architecture is that it relies on a well designed interface. Person which designs the interface needs to have very good idea of how that interface will be used in the future, which is difficult / often impossible thing to do.

When business requirements change, you then have the difficult dilemma - just insist on "no", introduce a minimal hack, redesign interfaces to support the use case in a clean way (possibly big task) etc.

> your encapsulation suffers death by a thousand cuts. you can end up with a de facto monolithic codebase that also has a complicated plugin interface that doesn't really encapsulate anything.

Yes, worst outcome of all. In reality, plugin based architecture is no silver bullet. It can be very counter productive, especially when you're figuring out what you actually want to build, as you build it.


> When business requirements change, you then have the difficult dilemma - just insist on "no", introduce a minimal hack, redesign interfaces to support the use case in a clean way (possibly big task) etc.

one thing I will add is that every new feature does not have to be a plugin just because you have a plugin interface. "implement it directly in the core" is a perfectly valid fourth choice. some things just aren't suited to a plugin implementation.


>one thing I will add is that every new feature does not have to be a plugin just because you have a plugin interface. "implement it directly in the core" is a perfectly valid fourth choice. some things just aren't suited to a plugin implementation.

Yes. Quoting my answer to your reply's parent:

""" It depends on the scope of the functionality. For example, right now, authentication and token generation are in the core, but it's okay right now because authentication spans across the whole product.

We eventually will extract it out, so we could use it as a component in another product, but for now, it's not inappropriate to leave it in the core. """


>Of which the end result will be that the desired functionality will be somehow hacked within the plugin or will not be available at all.

It depends on the scope of the functionality. For example, right now, authentication and token generation are in the core, but it's okay right now because authentication spans across the whole product.

We eventually will extract it out, so we could use it as a component in another product, but for now, it's not inappropriate to leave it in the core.

>When business requirements change, you then have the difficult dilemma - just insist on "no", introduce a minimal hack, redesign interfaces to support the use case in a clean way (possibly big task) etc.

Some days are easier than others.

>Yes, worst outcome of all. In reality, plugin based architecture is no silver bullet. It can be very counter productive, especially when you're figuring out what you actually want to build, as you build it.

That's why I talked with the scope and abstraction level. We tend to make the few and loose assumptions that get us a lot of leg work done automatically. We won't make further assumptions just for 1% advantage. And if we do, there's a fallback. For example, we say that a plugin has a certain structure and expects say an icon file. If it's not there, it's not there, the plugin is loaded but just not displayed. We issue a warning, in case it was by mistake, but the application does not break.

Very few and loose "specs" that one can go through quickly and easily without looking at a checklist or something.

Again, it's not a panacea. The underlying assumption in what I wrote is that neither I nor the reader believe in silver bullets. It's not a dichotomy. The question of course is not whether a plugin architecture solves all problems and makes bacon or solves nothing, and I may have been unclear in my message. My point was that it's one of the most useful things we have done because it reduced the amount of work we had to do. We still wake up and build product.


The core pretty much does nothing, except load the plugins and a few functions that we will extract into their components. This is what we did for the other parts.

One reason we did this was because we built custom, turn-key, ML products for large enterprise. Complete applications, from data acquisition and model training to "the JavaScript", admin interface, user management, etc.

Now... these large enterprise clients were in a sector. We could hardly sell the product to other similar clients because we couldn't just pick and choose which component or features to put on a skeleton.

It took us a lot of time, because these projects were both "software engineering" and "machine learning". In other words, we were toast. The worst of both worlds, as we were doing complete applications that even allowed their people to train models themselves.

It took a toll on morale. At some point working on eight different projects with different code bases and subsets of the team. We were fed up with this. We wanted to do things differently. We wanted to be able to get the time it took to ship the project the closest possible to the time it took to train models, which we historically did rapidly. It was all the rest that took time.

Total time = time to define problem + time to get data + time to produce models + time to write application + a big hairy epsilon

We wanted to bring "Total time" to its irreducible form. We didn't want to keep writing different applications for clients. We knew how to do it, but we did it enough times for several clients to notice patterns we wanted to extract into components. We also were losing time with the ML project lifecycle (experiment tracking, model management, collaboration, etc). We didn't want to ask the question "Which model is deployed again? What data produced that model? I tried your notebook on my machine, it doesn't work! DS: Hey, Jugurtha... Can you deploy my model? Jugurtha: I'm busy right now. I'll do it as soon as possible".

So we started building our ML platform[0] to remove as much overhead as possible, while being flexible. For example, one of our design goals is that everything one can do on the web app, they should be able to do with an API call.

- [0]: https://iko.ai


Could you point me to any open source projects/references you've used for your platform that showcases this plugin architecture? I'm a junior developer and I'd really like to learn and incorporate this in my projects.


Sure, here you go. That'll get you started. We didn't adopt any of that or a specific solution/library/way, but they were good resources to think about the matter.

That's from our internal wiki, and "Plugin Architecture" was the first entry in that, just to tell you about how important it was.

Also, given that you describe yourself as a "junior developer", here's a reply to an Ask HN about "How to become a senior developer".

https://news.ycombinator.com/item?id=25025253

Plugin Architecture:

Dynamic Code Patterns: Extending Your Application with Plugins - Doug Hellmann

Link: https://www.youtube.com/watch?v=7K72DPDOhWo

Description: Doug Hellmann talks about Stevedore, a library that provides classes for implementing common patterns for using dynamically loaded extensions

PluginBase - Armin Ronacher:

Link: http://pluginbase.pocoo.org/

Description: Armin Ronacher is the creator of Flask, Jinja, and a bunch of other stuff.

Miscellaneous links on plugin systems:

https://developer.wordpress.org/plugins

https://techsini.com/5-best-wordpress-plugin-frameworks/

https://eli.thegreenplace.net/2012/08/07/fundamental-concept...

https://pyvideo.org/pycon-us-2013/dynamic-code-patterns-exte...

https://en.wikipedia.org/wiki/Hooking

http://chateau-logic.com/content/designing-plugin-architectu...

https://www.odoo.com/documentation/user/11.0/odoo_sh/getting...


Thanks a lot for taking the time to list these resources (in this post as well as in the other Ask HN post)! This is really helpful for beginners like me. :)


This is an implementation of the "ports & adapters" or "hexagonal architecture" pattern. Seems like you may have independently discovered it. :)

https://en.wikipedia.org/wiki/Hexagonal_architecture_(softwa...


I was trained as an EE so control theory, feedback systems, adapters, ports, connectors, buffers and impedance matching, transfer functions all stuck with my way of approaching designing systems, including human systems/organizations.

The way I describe it as "receptors" as in neurotransmitters is because this fascinates me. Both nicotine and acetylcholine bind to nicotinic acetylcholine receptor (nAChRs). I found that to be amazing.


It’s also just a specific case of the more general idea of writing libraries and common APIs instead of monolithic applications. This can be seen in the Unix philosophy with standard IO and pipelines.


This has been my experience too - not in ML (i have zero experience there) but in software architecture more generally. The idea of locality of change, of encapsulation, of interfaces as a way to reduce the amount of stuff you need to know in order to change the system. It is a pretty effective strategy for keeping the "cost to change" a software product low.

There have been downfalls for me though - interestingly never the obvious problem. To me the obvious problem is performance, there's usually an overhead in a plugin system of some sort but it hasn't been an issue for me yet, maybe i'm just lucky.

An annoying real issue has been adding in dependencies between plugins, that has always introduced horrible issues later on. I think based on my current life experiences i'd even choose to duplicate functionality over introducing dependencies now - despite the fact that idea is basically anethema to conventional wisdom.


>The idea of locality of change, of encapsulation, of interfaces as a way to reduce the amount of stuff you need to know in order to change the system. It is a pretty effective strategy for keeping the "cost to change" a software product low.

Yes, not only keeping the cost of change low, but unlocking potential in a way that seems prescient but is just common sense. I like to think about this in terms of "unit of thought", "protocols, interfaces, specs", "impedance matching" to maximize power transfer, unknown future consumers.

Doing that adaptation upstream prevents people writing adapters downstream. It may be adding a "REST" API so that consumers you may ignore can work with that. It may be using Cloudevents spec for your events to make it easier for people to work with that. It may be making sure the output of your system is a Docker image to make it easier to use that elsewhere for a user, whether that may be a human or something else.

Systems that produce known/standardized/understood building blocks unlock a lot of potential and avoid downstream work.


Doesn't suprise me. I was watching a video where they had devs watch a speedrun for HL:2, and they said that there are a ton of hacky fixes that never got properly fixed. They said that same code is now in HL:Alyx, and that it's funny looking through the code for a modern AAA game and seeing comments like "Quick hack to get demo stable for E3 2005. Add permanent fix after show".


Yep, that happens all the time. Some quick fix is added before a trade show and 2-3 years later it's still there and going into the release. The reasons vary but it's usually one or more of: people forgot, issue was created but was never important enough to get worked on, the fix works fine and the comment should be removed, the code path is no longer executed.


Professional game development seems to revolve around the art of quick hacks. Gamers aren't going to care whether your code is clean or not, and they sure as hell won't want to wait for you to tidy it up.


But the weird thing here is that Source is based on GoldSrc which was based on Quake; apparently, instead of learning from years of experience and building a brand new engine without the baggage, they decided to just keep building on top of the old stuff?

I mean to a point I get it, but if some code is unmaintainable, you don't keep trying to fix it, you have to decide to replace it.

Valve has no excuse, they make crazy amounts of money, they can fund the development of a new engine from scratch easily. They just choose not to.


>Valve has no excuse, they make crazy amounts of money, they can fund the development of a new engine from scratch easily. They just choose not to.

It admittedly gets a bit more difficult when these hacks and quirks are part of what create the unique feel of your game engine. People have played CS at such high levels for so long that switching engine at all is likely going to introduce some difference in feeling, even if you think you've accounted for all the unique bugs and interactions. If you remade Quake 3 in a new engine, people are going to hate it. See Quake 4, for example.

Source 2 would still be a nice jump for CS:GO, but the team just doesn't have the resources at present to get this all done. Dota 2, being the style of game that it is, isn't as affected by a difference in feel as a first person twitch shooter.


> It admittedly gets a bit more difficult when these hacks and quirks are part of what create the unique feel of your game engine

Oh, the memories! I was so upset when bunny-hoping was mostly removed when CS:GO was released. I played the HNS (hide-n-seek) mode in CS 1.6 more than the normal game-mode. Most of the HNS mechanics were based on game bugs: bunny-hopping, long-jump (sync mouse movement with player movement), edge-bug (not dying if you fall from any distance on a 90degree edge at a specific distance), jump-bug (not dying if you fall from any distance if you jump exactly before you hit the ground), surfing (gaining almost infinite speed when sliding horizontally across a tilted surface).

I actually stopped playing CS entirely because this mode could not be accurately reproduced with the new CS:GO physics engine.


CSGO still has bhops, longjumps, surfing, and jumpbug; I think jumpbug is harder and I'm not sure about edgebug. Was there really such reliance on jumpbug and edgebug in this game mode?


For example, I found this question from 6 years ago: https://www.reddit.com/r/GlobalOffensive/comments/276pvz/can...

So apparently not only bhoping is harder (jumping window much smaller), the default speed is also capped at a very low value (300), where in 1.6 it was uncapped and you could have easily reached 400-500 speeds while bhoping.

Another related thread: https://www.hltv.org/forums/threads/1169582/16-movement

So, although changes are not clear, all of those who did play HNS before agree that mechanics are different and it just doesn't feel the same, you don't have the same freedom, it feels a lot more sluggish.


I am not sure this was always the case with CS:GO.

I know there are servers now that have those type of modes, but I think they have to use some custom server parameters for the physics to make it work more like it used to do.

I remember when I first tried CS:GO coming from 1.6 when it came out, there were no HNS servers as the default physics model really didn't allow for bhops and longjumps the way it used to work. Or maybe, even if somehow it still allowed for those things, the mechanical difference was too big so it didn't feel the same.

In the end I "just" played like 1000 hours in CS:GO and quit the game as there was no real HNS replacement as it used to be. I still played the normal mode for 1k hours as there were a few very nice 24vs24 servers were everyone was using microphone and try-harding.


> Source 2 would still be a nice jump for CS:GO, but the team just doesn't have the resources at present to get this all done

That is surprising. I had assumed that CS:GO was an incredibly steady cash cow. IIRC, CS:GO pioneered the digital collectibles + loot box market, and has somewhere between 600k to 1.1M+ active players during any given day. https://steamcharts.com/app/730

The trading market also seems particularly active and Valve takes a cut of each transaction of their digital goods. It's an attractive model. There are dedicated companies that have sprung up around it and seem to prosper.

Other indirect indicators seem to be green as well. Back when it was a phenomenon, it sold at least 25M+ units before it was made free-to-play. This is on par with Minecraft. The installed player base is in the hundreds of millions. It's essentially a giant social network with multiple monetization opportunities.

I am struggling to see how this wouldn't be profitable.

Surely, Valve must make enough in a month to hire 50+ people and give them 18 to 24 months to re-write the engine?


The guy who wrote the tweet in the OP regularly posts publicly about the internal structure and politics at Valve. Reading through that gives me the impression that money is never a factor at Valve, and it's all about 1) whether anyone cares enough to work on a project and/or 2) whether someone is trying to impress/kiss-ass to climb the social pyramid.

I can't imagine the idea of rewriting CS:GO from scratch just to improve maintainability is going to get very far.

Plus players will 100% notice even the tiniest changes, and will complain about it forever. A game like CS:GO will never die, just look at the player numbers for its predecessor which has 6000-7000 daily active players.


> Plus players will 100% notice even the tiniest changes, and will complain about it forever. A game like CS:GO will never die, just look at the player numbers for its predecessor which has 6000-7000 daily active players.

this. to any readers unfamiliar with the community, it's hard to overstate how much cs players abhor change.


> It admittedly gets a bit more difficult when these hacks and quirks are part of what create the unique feel of your game engine.

An example of this in the Quake 3 engine (and now permanent behavior in the CS series) is air strafing. It's a glitch in how Quake 3 handles motion vectors. But it's now also enshrined behavior, complete with entire game modes in CS built around it (KZ & surf maps). If you went and made an entirely new engine, or even just used something off the shelf like Unity or Unreal, you'd have to add that bug back. It's core to the gameplay now.


To add to the list of 'bugs that are now features', in SF2 'combos' were a bug, they were not intended to be part of gameplay. Their inclusion is arguably the basis of the 1v1 fighter genre.


That sounds interesting. could you tell more about that, or point me to a good article/video about it?

I never got into 1v1 fighters myself, but they were a big part of my childhood.


To add on, players like this feel enough that this behaviour is very convincingly replicated by Riot in Valorant, which is a CS-like game on Unreal engine.


A more famous example of that same class of bugs is bunny-hopping, why has become an FPS staple far beyond the scope of the Quake engine.


I think it's reasonable to remember that Half-Life 2 or Source goes back to days when Valve was still just a gamestudio. So they didn't have the resources they have now. Yes original Half-Life, expansions and Counter-Strike were popular, but Steam for one wasn't a thing. But they weren't printing money yet. So spending time on engine with a ambitious game wasn't something they had money or time for.

And then Global Offensive is title originally developed by third-party. So I don't think those guys either were going to put huge effort in engine. And once it got going, there is little push to replace it again. Specially when it's raking massive amounts of money in even in the state it is...


> instead of [...] building a brand new engine without the baggage

Time to bring up this classic post: https://www.joelonsoftware.com/2000/04/06/things-you-should-...


Quake is full of hacks and bug too.


There is nothing more permanent than a temporary solution.


There is a video like that for every game ever made.


I think anything that follows a traditional release cycle (i.e. you cut a release instead of push continuous updates to a server) has these accidentally-permanent hacks. Every mobile application I've worked on certainly has.


"Also, if you touched the renderer, even in a simple way, and a team later encountered a rendering bug, you would be blamed and have to fix it. Even if the bug had nothing to do with your change. This taught programmers to not change anything unless absolutely necessary."

I've seen this effect in code that wasn't nearly this bad, and I've even felt this way...

But in the end, I've decided to do it anyhow. The end result was that I became the guy that could fix anything (in other people's minds, anyhow) and my job was actually more secure than if I'd followed the path of least resistance.

Had these devs followed the hard path, too, I think it would have helped get things cleaned up, instead of continuing to pollute everything even worse.

Sometimes developing is hard, and you just can't shy away from the hard parts. It just makes everything else harder.


No unit test ?


The Source engine hails from the 90s, testing hadn't been invented back then ;)

But on a more serious note, writing automated tests for game engines involves a lot more than just "duh, unit tests" (especially when testability wasn't a concern in the original design).


Also in this case talking about the graphics stack it's not like OpenGL or DirectX are setup to be testable. You can't really "unit test" shader code, the best you can do is render it in some test scenes and screenshot the results. Which ends up flaky & noisy due to valid-per-spec differences in GPU & driver behaviors.


Every language has valid-per-spec differences. That's exactly why you test.

For sure OpenGL/DX requires more infrastructure to run unit tests than a generic block of C code. But it's absolutely possible to "unit test" shader code, with buffer read-back and/or vertex stream out, among other options. It's more the game engines themselves that aren't setup for unit tests rather than the graphics stack


> For sure OpenGL/DX requires more infrastructure to run unit tests than a generic block of C code. But it's absolutely possible to "unit test" shader code, with buffer read-back and/or vertex stream out, among other options.

Which is what I said, you can screenshot & compare. But it becomes a fuzzy compare due to acceptable precision differences.

And it ends up being more of an integration test and not a unit test.

> Every language has valid-per-spec differences.

They really don't, but that's not entirely what I'm talking about. I'm talking about valid hardware behavior differences, which doesn't exist broadly. How a float performs in Java is well-defined and never changes. How numbers perform in most languages is well-defined and does not vary.

GPU shaders are completely different. Numbers do not have consistent behavior across differing hardware & drivers. This is a highly unique situation. Even in languages where things are claimed to be variable (like the size of int in C & C++), end up not actually varying, because things don't cope well with it. Shaders don't play any such similar games.


> Which is what I said, you can screenshot & compare. But it becomes a fuzzy compare due to acceptable precision differences

You make it sound rigorous than it can be. A readback doesn't need to be a "screenshot" and doesn't need to be of a full scene. A frame buffer can be a 1x1 value.

Regarding precision differences, it's not much different than testing floating point math anywhere else. Shaders allow fast-math style optimizations generally, but they can be disabled at least on some platforms[1][2], otherwise one can take care in floating point math, or provide tests just using integer math.

> And it ends up being more of an integration test and not a unit test.

Sure, if you just setup scenes, render, screenshot and do a fuzzy compare, that looks more like an integration test. And I agree it's more common to see integration tests for renderers. But really, it's a bit more involved in that you have to deal with uploads, command queues, readbacks, but you really can setup the infrastructure to do proper unit tests, and then you can decide how you want to handle unit testing of flexible precision code, either toggling precision in the compilers, or building your tests to properly bound your expected precision, or both.

> GPU shaders are completely different. Numbers do not have consistent behavior across differing hardware & drivers.

This is an outdated and simply not true view, every modern (PC-Spec?) GPU hardware has IEEE754 compliant floats. They have to, otherwise GPGPU wouldn't have taken off in scientific computing. compiler defaults may just not be right.

[1] https://github.com/Microsoft/DirectXShaderCompiler/blob/mast... [2] See: #pragma optionNV(fastmath off)


Game code can be hard to unit test, it needs integration tests on actual hardware. Lots of weird stuff on all chips that need to be taken care of.


There's also no easy way to test for things like "do the shadows render correctly".

About the only thing you can do is take before/after screenshots and compute a signal-to-noise ration on a diff between the images. Which makes for an extremely fragile test definition. What if you change the default FOV of the camera? Now all your tests fail for no good reason.


What if you change the default FOV of the camera? Now all your tests fail for no good reason.

Which is completely fine, because you probably wouldn't want to accidentally change the FOV, would you?

High confidence tests fail on unexpected results. If only some aspects of the results are checked, the tests have obvious blind spots.


Yep, lots of interaction tests that can be quite brittle and can take a long time to run, even with a farm of servers and consoles.

A lot of small and low level stuff can be unit tested but during production things like writing good tests falls through the cracks.


And game development usually involves a lot of iteration, so setting up tests is at best a waste of time, and at worst a crutch that hurts productivity.


The best benefit we've found for unit tests is in low level platform specific code and generic containers. Things of that nature which absolutely must work and can themselves be tested in isolation.

I'm sure when we finish the project and look back on it we can go in, clean up, and implement far more unit tests for the code we already have.


I am surprised that game engines aren't set up to test simple scenarios programmatically. I play a lot of Overwatch and the bugs / patch notes about fixing those bugs amaze me every time; they tell me a lot about how the software is designed and tested.

There was one bug where a character has a deployable ability that doubles the damage and healing of all projectiles that pass through it. One day, the patch notes read "fixed an issue where healing was not amplified when passing through the amplification matrix". And, I totally get it... every conference talk I've seen out of Blizzard goes into details about all the infrastructure they've made for play testing their games. It sounds easy to get your coworkers into a build of your latest PR and try it out. But things like these subtle numbers adjustments just don't translate well to play testing -- sometimes the enemy is doing so much damage that you can't really be sure that the problem is the Amp Matrix isn't multiplying the healing by the right number. So, from time to time, refactors break it!

But, in a world where you could easily write integration tests, this problem would never happen. You'd write a simple scenario like "create empty room. place baptiste at position 0,0. deploy amp matrix at position 10,0 with orientation 90 degress. place sombra at position 20,0. set her health to 80. make baptiste fire a healing grenade along vector 1,0 at an angle of 45 degrees. wait 10 ticks. ensure that sombra's health is now 200." The framework to be able to write tests like this is not difficult (you can do it in their "workshop"), and it's not difficult to write a test like this for every ability, and even every combination of abilities. And, it would mean that play testers never ever need to be suspicious of numbers; the automated tests already check that. You'd make developers more productive (the computer can check the basics like this), and play testers more productive (they don't need to test simple stuff anymore). But... I don't think they do it. The buggiest releases are when the team is under time pressure to hit a deadline (Overwatch has seasonal events; the patch that introduces a seasonal event always has some weird bugs), and I don't think automated tests miss things under time pressure -- but humans sure do.

The one thing I'll give Blizzard credit for is that their games are fun. All that playtesting is certainly a good idea. I'd supplement it with some gameplay-focused integration tests, though. They have the money and the tools teams, and their games last longer than a few months, so it just seems like a smart investment to me. So it just baffles me what bugs ship to production.


The only game I'm aware of with an extensive automated testing infrastructure is Minecraft:

https://youtu.be/vXaWOJTCYNg?t=993


Riot has written about their League of Legends test infrastructure https://technology.riotgames.com/news/automated-testing-leag...


Interesting! That looks exactly like the test I wanted to write.


Source engine is actually a fork of Quake engine.

Back then people often had no idea what they were doing, often because they were doing stuff nobody did before, ever.


In a repo like this, tests can degrade overtime when someone just changes the test asserts to make them pass because they’ve got a deadline to get this out of the door.

It becomes difficult to even know if the test is checking the right thing since you’re completely unfamiliar with the context.


Hence why game engines are now their whole own business, and before that time, people used to just start fresh with every game, maybe having a drawer of useful snippets. Crunch time is fundamentally incompatible with the discipline needed to not end up in this place.


> Crunch time is fundamentally incompatible with the discipline needed to not end up in this place.

The hubris of trying to circumvent the law of fast, good, cheap; pick two.


I don't think it's only cuz of lack of discipline

I think sometimes in project this complex you (nor anyone) have no idea where some code SHOULD be and inserting it in "wrong" place causes weird things later


richgel999 has posted before about Valve's toxic culture - so bikeshedding about "well why don't you just... " is beside the point in the larger context of perverse incentives. https://twitter.com/richgel999/status/1330765701037101057?s=...


It's a refreshing perspective. It is still not that much of a knock on Valve - after all, they hired him, seems like they made the right choice, he is really articulate, voted with his feet, etc. etc. It actually makes him and Valve look pretty good, especially since it's such engineering and strategy focused and honest criticism, and none of it concerns illegal stuff either, like retribution.


TBH this issue exists in many larger codebases that don't take separations of concerns seriously. Solution: take it seriously. That's harder than it sounds and depending on the problem domain is next to impossible, you should still try it.


Absolutely true, the issue exists in many code-bases and organizations. And even trying to keep things decoupled will only help so much. A 10+ year old codebase which is mostly driven by new feature development will very often have accumulated so much cruft and workarounds that trying to make and further modifications is rather likely to break something else.


The renderer in Source is one of the places I don’t have a great understanding of. Quake’s is fairly straightforward, but the shader system Source has complicates understanding a lot more than if you had licensee access.

The way they talk to entities to handoff determining visibility is fantastic, and there are a number of other small design details that make the engine very pleasant to work with as a modder or game developer—but there are some things that are rather hard as an engine developer working with Source, or black boxes because no one has public information of how particular systems work anymore.

Even internally at Valve they’ve broken particular portions of Source and the Half-Life codebases because they don’t understand how particular interfaces work anymore, but some older members of the hlcoders community still do.


Wasn't CS:GO developed by Hidden Path?


That's a vey good name for codepaths which are difficult to find :-)


Initially but has been maintained by Valve for some time now.

Having said that, it's built on Valve's Source Engine, so still faced many of the same issues


Could a big part of the problem be just how old the engine is ?

I assume many if not all the original devs have left.


I'm not familiar, are they still running "Source 1" or have they moved on? If so, I don't see how this is very relevant beyond mildly interesting because it relates to a game many love and still play.


They're on Source 2 now, but not all of their games have been ported to it. A former employee said that Source 2 is pretty much just Source 1 with some extra phsyics bolted on, not a completely new engine.


Source 2 is Source 1 with most of the key systems replaced. They may have started with physics, but they didn't stop there. Many game engines are a collection of modules, Source included. So it becomes a fuzzy line when it becomes a "new" engine. Does replacing one module make a new engine or not? How about 2? 3?

And I very much do mean "replaced" there. Physics, since you mentioned that, was switched from Havok to the in-house developed Rubikon. And since Havok is a licensed middleware, they couldn't just bolt some new stuff on and call it theirs. That's going to be a full from scratch replacement.

Similarly the "UI module" was fully replaced, from the Flash-based Scaleform to Valve's in-house Panorama which is fairly similar to HTML5/CSS/JS. This module replacement was also "ported" to Source 1, and was implemented in CSGO as well. Which gets back to the lines between game engines "versions" are blurry.


Another nice example of that is the engine used in Ubisoft's Splinter Cell games. Technically it's Unreal Engine 2.5 (and some of the file formats are similar), but every single subsystem was replaced by now that only some of the tooling is still Unreal Engine 2.5 (which is arguably the important part to keep intact for game designers). Even the renderer isn't UE2.5 anymore.


It's the Engine of Theseus! ;)


That's a large understatement. The arguably most important part of any game engine is development tools, and those were completely rebuilt and are nothing like Source 1's tools. You can see this for yourself by installing and comparing CS:GO SDK and Dota 2 Workshop Tools in Steam.


.


> not all of their games have been ported to it


I still don't get why they can't make a basic anticheat or protect the process memory like most other games (even the Faceit AC client itself for CS:GO!). Is there some explanation like keeping compatibility with very low end PCs?

You can get wallhacks in multiplayer by simply using WriteProcessMemory calls. [0]

[0] https://github.com/Snaacky/Diamond/blob/master/diamond.py


To be even remotely safe from this, you need to use a kernel driver, which is invasive and widely seen as unacceptable - at the moment anyhow.

See Riot and Valorant from earlier this year. There was a lot of outcry and the response from the devs was basically "we don't give a damn".

Other games, for example, scan window titles or signature for a variety of debuggers/hacking tools like IDA and x64dbg. There's many techniques and variations you can apply to make things like this more "annoying" - but never impossible.

Earlier this year, there was a PCI card PoC that would read memory and act as an "undetectable" wallhack - people are clearly crafty enough to always find their way around.


Is there a way to check what games/platforms install kernel drivers? I recently installed trackmania and after jumping through what felt like 10 hoops just to play the game (log in to epic, download epic games store, log into ubisoft, etc etc), I realized I have idea how much stuff is being installed on my pc after all these steps.


Most recent games install their anticheats from their own folder, so if you open something like Process Explorer and look at the loaded drivers list, you could tell by the path.

However, AFAIK nearly all games with anti-cheat (save for Valorant of course) load their anti-cheat when the game starts and unload it when the game closes. You can run something like Process Explorer/Monitor before running the game, then notice what drivers & services it's loading.


There was backlash about Riot's Vanguard but I feel it was mostly driven by the gaming media as the vast majority of consumers don't understand the nuance of userspace vs kernelspace.

I'd expect more developers begin to deploy kernel-level anticheat in the future.


>There was a lot of outcry and the response from the devs was basically "we don't give a damn".

Because "we" gamers tend to prefer to have fair game

Playing against cheaters destroys fun and the games itself.

It's hard trade off, but your average gamer would rather to play fair game.


I don't think you can attribute this universally to "gamers" - there's a lot of games that don't have obvious hacking problems by employing various other measures which aren't as invasive.

I'd call myself a "gamer" and would never install something like Valorant - and most of my friends didn't either. Some of us value our privacy more than getting rid of the one hacker we get per week.


>Some of us value our privacy more than getting rid of the one hacker we get per week.

oh c'mon.

It's heavily dependent on the game.

There's different % of cheaters in CS, in LoL, in Tibia and a lot of other games.

e.g there isn't a lot of cheaters in LoL because (besides other) cheats do not have as huge impact there as in other games like shooters.

On the other hand cheats in Tibia are just bots that exp for people whole days (in majority of the cases; at least before BattleEye).

>Some of us value our privacy

Installing and running giant program which can do shitton of crazy things under the hood already says that I trust enough that vendor.

Unfortunely their soft does not need kernel level permissions in order to be dangerous to my privacy, so what's the exactly difference?

All they have to do in order to compromise my privacy would be just sending screen shoots to the cloud that I'm writting snarky comments on HN.


Presumably, server sdmins, mods or players can votekick players who they think are cheating. If a player is cheating such that they are indistinguishable from another good player, then that's kind of mission accomplished and doesn't really matter if they stay.

This general philosophy has been around in all CS games and has worked well IMO. Just gotta find an applicably well maintained server to play on first.


That only takes care of ragehackers (e.g.: blatant aimlocks) and throws very good players under the bus. Most cheaters toggle their cheats, make sure to not do too well, allow someone else to finish the kill (to not appear in the killcam), etc. It's common enough that most cheats out there have a strong disclaimer of "don't be obvious".


Getting kicked (and sometimes even banned) from such a server just because you're playing well one day isn't much fun.


IMO it's way better than dealing with intrusive and oft buggy anticheat, and there's always a chance that you'll find yourself with the rare but inevitable false positive bans that these systems give out. Moreso when they been you simply for having software they don't like on your system (e.g. autohotkey with Blizzard, reverse engineering software with many devs...)

Joining a new server from a list with filters is fast and easy.

Just as it is to find and acclimate to a server that is at your skill level and/or becomes familiar enough with you not to kick you for a hot streak.


When the code is chaotic enough it may be impossible to write an anti-cheat that can't be guaranteed to not trip from ordinary in-game code.


They do, it's called VAC, this repo even says you will get banned using this.


Won't this hack be caught by VAC? And if not, what is VAC actually doing?


It's not caught by VAC. I changed the glow colour (which probably changed the signature), tested it a few months ago on local server and then played some Valve deathmatch on official servers, account is still good. I'd expect the game to at least throw me out of the server if I tamper with what objects should light up, but nope.


If I remember right, VAC doesn't necessarily immediately ban a cheater. It might instead put them on a list. After the hack's popularity increases, there is a mass ban of everyone detected using the hack at a later date.


They do bans in waves to try and keep people from knowing what cheat/hack they used that got them banned. VAC might not have caught you, or you might get banned in one of the waves where they ban thousands of people all at once.


If you changed a known hack you might be scheduled for banning at a later banwave instead. Otherwise cheat developers would just make changes and test what gets them banned.


It seems like code quality was a big factor in the delay of Source 2 and future Valve games as well. At what point do you cut and run from legacy?

The end result is pretty fantastic, but it was expected 5 or 6 years ago.



I meant to say that these comments from Bob Martin remind me of testing and proper architecture for the code


Offtopic: I was checking his bio: "Entrepreneur at Binomial, open source dev. Previously SpaceX, Valve, and Ensemble Studios/Microsoft. SIBO survivor. From New Jersey. Opinions my own. He/him."

At the end wrote "He/him".

Anyone knows what does this mean and what is this trend? Is this to clearly state your gender and how you identify yourself?


there are plenty of people who, e.g.

- don't want to show their face on the internet

- people whose appearance is ambiguously gendered

- people who are not the gender that people assume from pictures of them

- people who used to be addressed as a different gender

and, when addressed, want to be referred to correctly.

for instance, i'll never show my face on the internet, but i like it better when someone says "_he_ wrote that program" rather than "_she_ wrote that program."

for all those people, it makes sense to put their pronouns in their bio.

but that leaves the problem: if only the people in the above categories put pronouns in their bio, and you have pronouns in your bio, that might imply you are, for instance, ambiguously gendered.

so people who are conventionally masculine like Rich put pronouns in their bio to normalize it, and to make sure that "having pronouns in one's bio" is not a "thing only OTHER people do".

i think it's a good thing to do. i'll do it too.


Thanks for the explanation, it makes sense!

Although this seems nice, I don't think it actually solves the bigger problem (if there actually is such a problem in the first place). You can not wear a label everywhere you go with your preferred pronoun, this would lead to people stuffing their bios with all their genes, preferences and believes (eg. gender, race, religion, political view, etc.).

The actual problems are:

1) People assuming someone's gender.

2) People getting upset when their gender is incorrectly assumed.

3) Not having a well established social protocol to ask someone their gender without one or both of the parties feeling uncomfortable.

4) In my opinion it further emphasizes that gender is something really important, that should be mentioned immediately as it changes the way you look at someone. I think the correct progressive way of thinking is to disregard gender entirely and assume everyone is "genderless" unless it actually matters. Does it really matter if he is a he or a she? Does it matter that much if a stranger on the internet uses the wrong pronoun?

5) Same issue applies for all other previously mentioned characteristics of an individual (race, religion, political views, etc.).


Signaling that he agrees and supports people chosing their preferred pronouns.

Maybe even supports compelled pronoun use.


[flagged]


if we’re being pithy, i feel like signaling “i’m woke” would be more apt.


this is yet another issue for programmers but not for the business. the product is seen as a black box by the business and if it gets expected output for provided input, it does not matter what is going on inside of it. it makes money and in the end that is the only thing that truly matters.

of course programmers will keep on complaining but in the end, it does not matter. if it works, don't fix it. doing rewrites brings nothing to the business, only to the developers. sure, the rewrite will save dev hours along the way but the rewrite itself is not free so all in all...if it works...

btw this is also why language design of composition instead of inheritance is so important for big projects. you will learn this way way too late if you do not get it already.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: