This is a rebranding yes.
VSTS was a bad name that confused potential customers because they thought of the IDE, and so no one really knew what the product was from the name alone.
Hah, long time TFS and VSTS user here - I recently got so annoyed with colleagues mis-using these acronymns that I sent out an email with a glossary of terms for TFS, VSTS, TFVC, VSO etc :)
I find it concerning that one person was able to push malicious code to 'production'. To me, this suggests that Tesla, a company building highly sensitive software, does not employ basic branch policies. How is is it that these changes could have made it through a code review process and get deployed?
If a company like Microsoft or Google announced that a disgruntled employee was able to push some code that, say, stole user information, I think the general reaction would be "why was this allowed to happen?" I'm not sure why Tesla gets a pass in this regard
This is a very naive comment. There will always be a small handful of engineers that can push the button to move code into PROD or even change code in PROD live. Ideally, with mature controls, the people in this list is short. But to jump to the conclusion that Tesla doesn't use good practises is very short sighted. Who's to say that external parties didn't target this person specifically because of their role/influence.
Obviously, not all details are available, but the wording in the email suggests that the parent comment is anything but naive:
> This included making direct code changes to the Tesla Manufacturing Operating System under false usernames and exporting large amounts of highly sensitive Tesla data to unknown third parties.
This sounds like something out of the 1990s, that dark and romantic era of version control when we thought CVS was pretty cool actually and we didn't know what key-based authentication and 2FA were.
There are volunteer-ran projects that don't have this problem.
Edit: to be clear, I presume no one is debating the fact that someone with high enough credentials can push code to production. The questions that the email raises are:
1. Why can anyone, regardless of credentials, push mission-critical code without review (or, alternatively, if the changes did go through review, why did the review process not catch multiple malicious changes?)
2. Why can someone compromise several high-level credentials without anyone figuring it out (the changes were made, apparently, under "false usernames")?
Some manager asks IT multiple times over the course of a few weeks to create an account for a contractor, then give them permissions to access production type machines.
Or a contractor that was fired had their credentials appropriated by this manager, perhaps by that manager removing them from a "delete these accounts" list.
Those are a couple of mundane ways of getting a false username to a production machine. This is even easier when there is a lot of flux at the company with many people coming and going, a lot of account management happening etc.
It could have been that the accounts were local to specific machines and not managed by the company as a whole.
> Some manager asks IT multiple times over the course of a few weeks to create an account for a contractor, then give them permissions to access production type machines.
And -- keeping in mind that production type machines operate machinery that can kill -- this sounds okay to you?
Not to mention this:
> Or a contractor that was fired had their credentials appropriated by this manager, perhaps by that manager removing them from a "delete these accounts" list.
...keeping in mind that production type machines operate machinery that can kill, does it sound OK to you that anyone can get access to an account that they don't own and control it?
This particular case would be enough to have PCI certification come into question (if not for it to be revoked), and that's just about money, not life-and-death stuff.
Someone has to be responsible for managing people and organising access to the appropriate machines for them to do their job, if it isn't their manager then who is?
You can manage people and organize access without actually having the ability to gain access to their credentials. In fact, that's how it's supposed to work in safety-critical environments.
My point is, as a manager one can request that their subordinates get credentials to access systems. Therefore as a manager you could create a fictitious person (or use one that's recently left the company), and have them be given credentials to access those systems. Then you could use that fictional identity to do whatever nefarious things you want to do.
Then again it could be just as simple to create an alternate fictitious identity without going through IT but just by accessing the systems you have permission to access anyway.
In a normal company, you could absolutely not create a fictitious account that way, or re-use the credentials of someone who just left. But more important, there is a very, very long way from having created a fictitious person to being able to push stuff to production in their name.
The former restriction is maybe difficult enough to efficiently implement in an organization that it's excusable (we have a scheme for it at $work, but it unfortunately means that sometimes people show up at work and the paperwork isn't ready yet and some of the accounts they need aren't yet ready).
The latter, on the other hand, is security 101 and not implementing it on the production floor is just irresponsible. I really hope it's not what happened.
If we're talking about changes to the software that's used to manufactures vehicles that are driving on public roads, I sure as hell hope the odds are zero.
I hope so too, but then again we constantly read stories where serious industrial equipment and critical infrastructure has their computer systems opened up to the wide Internet because someone thought they would like to control it from a crappy app on their phone. Etc.
We are talking about factory floor equipment, the kind that's designed to run air-gapped and where you find lots of old unpatched Windows 2012 installs because the machine was certified with that and patching would require recertification.
And I'm not joking - recently I was asked whether something (that was designed for a clustered Linux environment) could run on Windows XP because that's what was on the machine they wanted it to run.
> 1. Why can anyone, regardless of credentials, push mission-critical code without review (or, alternatively, if the changes did go through review, why did the review process not catch multiple malicious changes?)
Why do you suppose the unauthorized party was following the company's development practices? Maybe it was from the sysadmin side, somebody who worked on the toolchain used for reviewing and pushing things to production. So he was able to sidestep the normal review process. This can happen, what is important is that such things are discovered.
> So he was able to sidestep the normal review process.
He should not have been able to sidestep the normal review process. That's the problem in the first place. Even if you're from the sysadmin side. It should not be possible to do it.
You may think that looks exaggerated but I've worked in two places where we implemented such a process, both of them far more boring than Tesla and, I suspect, far less money to burn on infrastructure.
> This can happen, what is important is that such things are discovered.
No, what is important when working with mission-critical code is that such things are mitigated. Discovering such a problem in production code is already a problem, not a solution.
You don't keep someone with wheel access from doing anything on the server. You:
1. Sign every review
2. Use the review signatures + the manufacturer's keys to sign reproducible builds of the production image (i.e. you cryptographically certify that "this image is authorized, and it includes this list of commits, that has gone through these reviews").
3. Use a secure boot scheme of your choice to ensure that only signed images can be installed on production servers
4. Keep anyone with 'wheel' access away from the image signing keys, and anyone who can generate images away from 'wheel' access.
This way, you make sure that no one who has 'wheel' access can install a sabotaged image, that any image that can be installed has gone through an auditable trail of reviews, and reduce the attack surface that a malicious developer has control over to stuff that requires root access (which is still a lot of surface, but is harder to sneak past a review).
Root access to production servers does not need to mean that you can install arbitrary code on them, and with the right systems engineering, you can ensure that it does not trivially result in arbitrary code being run on production equipment.
Edit: this is all in the context of "questions that Tesla's answer raises". For all I know, the answer might be that they hired some brilliant genius who figured out how to sneak by whatever secure boot scheme they're using. The point is -- the post that sparked all this is not naive. This is real stuff. Companies that are concerned about it can ensure that unauthorized commits are so difficult to get into production that a disgruntled employee would rather just quit than go through with it.
What are you saying? That because perfect security is impossible, we should give up and do nothing?
It's possible a sysadmin with low-level access can exploit that and a variety of zero-day exploits and escalations of privilege in the layers above to systematically compromise the boot images, steal or falsify credentials and signing keys, and circumvent the safeguards and alarm systems which should be in place to prevent malicious modifications of the source code and the compiled binaries, while hiding his actions from his co-workers, sure whatever.
And if that's what happened to Tesla, wow, sucks to be them, that's amazing.
But if there are no safeguards, no review process, no alarm bells to go off and any damn person can submit malicious code effortlessly and they were basically working off the honor system... I'm going to blame the victim a little bit.
1. It may have gone through the review process but if there is a massive pressure to ship (which there verifiably is at Tesla) then the review part of the process will be the first to degrade quality wise. Inexcusable but realistic.
Every company I've worked at had a responsible person who was the only one able to push code into production. If you allow anyone to do that you are an idiot.
> There will always be a small handful of engineers that can push the button to move code into PROD or even change code in PROD live.
There is really no reason for this to be the case. Certainly all code that actually runs on the car can be required to go through review and be verifiably built, even if server code standards are more lax.
I think you misunderstand. Even with code review policies, there is still a short list of people who can push to production without going through code review. Not from a policy standpoint but from a security and access perspective.
> It wouldn't prevent malicious code going out but at least would require a chain of cooperation between employees, which would be harder to achieve.
No, it just requires a chain of cooperation between authorized accounts. That's a very important distinction, especially here, where the email in question alleged the following:
This included making direct code changes to the Tesla Manufacturing Operating System under false usernames
As a data point, in most of the ${BIGCORP}'s I've worked there are also infrastructure roles most people don't see or think about, which have access across wider environments.
* Storage engineers: Generally have access to most storage (all of dev/test/prod) in their group. Sometimes their access is silo'd, sometimes not.
* Backup engineers: Generally have read/write access to _everything_, and all historical versions of it, as backup systems need to be able to do both read/write. Fairly often there are ways for this access to be "unlogged" too, so the actions aren't captured into any system auditing logs (otherwise it can screw things up). I've not (yet) seen access for backup engineers ever be silo-d, but some places might be doing it and I've just never seen it. :)
They pretty much always can. Sure sure you can imagine some perfect system which would mitigate it but no one - definitely not your bank - is doing that.
It very much sounds like thats the case here - production code was edited, and subsequent auditing has found what should've been deployed and what is deployed differs.
Lots of people do work really hard on mitigating this problem. It's a tough and constant battle, but that doesn't mean you have to throw your hands up and not bother working on it. I'm sure you're right that my bank isn't working to mitigate insider threat to the extent I'd like, but Tesla's code is more safety critical than my bank and I think it would be worth their while to work very hard on keeping this from happening with their computer-on-wheels.
And what stops a single member of the staging or deployment team patching the build scripts, binaries or just installing their own software to a server?
It would not require any chain of cooperation I highly doubt staging team would catch some coefficient used for particular
industrial robot being off by 0.1% in some PR.
Which indicates a different problem if the sabatuer was on that short list. And in this day and age of cryptographically signed commits, the number of people who could do this should go down even further.
Given this, I think it’s much more likely there were few or bad controls, than a person in an incredibly privileged position working in the margins.
No I understood your point. I just don't think it must be true, in the strong formulation you are using, that there must be a short list of people who can independently cause new code to run on a vehicle. I believe security and access can be set up such that no one person can accomplish that. It may be very difficult to set that up, but it seems worthwhile in this case.
> Even with code review policies, there is still a short list of people who can push to production without going through code review.
That's completely unnecessary and should not be the case. If you need something pushed quickly, you can get a colleague with review bit and get them to ack for "urgency" reasons after a quick lookover.
I wrote the policy for our company (and got it through the audit and compliance processes, including SOX404 and PCI-DSS) that specifically and intentionally allows a specific group to take whatever action they determine is appropriate in the face of a production emergency, provided they declared the emergency, their intent, and documented/published what they did afterwards.
I believe this policy, used only a few times per year, has saved us 8 figures in outage costs over a decade. (More than half of the benefit is from a clear statement and instilled sense of ownership, and only secondarily the defusing/unraveling of people would otherwise wait or insert tangles of “best practice” red tape while the website or a factory was hard down.)
I based it on 14 CFR 91.3 (in intent) which says, in part:
91.3 Responsibility and authority of the pilot in command.
(a) The pilot in command of an aircraft is directly responsible for, and is the final authority as to, the operation of that aircraft.
(b) In an in-flight emergency requiring immediate action, the pilot in command may deviate from any rule of this part to the extent required to meet that emergency.
(I made part c of the law, the reporting requirement, mandatory where it’s only on-demand in the aviation law.)
When we explain the policy to new employees, we often cite the aviation law directly, to help clearly communicate our intent.
I've seen lots of places that think they have a process in place that prevents this. I've very rarely seen places that actually had sufficient security in place to prevent someone with malicious intent from actually finding ways of bypassing it if they were prepared to break company policies and/or the law. I'm sure they exist, and more places ought to take this seriously, but part of the problem is a lot of places think they have processes in place that ensures they're not vulnerable.
Often it boils down to not taking sufficient measures against social engineering. In this case the claim is they used fake usernames - most places I've worked, successfully getting a fake account if you already work there would tend to "only" require a willingness to lie on a form or two ("fake" a contractor) and then request elevated privileges. Very few places I've done work requires sufficient checks or counter-signatures to require additional accomplices or make it harder than that. They do exist, but they're rare.
The state of security most places is quite depressing at times. Then again, most of the time it's enough.
True. It's possible that the manufacturing software could be modified in some manner to introduce some fundamental flaw in the final product though. For that reason, I would say the code should also be held to a higher set of standards
While you're absolutely right, it is far more likely that someone was able to do this because there is a lack of security in their software engineering practices. This isn't aimed at you, but I think a lot of people are blinded by their support of Elon Musk and Tesla to acknowledge that ultimately he's one man, and Tesla are just a company that makes cars. People and companies are fallable.
Fully agree. In an ideal world all of this would be locked down, but you've gotta be prerty naive to assume that everything is perfectly locked down.
There are areas that I couldn't push code willy nilly, namely in the security space. But I'd be willing to bet that a majority of teams at any bigN could have a single bad actor cause some damage... That's just the maturity of the industry.
>There will always be a small handful of engineers that can push the button to move code into PROD or even change code in PROD live.
What? Why?
Nobody on my development team has access to the production code signing keys. And nobody - at all - has the ability to remotely make a production system take an unsigned update.
Yes, but that changes the scenario from a "small handful of engineers" that can all do it unilaterally, to needing at least one person from N different teams to collaborate.
And in my specific case, the group with the singing keys is also the group paid to tell us "no" whenever a release is blocked by process reasons.
My guess is that most people with access to signing keys or prod environments would have enough skills to code in some sabotage bugs before deployment or siphon off some data, so a lone devops person with (physical) access could probably a lot of harm just by themself.
Sure, I’m just saying our commitment to process is strong enough that the technical systems are a funnel into following a reasonable process.
I got dragged into defending my technical solution, but my point was that if the ability isn’t needed, don’t have it. You can break the glass when when you need to. Good process make deviation from it more visible. Our code signing keys belong to a team we already need wet ink signatures from to release software.
I can go into the biohazard labs with shorts on easier than I can leverage our technical disaster recovery. I’ve only ever had to do the latter.
"There will always be a small handful of engineers that can push the button to move code into PROD"
I have very different experiences from a SEC regulated company. With SOX there are controls to prevent such a thing to happen. If this is a SOX breakage, Tesla is in deep trouble with the SEC.
Can you explain how Sarbanes-Oxley applies to the situation of an employee sabotaging a production line? Specifically how it would inherently imply wrongdoing on Tesla's part.
Basically, SOX would apply if the numbers Tesla announce in quarterly reports were derived from metrics taken from production-line systems. The wrongdoing is the same in every cause of SOX non-compliance: misleading investors to manipulate stock price.
So Sarbanes-Oxley came about because Enron were stating investor-facing metrics that didn't match reality. SOX compliance comes about if you:
* are publicly-listed
* announce numbers to investors
If those numbers are counted by a computer, you must show that you have procedures in place to prevent a single person from making a change to the code which would allow them to choose what number is produced.
So, if you were some web app that mentioned Monthly Active Users in your quarterly results: bam! Your release process now has to be SOX-compliant, since the MAU calculations come from analytics (from the frontend, or from access logs) which could be altered by some nefarious code.
Note that the intent of the act isn't to prevent such a thing from happening, but to make sure there is enough information for external auditors to detect it if it occurs.
I can confidently tell you that some of the largest companies in the world don't do this... Or at least don't do it to the level that an auditor could confidently say "nobody nefariously edited this code/data".
Part of Sarbanes-Oxley is to make sure IT systems are not manipulated. This includes regulating access to systems and controlling software development. Mainly this means, people who write code can't push code into production systems on their own.
One person writes a requirement, this needs to be OKed by another person, then a third person writes this code and it's pushed to production. Controls are setup - e.g. checking JIRA tickets in git logs - that no code without proper authorization (corrrect JIRA status) is pushed and deployed.
People need to be able to trace every code change to the requirement and the OK.
In the core this only applies to systems that are in one way or the other relevant to financial data (like ordering), but auditors usually want to be better safe than sorry. But Tesla might have a SOX-IT and non-SOX-IT.
> Part of Sarbanes-Oxley is to make sure IT systems are not manipulated.
I would expect that there are limits to this. If a rogue employee engages in fraudulent behaviour against you, using "false usernames" to subvert your security as this employee reportedly did, then I don't see how the organisation could be considered responsible.
I'm gonna guess they use github and this guy just pinged an admin to have a few new github users called things like "Legolas66" added to the organisation.
Many orgs don't track github usernames to real human mappings well or at all, mostly because the single sign on version of github is 3x the price.
In the financial industry part of SOX is segregation of duty.
As a developer I'm not allowed to have write access to any production system, except in an emergency via a break-glass mechanism, which is audited to the hilt and back.
It also means we're not allowed to deploy software to production systems. This has to happen via a specific chain development > regression / user acceptance testing > production.
All those environments need to be physically seperated with very specific access requirements.
The deployment process needed to be signed off by outr auditors.
I can't speak for other banks, but they probably need to implement the same -, or a similar system.
Neither can I speak for SOX requirements regarding software fo car manufacturing.
I think it's your comment that's naive. Mature organizations have strict separation of duties, and a great deal of oversight over code reviews and code deployment. It is becoming obvious that Tesla's focus is on execution speed and other aspects (in this case internal security) are suffering.
You mean all the mature organizations that have security breaches announced every day?
Most companies do not have good security. Even when they do, it's hard to get it right, especially for internal attacks. Don't underestimate what a single individual can do when they're already inside and well-informed.
No, not for a company doing what Tesla does. In the environment I used to work in, that would have been flat out impossible. No change would have been allowed that did not follow process unless an explicit and documented exception was made.
Even if you did manage to commit code to the release trunk without review, every single change to the codebase was checked before the release process started in order to verify that the right process was followed. If we ever found a change that no one could find paperwork for, it would be reverted.
So, yeah, it's possible if you want to do it badly enough.
You're describing how websites work, but that's not how embedded systems work, especially those were lives are it risk. If this happened then it can only be because good practices were not followed.
A good system will ensure that others are notified when this happens. For critical stuff you could also require 2 keys to push code to production so no one individual can do it alone.
Code review is a policy. If there's automated enforcement, it's through software written by someone, configured by someone, on a server that someone has root access to, in a room that someone has physical access to. If you have code signing, someone set up and patches the code signing server. Someone configures the code signing enforcement plane on the target devices. Someone gets to provision new accounts and enroll new keys.
Code reviews are a thing, but physical/mathematical assurance that zero people in your organization can bypass them are not. However amazing your tower of automation and policy may be, there's at least one sysadmin underneath it all, and at least one guy with keys to the cabinet underneath him. That only starts to diverge when you're running on FIPS 140-2 Level 3+ HSMs and you need to assemble a quorum of operators to do anything. And those are quite easy to sabotage - just trip the tamper protection mechanisms.
This++. Even at large, serious organizations with certifications and important government contracts, there are inevitably dozens of people who are/were involved at various levels of the security infrastructure and who happen to know of some aspect of one of the "base turtles" that is secretly a shit show amounting to "this set of people is special." We used to like to play "where's the bullshit" in security design review, because you know there's always something in there with a big fat TBD at some level, and the folks who know what they're doing will readily own up to it and have a future plan for mitigation (often something which will always be a "future" plan). In my experience, the best designs are the ones that don't try to be 100% impossible to subvert, but at least can be audited. Meaning you may be able to come up with a way to push your code into the production line, but the stakes are high because you're probably gonna get caught after the fact.
There may have been. The article speaks of the employee using "false usernames". Code review may stop some foolish person pushing broken code, but it's not going to prevent a determined saboteur who's masquerading as other authorised users.
I'm baffled at your lack of imagination, but: keyloggers, unlocked terminals, API token sniffing from cookies, reactivated old accounts, changing the reviewing account id in the database, …
At Facebook, you could alter code even after somebody had given the OK for code review. I know some people specifically kept some small commits open after being approved, just so they could quickly make changes without needing approval if they ever needed to.
I guess my thinking is if the disgruntled employee was so upset over not getting a promotion to cause so much damage they must have been in a high enough position already to warrant such a disposition, and "high enough" might mean high enough to have push access themselves.
Edit: see closeparens comment above. Complicated systems can always be subverted when trust is broken.
That's all good and nice but in most cases, this is a matter of policy. If someone actively tries to circumvent the policy or the process, odds are that most software shops would fall victim to the same thing. Especially in the case of internal software. Even if you have a system that attempts to enforce the process, odds are that the system isn't without flaw and is not too hard to circumvent.
Most businesses who's primary money maker is not defense or security related... are not running their operation with a focus on defending themselves from themselves.
> If someone actively tries to circumvent the policy or the process, odds are that most software shops would fall victim to the same thing.
Too often the focus is entirely on outside attacks, with little consideration given to insider attacks. Previous job was at a cyber security firm. We'd routinely come under attack from criminal and, we believed, occasional nation state attacks as our researchers attributed a few "Axis-of-Evil" nation-state operations.
Then you'd come back from lunch and find the mantrap doors propped open, or someone left a workstation unlocked with root access to something important, or random guests just wandering around. It's a miracle we never were compromised by a disgruntled employee (of which there were many).
New job, new industry, same behavior. Folks leaving "secure" doors open, workstations unlocked with root access, and all that jazz. They're happy to cite a vague SEC regulation that may have applied at their last job, but doesn't apply to ours and even then, it's not like an attacker would ever give a flying fuck about what the SEC thinks.
I'm starting to wonder if this isn't all a comically bad dream.
> Then you'd come back from lunch and find the mantrap doors propped open, or someone left a workstation unlocked with root access to something important, or random guests just wandering around. It's a miracle we never were compromised by a disgruntled employee (of which there were many).
Maybe not the most technical solution to this, but one of my previous employers had a workplace culture of setting the desktop background to a rainbow with unicorns if someone left their computer unlocked. Sometimes even teams of two would work together to pwn some of the more alert members, by temporarily keeping them distracted while the other "attacker" did the business.
3 or more fails and the background bumped up to something pretty gross that nobody wanted, simply as a practical matter.
I actually thought gamifying it made it pretty fun, and at least for workstation locking, we were elite. Even today I'm sharp about it. Maybe it could be extended to physical access to areas as well.
Yup, we had two approaches. You could send out an "I love purple flowers" email to everyone, or the victim would be "Hoffed." That meant setting the desktop background to that picture of David Hasselhoff, nude with strategically-placed pups.
I worked at a place that did something similar. If you walked away from your unlocked computer, when you came back you might find you'd sent an email to everyone in the office saying that you're buying lunch today.
Yeah, this is funny, but a bad idea. You getting on somebody elses workstation, under their login - what happens if they are doing bad shit to the company? Now you have to explain why you were seen on their workstation.
We used to do this at LAN parties in the 90s... of course goatse was the preferred background. People quickly learned never to leave their seats for more than 30 seconds. Most people would just do a 5-10 minute power-nap in their chairs.
Doesn't even need to use unicorns or something gross, the simple fact that the person was perceived as a "he got caught" is enought. Yes, shaming has its uses...
Not sure if you've tried lately, but when you pay cash for a burner phone, most retail outlets require a driver license to scan your information into their system. When you activate it, you need quite a bit of personal information.
It used to be really easy, but law enforcement is cracking down on the ability to buy and activate these anonymously for obvious reasons.
Are there ways around these? Sure, but it's not nearly as easy as you're proposing.
> I find it concerning that one person was able to push malicious code to 'production'.
Production line software is almost certainly handled separately from the software that runs their vehicles, and isn't "production" in the usual web sense of being customer facing. This sounds more like someone messed with their factory automation setup.
Are you trying to us that less stringent controls on manufacturing software is in any way acceptable?
Manufacturing process (including mfg software tools) are a huge potential source of product failure. I don't know about automotive QMS specifically, but I work in embedded software for safety critical systems. If an unreleased procedure or unreleased software is used to build the hardware, or unreleased software is run on the hardware, it's not even suitable for QA (let alone going to the field).
>> Production line software is almost certainly handled separately from the software that runs their vehicles, and isn't "production" in the usual web sense of being customer facing.
> Are you trying to us that less stringent controls on manufacturing software is in any way acceptable?
Yes.
It is simply uneconomical to develop all software in a no-bugs-allowed NASA style process. One of the classes of software that can easily be placed in a lower scrutiny bucket is industrial software where the the result of the industrial process will be separately checked by a quality control process.
This is in dramatic contrast to safety critical software in cars, for example the battery management software or the fly-by-wire systems. Defects in these systems, in contract to industrial software, are much more difficult to discover by inspection, and failures are much more costly.
Yeah, in that case the controls are moved from the development process to the downstream verification. Sorry if I wasn't clear, I originally had some wording about how known unknowns are handled in my post, but it was getting unnecessarily long and felt like a diversion from the hypothetical I was responding to. I was focused more on documentation are repeatability of process than the quality of the tool themselves.
>> I find it concerning that one person was able to push malicious code to 'production'.
>Production line software is almost certainly handled separately [...] This sounds more like someone messed with their factory automation setup.
For a production line, I would still expect a level of control over exactly what software was running and how it was configured, even if it was SOUP.
edit: SOUP not SOAP. I need to stop posting in the morning.
No one is saying "no bugs" NASA level code is required. Basic SQA must however be conducted- factory automation software is not something where you can settle for low quality as you seem to imply. Something tells me you've never worked in factory / production environments. There are critical safety requirements that must be checked, especially in environments where machinery is operating in proximity to workers. Not only that, but there are huge financial incentives in place to ensure high quality. A bug in an early step of a manufacturing process can cause massive amounts of rework or even scrap, leading to huge losses.
Chill out. Every system that relies on human input can be fooled and broken. How do we know this employee wasn’t the one responsible for loading new code into the machines once validated? Anyone looking to spend years infiltrating a place for a goal will do whatever it takes to circumvent security measures.
Yes. It's not "manufacturing software" (if I'm right about it being the factory control system), it's more like glue code tying all of the different bits and bobs of the assembly line together. This kind of system is usually built in ladder logic or function block diagrams rather than as source code, is typically a whole lot of very simple bits side by side, and often needs to be tweaked in realtime during production. It's likely that quite a few people (definitely more than 2-3) have direct access to modify this sort of code and are probably monitoring and fine tuning it continually.
It's not really comparable to your work where you're writing the OS or system code for a safety PLC or something, and you quite rightly have to maintain a very strict process where everything is fully documented and triple-checked so you can prove that it meets your target performance level. All of the stuff that does need that kind of precision is locked away into modules (like the aforementioned safety PLC) which just gets plugged together like Lego by most of the controls team.
(All this is kind of moot, though, because it says that the culprit used a few different logins to make their mischief. Depending on the permissions set for those logins they could have bypassed pretty much any formal workflow, at least temporarily.)
That may indeed be the case, but I can still see Tesla ship "damaged" cars, that could lead to crashes, because of something like this.
Furthermore, what if Tesla has the same kind of "security" in the software division, too? Can just about anyone in the Tesla team develop and push malicious code to the Autopilot?
I've always thought that Tesla has "better than most" software security for its cars, but I've never thought even that was "good enough". I don't think any carmaker has the kind of extensive security measures put in place that should be required for something that drives you to places on its own at high-speed, can be accessed remotely, and receives OTA updates.
> To me, this suggests that Tesla, a company building highly sensitive software, does not employ basic branch policies.
Are they building highly sensitive software?
I would suggest that the factory automation systems would be sensitive to downtime more than anything else, and the ability to respond quickly with changes would be very valuable. Merging policies would only slow this down.
I would also suggest that there are many other internal tools, from machine learning infrastructure to small internal webapps, or supply chain management software that really aren't sensitive, and where the whole dev team having the _ability_ (whether that's used often or not) to push to master is advantageous.
Where is it not appropriate? Customer databases, HR tools, the autonomous driving software, perhaps?
Different development teams can work in areas with wildly different sensitivities and priorities, and in many cases it creates a better culture, faster delivery and more ownership to have developers able to push to master. Not always appropriate, but worth striving for when it is.
> I would suggest that the factory automation systems would be sensitive to downtime more than anything else
If somebody pulls some shenanigans on the sofware, which is responsible for actually building the cars I'd wager that such software is even more critical than the software in the cars.
Just imagine some twiddling with the break systems being manufactured.
So I don't believe downtime is the major concern. Security of the final product, which is directly derived from such software, is.
Sure, changing factory automation in a way that might affect the quality of the finished product would be sensitive. I don't know how much of an opportunity there is to do that – I'd expect it's much more "it works or it doesn't" with factory automation, but I honestly don't know.
This was factory automation SW, not car SW. Robots could kill people also, but afaik those robots are protected from people there, so it's only economical sabotage.
And to those were wondering about robots sabotaging security sensitive car equipment: This should not be possible. In such situations you'll always two independent chains of control, one in HW and one in SW. In the car. HW breaks down eventually and SW is always partially broken.
I know in Microsoft it would be relatively easy to push destructive code changes. It depends on the product though. I was able to connect to production machines.
From my experience on the O365 team, it would have been difficult to push code without approval because it would have voilated a number of the compliance requirements that area of the business has to meet.
But yeah, I suppose there are other areas of the company that might have more relaxed requirements
I disagree on the o365 team, since most of the programmers are WiPro and TaTa and shuffle in and out.
While I was on the Enterprise services team, looking upon IRadaS for something that broke after a push we always debated was it malicious or incompetence.
You can enforce it on your git server. My experience is with Bitbucket but I'm sure Github has the same options. You could set up something like: nobody can push to master except through the PR UI, all PRs must have a reviewer, no PR can be merged to master without being approved by all reviewers. That way you guarantee that everything on master can be tracked to a dev + a reviewer giving sign off.
Sure but who admins the git server? At work it's actually me (and my team) - we have backend access to the git repos as a necessary result of that. While it would be tricky to hide changes, if people only merge via PR it would be easy to bury a few extra commits no one worries about provided they don't break anything.
>I know in Microsoft it would be relatively easy to push destructive code changes.
So why did it never happen? Or if it did, why did Bill Gates not leak an email saying they found the culprit, and that they believe he was hired by those unscrupulous companies, Lotus or Oracle?
I would say it's like the difference between China and America having a major environmental accident - we treat developing economies and businesses differently from established ones.
Judging from their username, tslathrowaway31, this could be an insider who knows more than what's in the memo.
Regardless, I think configuration should be treated as code (use version control and reviews, write tests), since the damage you can do through misconfiguration is often as great as anything you can do with code.
As the under-handed C contest (http://www.underhanded-c.org/) demonstrates it's perfectly possible to write code that both looks innocent and apparently is innocent when run in a testing environment. You cannot produce a set of tests that show 'this code never causes anything bad to happen' so a sufficiently skilled malicious programmer could certainly write something that could cause issues that would pass stringent code review, testing and other quality control processes (especially when they have an in-depth understanding of those processes themselves).
I have no idea what has happened at Tesla, it's perfectly possible they lack processes they really should have, but you cannot prove that from the information available.
And even if they do lack review processes, it can't be said that putting in even the strictest of review procedures would stop a determined attacker.
It's almost childs play to hide a security vulnerability into a large 1000 line PR, down some error path, through a use after free, hidden deep in a library, a buffer overflow, a debug handler, or anything else really.
> How is is it that these changes could have made it through a code review process
Most companies allow you to add on code and then push it after it has already been approved (mainly to account for small nitpick changes)
> get deployed?
I would assume through the normal process? You don't have to immediately try and blow up everything causing an immediate rollback. If you are smart, you'll pull a stuxnet [0] and carefully fuck up the machinery to try and make it go unnoticed.
Because it wasn't code pushed out to users. It was code pushed to the factory.
Codereview is a methodology to get high quality code you can stand behind and put your brand to.
Maverick engineers making hodgepodge changes and knocking things together quickly with no oversight is the way you get a proof of concept prototype working fast.
For the factory, I take it they're still in (or have returned to) the latter methodology.
If they really had access to credentials for multiple accounts, then I don't see why they could not +2 their own code commits. Provided the code still builds and passes basic functional tests, I doubt anyone would see this.
Could it be that folks started looking at code commits when vehicles started behaving badly?
At the very least you would expect these images to be signed. There's no way code should be injectable on the floor. The boot loader would simply not accept it. And you would expect any signed images to be code reviewed, regression tested, and system tested. Still, no way for sabotage on the floor.
Ok, but we are trusting this 'young and rambunctious' company to produce software for self driving cars, where lack of appropriate policy controls could result in serious consequences if another "disgruntled employee" decides to sabotage the codebase
Really, the most worrisome aspect of this story isn't the alleged sabotage: it's that a single unauthorized employee was able to make such widespread changes to the software.
Says horrible things about Tesla's security policies, especially for a company claiming that it will unleash true self-driving sometime in the next 6 months.
Are we really saying that Tesla, a company building highly sensitive software, is not employing basic branch policies? How is it that these changes could have made it through a code review process and deployed to 'production'?
What I take away from this is that one malicious actor was able to single handedly deploy malicious code and that there were no processes in place to stop this. That is a far more worrying issue
Not sure if it changes your point at all, but the changes were to the manufacturing operating system, not the car OS. I'm not familiar with manufacturing but I imagine there are a ton of different systems and perhaps not all are as secure as the other. I recall reading they needed some paint improvements, so perhaps they changed something there. I don't imagine they would have software controlling paint totally locked down.
Most manufacturing machines run G-code. Simply changing one of the G-code values (text editing) would cause the machine to make parts incorrectly, and this would screw up things down the line. https://en.wikipedia.org/wiki/G-code
on my Haas mill, the code is directly editable by the operator. On my robots, the code is 'taught' by moving the arm and telling the machine that the new location is the updated. It is routine to have to adjust minor things in the robots, the operator would have to have access to do their job.
I had a welder put a washer under a sensor on my robot one time, making every weld 1 mm in the wrong spot. We couldn't figure it out and ended up reprogramming the entire setup, 8 hours, to fix it. A week later I noticed the washer, put there by a guy who I fired for bad attitude a week before. Once I had found the washer, I had the choice of going back to the old code, or leaving it with the new code.
I left it with the new code. I didn't have another 8 hours to waste to fix the fix.
Even tiny companies like mine can have sabotage. Why not Tesla?
"Are we really saying that Tesla, a company building highly sensitive software, is not employing basic branch policies?"
I suspect they aren't. They're trying to hit a number. Elon is on site personally stepping on anything that might impede that goal. They're in desperation mode, running a car company like a startup, erecting tents to house parts of the plant, firing contractors left and right. Are some of their coders banging in changes with little to no review and pushing to "production" -- such as it is -- right off master? Would not surprise me at all.
I don't know what to believe. On one hand Elon claims they've got some sort of admission from someone. On the other I remember the theories he entertained when a Falcon 9 blew up on the pad; “We literally thought someone had shot the rocket.” I tried to explain how infeasible that would be but people don't want to hear it; lots irrational thinking around all things Musk/Telsa.
Anyhow, sabotage isn't uncommon in big plants. Grievances emerge and people sometimes pull some Bolt Weevil stunts to hit back. Usually manufacturers deal with the problem as quietly as possible. Elon, knowing perfectly well his email would leak, decided to make a splash instead. Maybe looking to line up some excuses if they miss the number. Who knows...
"That is a far more worrying issue"
Not for me. We have filled the world with safety; everyone playing by Marquess of Queensberry rules. Nice to see an exception. Yes, even if it costs a few lives.
(Work on VSTS team)