>“Here at Google, sometimes we just throw stuff together,” Dr. Norvig said, duri...

toast0 · on Dec 17, 2018

It's not necessarily low level optimization. You get better optimization results if you're willing to change things at all levels of the software stack (including user expectations); so it's really about understanding what's going on all the way down and all the way up.

As with all things, the way to get experience and knowledge is to try things. Take a look at some software that you use that does something slow, and try to make it faster; it's easier if you have the source available. Lots of applications these days are slow to start, and that's easy to time, and probably easy to bucket things into things that don't need to be done or things that can be done later. If you interact with any long data processing loops, those are often good choices for optimization as well.

This skill set is most employable for companies that have scale; when you have ten servers, being able to turn off one of them because of efficiency gains doesn't help that much; but it's nice to turn off ten percent of your 10,000 node cluster. It also leads to nice claims on resumes "made efficiency gain in X, leading to Y% cost savings on servers", and since you're likely to spend a bunch of time on a single issue, you'll be able to have a good discussion when it comes to 'discuss some of your interesting projects' in interviews.

The only downside is you'll be super frustrated about everyone else's software being slow. You'll likely gravitate towards classic games because they don't have load times; I hope you like pixel art. :D

nwsm · on Dec 17, 2018

There's a position like that in my company, open to new hires (only bachelor's needed). The goal is to optimize algorithms and data analysis models that run on our HPC cluster. It basically boils down to changing compiler settings and seeing what runs the fastest.

supahfly_remix · on Dec 17, 2018

Can you post a link (if there are no anonymity concerns)? Sounds like more fun than my current job.

EliRivers · on Dec 17, 2018

Amongst other things, know your hardware.

For example, you could read the C++ spec forwards and backwards for a year and know your data structure big O notation just as well, but they will never tell you that inserting into the middle of a vector is faster than inserting into the middle of a linked list in a great many common cases.

Sure, ideally you'd measure, but you need to have some idea of what options to even compare with your measurements.

IfOnlyYouKnew · on Dec 17, 2018

I may be missing something here, but your example seems to be a textbook case for big-O analysis. Writing in a vector is O(1), while linked list is O(n), isn’t it?

PeterisP · on Dec 18, 2018

It's about inserting in the middle of a vector, which requires to shift every element after that one (so, O(n)), and for a linked list it can be done by allocating a new element, which, if you have the pointer to the place where you want to insert, is O(1) - and really, just a few operations.

However, in practice, the O(n) operation will be faster - since doing a hundred reads and a hundred writes to shift a hundred numbers within a single cache line will be much faster than reading a single byte to an out-of-cache memory in the linked list case; the processor can do a lot while waiting for any new data from RAM.

jki275 · on Dec 17, 2018

Writing into a vector is O(n), writing into a LL is O(1), assuming you have an iterator to the location to write to. If you don't, then it becomes O(n), and writing into a vector can be O(1) if you're writing to the end.

C++ stl, LL appending is always O(1) and writing into a vector is O(n). There are plenty of caveats to this and the documentation should be referenced.

He's right btw that insert into a vector is generally faster in the C++ stl than insert into a LL. I don't recall off the top of my head why that is, just remember it's the case.

manfredo · on Dec 17, 2018

Data locality is a big reason. Sure, it's O(N) writes, but those writes are neatly lined up in such a way that the likelihood of a cache miss is low. As opposed to a linked list, where just traversing to the middle may entail multiple round trips to main memory.

jki275 · on Dec 18, 2018

Yeah -- O(1) and O(n) are definitely simplifications and never tell the whole story.

tdsamardzhiev · on Dec 18, 2018

It's not strictly related to C++ and/or STL. This is how computers work nowadays.

jki275 · on Dec 18, 2018

Well, my comments were specifically based on C++ and the STL. There are so many implementations that one has to define the discussion.

gamegoblin · on Dec 17, 2018

Working in any of the big cloud services (AWS, Azure, GCP) will involve this sort of work. Some services probably have more interesting scaling problems than others. (e.g. I bet AWS DynamoDB is more interesting than AWS CloudTrail).

chrisweekly · on Dec 17, 2018

SRE (Site Reliability Engineer) is one such track.