Swift architecture at Uber

andymatuschak · on April 27, 2017

One other big contributor to app size we've noticed at Khan Academy: extensive use of value types, particularly ones whose fields require more than a couple words of storage.

The big picture observation is that for value types with storage larger than a few words, several instructions must be emitted per call and per storage word because they cannot be passed-by-value in registers. And Swift often emits more calls than you can see (e.g. thunks, protocol witnesses, weak sentinels, etc). This is not new to Swift—the same requirement exists when passing large C values around—but we use value types a lot more in Swift for various reasons, so the issue becomes more salient.

In terms of remediation, I audited our app for all structs larger than a few words. I just did this manually; there were something like 120 structs to look at. For each, I converted it to a class then evaluated the impact on generated code size. Only four structs had meaningful impact on generated code size (to the tune of ~13MB), and happily, they were fully immutable, so they retained their value semantics even when converted to classes. If they had not already been fully immutable, I would have had to spend some time either adapting the classes to achieve value semantics, or adapting their clients to tolerate reference semantics.

Then I audited our app for all enums larger than a few words. These can be made pass-by-reference by using Swift’s `indirect` feature, which implicitly boxes associated storage. We had one enum for which this made a substantial difference, to the tune of several MB.

Then I had to make sure runtime performance hadn’t been too badly damaged by all the new dynamic allocations and indirections. In the end, I observed nothing noticeable. We don’t have formal repeatable performance tests, though—it would have been interesting to see the impact on those.

C++ has this issue, too; it largely handles it by using reference arguments when consuming large stack values. In the future, it’s possible for Swift to optimize many cases (especially intramodule) where this occurs by allowing deeper stack frames to reference values types stored in parent stack frames when it can prove it’s safe. Rust has a lot of fanciness here you might find interesting!

slavapestov · on April 27, 2017

There is some work ongoing to improve this situation in Swift 4.0. Take/copy/destroy operations for enums are now outlined and shared between call sites, and a further improvement is in the works to pass large structs indirectly even when they are otherwise loadable (no address-only fields).

Another recent improvement is copy-on-write existentials. There's no code size win here, but it improves runtime performance by avoiding copying the payload when passing existentials by value.

andymatuschak · on April 27, 2017

These are both wonderful pieces of news! Thanks for sharing.

steveklabnik · on April 27, 2017

Servo has unit tests that check this kind of thing https://github.com/servo/servo/blob/412f4bbb6ff075de23d1bbfd...

I think they have some more infrastructure that can audit stuff automatically in the same way you did, but I'm basing that off of a conversation I had and I'm not familiar enough with the codebase to find it; maybe it's just this set of tests.

ed_balls · on April 27, 2017

.Net recommendation is to use value types for values that has maxium of 128 bits.

Aqua_Geek · on April 27, 2017

We ran into this issue in the rewrite as well. We ended up having to convert all of our models from structs to classes (similar to your case they were immutable, so it wasn't a huge difference).

hota_mazi · on April 27, 2017

> The reason for this is that, as much as I know, that a compiler does type checking for every single file. So if you spawn 200 processes of Swift compilers, it needs to 200x check all the other files and make sure that you're using the correct types.

I'm a bit baffled by that. Is the Swift compiler that naive?

Surely you know how to assess the number of processors/cores on your system and you spawn threads in a way that doesn't lead to diminishing returns. You use a bounded thread pool and you stay within these bounds.

Seeing such a speed up from simply merging text files is really puzzling to me. You have to type check the code anyway, surely the overhead of opening a new file is completely negligible compared to running this type checker on the same file? Especially since all these instances of the type checkers have to share a lot of data anyway.

Swift is very cool and I'm excited to see it and Kotlin become our next generation languages for mobile, front end and back end alike, but it seems to me the Swift compiler is still very immature.

mahyarm · on April 27, 2017

There are a lot of 'obvious optimizations' like this lurking in swift. I get the impression they are somewhat engineer constrained in an interview with chris latner.

andrecl · on April 27, 2017

Is there a link where Latner says the Swift team is engineer constrained?

mahyarm · on April 27, 2017

Ex: "But the real reason for doing it is that it was a small amount of work that moved Objective-C forward, which allowed the compiler and language team to focus on Swift because Swift was a very large amount of work. " http://atp.fm/205-chris-lattner-interview-transcript/

Someone · on April 27, 2017

"So if you spawn 200 processes of Swift compilers"

I don't think that implies those processes all were spawned at the same time (or even that Apple's compilation process did the spawning. Given the effort they spent on their build processes, I do not rule out it is their tooling that does the spawning)

The thing is: all these processes are independent of each other. Uber likely has tons of files that are imported into many, many files. Each compilation process that imports such a file has to individually parse it.

I can see two ways around this: reusing intermediate results between processes and using fewer processes.

The first is like precompiled headers in C/C++. The second could mean having a queue of to be compiled files and a fixed number of compilation processes that pick up files from it.

Risk in both cases is that unintended state may leak between compilations. For example, compilation flags may be different for files f and g. Does that mean g can't use preprocessed module m generated while compiling f? Because of that, I would go for something like precompiled headers, because it makes it easier to reason about what information flows between compilations of f and g.

I don't know whether that would be sufficient, though. The biggest concern I have w.r.t. Swift is that all its cool features, combined, make it essentially impossible to build a fast compiler (Yes, languages such as C# have most of them and aren't that slow, but adding protocols and overflow detection (which you may want to optimize away a lot, even in unoptimized builds) may just tip the balance)

edgyswingset · on April 27, 2017

The Swift compiler has some serious issues with the way it does type inference. Try compiling code with a line of basic arithmetic on a few numeric literals.

lilyball · on April 27, 2017

That has literally nothing to do with what the parent comment is talking about. And that's not naivety either. That's because the way type inference, operating overloading, and literals work together ends up being a combinatorial explosion.

bendecoste · on April 27, 2017

Kotlin has all of these features and build times don't (noticeably) suffer.

arcticbull · on April 27, 2017

As does Rust.

slavapestov · on April 27, 2017

From what I've heard Rust has serious build-time issues as well, mostly because of it's inability to perform separate compilation of generic code. Unlike Swift, Rust specializes everything at compile-time, similar to C++ templates.

steveklabnik · on April 27, 2017

That's not exactly true, while that is usually what Rust people write, you can choose to have stuff not monomorphized. I'm also not exactly sure what you mean by "separate" here.

As usual, compile times depend on what you're used to. It is something we're working on improving though; we want it to be very fast! We expect incremental recompilation to move out of nightly soonish; that will help quite a bit.

lilyball · on April 27, 2017

No it doesn't. Rust doesn't have overloading, or operator overloading, or literal overloading. It literally has none of the pieces that cause the combinatorial explosion in Swift.

curun1r · on April 27, 2017

Rust does have operator overloading though the traits in the std::ops module. There are only a few (&&, ||, etc) that can't be overridden.

lilyball · on April 27, 2017

Oh you're right about that. I was thinking of custom operators when I said no operator overloading, but of course that's not actually what we were discussing. Still, no literal overloading. Also, there are other differences between Rust and Swift's type systems that allow Rust to do type inference in a way that Swift cannot.

jordansmithnz · on April 27, 2017

In regards to the tool mentioned that provides information about binary size contribution... ("If you want to see this open-source. Just scream out loud")

I am screaming out loud. Please open source this!

Entangled · on April 27, 2017

> Android engineers are more welcome now. Especially if they write Kotlin.

I love Swift, and I love Kotlin.

To any young programmer out there (I'm in my fifties), learn these two languages and you will be highly employable for the next decade, on top of the wave.

jacquesm · on April 27, 2017

A decade is a pretty short time for an investment of that magnitude. Learn C (yes, I know), COBOL, Java, Erlang, Python or any other non-niche language and you'll hopefully be employable for decades, or at least until general AI rolls around.

WhitneyLand · on April 27, 2017

The languages actually aren't really the big investment. It's really two things:

1) Learning new concepts in computer science. For example Swift uses a lot of modern concepts around type safety, closures, parallelism, synchronization, etc. These are a big learning curve but they are not specific to Swift and you'll notice other modern languages are adopting the same concepts.

2) The UI frameworks are fundamentally different. If you're an option expert at XCode constraints, storyboards, etc, none of the applies to android and vice versa.

I can tell you this, if you do bite the bullet and learn both very well you'll be surprised how much it helps you learn other systems more quickly because you've see all the concepts before.

lacampbell · on April 27, 2017

For example Swift uses a lot of modern concepts around type safety, closures, parallelism, synchronization, etc.

None of those things seem particularly modern to me. Does swift actually have any concepts that weren't already implemented in other languages by say, 1980?

It might seem like I'm being pedantic, but the flip side is - why not learn older, simpler, more mature languages that already have those concepts?

coldtea · on April 27, 2017

>None of those things seem particularly modern to me. Does swift actually have any concepts that weren't already implemented in other languages by say, 1980?

Modern for mainstream languages. Non-mainstream languages are irrelevant to the discussion, since nobody cares about theme except niche industries, hobbyists and academics...

>but the flip side is - why not learn older, simpler, more mature languages that already have those concepts?

Because those languages are not tied to a $50 billion app industry or have major adoption and increased support.

paulddraper · on April 27, 2017

It has options instead of null

mbel · on April 27, 2017

To which the same argument applies.

paulddraper · on April 27, 2017

Which languages before 1980 had an option/maybe type?

steveklabnik · on April 27, 2017

https://en.wikipedia.org/wiki/Tagged_union#1960s (and so on in the rest of the article; you could make the argument that being able to express a sum type and using option/result pervasively are different things; I don't know much about early ML but it was from '73...)

paulddraper · on April 27, 2017

No, a sum type is the essence.

TIL that ML is ooold.

_delirium · on April 27, 2017

ML has roots that go back a long ways, but it wasn't developed as a general-purpose programming language (for use outside theorem provers) until the '80s, and I'd consider it properly "released" to the general public as something intended for real use only in the 1990s, with the publication of the Standard ML definition (1990) the release of OCaml (1996).

Tanegashima · on April 27, 2017

Protocol extensions.

jacquesm · on April 27, 2017

It's never just the language. The libraries are where the time goes and a language without libraries is fairly useless these days.

Tanegashima · on April 27, 2017

This, and package managers.

For example, node.js, many people say its popular because lots or webdevs knew JS already, but that's not even close to true, most people went to node.js because they found a couple of usable libraries on npm that gave the trust necessary to start the project.

coldtea · on April 27, 2017

You got it reverse, since when Node started getting adoption npm wasn't even a thing or was still small. Back in the Node early days it was all "we can run JS we know on the server now" (plus some cargo cult hype about it being "fast because async").

shakna · on April 27, 2017

> For example Swift uses a lot of modern concepts around type safety, closures, parallelism, synchronization, etc.

Ada. [0]

It's been around since before '83 (when it became an ANSI standard), and was developed for the DOD, to replace the hodge podge of languages they were using, with an emphasis on safety.

It has all the above, whilst focusing on being plain English.

Ada also has a few things that are considered to be fairly modern, and has had them for a long time. Such as:

* No Primitive Data Types

* Type Contracts

* Subtyping, operator overloading

The more languages change... The more they stay the same.

These concepts aren't new.

[0] http://www.adacore.com/adaanswers/about/ada

WhitneyLand · on May 1, 2017

@Coldtea had it right - I meant it has many concepts that are new to the most popular platforms. For example C# is very popular, but what it has of these features has trickled in slowly over the years and many .NET programmers have had no need to master them, same with Java.

Ada was a beautiful design for its time. Maybe its most fundamental flaw was government oversight. So much about designing a widely successful language is non-technical, choosing the right features, trends, hardware, that builds enough momentum to support a self-sustaining ecosystem. To do so often requires an agility that the DOD just doesn't have. Ada is one of so many examples that reminds us technical superiority commonly loses out to pragmatism.

I do like the name, it would have been fascinating to know Ada Lovelace.

slavapestov · on April 27, 2017

Ada does not have parametric polymorphism, algebraic data types, Objective-C interoperability, etc.

shakna · on April 27, 2017

I wasn't criticising Swift. I criticised certain features of Swift being regarded as modern.

But, if you insist:

> Ada does not have parametric polymorphism.

Yes, it does. [1][2] It's supported through the use of generic units.

> Ada does not have ... algebraic data types

Ada does have tagged records, and other variant types, and has had for quite some time. [0] They aren't quite Sum Types, but are incredibly close.

> Ada does not have... Objective-C interoperability

GNAT does. It's part of the GNU Compiler Collection, and as such, can be linked against other languages supported by the toolchain. GCC also supports Objective-C.

[0] http://archive.adaic.com/standards/83rat/html/ratl-04-07.htm...

[1] https://rosettacode.org/wiki/Parametric_polymorphism#Ada

[2] https://en.wikibooks.org/wiki/Ada_Programming/Generics#Param...

mpweiher · on April 27, 2017

> GNAT does. It's part of the GNU Compiler Collection

Yay! :-)

One thing that's probably almost unknown these days is that Objective-<X> was always supposed to be something you can easily add to any <X>, and in fact there were quite a few of these, including Objective-Assembler.

slavapestov · on April 27, 2017

Turns out I was quite ignorant of Ada - thanks for the pointers.

pjmlp · on April 27, 2017

Ada surely has generics since 1983.

Ada 2012 probably even has more generic data containers on their standard library than Swift.

ADTs can be done via tagged records.

slavapestov · on April 27, 2017

> Ada surely has generics since 1983.

That's interesting, I did not know that.

From a quick glance it appears they do not have bounded polymorphism ("protocol-constrained generic parameters"), associated types, or existential types ("values of protocol type"). So for example if you have a generic Set data type you would have to pass in an equality and hash function to each operation instead of saying that the element type is Hashable, etc.

The explicit instantiation looks quaint, but it reminds me of ML functors for some reason:

    procedure Instance_Swap is new Swap (Float);

pjmlp · on April 27, 2017

> From a quick glance it appears they do not have bounded polymorphism

You can use abstract classes and interfaces for that.

jacquesm · on April 27, 2017

Do those first two strike you as extremely limiting if they are not present in a programming language?

slavapestov · on April 27, 2017

Yes. Having done a fair bit of C programming in the kernel, where generic data structures are simulated with preprocessor macros and unsafe casts, I much prefer either static languages with generics, or dynamically typed languages.

shakna · on April 27, 2017

Ada was designed for the DOD by the High Order Language Working Group for this.

They wanted something safe for embedded, and the original specification in '83 included generics, so you could handle data structures in a nice, safe, performant manner.

jacquesm · on April 27, 2017

That's a fair point. Personally I don't like that style either so I avoid it at all costs but I've seen it practiced. It usually revolves around creative use of 'void' pointers and terribly hard to isolate bugs.

fauigerzigerk · on April 27, 2017

>Swift uses a lot of modern concepts around [...] parallelism, synchronization

What does Swift have in terms of parallelism/concurrency?

bshimmin · on April 27, 2017

If you're in your fifties, a single decade is about all you need to take you through to retirement age.

To be honest, I would've said going deep on JavaScript right now (React, React Native, etc) would give you a shelf life of at least five years.

Kotlin seems like a bit of a random recommendation - it's not in the TIOBE top 50, and a quick search for Kotlin jobs on indeed.com (the first job search site I found from Google) reveals several orders of magnitude fewer jobs than for anything mainstream (hell, even Haskell beats it comprehensively). Perhaps it's the next big thing though!

coldcode · on April 27, 2017

Javascript frameworks change on an annual basis if not faster.

root_axis · on April 27, 2017

This is a trendy sentiment that is oft repeated but of dubious veracity. Seems to me that Angular and React have remained as the preeminent Javascript frameworks for quite a few years without much signs of a shakeup, with the exception of vue.js taking on some modest gains in popularity, but still decidedly in the shadow of Angular and React.

sedky · on April 27, 2017

I feel like Node, Angular, React have all been around for a few years each

TheAceOfHearts · on April 27, 2017

If you find Erlang a bit weird at first, try out Elixir. Its syntax is more familiar, so it's a bit more approachable. One nice perk is modules can be shared between both languages.

sandGorgon · on April 27, 2017

With all due respect, I would dispute that. That would be JavaScript based - today React Native... tomorrow something similar.

Already there are large apps that are built around React Native. But more importantly, js as a language is becoming indispensable.

A few days ago, there was a discussion about Netflix - who specifically re-engineered it's back end infrastructure to let it's app engineers (who are js) write API services in nodejs. We are talking 40% of The Internet here ?

There is a large boost in hive mind productivity with one programming language . If Java 9 Truffle/Graal.js is as good as everyone says, then it is pretty much js all the way down.

And for those who think js is a shitty language (god knows I did), please try out ES2017 or Typescript - you will be pleasantly surprised.

edgyswingset · on April 27, 2017

Netflix is a polyglot shop which uses a number of different programming languages for different parts of their systems. Yes, Node and JS play a large part in that, but that really isn't the whole story.

sandGorgon · on April 27, 2017

This is part of a specific re-engineering effort to enable front end engineers write their APIs in single threaded nodejs ..And the platform takes care of scaling.

Because "they already knew js". I know that Hystrix is still core, but IMHO the shift has started.

xor1 · on April 27, 2017

How about learn all 3

discreteevent · on April 27, 2017

"First, you have to be aware that structs can increase your binary size. If you have structs into lists they are created on the stack and they can increase your binary size."

I don't get this. Is it saying that structs can increase your binary size and as a separate issue they are created on the stack. Or is it saying that because structs are created on the stack they can increase your binary size? How would that work if stack allocation is something that happens at runtime and affects your memory footprint rather than binary size? (I don't use swift so I might be missing something here)

_pmf_ · on April 27, 2017

In C, static structs that are not zero-initialized or uninitialized (i.e. that are statically to something else than zero) increase the binary size (they go into the data segment, i.e. even if you only initialize one field of a struct, the binary contains a full image of the full struct); this has nothing to do with the stack in C, but I don't know about Swift.

bsaul · on April 27, 2017

Funny he didn't mention a command line tool that gives compile time per function ( explained here http://irace.me/swift-profiling ). That proved to be the greatest help in my case to reduce compile times drastically.

He did mention running a tool to add explicit types everywhere, but it's very often a matter of just writing the most generic ones. Maybe not in uber case, but for everybody else you should try it.

dep_b · on April 27, 2017

Especially when having chained calls like .filter .flatMap .reduce etcetera having an explicitly declared types will take a lot of burden off the compiler.

Apart from that dictionaries and arrays should be typed.

Also some overloads are really hard to process, especially the '+' operators are pretty expensive when used on non-integers and not (previously) strictly typed values.

Never do something like "This is a " + birdName + " on it's nest" but use the "This is a \(birdName) on it's nest" method instead. But sadly it also applies to libraries like Cartography that rely on such overloads to create more elegant declarations.

dankohn1 · on April 27, 2017

I feel very appreciative to have Uber working through all these bugs in the tooling so that the rest of us can take advantage once things are more reliable.

mgoblu3 · on April 27, 2017

Does anyone have any other good articles/examples of maintaining large iOS applications like this? I found this super helpful and interesting to my current project so was also interested in other cases like this

rudros · on April 27, 2017

https://www.objc.io/issues/22-scale/ has some nice articles.

perfmode · on April 26, 2017

I would love to see an implementation of the router component.

santaclaus · on April 27, 2017

Why didn't they use react native?

eddieroger · on April 27, 2017

Why should they have? Native is a great, if not arguably, the best option.

Tanegashima · on April 27, 2017

Because it's a high performance app that thousands of people depend daily for living.

rezashirazian · on April 27, 2017

Fast forward to 9:53

twelve40 · on April 27, 2017

...and there he talks about IDE crashes? didn't catch any mention of react.

proyb2 · on April 27, 2017

There is no mentioned of RN at all, it is because Swift works well for Uber. UberEats uses RN because of multiplatform support with their partners.

Uber Everything is hiring iOS developer for logistic service? I used to work in German logistic which uses Java technology and does having lot of issues everyday. Swift would prove logistic industry in a new way and probably use Grand Central Dispatch (GCD) which is powerful. If Swift will supported Corountines after 4.0, useful for server side swift.

Sorry, I find UberEats UI is distracting with animations. It does not feel native to me however I tried to, the font is awkward to read when Sans Francisco font looks much cleaner on native iOS app.

rezashirazian · on April 27, 2017

There are no mentions of RN at all, but all the positive points for why they used Swift is probably an implicit argument for using that instead of RN.

nnain · on April 27, 2017

It's so great that they shared all these findings: I hope the XCode team takes notice and fixes some of the nagging compile time and indexing issues!

bsaul · on April 27, 2017

Unfortunately the xcode team knows that for a long time, since people inside appl have to use that tool themselves. The reason they don't solve the issues crippling the tool is probably like everywhere else : the codebase has become unmanageable and nobody wants to budget a rewrite.

MooMooMooney · on April 26, 2017

"Lastly, we started combining files, and we found out that combining all of our 200 models into one file decreased the compilation time from 1min35sec, to just 17sec. So we are like, "Hold on, this is interesting, combining everything into one makes it much faster." The reason for this is that, as much as I know, that a compiler does type checking for every single file. So if you spawn 200 processes of Swift compilers, it needs to 200x check all the other files and make sure that you're using the correct types. So combining everything into one makes it much faster. "

Good to know

symisc_devel · on April 27, 2017

The technique is known as Amalgamation and was first introduced in the SQLite2 source tree. In my company were we do embedded software[1], this shown to be a powerful technique. Not only, the compilation is extremely fast (A 1.3 megabytes of C code, took 9 seconds to compile on a core I3), but a modern compiler will be able to do additional optimizations on code when it is contained with in a single translation unit.

[1]: https://unqlite.org, http://ph7.symisc.net

lanna · on April 27, 2017

Why don't compilers do amalgamation themselves as a pre-compilation step?

klodolph · on April 27, 2017

The compiler runs too early. If you're compiling C or C++, each individual .c/.cc/.cpp file gets compiled into a .o/.obj file by the compiler. Once these are all built, the linker combines the .o/.obj files and produces a library or executable. Compilation is finished before you get a chance to combine anything—the linker is what combines the code, and it can't do much optimization.

However, with LTO (link-time optimization), the compiler doesn't finish compiling and writes out .o/.obj files with partially processed outputs. The linker is modified to re-invoke the compiler to finish compiling all the files at once. In GCC this is available with -flto.

Actual amalgamation, where you combine many C files into one, is often not possible without modifying the files. It works for SQLite because they've made sure that their code works with amalgamation.

flamedoge · on April 27, 2017

In practice, LTO compiled .o/.obj files just contains compiler IR. Combining C/C++ files into one is harder due to things like name resolution, etc.

kornish · on April 27, 2017

My guess is that it's to prevent compilation of code which is unused. If code which isn't actually called anywhere is amalgamated and compiled, compilation time could actually increase instead of decrease.

Seems easily avoidable though through basic dependency analysis, though.

klodolph · on April 27, 2017

That's not actually true. You have to pass special linker flags to tell the linker to avoid including code which isn't used. With GNU Binutils and GCC, those flags are -Wl,--as-needed and -Wl,--gc-sections. Normally, unused code is only excluded if it is part of a static library. Even then it is only excluded or included an entire file at a time, unless you split files into multiple sections with the compiler (which has drawbacks—the compiler can do certain optimizations if it knows that two pieces of code or some code and data end up in the same section).

kornish · on April 27, 2017

Ah, good to know – thanks for correcting.

achamayou · on April 27, 2017

Because it means rebuilding everything every time, rather than just the relevant .o and re-linking.

On a large enough codebase, being able to rebuild only what you touch and relink is the only way to get acceptable build times in dev.

monocasa · on April 27, 2017

You could argue that's the point of link time optimization.

vernie · on April 27, 2017

How is this different from so-called unity builds?

flamedoge · on April 27, 2017

It's probably time spent in the linker. The advice of using WPO without opt is verrry circumstantial.

slavapestov · on April 27, 2017

No, it's valid advice. Non-WMO starts a frontend job for each file, WMO runs one job.

The frontend jobs do not share state, so each one parses all the files in the module.

In general the parser is very fast, and the type checker tries to only type check declarations in files other than the primary file when absolutely necessary, so it's not always O(n^2). But there are pathological cases you can construct today where the type checker ends up doing too much work.

flamedoge · on April 27, 2017

It just feels hacky to do it the way author suggested. You can just invoke all the files in cmdline, swift *.swift to same effect I think.

slavapestov · on April 28, 2017

`swiftc * .swift` spawns one frontend job per file and then runs the linker to link together the .o's. `swiftc -whole-module-optimization * .swift` compiles all files in a single frontend job. Note that -O is independent of -whole-module-optimization, which is perhaps a bit confusing.

beaconstudios · on April 27, 2017

hm, so they rewrote the whole platform from scratch in <totally hip language of the month> and it didn't all crash and burn? That's kind of surprising - this is usually a really stupid idea because you often end up mostly solving the problems of the v1 architecture but introducing a whole bunch of different, equally painful problems - but with the added headache of the whole codebase being newish.

battwell · on April 27, 2017

We were all skeptical about doing a rewrite too. A bunch of us have worked at companies that went through bad rewrites. And we've read https://www.joelonsoftware.com/2000/04/06/things-you-should-... :p

But the old Uber app had hit its age limit. It was build on top of technology chosen for a small number of features (ex: a single global DI component, lack of typing, a small MVC hierarchy). So we either needed lots of large migrations or a rewrite. The incremental migrations required to fix these issues would have been extremely disruptive, they wouldn't have gotten us an entirely refreshed UI and they wouldn't have given us a number of other benefits.

So we decided to do a rewrite. We had lots of engineers that knew the issues to watch for from the first time we wrote the app. And a handful of us spent months researching/building different architectures, static analysis and tooling that would ensure the rewrites success. We weren't going to repeat the same mistakes twice.

On the very first day the app launched it was more reliable and performant than the version of our app that we had been maintaining for years.

beaconstudios · on April 27, 2017

I did assume that in your case the rewrite was valid - after all, you laid out the case. I was just surprised that it went as well as it did - I've seen so many rebuilds that went to shit that it was a genuine shock to see one go well.

uberengineer · on April 27, 2017

While I can't speak for mobile rewrites, many systems at Uber have been rewritten several times as the business needs have changed and we've grown in scale. With each iteration the systems become more general and support a greater scale and a greater variety of business needs. The added complexity of our micro-services architecture has been worthwhile because it's reduced the complexity of making changes to systems that can't be taken offline because they are part of the core trip flow. Basically, it's as if we started with a small glider years ago and have upgraded it into a 777 Dreamliner piece by piece while still flying.

We have also taken advantage of the rewrites to rebuild in Go and Java. Most of our older systems were NodeJS and Python. Many of those have been rewritten in the two languages we've now mostly standardized on.

Pretty much every rewrite I have witnessed has gone well, so I can only guess there's something we're doing right in this respect.

slavapestov · on April 27, 2017

Why did you abandon NodeJS?

uberengineer · on April 27, 2017

Lots of reasons such as performance, type safety, static analysis, etc. NodeJS worked well when we were smaller, but it's not uncommon to work in code you didn't write and Golang and Java are languages where the compiler helps you a lot.

One of the best policies we have at Uber is our internal transfer policy. So long as you're in good standing perf-wise, you can transfer to other teams and projects within 1-2 months. For such a policy to work, it needs to be easy to work in unfamiliar codebases. I can't speak for Java, but it's much easier to drop into an unfamiliar golang codebase and be productive and not introduce bugs than it is to do that with NodeJS or Python.

I do exclusively golang work now but when I joined I was doing exclusively NodeJS. While the first two to three months with golang were a bit frustrating, I can't imagine writing a backend in NodeJS ever again. That said, I would still use NodeJS to build frontend developer tools, but that's about it.

melling · on April 27, 2017

Problems aside, Swift is the future of the Apple platform. It's a much less verbose language than Objective C. (no headers, type inference, no @, no ;)

https://h4labs.wordpress.com/2016/02/09/should-i-use-objecti...

anonred · on April 27, 2017

It may be the future, and I love the syntax, but I've run into many of the same problems mentioned in the article (long compile times, SourceKit issues, increase in binary size, etc.) with my own smaller apps. It's honestly frustrating, sometimes more so than the eyesore that is ObjC.

melling · on April 27, 2017

Uber and Facebook have large apps. Uber is 500,000 lines of Swift. This means someone is already working on these problems before you get there. The video had a lot of good advice. Hopefully, they'll contribute back to the Swift compiler too.

beaconstudios · on April 27, 2017

my point is that a full switch-over is usually a terrible idea. I don't know if an incremental change-over from ObjC to Swift is possible in the Apple ecosystem, but if it is that's almost always the better option in my experience.

valuearb · on April 27, 2017

Swift is so far superior to Objective C that I switched permanently with version 1.2 three years ago. No more dangling pointers is huge, and worth the toolchain issues.

xyzzy4 · on April 26, 2017

It seems the Uber app was extremely over-engineered. Call me crazy but I don't think you need 100 engineers to recreate the front-end of Uber.

htormey · on April 26, 2017

Well your not taking into account all the A/B tests they are running, the fact that they have custom experiences for different locations (SFO, India, etc). I can easily see how they would get up to that number pretty quickly at the scale they are at.

sudhirj · on April 27, 2017

If you have the app, move your map pin to a different city / country to see how it changes. Lots of little and interesting changes depending on which part of the world you're in.

venture_lol · on April 27, 2017

At some point, consider the use of different apps altogether. Perhaps having a front loader to detect which local app should pop up based on information gleaned from the phone and GPS.

htormey · on April 27, 2017

What would be the criteria for splitting something out? They do this with uber eats but I can't imagine them doing this for anything else.

Why would I as a user want to download a variety of separate apps to hail cars, etc in different locations? People got really pissed off when Messenger was split out of Facebook (it's still mentioned in app store reviews to this day) I'd imagine they would not be wild about Uber doing something similar. I could see this getting really annoying really quickly.

Imagine landing at a new airport and having to download a new app just to hail a car or a tuk tuk or whatever.

venture_lol · on April 27, 2017

From a user perspective, you may not have to download a new app.

Different locale, different customs, regulations may dictate entire different screens.

How an "app" is designed and packaged internally to enable the business to respond quickly does not have to impact the use experience. There are plenty of design patterns and engineering experience to draw from. It's a somewhat boring topic.

I find it more interesting to find answers to question such as:

1. A native person in India will be presented with an "Indian" app, satisfying the locale, regulations, laws, culture, etc.

2. A tourist landing in India, well, er, what version should that person use? Laws, regulations still apply (well, Uber may have a different take :) ) but how about the user experience and colloquial details? Would the tourist prefer something from home or something that more accurately, and perhaps more apt for India?

htormey · on April 27, 2017

So why is this better from a user or engineering perspective than having just one app?

"There are plenty of design patterns and engineering experience to draw from. It's a somewhat boring topic."

I disagree, I think its a fascinating topic but then again I might be biased as I'm a mobile developer ;) Very few organizations have single app's that are worked on by > 50 people. I worked on the FB iOS app for several years. When an app gets that big you run into all sorts of problems that are not obvious both from an engineering and product perspective.

So to me it's pretty interesting to read about how Uber tackled these problems. Especially as it pertains to Swift which has had quite a number of performance issues and language changes.

"1. A native person in India will be presented with an "Indian" app, satisfying the locale, regulations, laws, culture, etc."

Why not have the app detect that and adapt to that user rather than having a separate app?

"2. A tourist landing in India, well, er, what version should that person use? Laws, regulations still apply (well, Uber may have a different take :) ) but how about the user experience and colloquial details? Would the tourist prefer something from home or something that more accurately, and perhaps more apt for India?"

When you land at a different airport currently uber provides a customized experience for that airport/locale. It knows that you are in a different local.

tumblen · on April 27, 2017

I think you two are saying the same thing. First sentence of the parent comment is:

"From a user perspective, you may not have to download a new app."

Point is that even if it isn't a separate download, in different locale's, it may be essentially a separate app with different screens and UIs.

htormey · on April 27, 2017

Sorry maybe I'm misunderstanding what he originally said. I thought he meant break the uber app up into seperare sub apps.

What you describe is what uber currently does?

jonknee · on April 27, 2017

Well that's not going to save engineering resources and will also be worse for users.

perfmode · on April 27, 2017

It seems like they've broken into teams organized around features where teams own their feature in all locales. I think that's a better decomposition than one around centered around locales. Centering development around locale is actually incredibly absurd.

unoti · on April 27, 2017

> Call me crazy but I don't think you need 100 engineers to recreate the front-end of Uber.

Your sentiment is valid, but consider this: whenever you're working with a project with a ton of money on the table, it becomes possible to add small bits of functionality which more than pay for themselves in terms of return on investment for the engineering effort. This is why you can often write a basic version of a more established application in "a weekend". We've had extensive discussion and meditation on this topic on HN before[1]. Basically apps will grow and add features as long as the engineering effort will pay for itself (I suppose I use that term loosely in this case).

Also consider that the UX is different in different markets, and the reason for that is related to the first issue.

I sympathize with what you're saying, though, and couldn't help but think about Alan Kay's talk that we discussed the other day, about natural complexity vs artificial complication.[2]

[1] https://news.ycombinator.com/item?id=12626314 [2] https://news.ycombinator.com/item?id=14188759

yawaworht12 · on April 27, 2017

Any software from a big company like Uber, Facebook, Google, etc. is far more complex than what you see. There are typically dozens of features and experiments rolled out to some users and not others. There is also a lot of work necessary to serve so many different markets, such as localization and internationalization. Furthermore, in Uber's case they have a lot more iOS applications than the ones riders use. In terms of complexity, Uber iOS apps are probably comparable to Facebook's apps.

battwell · on April 27, 2017

It's definitely a lot of engineers. But we cram a lot of features into the app. I expect that most people will only experience 10% of them. Consider: 1) We exist in a tonne of countries. And different counties often require different product optimizations and payment methods 2) We have our own map provider 3) We experiment with everything 4) We support countries that have terrible networking. That creates lots of challenges.

venture_lol · on April 27, 2017

From an engineering perspective, that sounds uninteresting :)

Terrible networking? how many engineers does it take to implement a few flavors of retries? :)

Uber growth in term of getting drivers, that's what I want to know. It's not just ride subsidies, but the flow of investor honey helps, of course, but they do have some alpha growth managers.

uberengineer · on April 27, 2017

Oh, it goes well beyond retries. You have issues like unpredictable latencies and lost network connectivity. A feature that works great when RTT is 400ms may require an entirely different approach when RTT is 30000ms. There are also issues with clocks due to both fraud and misconfigured cell towers. India for example has notoriously bad RTT times and Jakarta and several cities in LatAm have issues with clocks.

At scale, across so many different markets, with so many different features, things get complex fast. We want the user experience for both riders and drivers to be magical. More magic requires more engineering. The goal is transportation as reliable and available as running water everywhere all the time.

Here's an example of telematics engineering used to improve safety by detecting harsh braking: https://eng.uber.com/telematics/

We also use the IMU measurements to detect drivers who aren't using a phone mount and are instead holding their iPhones while driving.

Features like these allow us to advise drivers on how to be safer and get better ratings since not using a phone mount and harsh braking both correlate with negative user ratings.

A lot more goes into the Uber app to make things magical. I used to read the front page of HN daily to read about all the cool things people around are working on. Since joining Uber, I check far less often now because there are so many cool problems being worked on internally that many of the front page stories sometimes seem quaint in comparison (not that many HNers aren't working on awesome things but the quantity and quality of cool problems my colleagues work on easily captures most of my attention these days).

The best way I can describe working at Uber: it's like building that transportation component of sim-city for every city, everywhere, but it's not a simulation. Despite all the click bait negative press, its bar none the best place I've ever worked.

uberarch · on April 27, 2017

It is easy to get lost in your work and think it is special. Prior to uber, I've seen quite a bit, and I can tell you certain some are good and certain stuff are not that good

uberengineer · on April 27, 2017

Completely agree that not everything is sunshine and happiness at Uber. The quality is highly variable across the company, but my observation has been that this is a feature, not a bug. Some systems need to be built very well such as software defined networking and container infrastructure, so you need to take the time to build it right. Other areas are one off features where speed to market matters most. Faster, Cheaper, Better, choose two. I would say on average there is a good balance between those three across the company, with each team/project making the right trade offs for the short and long term business goals. Most systems that have proven valuable end up being rewritten. I've helped sunset two older systems thus far.

It's easy to drop in and criticize the architecture of some older systems, but it's also instructive if you were around when they were built and knew the trade offs and constraints that existed when they were first conceived. The growth we've experienced has been astronomical and it would have been non-trivial exercise to build systems 2-3 years ago that account for the scale and business needs today. Even today, it's hard to plan farther that 2-3 years out with the growth we are seeing. I work on a system doing millions of QPS and in 2-3 years, it will be handling an order of magnitude more QPS and more business needs. We might scale horizontally or decide a different solution is needed. I don't think there is a silver bullet and the microservice architecture means that rewrites and re-architectures are tractable problems.

jjtheblunt · on April 26, 2017

He mentioned in the very beginning that there are city teams, so perhaps it's a number of people proportional (not exponential, like he says) to the growing # of cities.

ReligiousFlames · on April 26, 2017

Hire lots of people... complexity, arcaneness, process and knowledge-hoarding becomes vital to job-security.

tmh79 · on April 26, 2017

also when you need to develop a new feature and it needs to go through A:B testing frameworks, localization (including Arabic Right to Left text/layouts), phased feature roll outs, 99.999% reliability, integration testing, planning for use cases anywhere from LTE/WI-FI in USA to EDGE network in developing countries, and you need to build the tools that do those things too. Clearly the 100 engineers are sitting there twiddling their thumbs.

ReligiousFlames · on April 27, 2017

I think you might of missed the point... everyone has regulations and current problems at any scale, but the ability of BS job roles to hide in a large workforce and for managers to build pyramidal layers of organizations of little added value (but lots of expensive management layers with ever loftier titles).

venture_lol · on April 27, 2017

I think you have been to a few rodeos :)

Yeah, when a team goes from 2 to 100, a serious business/engineering manager got to ask a few questions. However, if money honey flow is not a problem, it could be in everyone's interest just to ride along

tyingq · on April 27, 2017

They have a few other apps too, right? Driver app, Uber Eats, Uber Freight, etc.

sudhirj · on April 27, 2017

It's basically tens / hundreds of apps rolled into one, with frontends loaded based on your currently stated or GPS location. They all share a common foundation, but features vary wildly depending on city / country.

panic · on April 27, 2017

You're coming at this backwards. They had 100 engineers already. They needed to design their software so that the engineers could all keep working on it. They explicitly say as much:

> This application has served us very well for the past four years. But as we've expanded and exponentially grown our mobile engineering team, we started seeing breakages of the architecture, we started seeing problems where feature development would become pretty hard. We had to test multiple code-paths because we shared a lot of view controllers amongst different teams. We really started to hurt with the old architecture because it was written by two engineers and we had grown the team to over one hundred, at that point.

The actual quality of the software isn't as important, since Uber drivers and riders don't have a choice anyway.

perfmode · on April 27, 2017

Uber riders have more than one ride-sharing app on their phone. If Uber can't match me to a driver, I'm still going to get where I want to go.

JustSomeNobody · on April 27, 2017

[flagged]

sctb · on April 27, 2017

We detached this subthread from https://news.ycombinator.com/item?id=14208159 and marked it off-topic.

htormey · on April 27, 2017

Why the hate for react native? Also, they use react native for uber eats: https://eng.uber.com/ubereats-react-native/

JustSomeNobody · on April 27, 2017

I don't hate react native per se. I hate that people blindly throw that out there like it in and of itself will solve real engineering problems.

It's the naivety of it all that is so annoying.

tuxracer · on April 27, 2017

Blindly shooting it down like it wouldn't solve engineering problems is just the other side of the same coin. You haven't really provided any constructive info why it wouldn't be a good fit here.

kitsunesoba · on April 27, 2017

Blindly shooting it down probably isn't the right approach, but neither is treating it at as a sort of silver bullet. There are plenty of situations where RN isn't the best solution or just isn't a good fit.

In this particular case, a company the size of Uber isn't going to reap the benefits of RN the same way a smaller one would, and RN's downsides will be magnified. Further, to fix the issues it has with RN it'll need its engineers to do native work anyway, making the extra abstraction layer something of a distraction that could've been avoided.

JustSomeNobody · on April 27, 2017

How about the linked article at the top? Don't you think if React Native would have solved all of those engineering issues they would have said, "I guess we use React Native!"?

The comment I replied to read, "Why didn't they use react native?" As if this would have solved EVERYTHING.

tuxracer · on April 28, 2017

It's an interesting question though. I didn't read anything further into it (such as "oh you're saying it will solve ALL engineering problems EVER conceived by all of humankind?!!")

It's just a question, and an interesting one at that. Given they were already prepared to (and did) spend so much time rewriting why not kill two birds with one stone and get cross platform on top of it? There very well may be good reasons to have not used react native. It would be insightful to know what those were or even (to a lesser degree) to speculate.

JustSomeNobody · on April 28, 2017

Let's assume the person read the article. They know the issues Uber solved/was trying to solve.

If they were asking that question having not much experience in React Native, then sure, it's just a question.

If they do have experience in React Native, then this does somewhat imply that the asker is saying React Native would have solved their problems.

Zach_the_Lizard · on April 27, 2017

I think Uber uses it for its restaurant portal. Download the UberEats app for Android and open up the archive to confirm, but I think it's Java.

I'd bet their restaurant app went from Web to app, and so that likely sealed the deal.

asimpletune · on April 27, 2017

Awesome post!

fred_is_fred · on April 27, 2017

It's really irritating that Apple picked the name of an existing software product for their language. When I saw "<large Company> swift architecture" I was pretty excited to see how they were using object storage.

ktzar · on April 27, 2017

and iOS had been the name of Cisco hardware OS for ages... They don't really care.

unpopular11333 · on April 27, 2017

I was hoping to find something about how the Swift architecture affected their ability to cleanly implement a tipping feature, since they've been a tad behind the curve on this very highly demanded functionality (just pinging my general group).

Humdeee · on April 27, 2017

The functionality of that feature is completely subjective and I'm sure top level management has had many, many meetings with that very topic as the headliner since the first weeks of development. They're very likely ahead of the curve.

valuearb · on April 27, 2017

Pretty sure a tipping feature would be the single worst possible feature they could ever add.