Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi, I work in this field professionally - you are correct up until the tradeoffs. It is not the case that the spatial resolution is higher - in fact, semiconductor manufacturers have struggled to get ToF sensors much above VGA in mobile form factors. In general, ToF systems have a considerably lower maximum precision than structured light of the same resolution, especially at close range - typically structured light systems have quadratic error with range, but the blessing that that curse gives you is that at close ranges, precision goes up superlinearly with horizontal resolution. This is one reason that essentially all industrial 3D scanning is based on structured light. The other is multipath inference, which is specific to time of flight systems (and you can see the effects if you look at a ToF’s results near a corner - the exact corner itself will be correct, but the walls immediately adjacent to it will be pushed out away from the camera).

Temporal resolution is more debatable, because ToF is a lot more conducive to “extremely short, extremely bright flash” than structured light is. But for example, there are systems that run structured light (specifically, single-shot dot pattern structured light, like that seen in TrueDepth or Kinect), or its two-camera equivalent (active stereo) at exposure times of 1ms or less. It is all about the optics stack and sensitivity of the camera. So I don’t agree that temporal resolution is a compelling advantage either.

The main advantages of ToF are that it can be built in a single module (vs two for structured light), it does not require significant calibration in the field when the device is dropped or otherwise deformed, and it is easier to recover depth with decent edge quality. In general the software investment required for a good quality depth map is lower, though in this case Apple has been chewing on this for many years. Another potential advantage is outdoor performance at range - while historically this has been a significant weakness for ToFs, more modern ToFs adopted techniques to improve this, such as deeper pixel wells, brighter and shorter duration exposures, and built-in ambient light compensation. These are hard to do with structured light without manufacturing your own custom camera module. Finally - and I suspect this is why Apple ultimately picked it for their rear depth solution - because time of flight is a single module, it can be worked into the camera region on the rear of the device without having to have a separate opening for the illuminator and camera. The quadratic drop in accuracy with range that I mentioned above can be offset not just by resolution but by the distance between camera and illuminator - for a rear mounted device, the temptation is to make that baseline large, but this would put another hole in the back on the other side of the camera bump. I don’t see Apple going for that.




Can you comment on the tradeoffs between indirect TOF (phase) and direct TOF (time), what made Apple opt for direct TOF here, is it Microsoft's patents?


Indirect ToF: easier to manufacture in small form factors, relatively cheap, well established technology. Easier temperature calibration and lower precision required when manufacturing the emitter, meaning cheaper components and more vendors that can make them. Because the technology can only measure the shift in phase, there is phase ambiguity between waves. The way this is dealt with is to emit multiple frequencies and use the phase shifts from each to disambiguate, but you usually only get a few channels so there ends up being a maximum range, after which there is ambiguity (aliasing, if you will) about if an object falls in a near interval or a far one. Multipath can commonly also cause such artifacts in indirect ToF systems. Finally, because they are continuous wave systems, they can (though modern indirect ToFs try to mitigate this) interfere with each other like crazy if you have multiple running in the same area. I’ll note that there are also gated systems that I would characterize as indirect ToF, that use a train of pulses and an ultrafast shutter to measure distance by how much of each pulse is blocked by the shutter. These suffer from more classical multipath (concave regions are pushed away from the camera), and are not very popular these days. You are right to call out that Microsoft is very heavily patented in the ToF space, and they ship probably the best indirect ToF you can buy for money on the HoloLens 2 and Microsoft Kinect for Azure.

Direct ToF is a newer technology in the mobile space, because it has proven challenging to miniaturize SPADs, which are really the core technology enabling them. Additionally, the timing required is extremely precise, and there are not that many vendors who can supply components adequate for these systems. While there are patent advantages, there are also some significant technical advantages. Direct ToF systems have better long range performance, are much less affected by multipath, interference with other devices is minimal, and most critically - you can push a lot more power, because you emit a single burst instead of a continuous wave. This is really important for range and SNR, because all active IR imaging systems are limited by eye safety. For eye safety you care about not just instantaneous power but also energy delivered to the retina over time. Its helpful to recall that for all these active IR systems that go to consumers, they need to be safe after they’ve been run over by a car, dropped in a pool, shoved into a toddlers eye socket, etc - so this puts pretty strong limits on the amount of power they can deliver (and thus ultimately on range and accuracy). Direct ToFs are also really nice thermally, because your module has a chance to dissipate some heat before you fire it again (vs CW systems where you’re firing it a much higher fraction of the time).


Kudos on the explanation. I'd love to see you do a blog pot that elaborates on these different methods with diagrams for laymen like me who find this fascinating.


Why Sony, and other chipmakers they went as far as developing a true ToF sensor, rather than switching to the continuous wave sensing?

And I also heard that Apple went as far as trying to develop a ToF ladar by itself. What's the magic with/what's particular reason to use ToF sensing, instead of CW?


[flagged]


What part?


The only part that I think could be debated is my assertion that it’s single module. While technically the illuminator and receiver are different components manufactured by different vendors, they are integrated into one module well before final assembly - and generally when you buy ToFs from semiconductor manufacturers, you get a single mostly rigid part that has everything attached to it.

I suppose I could also imagine someone challenging my assertion that industrial 3D scanning uses structured light, because there are some vendors that use -temporal structured light-, where you flash different patterns to resolve different spatial frequencies on a stationary part. But beyond that I am as confused as you are.


> where you flash different patterns to resolve different spatial frequencies on a stationary part

Oh, that's neat! Is the output from the sensor, when combined into a time-series, essentially a frequency-domain image of the part, such that you just apply an inverse-FFT and get a picture out?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: