Hacker Newsnew | past | comments | ask | show | jobs | submit | bleke's commentslogin

Sorry for not in topic, did Intel calculate bonuses on hn karma (more officially impact)? I see this bf16 multiple times and it like authors dying for Christmas bonus.


To me it looks like a clever optimization. Same range as FP32, but half the size and less precise and can be converted back and forth by truncating and concatenating zeros.

Is anyone else using it?


Google uses it on their TPUs [0]. If you're interested in how it would effect the numerical stability of an algorithm you want to use, there is a Julia package that makes prototyping linear algebra over this datatype pretty straightforward [1].

[0] https://cloud.google.com/tpu/docs/system-architecture

[1] https://github.com/JuliaComputing/BFloat16s.jl


And Facebook is taking this even further. And while all these things are very cool, do not let ASIC designers claim they are barriers to entry for GPUs and CPUs. Whatever variants of this precision potpourri catch on are but a generation away from incarnation in general processors IMO...

https://code.fb.com/ai-research/floating-point-math/


Google's TPUs use them. But it has been for a year. I don't agree with the "new" or "Intel's" in the title.


And TPU uses them because Tensorflow uses them, it's been present since the first public commit: https://github.com/tensorflow/tensorflow/blob/f41959ccb2d9d4...


I would be extremely surprised if the motivation for putting bfloat16 in tensorflow was not the TPU. That first public commit was ~1.5 years before TPUv2 was announced at I/O, so it was almost certainly already in development.


bfloat16 was first in DistBelief, so it actually predates TensorFlow and TPUs (I worked on both systems). IIRC the motivation was more about minimizing parameter exchange bandwidth for large-scale CPU clusters rather than minimizing memory bandwidth within accelerators, but the idea generalized.


Thank you! I didn't know this. I thought they introduced them shortly after announcing TPU v1 in the 2016 (or 2017, can't remember) Google I/O.


Why is it clever to change the mantissa and exponent size? I thought the clever ones were the nervana flexpoint which seemed at least partially novel. And it's interesting Intel isn't pushing that format given nervana's asic had it.


Or you hacker :), Greasemonkey with 33 lines (including filter) and you newer see nytimes, wsj, washington, techcrunch and usual offenders again


Being cynical here. From fines GDPR & friends incoming and maximise profits before they forced to pay to market already saturated and without pissing user base there will be no double digit 'adaptation' rate


Beamforming is cheap, for sectorial scan phone cpu can easily handle it, of course it will not be all big HQ image, but for initial scan is enough, but ADC with pulse generator - that one is expensive, for decent quality digital steering you need for example minimum 16 channel adc in mhz range and that one is not cheap. If you think it cheap go and google 128 channel ADC plus 128 pulse generators with protection circuit (i know somebody says overkill, but for ndt is pretty norm)

ndt - non destructive testing


Googled it. DDC1128ZKLT ADC costs 212€ a piece, but can sample at 6.25 kS/s what is way too low for this application. 8 fast 16 channel ADCs cost a fortune. They also require at least one FPGA with high pin count and these aren’t cheap either. My conclusion: decent ultrasound scanner can’t be cheap from bill of materials perspective alone.


If you don't like Linus, you can always fork Linux and create DyslexicAtheist OS :), but no you are attacking, he clearly says few mails later, that killing machine without very good reason is big no:

I absolutely refuse to take any hardening patches at all that have BUG() or panic() or similar machine-killing in it.


I think the bulk of the argument is about the insult. Creating a new fork doens't really help anyone. Creating a new standard just increases the number of standards.


Yes you can and start whatever program you want, take i look at better-initramfs (https://github.com/slashbeast/better-initramfs), although it is more for gentoo/funtoo.

Basicaly when linux boot from initrd it starts "/init" executable and chroots to your system (ok, it is using pivot_root and it is slightly more complex), but idea it is same as manually mounting filesystems and doing chroot then starting executable /bin/init


There's also an example script on the Gentoo wiki that might serve as a good starting point: https://wiki.gentoo.org/wiki/Custom_Initramfs/Examples#Simpl...


I agree, started to notice in google search last year about this time (maybe i getting older) or maybe DDG started to serve better results. Nowadays, I use google rarely in this year i can't remember when get better results than DDG, but maybe it just my bias


My couch scientist theory is that REM sleep is phase when brains reconnects and tunes quantum links between neuron groups (assuming brain is big quantum computer underneath), that why there are chaotic movements.


That's a pretty big assumption there. There's a huge difference between "exploits a quantum effects" (like classical computers today) and "is a quantum computer". What evidence leads you to suspect the latter?


I'm not a lawyer, but i have high probability they have "Legitimate interest" to keep the keys (something like preventing fraud) and keys have expiry time


It needs to be revised and is not cool(tm) project than creating new language and rethinking how to avoid any pointer arithmetic (starting from char *str, to arrays char str[999]) is not very easy task.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: