My state of mind are quite different when I'm doing "work" work and when I'm doing something else for fun. Completely different experiences with just the facts that I'm staring at a screen and using my fingers on a keyboard being common to both.
Carpentry feels surprisingly similar to programming, and there is a lot of depth to it (one of if not the oldest and most diversely expressed forms of human creativity).
> [...] If a book bores you, leave it; don’t read it because it is famous, don’t read it because it is modern, don’t read a book because it is old. If a book is tedious to you, leave it, even if that book is 'Paradise Lost' — which is not tedious to me — or 'Don Quixote' — which also is not tedious to me. But if a book is tedious to you, don't read it; that book was not written for you. Reading should be a form of happiness, so I would advise all possible readers of my last will and testament—which I do not plan to write— I would advise them to read a lot, and not to get intimidated by writers' reputations, to continue to look for personal happiness, personal enjoyment. It is the only way to read.”
Anyway, being science/academic books/papers the point of discussion in the thread, I doubt one would always have the privilege to just leave it.
Math and adjacent literature are there to rewire your brain, they will always be a struggle to read since rewiring your brain takes effort. The exception is if you already know the topic really really well so you don't have to rewire anything, you just put it in places you have already created, but that is impossible for topics new to you.
[Edit] I guess someone doesn't get the joke about how bad Words grammar checker is. It used to flag so much stuff as "fragmented" and didn't suggest improvements.
Also, I cannot believe that some cars (looking at you, Subaru) don’t have a pause button, only mute. Not an issue when you’re playing to music. But when you are Listening to an audiobook, it’s another story
There are existing efforts to compile SYCL to Vulkan compute shaders. Plenty of "weird quirks" involved since they're based on different underlying varieties of SPIR-V ("kernels" vs. "shaders") and seem to have evolved independently in other ways (Vulkan does not have the amount of support for numerical computation that OpenCL/SYCL has) - but nothing too terrible or anything that couldn't be addressed by future Vulkan extensions.
Vulkan 1.3 has pointers, thanks to buffer device address[1]. It took a while to get there, and earlier pointer support was flawed. I also don't know of any major applications that use this.
Modern Vulkan is looking pretty good now. Cooperative matrix multiplication has also landed (as a widely supported extension), and I think it's fair to say it's gone past OpenCL.
Whether we get significant adoption of all this I think is too early to say, but I think it's a plausible foundation for real stuff. It's no longer just a toy.
Is IREE the main runtime doing Vulkan or are there others? Who should we be listening to (oh wise @raphlinus)?
It's been awesome seeing folks like Keras 3.0 kicking out broad Intercompatibility across JAX, TF, Pytorch, powered by flexible executuon engines. Looking forward to seeing more Vulkan based runs getting socialized benchmarked & compared. https://news.ycombinator.com/item?id=38446353
The two I know of are IREE and Kompute[1]. I'm not sure how much momentum the latter has, I don't see it referenced much. There's also a growing body of work that uses Vulkan indirectly through WebGPU. This is currently lagging in performance due to lack of subgroups and cooperative matrix mult, but I see that gap closing. There I think wonnx[2] has the most momentum, but I am aware of other efforts.
How feasible would it be to target Vulkan 1.3 or such from standard SYCL (as first seen in Sylkan, for earlier Vulkan Compute)? Is it still lacking the numerical properties for some math functions that OpenCL and SYCL seem to expect?
That's a really good question. I don't know enough about SYCL to be able to tell you the answer, but I've heard rumblings that it may be the thing to watch. I think there may be some other limitations, for example SYCL 2020 depends on unified shared memory, and that is definitely not something you can depend on in compute shader land (in some cases you can get some of it, for example with resizable BAR, but it depends).
In researching this answer, I came across a really interesting thread[1] on diagnosing performance problems with USM in SYCL (running on AMD HIP in this case). It's a good tour of why this is hard, and why for the vast majority of users it's far better to just use CUDA and not have to deal with any of this bullshit - things pretty much just work.
When targeting compute shaders, you pretty much have to manage buffers manually, and also do copying between host and device memory explicitly (when needed - on hardware such as Apple Silicon, you prefer to not copy). I personally don't have a problem with this, as I like things being explicit, but it is definitely one of the ergonomic advantages of modern CUDA, and one of the reasons why fully automated conversion to other runtimes is not going to work well.
Unified shared memory is an intel specific extension of OpenCL.
SYCL builds on top of OpenCL so you need to know the history of OpenCL. OpenCL 2.0 introduced shared virtual memory, which is basically the most insane way of doing it. Even with coarse grained shared virtual memory, memory pages can transparently migrate from host to device on access. This is difficult to implement in hardware. The only good implementations were on iGPUs simply because the memory is already shared. No vendor, not even AMD could implement this demanding feature. You would need full cache coherence from the processor to the GPU, something that is only possible with something like CXL and that one isn't ready even to this day.
So OpenCL 2.x was basically dead. It has unimplementable mandatory features so nobody wrote software for OpenCL 2.x.
Khronos then decided to make OpenCL 3.0, which gets rid of all these difficult to implement features so vendors can finally move on.
So, Intel is building their Arc GPUs and they decided to create a variant of shared virtual memory that is actually implementable called unified shared memory.
The idea is the following: All USM buffers are accessible by CPU and GPU, but the location is defined by the developer. Host memory stays on the host and the GPU must access it over PCIe. Device memory stays on the GPU and the host must access it over PCIe. These types of memory already cover the vast majority of use cases and can be implemented by anyone. Then finally, there is "shared" memory, which can migrate between CPU and GPU in a coarse grained matter. This isn't page level. The entire buffer gets moved as far as I am aware. This allows you to do CPU work then GPU work and then CPU work. What doesn't exist is a fully cache coherent form of shared memory.
https://enccs.github.io/sycl-workshop/unified-shared-memory/ seems to suggest that USM is still a hardware-specific feature in SYCL 2020, so compatibility with hardware that requires a buffer copying approach is still maintained. Is this incorrect?
"Using a pointer in a shader - In Vulkan GLSL, there is the GL_EXT_buffer_reference extension "
That extension is utter garbage. I tried it. It was the last thing I tried before giving up on GLSL/Vulkan and switching to CUDA. It was the nail in the coffin that made me go "okay, if that's the best Vulkan can do, then I need to switch to CUDA". It's incredibly cumbersome, confusing and verbose.
What's needed are regular, simple, C-like pointers.
Here’s an idea. Instead of the rod being parallel with the wall, have multiple small rods perpendicular to the wall. They can easily accommodate full size coat hangers. Problem solved
> Here’s an idea. Instead of the rod being parallel with the wall, have multiple small rods perpendicular to the wall. They can easily accommodate full size coat hangers. Problem solved
That takes up more wall space, and makes it harder to see what everything is.
I get that this is not the product for you (or me either), but don't compare the 3 years of on again, off again mulling over the design with your 20 seconds of thought on the problem space.
> with your 20 seconds of thought on the problem space
It's a solved problem, is what I imagine OP meant by the "idea". It's not theirs. Those products exist.
It's not harder to see what everything is because they come in variants where they are stacked vertically, can pop out, can slide out. I agree it might not be as space efficient, but at least you are not limited to thin items that don't crease. And it's just a rack and you can use $0.50 coat hangers instead of $6 ones.
I do see the beauty of this design, and it can be useful when you have limited width as well (as the van pictured), so this is not a hate on that, just that this is not a revolution, just a different take on it.
Also access to items behind other items becomes harder and slower. And it requires a shelf above to secure to.
Horizontal pole and foldable hanger design still seems like a simpler and better approach. The design seems so good that I bet you will be able to order these from China in a couple of years for less than a dollar.
So I've thought about the problem some more, and have my own idea.
You would still have a rod mounted a short distance from the wall, like the clothes hinger design, but a little further out. The rod for the clothes hinger has grooves that are perpendicular to the long axis of the rod.
But what if you cut the grooves at an angle instead, maybe 60 degrees. This would take up about the same distance out from the wall as the clothes hinger design, but slightly more wall space. This would be a factor for very constrained walls, but for the longer rods, the "overhead" would be insignificant. And it would only project out from the wall the same as the clothes hinger design (that's trigonometry!).
You would use a smaller diameter rod, so that with the grooves cut at an angle, the diameter is still sufficiently small enough to use a regular clothes hanger.
It's common for them to be stacked like stairs or if there's more space to slide out, so that's not usually an issue. Sometimes it's kinda both (stairs that pop up).