> but I think it's fair to say the remote host is headless if it doesn't have a GPU nor a program to interface with the GPU at all
You can use software rendering for Wayland cases too. There are even OpenGL / Vulkan software implementations.
> All the remote host needs is code to interact with such a server over TCP or Unix domain sockets. And that code is tiny, even small computers without memory for frame buffer can do it
I don't really see much value in such use case. Thin client (the reverse) makes more sense (i.e. where your side is a weak computer and remote server is something more powerful).
But either way, running a compositor even with software rendering should be doable even on low end hardware.
> Sending the complete interaction as a video stream? That's what I'd call a hack
Why not? Video by the mere nature or modern codecs is already very optimized on focusing only on changes to the encoded image, so it's the best option. You render things were they run, then send the video.
It works even for such intense (changes wise) cases as gaming and actual video media. Surely it works for GUIs too.
> You can use software rendering for Wayland cases too. There are even OpenGL / Vulkan software implementations.
That's actually besides the point I intended, which was to provide an example how little code the X11 client actually needs. OpenGL/Vulkan software implementations are the opposite of little.
> I don't really see much value in such use case. Thin client (the reverse) makes more sense (i.e. where your side is a weak computer and remote server is something more powerful).
Yes, I can see that, e.g. remotely using a super computer. However, I think GPU-capability wise the devices people use to interact with graphical systems are quite sufficient to most any interaction task and if X11 was able to video stream just the important bits (I imagine the important bits would be large updated bitmap areas within the user interface, so video encoded server side bitmap transport would do it), it would be just as suitable for that kind of asymmetric case; while being still usable for IoT scenarios, where you have those tiny computers providing sensor data, of which there are probably hundreds of millions if not billions by now.
In principle it's also trivial to convert an X11 style display interaction with video transport (just run the X11 server in the remote end), while the inverse is impossible. So with X11 style you could choose either or, depending on your devices and needs.
> But either way, running a compositor even with software rendering should be doable even on low end hardware.
And how about video-encoding that data on low end hardware without help from hardware? And even with the help of hardware, i.e. NVidia has limited number of video encoding sessions (so number of distinct video streams) to five and not all hardware can even do that. So it's CPU time if you have multiple such sessions, and running high-quality video encoders are not a walk in the park for them.
Because the streams would be between two end points, multiple streams could be packed inside the same stream to save encoders, but I don't think anyone's doing that..
Alternatively to the number of encoders limitation it helps if you run a single stream (stream the desktop), but personally I consider per application remote use a much more flexible system, and the default provided by X11.
> Video by the mere nature or modern codecs is already very optimized on focusing only on changes to the encoded image, so it's the best option. You render things were they run, then send the video.
Surely scrolling a document by instructing the server to render a different part of the server-side bitmap to its own screen is going to be way more effective than encoding+decoding video, i.e. when considering latency, quality, energy consumption, memory usage and bandwidth?
I still find it a very questionable idea to run some GUI stuff on IoT device streaming the GUI to you. I just don't get why you would want to do that, vs let's say having an HTTP access.
I have an HTTP server running on my printer and I'm pretty sure it doesn't have a powerful CPU. The web page it provides is more than adequate for controlling it.
Remote interface in streaming fashion only makes sense to me when remote computer is strong enough to do all that's needed for GUI to work. Like thin client idea above, game streaming and so on.
You can use software rendering for Wayland cases too. There are even OpenGL / Vulkan software implementations.
> All the remote host needs is code to interact with such a server over TCP or Unix domain sockets. And that code is tiny, even small computers without memory for frame buffer can do it
I don't really see much value in such use case. Thin client (the reverse) makes more sense (i.e. where your side is a weak computer and remote server is something more powerful).
But either way, running a compositor even with software rendering should be doable even on low end hardware.
> Sending the complete interaction as a video stream? That's what I'd call a hack
Why not? Video by the mere nature or modern codecs is already very optimized on focusing only on changes to the encoded image, so it's the best option. You render things were they run, then send the video.
It works even for such intense (changes wise) cases as gaming and actual video media. Surely it works for GUIs too.