Vulkan is the modern option, the difference is not being stuck with legacy paths and using something that allows explicit sync.
Wayland is also the modern option, so I don't really worry about X11 use cases. For remote desktops, better to use something like FreeRDP anyway. X11 forwarding is much worse in every sense.
I think KDE are working on integrating FreeRDP server into Plasma for seamless usage.
Another thing to add for Firefox would be may be switching to Vulkan video from VAAPI (or at least having it as an option since ffmpeg already supports it) and using hardware acceleration for video encoding too, not just for video decoding.
X11 can also do remote window forwarding, not just desktops which is super handy. Your windows appear in the remote computer with its own window manager just like you run them locally. One of the reasons I still use X.
For barebones window forwarding (no input) I use something like gpu-screen-recorder with SRT streaming output and play the result on the other end with mpv / ffplay.
Haven't looked into it, but FreeRDP might support specific window forwarding too rather than the whole desktop.
If you need something fancier there is Sunshine / Moonlight, but they still have an issue with not using Pipewire for window / screen capturing (and kmsgrab is not really the proper way to do it).
Anyway, X11 is a complete dead end in general so it's not really a viable option for anything serious.
X11 may be a dead end but Wayland sucks as a replacement, so for now, I see no other option than supporting them both.
It may be technically possible to do the equivalent do X11 forwarding with Wayland, that is connecting to a server with a ssh terminal (no remote desktop, headless server), run a GUI app, and have it display its windows on my own desktop as if it was running locally. The problem is that Wayland is 17 years old and I still can't.
FreeRDP is pretty feature rich, so I wouldn't call it a kludge.
For any kind decent remote desktop access you need good performance, specifically low latency. X11 just isn't there.
Headless server is headless server - you can't have anything in such case there with X11 either. If you want to forward X11, you need X server, which means it's already not headless.
Instead of X server you can have any Wayland compositor (Wayland server) and whatever part that provides streaming (FreeRDP or what not).
So I don't see how X11 is any better - it's just worse due to having abysmal performance. X11 was never designed for real world remote desktop usage - it just happens to have network transparency. So it's X11 that's a kludge for such scenario if anything.
> Headless server is headless server - you can't have anything in such case there with X11 either. If you want to forward X11, you need X server, which means it's already not headless.
To me this reads a bit confused, but perhaps I'm misreading it? In X11 terminology the server is sitting in front of you (the one that draws to the screen), so no, you don't need need the remote host to be running X11 server.
You do need the program that draws to the screen, but I think it's fair to say the remote host is headless if it doesn't have a GPU nor a program to interface with the GPU at all. All the remote host needs is code to interact with such a server over TCP or Unix domain sockets. And that code is tiny, even small computers without memory for frame buffer can do it.
> So I don't see how X11 is any better - it's just worse due to having abysmal performance. X11 was never designed for real world remote desktop usage - it just happens to have network transparency. So it's X11 that's a kludge for such scenario if anything.
I think X11 was actually pretty great at the time it was created, i.e. clients can create ids and use them in their requests (no round-trip to the server) and server can contain large client bitmaps that the client can operate on, but sometimes poor client coding can kill the performance over the network. As worst offender I once noticed VirtualBox did a looooot of synchronous property requests during its startup instead of doing them in concurrently, stretching the startup time from seconds to minute or more. (Whether it truly needed those properties in the first place is another question.)
Sending the complete interaction as a video stream? That's what I'd call a hack—though X11 should be modernized in various aspects, for example to support more advanced encodings for media, controlled by the client.
In some sense the web is the direction where I would have liked to see X11 going: still controlled by the client, but some light server-side code could be used to render and interact with the widgets. This way clicks would react immediately, but you would still be interacting with the actual service running on the remote host, not just a local program.
(Another reason why I consider X11 better is the separation of the server and the compositor.)
> but I think it's fair to say the remote host is headless if it doesn't have a GPU nor a program to interface with the GPU at all
You can use software rendering for Wayland cases too. There are even OpenGL / Vulkan software implementations.
> All the remote host needs is code to interact with such a server over TCP or Unix domain sockets. And that code is tiny, even small computers without memory for frame buffer can do it
I don't really see much value in such use case. Thin client (the reverse) makes more sense (i.e. where your side is a weak computer and remote server is something more powerful).
But either way, running a compositor even with software rendering should be doable even on low end hardware.
> Sending the complete interaction as a video stream? That's what I'd call a hack
Why not? Video by the mere nature or modern codecs is already very optimized on focusing only on changes to the encoded image, so it's the best option. You render things were they run, then send the video.
It works even for such intense (changes wise) cases as gaming and actual video media. Surely it works for GUIs too.
> You can use software rendering for Wayland cases too. There are even OpenGL / Vulkan software implementations.
That's actually besides the point I intended, which was to provide an example how little code the X11 client actually needs. OpenGL/Vulkan software implementations are the opposite of little.
> I don't really see much value in such use case. Thin client (the reverse) makes more sense (i.e. where your side is a weak computer and remote server is something more powerful).
Yes, I can see that, e.g. remotely using a super computer. However, I think GPU-capability wise the devices people use to interact with graphical systems are quite sufficient to most any interaction task and if X11 was able to video stream just the important bits (I imagine the important bits would be large updated bitmap areas within the user interface, so video encoded server side bitmap transport would do it), it would be just as suitable for that kind of asymmetric case; while being still usable for IoT scenarios, where you have those tiny computers providing sensor data, of which there are probably hundreds of millions if not billions by now.
In principle it's also trivial to convert an X11 style display interaction with video transport (just run the X11 server in the remote end), while the inverse is impossible. So with X11 style you could choose either or, depending on your devices and needs.
> But either way, running a compositor even with software rendering should be doable even on low end hardware.
And how about video-encoding that data on low end hardware without help from hardware? And even with the help of hardware, i.e. NVidia has limited number of video encoding sessions (so number of distinct video streams) to five and not all hardware can even do that. So it's CPU time if you have multiple such sessions, and running high-quality video encoders are not a walk in the park for them.
Because the streams would be between two end points, multiple streams could be packed inside the same stream to save encoders, but I don't think anyone's doing that..
Alternatively to the number of encoders limitation it helps if you run a single stream (stream the desktop), but personally I consider per application remote use a much more flexible system, and the default provided by X11.
> Video by the mere nature or modern codecs is already very optimized on focusing only on changes to the encoded image, so it's the best option. You render things were they run, then send the video.
Surely scrolling a document by instructing the server to render a different part of the server-side bitmap to its own screen is going to be way more effective than encoding+decoding video, i.e. when considering latency, quality, energy consumption, memory usage and bandwidth?
I still find it a very questionable idea to run some GUI stuff on IoT device streaming the GUI to you. I just don't get why you would want to do that, vs let's say having an HTTP access.
I have an HTTP server running on my printer and I'm pretty sure it doesn't have a powerful CPU. The web page it provides is more than adequate for controlling it.
Remote interface in streaming fashion only makes sense to me when remote computer is strong enough to do all that's needed for GUI to work. Like thin client idea above, game streaming and so on.
Yeah, I've seen it in action (nomachine/nx) It's not bad. But problem is that it's not open source, so it's sort of DOA, unlike all the open options. They should have opened it from the start for it to be relevant.
Yes it is a direct clone from the earlier NoMachine NX. That was open source, and later they moved to a new closed-source protocol. FreeNX took the earlier one over.
And no it doesn't support Wayland of course. It's an X11 accelerator, the design is heavily connected to the X11 design. It doesn't replace X11's remote display support, it just augments it. Wayland doesn't have that at all so there is no point there.
It basically removes the many round-trips in the protocol that increase latency, by caching values locally. And it can also keep the session alive when disconnected, similar to what termux or screen do for SSH.
How much of a difference does it make?
> just implement their own full fledged Wayland handling
As long as they still support X11... (I often do ssh -X ... firefox when I need to see a webpage from a remote machine)
> Back Servo again as the future engine
100% yes, if they still can that is