There are some advantages and disadvantages of the approach.
However, there's just stuff that X assumes that isn't true anymore. A good example is how it handles input. How many cursors does a screen have? One machine? okay, One input device?
I'll have to see the talk later. If a new input model is needed, why can't it be an incremental upgrade? I know that UI frameworks still have the traditional mouse down/up/move events, so they've been able to make it an incremental upgrade in that layer. If it's so incompatible that it can't coexist with the old model in the same connection, we make it an option for the client to request. If we're very very sure it's the future, we make it X11R8 or even X12 (the difference between X10 and X11 was smaller than this), we allow the server to support X11 and X12 clients and deprecate X11, and remove it after a very long time (if ever). All of this is still less invasive than Wayland.
However, there's just stuff that X assumes that isn't true anymore. A good example is how it handles input. How many cursors does a screen have? One machine? okay, One input device?
One of the folks behind both Wayland and X gave a fantastic talk on the issues and the hell. https://www.youtube.com/watch?v=HllUoT_WE7Y