Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the aspects of the device that has been under-realized is that when mirroring your desktop/laptop display to the AVP, you can't break out its applications into different areas. You can't pull them away from the desktop window.

This is one of those things that Apple never claimed was supported, and yet there's something about that behavior that feels like such a natural intuitive implication to the technology that a lot of people feel alarmed or even cheated when they realize it's not possible (yet). It's been funny to watch the various discussion threads as people pop up talking about their shocked realization and disappointed feelings.

Update: I did realize when watching the WSJ video that the "mirrored" display actually appeared to have greater "resolution" (more pixels in height and width) than what she had on her laptop. So that's something.




It seems very reasonable this will be a future feature. I've long suspected iPad OS' stage manager feature shipping so half baked was really more of getting the platform ready to support multiple apps and easier manipulation (from a developer perspective) of the double buffered "window" textures - given Vision Pro is based on iPad OS.

With Stage Manager on macOS now, it feels like they have all the primitives in place to "transpose" macOS stage manager windows textures to Vision OS/ the iPad OS foundation.

Though this will be tricky to get right for all apps. Will be interesting to see if it's a macOS App store only feature/ API, opt-in, or some other option


They can already do this with the desktop composition software they use today. All the windows are virtualized onto backing layers that you can draw anywhere and add effects to. It’s how window shadows work, and how certain window effects are done.

They just haven’t done it.


Yes?

The first iphone didn’t have copy/paste.

Apple will always prioritize critical scenarios over nice to have. None of these things are technically difficult, it’s just time. I’m willing to believe they released too early, but at some point you have to start learning from real users.


I'm aware. I've worked in the space. It isn't as simple as you're making it out.


Sorry but almost 10 years ago I could do this on Xorg where the worst problem is that compositors cannot redirect input (so you had to kludge new events from scratch). I cannot imagine it would take more than _half an hour_ for someone with macOS display compositor experience to implement it.


I cannot imagine it would take more than _half an hour_ for someone with macOS display compositor experience to implement it.

What fools Apple engineering management must be, then!


It probably actually is a quick initial implementation, but like everything else, it needs to be planned, ticketed, developed, tested, and slated for release.

Compositor work isn't tremendously hard when the graphics primitives are already done.

UI state handling and input is a majority of the work from there. I've implemented this work before. A large portion of my background is in compositors and UI primitives.

What's really cool about having window backing layer handles, is you can do all sorts of crazy fun stuff in the office and show off to your coworkers or tech demo that basically will never make it to production because it's totally impractical.

The best example of that which actually did end up in a final product in my opinion was Windows Flip 3D. Totally fun implementation, I'm sure.


And the reason Xorg is now dying in favor of a system that doesn’t have this capability is because the architecture that enabled it, while cool at the time, severely limited the graphics performance and capability of the applications.


What? Doesn't Wayland work this way by design?


Don't be letting actual experience get in the way of shitting on Xorg, now! Wayland will never win with that attitude!


Indeed it does not. It’s local only.


I see. Do you mean like the Xorg fast path for local 3d rendering that became the basis for Wayland?


Exactly that. A local only path that provides high performance.

Remember the original comment was the claim that Xorg could easily do remote windows years ago. This is only true using the low performance path that modern operating systems have universally rejected.


No, the original comment is that by writing an Xorg compositor I could project separate windows into separate surfaces at arbitrary positions in the 3D world. There's no need for network transparency and you can very well do with an HDMI cable (actually, I was using an HDMI cable -- this was 10 years ago).


Network transparency is exactly what people want. If you're going to say just tether it to your computer with a wire then you just aren't understanding the product.


Network transparency is not required for wireless video, either.


I doubt making the windows draw is what’s taking them time to get right.


You nailed it, down to how you'd do it without any help from Cupertino: https://github.com/saagarjha/Ensemble


Except that doesn’t actually do it. It is just a proof of concept.

Try it with a full suite of Mac Apps and you’ll find it falls apart because they aren’t all well behaved.


It falls apart how exactly? There's some app for which the popup menus are not shown at the right coordinates?

You are making this sound much more complicated than it really is.


Amongst many possible edge cases. Not to mention shearing with things that have rapid animations etc.

The trivial case is simple. You are imagining it's all the trivial case.


But they have already fixed shearing, otherwise there would be no MacBook display mirroring feature at all, multiple surfaces or not.


I’m talking about this open source project, not Apple’s implementation.


It's probably more down to the getting the UI right on apple's end.


Unfortunately I happen to live in Cupertino


Yes, it is.

In fact, there's already a project for it on GitHub. https://github.com/saagarjha/Ensemble

480 lines total, including comments, headers, whole shebang.


> Ensemble is currently "pre-alpha": it's really more of a demo at this point. There's a lot of things I need to work on and until then it is unlikely I will be taking any code contributions.... The code is definitely not designed for general-purpose use yet, so don't expect much of it :)


Tech demos are often easy to put together.

It's all of the edge cases and UX refinements that takes time.


Yeah, sure. I'm gonna go ahead and say Apple probably could have found a way to ship this over the visionOS dev cycle.


Some of the code is factored out into individual libraries. For example the networking and serialization code is separate and it is likely that screen recording will get pulled out too at some point.


> They just haven’t done it.

Literally nobody has done it. It's beyond ridiculous that you can't already show or duplicate an application window on any display you want and allow it to be controlled from anywhere it is visible.

Searching for ways to do this lead one into extremely niche software ecosystems. Please is there any collaboration app out there that makes it seamless to toss windows around like everyone actually wants?


Isn't this how ordinary spanning monitors works? It might be slightly awkward with AR goggles since the relative orientation of the displays will be constantly changing as your head moves, and what happens to a window you have half on and half off of the Macbook's screen when you look away? Or do you want to have the application jump between devices, like appearing on your fridge when you go for a drink? With the old X11 protocol and a daemon in the middle this was possible but the use cases were extremely limited and the security issues made it a pain in the ass to actually use. With distros moving away from X11 this is only going to get harder, and you have to ask yourself how much you really want it.

This would mean the goggles would be basically just a dumb display for the Mac. It would also be weird to try to move an AR app onto your Mac.


I feel like multiple people did it back in the original X11 days, and almost certainly when compiz was the new hotness


Nearly 20 years ago there was an OS X screensaver that would capture all the window buffers and then float them around the screen as they rotated various ways. Another app would save a snapshot of all windows as Photoshop layers with appropriate transparency (before blur was added). X11 was always network transparent, though I don’t think windows could easily be moved from one client to another.

Even classic Mac OS, which wasn’t designed to be “rootless” could be run that way could have its windows mixed with OS X windows in the classic days.

As others have pointed out, there is already a PoC app that is doing it, so it seems if Apple wants to, it’s completely in their power. However, does this match their vision? (pun intended) Time will tell.


So Apple could have done this, but did not. Why? (Speculation and leaks welcome)

- [Profit on basic innovation] Did they want to wait and see how their customers would adopt VisionOS's native free-floating windows, so as to avoid cognitive overload by commingling with MacOS windows?

- [Benevolence to fellow competitors] Did they not want to takeover the existing market of virtualized VR desktops?


I'm reasonably certain it's a combination bandwidth and tech issue.

The Vision Pro is effectively using AirPlay to mirror the whole screen. If you used AirPlay to mirror each window as a whole screen, you'd run out of bandwidth pretty quickly.

The windowing system in MacOS, Quartz Compositor, also isn't built to stream window information. Right now it has a big built in assumption that any windows its displaying are on a screen it also controls. It was probably too big a lift across teams to also re-write the graphics stack for MacOS for the launch of the Vision Pro. Hopefully they get it working in the future, but neither of these problems are easy to solve.


Makes a lot of sense, future will solve it someday.


Most likely because it wasn't polished, so get it out now in a polished and limited state and now you have a fancy update to tout when it is done.


They shipped iPad Stage Manager half-baked, to get iPad developers ready for double-buffered windows, so they could eventually ship the visionOS macOS integration half-baked? Doesn't sound right at all to my ears, even though I'm stoked for my order!

EDIT: -5* doesn't make sense, this is the most polite way you can point out that getting macOS apps windowed on visionOS has ~0 to do with double-buffered windows on iPad OS. n.b. I didn't use half-baked, OP did.


I don't think it's half-baked. I think it's lightly toasted. :-)

I use iPad Pro as a kind of sidecar daily driver, in the magnetic dock magic keyboard w/ trackpad.

As I type this, the screen shows a traditional MacOS style dock across bottom, four Stage Manager window clusters I can tap with a thumb on the left, and Safari plus Messages taking 2/3 and 1/2 of screen respectively.

There's more app and pixel real estate than most Windows laptops, and bringing screen sets to the foreground or swapping them back to the side is so natural I almost feel like giving up that space on my Mac as well.

The big thing I saw happen from apps over the past two version of iOS is app devs realizing their windows will not always be full screen or 1/2 screen size, but arbitrary size.

By now, most iPad apps of any serious nature are effectively window size independent, making them play well with others in stage manager. It's easy to see how that would make them play well with the headset one day.


No, I'm saying they shipped iPad stage manager half baked for their own uses/ to refine for AVP. I'm positing that a major reason for macOS stage manager's existence is as a transport layer/ "texture formatter"


On another operative, I use Stage Manager everyday of Mac & iPad and it’s pretty neat. I actually forgot I was until you mentioned it


Same here, I actually really like stage manager on my Mac.


On Windows of all places (95ish to MEish) there was a remote tool called radmin and it had something that I wish companies had embraced: it hooked in to (maybe even before?) the window-rendering functions and sent the changes over the network. It’s hard to explain exactly what I mean because everyone is so used to streaming pictures of the screen over the network (if they even use remote access at all), but you could have less than 20ms latency while controlling over the internet while using tiny amounts of data (50kbps? 100? not sure but somewhere around there).

OSX had the opportunity to follow that path before settling on the “render windows, capture the screen, compress the image, send it over the network to be decompressed” VNC-style remote access that’s bog-standard today, and if they had Vision Pro would be set up to be an absolute mind-blowing macOS experience.


This is how Windows Remote Desktop used to work - it would forward GDI instructions to be rendered remotely.

It falls apart as UIs got richer, browsers in particular: they're entirely composited in-app and not via GDI, because GDI isn't an expressive enough interface. So you end up shipping a lot of bitmaps, and to optimize you need to compress them. You might as well compress the whole screen then.

https://www.anandtech.com/show/3972/nvidia-gtc-2010-wrapup/3


I wonder how this works exactly. RDP lets me connect to a single monitor Win 11 host and display it on my client's three monitors. Everything is super smooth including browsers (I am connecting via Ethernet). Is the host managing the three screens or does the RDC do it on the client's side?


Note the "used to work" modern RDP will negotiate some form of image (or video) based compression for transferring data[1]. You can even share an X11 desktop over RDP using freerdp-shadow-cli.

1: e.g. https://learn.microsoft.com/en-us/openspecs/windows_protocol...


> On Windows of all places (95ish to MEish) there was a remote tool called radmin and it had something that I wish companies had embraced: it hooked in to (maybe even before?) the window-rendering functions and sent the changes over the network. It’s hard to explain exactly what I mean because everyone is so used to streaming pictures of the screen over the network (if they even use remote access at all), but you could have less than 20ms latency while controlling over the internet while using tiny amounts of data (50kbps? 100? not sure but somewhere around there).

This has been done many times before (see e.g. X Windows) and has known downsides. Off the top of my head:

- You need the same fonts installed on both sides for native font rendering to work

- Applications that don't use native drawing functions will tend to be very chatty, making the total amount of data larger than VNC/rdesktop/&c. style "send compressed pictures"

- Detaching and re-attaching to an application is hard to get right, so it's either disallowed or buggy.


isn't that how Xorg remoting used to work as well? the display server and client are separate, so whether the pipe was local or remote didn't matter. In principle, Wayland could do it too, I think, if there were a way to synchronize texture handles (the Wayland protocol is also message-based, but IPCs GPU handles around instead of copying bitmaps.)

I guess one downfall is that that your pipe has to be lossless, and there's no way to recover from a broken pipe (unless you keep a shadow copy of the window state, and have a protocol for resynchronizing from that, and a way to ensure you don't get out of sync.)


Yeah, you can do this with x forwarding on Linux. Not sure if there’s a modern Wayland equivalent.


Wayland clients don't draw things the way old-school X clients do (neither do modern X clients), so it doesn't make sense at the Wayland level. KDE or GTK could potentially implement something like this though.


This only works with two major assumptions, neither of which is true for the VisionPro:

1. The receiving side has to have at least as much rendering power as the original side, since it will be the one actually having to render things on screen. This is always going to be the opposite case with any kind of glasses, where you'll always want to put as little compute as possible for weight and warmth reasons.

2. Each application actually has to send draw instructions instead of displaying photos or directly taking on control of the graphics hardware themselves. No or very few modern applications work like this for any significant part of their UI.


IIRC many windows apps at that time were using MFC or otherwise composing a UI out of rects, lines, buttons, etc. Then came Winamp and the fad to draw crazy bitmaps as part of the UI. If everyone does that, shipping draw commands is less useful and shipping pixels makes a lot more sense.


This can only work until it doesn't, and it won't work in many situations because eg 1. apps aren't going to bother being compatible with it 2. compositing has surprising performance and memory costs, and in this case the destination is more constrained than the source.


Windows does this very well since at least a few years back. When connected via Remote Desktop any native application will get the behavior you describe, so the UI gets updated with almost no latency.

Applications which bypass the native APIs to render their window contents, in particular video players or games, get a compressed streamed video which has very decent performance. The video quality seems to be dynamic as well, so if there's a scene with very few changes you can see the quality progressively improve.

All of this is done per window, so a small VLC window playing a video in a corner gets the video treatment, while everything else still works like native UI.


X windows system basically does that iirc, and I remember the magic you speak of.


And yet in practice most X traffic these days is bitmaps.


Can you give some more details on this? Google frontpage has lots of results,but it's not clearly the same thing you mentioned.

How did you know it sends windows hooks? Was it some sort of binary serialization?


Yeah, I don't think "mirroring" is quite the right term. It's effectively a 4K monitor for the laptop, with the laptop screen going black. Most (all?) Mac laptops don't have a 4K screen, so you have more screen real estate than "mirroring" would make you think.

But this is sufficient for many use cases (or at least, mine). I pre-ordered one with the idea that my main work will be on the 4K monitor, with most of my superfluous apps floating around as native visionOS apps. That's mail, a web browser, and zoom, which all have apps now, and Slack, which I could just use Safari for but may have a native app in the future.


The screen real-estate is the same as for a 1440p screen. From The Verge’s review:

“There is a lot of very complicated display scaling going on behind the scenes here, but the easiest way to think about it is that you’re basically getting a 27-inch Retina display, like you’d find on an iMac or Studio Display. Your Mac thinks it’s connected to a 5K display with a resolution of 5120 x 2880, and it runs macOS at a 2:1 logical resolution of 2560 x 1440, just like a 5K display. (You can pick other resolutions, but the device warns you that they’ll be lower quality.) That virtual display is then streamed as a 4K 3560 x 2880 video to the Vision Pro, where you can just make it as big as you want. The upshot of all of this is that 4K content runs at a native 4K resolution — it has all the pixels to do it, just like an iMac — but you have a grand total of 2560 x 1440 to place windows in, regardless of how big you make the Mac display in space, and you’re not seeing a pixel-perfect 5K image.”


> 4K monitor

It's more like 1080p monitor. The virtual monitor only covers a small part of the VisionPro's display. You can compensate a bit for a lack of resolution by making the virtual screen bigger or by leaning in, but none of that gives you a 4k display.

To really take proper advantage of the VR environment you really need the ability to pull out apps into their own windows, as than you can move lesser used apps into your peripheral vision and leave only the important stuff right in front of you. You also miss out on the verticality that VR offers when you are stuck with a virtual 16:9 screen.


It is a 1440p display.

Which is the resolution that the majority of PC users are likely using.


4k is important because of perspective, rotation, and aliasing. Just sending 1080p would look terrible.



Slack's approach of being a glorified webview everywhere is really paying off (a bit of sarcasm here, but I see it as net positive)

We're pretty close to "Write once, (rewrite a bit,) Run everywhere"


lol yeah, the ux is great across all devices i have tried the app.

even the big UI update recently as completely overwritten how the old app used to look in my memory

the only feature i have tried and not really cared for is the "Canvas" feature


Slack is largely UIKit- and Swift-based on iOS. I suspect they are bringing over their iOS app rather than a visionOS-focused one, though.


Is it mirrored as some HEVC video stream from the laptop, or are UI elements actually rendered on headset itself?


It's streamed from the laptop. Technology-wise it's very similar to a screen sharing session but initiated from Apple Vision Pro instead of a Mac.


I agree that this should be considered long term, however... you are able to snap VisionOS / iPadOS apps anywhere around your Macbook view AND you are able to control those very apps with your Macbook trackpad.

So even though you have a sequestered Mac output alongside Vision apps, you can use the same controls for all them simultaneously. This should help in the interim.


Yeah when I found this out, it resolved my concerns. Most of my apps will have a native Vision release (email, web browser, slack, etc.) and my actual monitor screen will only need more professional software (e.g. Photoshop, Illustrator, InDesign).


Discontent over this implementation detail shows users are fully sold on the basic idea. Like, if the main complaint about the first Fords was the colour range.


I think it's a more important feature than just a cosmetic color. Imagine if you bought a truck to haul cargo, but were then told it can only transport one type of cargo at a time. That would suck.


I meant it's a superficial rather than fundamental design flaw, easily rectified.


It looks like someone is working on a Mac app that does exactly this, and they seem to have a functional prototype: https://x.com/TheOriginaliTE/status/1751251567641346340?s=20


Unclear if Apple will allow this in the store

edit: Yes I know you can build apps before they're in the store


It is permitted on the App Store. The developer had a thread on the fediverse several days ago.


The visionOS side of this, which is the one that requires App Review, is a glorified VNC app. They will probably review it thoroughly but it doesn't seem like it breaks any rules.


As long as it's open source, you can sideload onto any iOS device by building it yourself.


Iff you pay an annual $100 dev fee.

Yes I know you can technically do this without a paid dev account, but it's practically useless because it has to be re-done every 7 days.


Surely considering the fact that you are streaming your Mac to Apple Vision Pro you should be able to easily resign it on that very device? :)


I don't know about you, but there's nothing easy about having to boot up an IDE every 7 days to sign and re-install an app you depend on. It's just one more thing to worry about and mentally keep on top of, the opposite to Apple's convenience and smooth UX ethos.


Further, I wish they added support to make multiple virtual monitors from macOS Workspaces, like what happens today when you attach another monitor. Switching workspaces can be bound to keys in the Keyboard Settings. Moving windows to other workspaces is easy to do with third-party apps like Amethyst.

It feels like the Vision Pro would definitely be a great replacement for people who (want to) buy multiple expensive monitors, but it doesn't fully reach that potential today, and mostly because of software? Although rendering 3 or 4 virtual workspaces through ad-hoc Wifi at 4K 60fps+ low-latency would certainly be a huge challenge.


I do this sometimes on my meta quest. Go into desktop VR and pull up a couple desktop views so I can see things happen in real time on different “screens”.


You can put VisionOS apps next to the Mac desktop, so it isn't as much of a problem as it seems.


I wouldn’t be surprised if this came in a visionOS update. On a shorter timescale it could also come in the form of third party apps, because there’s no technical limitations preventing a server app from cutting out windows on a desktop OS and sending them over a wire to a visionOS client.


On a shorter timescale it could also come in the form of third party apps

If Apple approves it, of course. This is one of my major concerns; there's a lot of potentially useful functionality that could be implemented, but you have to jump through the app store hoops and hope that Apple doesn't decide that it conflicts with their idea of what you should be allowed to do.


Functionally speaking these apps would be scarcely distinguishable from the plethora of screen streaming apps that exist on the App Store already, like Screens and Moonlight. Of course Apple could reject these apps anyway but it seems unlikely.


There are third party apps that do this already on the Quest. I believe they can replicate Mac screens, they definitely can replicate Windows PC screens into the VR space. If Apple doesn't provide a 1st party solution I suspect someone else will soon.


Appears possible in theory: https://github.com/saagarjha/Ensemble


The inability to break out Mac windows curbed a lot of my enthusiasm for the AVP. I hope Apple will eventually add it, but I'm not going to spend $3500 on that hope.


It will be less of an issue for me if we start seeing native builds of popular IDEs like xcode, intelli-j/goland, etc for vision pro (and other apps for other people say photoshop). I think of the 'projected screen' feature more like a compatibility layer like Rosetta 2. You use it until you get a native build then it stops being a thing you bother with.


That will never happen. Nobody will make real software for iOS because nobody wants to pay a 30% fee for the privilege of randomly having their whole business shut down when an app reviewer is in a bad mood.


What world do you live in? The iOS App Store is probably the most full fledged and populated app ecosystem there is.


And yet none of the software GP mentioned are available on it.


It’s been 1 day.


No, iPads have been around for more than a decade.


Are you referring to IDEs? The iPad is not a productive machine. It’s for consumption. The VisionOS is for productivity which is very different from anything on iOS so I imagine it’s going to be available on it.


My point exactly, i'm going to be hard pressed to drop 3.5k on a ipad strapped to my face. If it can't replace my macbook why would I bother?


It has the exact same app review guidelines as iOS and iPadOS. If you can't put it on those, you can't put it on VisionOS.


I don't see why you couldn't port an IDE to ipad os.. filesystem is a bit tricky with the way icloud files work.. but otherwise it's just a thing.

I think mostly people don't (for high end IDEs) because they expect a keyboard and pointing device that isn't a touchscreen and those are faily rare per ipads sold.

The ipad pro certainly is seeing some more beefy productivity apps it seems.


That's not how Apple advertises iPad Pro.


IntelliJ, GoLand etc are Java apps.

Windows, BeOS and Commodore 64 apps also don't run natively on the iPad or Vision Pro.


Without a tethered connection the bandwidth simply isn't there.


As long as you keep your expectations modest you can probably do an acceptable job for casual usecases.


The fact that pretty much everyone who owns both a Vision Pro and a Mac would want that feature means it's probably going to happen.


The largest portable MacBook Pro 16.2 inch has a 3456-by-2234 native resolution at 254 pixels per inch which by default is halved . So I don’t know what she means exactly about 4k but there are enough pixels to do a portable 4k display.


Streaming an arbitrary collection of windows instead of a single finished, composited framebuffer increases the bandwidth requirements by at least an order of magnitude. That's never going to work well over WiFi.


As long as the total number of pixels is less, I don't see what that has to be true, at least bandwidth wise? Compute wise, the vision might have to do slightly more to separate the buffer and composite them into the AR view in different places, but the bandwidth should be direction proportional to the number/size of each window. If I can fit all the windows on a 4K screen, then I don't see why the software can't split that and lay it out separate in my view instead of in a single rectangle.


Some of the windows will be obscured by others. If you stream a 100 windows to visionOS, it's possible to lay them out so that none of them cover each other and you have to render them all. On a flat screen there is a limit to how many pixels you need to paint.


It seems pretty obvious that nobody wants to be limited to the total number of pixels being merely equivalent to tiling across a 4k screen.


Just compress the stream. Total pixels increase the vram on the device but popping out a static window shouldn't take any more than a trivial amount of streaming bandwidth.


Everything is already compressed. Uncompressed 1080p is 3Gbit/s, which is already well beyond what's actually achievable with WiFi. Allowing an unbounded number of window surfaces to be streamed invites people to use it in a way that current technology simply cannot provide the kind of slick experience that Apple needs from this product.


Seeing as you have the eye tracking you could probably greatly reduce the FPS/bitrate of windows that are out of gaze. This kind of foveated rendering seems to already be used in their ecosystem so I assume its sufficiently slick for Apple. Apps will sleep when not being gazed and such.

This allows you to restrict full bitrate to a single window with a maximum resolution and keep the user experience high and astonishment low.


Keep in mind that you are popping out a static window 60 times a second or so.


Use the touted eye tracking feature and compress that particular applications stream inversely to how much the user is focusing on it within the Vision Pro.


Does foveated rendering work with the extra latency of a round trip over WiFi? I can certainly see this working fine on a head tracking scale, but I'm less sure about eye tracking. I also wonder if existing video compression hardware can actually make use of such outside feedback to adapt bitrate and quality with the necessary flexibility.


I guess there is something to the macpro being able to handle the output for one screen at a time. If it has to render 4k outputs for 10 different screens simultaneously, performance is going to suffer.


That’s really weird, because even the hololens has this feature. Multiple windows, multiple desktops is how we want to work.


the iPhone debuted without copy/paste. They'll get to it but maybe not immediately.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: