Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A tool for motion-capturing 3D characters using a VR headset (diegomacario.github.io)
103 points by diegomacario on Jan 17, 2023 | hide | past | favorite | 31 comments
Hi everyone! I'm one of the authors of this project. The demo you see here is powered by a tool that I recently helped develop and open-source at Shopify called handy. You can find the repo here: https://github.com/Shopify/handy

Most people don't realize that VR headsets have become really capable motion capture platforms, so we decided to release this tool to bring motion capture into the hands of everyone who owns a headset.

With a cheap Quest 2 you can capture your hands using the headset's hand-tracking feature and your head. With an expensive Quest Pro you could capture your facial expressions using the headset's eye and face-tracking features.

Thanks for checking this project out! I'm here to answer questions if you have any.




This is great, well done, and thanks for open sourcing it.

I started to build something like this early last year, but got too busy with another pet project and having a baby. Here's some demos and notes if you're interested. https://twitter.com/convey_it/status/1433163282597171200


Holy smoke that's so cool, thanks for sharing it! I love that your work helped increase the number of morphTargets in Three.js. That's real impact!


Thanks! Yeah, that felt good.


With an expensive Quest Pro you could capture your facial expressions using the headset's eye and face-tracking features.

Any more on this?

As for hands - is it able to track hands and fingers only or also an object within a hand? I can see this being useful to 'puppeteer' a 3D object by hand used as a proxy of it... Or a simple object within a hand.

Outside of maybe a cliff-climbing game I struggled with finding a reason for buying a headset at all. This seems like a plausible use for animation.


Well consider this sample from the Meta Movement SDK that allows you to control the facial expressions of a character with the face tracking of the Quest Pro: https://youtu.be/IQt4wTdGK64?t=315

Our tool could record those facial expressions into an Alembic file.

And as for holding objects, our tool could capture those too. Meta's Interaction SDK is excellent. Here's another sample they provide: https://youtu.be/WS4vtcwm8Zw?t=1

And I have to say that the cliff-climbing game is really fun.


This is extremely cool!

I haven't much of a clue on motion capture. But, the capture I want to do requires feet, hands, face, shoulders, arms, knees, fingers, AND the ability to do it in any position. Curl up in ball. Roll over on the floor. I want to capture more than just a head and hands upright but AFAIK there's no solution under $20k.

See user name. if I want to make pr0n, I need to be able to capture those positions. Is there anything that can do it on the cheap?


Curling up in a ball and rolling around on the floor seems like a worst case for a lot of motion tracking setups. Systems based on wearable sensors won't work because you can't comfortably roll around in them, and single-viewpoint systems like the Kinect get confused when they can't see your whole body.

Have you tried getting a bunch of Kinects and doing some kind of sensor fusion? I don't know what the current state of the software is but you can buy the hardware for <$100 each used.


https://www.theverge.com/2022/11/30/23485732/sony-mocopi-mot...

Not open source but pretty cheap and does 80% of what you want, especially if you combined it with this.


That’s cool! What would the typical use case for this be (and why did Shopify build it)?


We actually built this to easily add hands to this concept video: https://twitter.com/StrangeNative/status/1613218237969494017...

Animating hands by hand (no pun intended) is horrendously difficult, so our 3D artist asked if we could help.

We built this within Shopify's Spatial Commerce team, which is a team that explores the intersection between AR/VR and commerce. Here are some of our other projects:

https://twitter.com/StrangeNative/status/1544666221181734912... https://twitter.com/StrangeNative/status/1562450155080597505...


I am constantly impressed by what your team shares and the practical application of XR concepts. I find myself pointing people to your demos frequently and hope to see one or two implemented through to a purchase experience I can try at home. Thanks so much for all the output !!


Plugging in this one we did for reference a long time ago! Capture + share dances https://blog.mozvr.com/a-saturday-night/


I love this so much, thanks for sharing it! We have been thinking about similar concepts at Shopify. Mocapped performances convey so much more emotion than images or videos.


Thanks for this! I have been working on something similar for an upcoming education app, record a course and play in the same app with a compact file format (1MB per minute, could be much less with some tricks). You can see a demo here : https://youtu.be/zcHAzQXm3Hg and more at https://explayn.me I will definitely check your lib, and will be happy to switch if it’s better !


Wow that's super cool! That's definitely the biggest challenge - how to keep the size of the mocap data small.

Our library is not the best at that, since it simply records everything to Alembic files, which are quite heavy.

Best of luck with Explayn. I really like the idea of learning from mocapped lessons.


Thanks for the kind words! Yes, file size for live action or shipping directly in an app is always an issue. I may actually contribute some code, my file format is no rocket science, just a bare set of floats with some metas, not even some diff encoding between frames, so quite easy to interpret between languages and platforms.


For a body, with bones, since they can't stretch. All you need is rotation at each joint. You can get away with 10/11 bits per axis.

So for a full body you should be able to compress to 200 bytes per frame for 50 joints. That would mean 300k for 1 minute of animation at 30fps. Interpolate to get 60fps. That doesn't include faces.

If you do faces like Apple does, which IIUC is just N morph targets where N is like 15? Those are 1 weight each and you could easily make those 1 byte per weight or less so that's 27k for 1 minute of animation

Both of those could probably easily be compressed by storing deltas like draco or fit to curves for lots more compression.


Thanks! Yes, that’s the way to go. There is only a compromise between code simplicity/specialization and compression performance. And another point is the ability to use the format in memory without too much decoding or overhead when opening the file, random access for fast forward etc (when doing delta encoding) etc. (And I actually need strech too because my NPC interacts with objects, so the arm/hand bones should be at their exact place at replay, that’s 32 bones just for fingers.)


Take a look at the MCAP file format (https://mcap.dev), we invented it for the robotics industry but it’s a generic write-optimized container format for time series data. Since it’s a container format, you need to choose a serialization format as well such as flatbuffers or protobufs.


Thanks for the pointer, I will definitely have a look at this format, I was not aware of this.


This could be great for people doing work with various sign languages.


That's great!

I assume the hand capture (+ head position) doesn't give you enough to apply inverse kinematics reliably (i.e. better than VRchat), is that correct? It's a naive assumption on my part given that wrists aren't captured, but perhaps there are really smart ways to solve this. It feels like it could be a "next step" to have upper-body motion capture.


How well does it do with occlusion? Holding, or manipulating an object, and capturing those interactions.


It does amazingly well. Last May Meta released their Hand Tracking 2.0 update, and the quality of the hand tracking improved enormously. You can see a comparison of 1.0 and 2.0 here: https://www.youtube.com/watch?v=K-l-RXxYnzs&t=4s&ab_channel=...


What about arms? Or would it have to rely on some realtime IK solution to sort them out?


For the arms you could use Meta's new Movement SDK: https://developer.oculus.com/documentation/unity/move-overvi...

Although in my opinion the arms still look a bit janky. You can see a demo of a sample they included with the SDK here: https://youtu.be/IQt4wTdGK64?t=419


Was wondering the same thing - if it does well there, it could be a really cheap way to, for example, motion-capture realistic weapon holding in an FPS, provided it can be extended to capture the arms as well using CV.


It would need some cleanup, but definitely usable.


This is a really great project, thanks, will definitely have a play with this and it might enable some project prototyping for me that are too small for a proper mocap setup.


Oh, while you're here, I recently started exploring the possibility of starting a small shop. When I looked into Shopify I had absolutely no idea how to get started or how much it cost and your website failed to even come close to answering those things in the few minutes I had to investigate. Whereas with SquareSpace provided those answers in seconds.


Just went back, have no idea how I could have gotten so confused. Had ended up on a list of add-ons and such.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: