The Avatar Magna Carta : or, How to Puppeteer a 3D Humanoid with 6DoF head and hands tracking

In this post, we present the workflow required in order to enable a player to live puppeteer a rigged first person 3d avatar in-game, by:

  1. driving the avatars in-game hands in a 1:1 relationship with the players actual physical hands, and
  2. animating the avatar’s in-game head orientation to match the precise orientation of the players physical real-world head,
  3. via 3d trackers on the players heads and hands, and the application of a simple inverse kinematic (IK) physics model.

We spent a long time figuring out this path, so I thought we’d share it with the community. Note: this is an entirely technical, workflow, pipeline post for readers who are currently developing VR applications. Its not for the general consumer. This tutorial is specifically crafted for a Unity3D pipeline. Also, it is specific to the Oculus Rift DK2 HMD, and Razer Hydra hand trackers, powered by Sixense. It shuold work with other tracking solutions, with modification.

Go ahead, wave hello to the adoring fans...

Go ahead, wave hello… heads and hands fully tracked and puppeteered

First, the basic premise:

People want to relate to their own physical avatars in VR. They want to be able to look down at their feet, and see their body. They want to be able to wave their hands in front of their face, and see some representation of their appendages in front of them, superimposed on the virtual scene. In short, they want to feel like they are present in the experience, not just an ethereal viewer.

This problem proved a bit more difficult to solve in practice than one might imagine. So in the interest of fostering community, we are sharing our technical solution with everyone. It isn’t perfect yet; we’ve posted a number of tips and follow-on research topics at the end of the tutorial, places where this needs to go before its fully “ready for prime time.” However, the solution presented here IS functional, leaps beyond the standard “avatar-less” VR being produced today, and should serve as a baseline from which improvements can and will be made.

Now, the actual tutorial:

  1. build and skin your Avatar Model
    1. we create humanoids in Mixamo Fuse
    2. you may choose the modeling software of your choice : Maya, Blender, etc
    3. make sure that your final model is in T-pose
  2. rig the character with bones
    1. this can be done by uploading a t-pose character to
    2. …or manually in your software
    3. use the maximum resolution possible : we use a 65-bone skeleton, which includes articulated fingers
  3. give the character at least one animation
    1. we will use “idle” state
    2. you assign this online in Mixamo
  4. get the Avatar into Unity
    1. export from Mixamo to Unity FBX format
    2. import the resulting FBX (approx. 5-20MB) into Unity Assets : gChar folder
    3. this will generate a prefab in Unity along with 4 components in a subfolder:
      1. a mesh
      2. a bone structure
      3. an animation loop file
      4. an avatar object
    4. the prefab will have a name such as “YourModelName@yourAnimName”
  5. Configure the Avatar
    1. click on the prefab in the Assets folder
    2. In the inspector, select the “Rig” tab
      1. make sure that “Humanoid” is selected in the “Animation Type” pull-down
      2. if you selected that manually, hit “Apply”
    3. drag the prefab from the Assets into the Heirarchy
    4. select the prefab avatar in the heirarchy
      1. In the Inspector:
      2. add an “Animator” component. we will fil in the details later
      3. add the g-IKcontrol.cs C# script. again, we will fill in details later
      4. you can copy the source of the script from here
  6. Add the latest Oculus SDK (OVR) to the project. 
    1. Download the latest Oculus SDK for Unity
    2. this is usually done by double-clicking “OculusUnityIntegration.unitypackage” in the OS, then accepting the import into your project by clicking “import all”
    3. You should now have a folder within Assets called “OVR”
  7. Add the latest Sixense Hydra / STEM SDK to the project
    1. Download the Hydra plug-in from the Unity Asset Store
    2. Import it into your project.
    3. You should now have a folder within Assets called “SixenseInput”
  8. Create a high level Empty in your hierarchy and name it “PLAYER-ONE”
    1. make your Avatar prefab a child of this parent
    2. Drag the OVR CameraRig from the OVR folder and also make it a child of PLAYER-ONE
  9. Properly position the Oculus camera in-scene
    1. The oculus camera array should be placed just forward of the avatars eyes
    2. we typically reduce the forward clipping plane to around 0.15m
    3. If you’re using the OVRPlayerController, Character Controller settings work well:
      1. Center Y = -0.84m (standing),
      2. Center Z = -0.1 (prevents from being “inside head”)
      3. Radius = 0.23m
      4. Height = 2m
    4. This will require some trial and error. Make sure that you use the Oculus camera, and not the Oculus Player controller. Experimentation will be required to bridge the spatial relationship between a seated player and a standing avatar. Calibration software needs to be written. Trial and Error is generally defined as a series of very fast cycles of : build, test, make notes. modify, re-build, re-test, make notes. repeat until perfect. There are many gyrations and you will become an expert at rapidly donning and removing the HMD, headphones, and hand controllers.
  10. Create the IK target for head orientation
    1. Right-click on the CenterEyeAnchor in the Hierarchy, and select “Create 3D Object > Cube”
    2. Name the cube “dristi-target”
    3. move the cube approx 18” (0.5m) directly outward from the Avatar’s third eye
    4. This will serve as the IK point towards where the avatar’s head is “aimed” at, i.e. where they are looking. In yoga, the direction of the gaze is called dristi.
  11. Get the Sixense code into your scene
    1. Open the SixenseDemoScene
    2. copy the HandsController and SixenseInput assets from the heirarchy
    3. Re-open your scene
    4. paste the HandsController and SixenseInput assets into your heirarchy
    5. drag both to make them children of OVRcameraRig
  12. Make sure the Sixense hands are correctly wired.
    1. Each hand should have the “SixenseHandAnimator” controller assigned to it
    2. Root Motion should be UNchecked
    3. Each hand should have the SixenseHand Script attached to it
    4. On the pull down menu doe SixenseHand script, the proper hand should be selected (L/R)
  13. Properly position the Sixense hands in-scene
    1. They should be at about the Y-pos height of the belly button
    2. The wrists should be about 12” or 30cm in Z-pos forward of the abdomen
    3. In otherwords, they should be positioned as if you are sitting with your elbows glued to your sides, forearms extended paralell to the ground.
    4. You will want to adjust, tweak, and perfect this positioning. There is an intrinsic relationship between where you position the hands in the scene, and the Sixense trackers position in the real world relative to the Oculus camera. Trial and error and clever calibration software solves this. That’s another tutorial.
  14. Make the Sixense hands invisible. 
    1. we do this because they will merely serve as IK targets for the avatars hands
    2. do this by drilling down into HandsController : Hand_Right : Hand_MDL and unchecking the “Skinned Mesh Renderer” in the Inspector panel
    3. do the same with the left hand.
    4. this leaves the assets available as IK targets, but removes their rendered polys from the game
  15. Create the Animator Controller
    1. create transitions from Entry to New State, and
    2. from New State to Idle (or what you named your created animation)
    3. On the Base Layer, click the gear, and make sure that “IK Pass” is checked.
    4. this will pass the IK data from the animation controller on down the script chain
  16. Assign the new Animation Controller to the Avatar
    1. select the avatar in the heirarchy
    2. assign the controller in the inspector
  17. Map the Avatar with Puppet IK targets for Hands and Head,
    1. drag the “Hand – Left” from the Sixense Handscontroller parent to the “Left Hand Obj” variable
    2. drag the “Hand – Right” from the Sixense Handscontroller parent to the “Right Hand Obj” variable
    3. drag the Look-At-Target from the OVRCameraRig to the “Look Obj” variable
      1. The Look-At-Target is nested deep:
      2. PLAYER-ONE : OVRCameraRig : TrackingSpace : CenterEyeAnchor : Look-at-Target
  18. THATS IT!
    1. Build a compiled runtime.
    2. Connect your Rift and Hydra
    3. launch the game
    4. activate the Hydras.
      1. grasp the left controller, aim it at the base, and squeeze the trigger.
      2. grasp the right controller, aim it at the base, and squeeze the trigger.
      3. hit the “start” button, just south of the joystick on the right controller
    5. When you tilt and rotate your head, the avatars head should also tilt and roll. When you move your hands, the avatars hands should move in a 1:1 ratio in-scene. Congratulations, you’re an Avatar Master.

and areas for further R&D
  1. Ideally, the avatars head should not be rendered for the player, yet it should still cast shadows and reflections
  2. the avatars head should also clearly be rendered for other players in multi-player scenarios, as well as for third-person camera observer positions.
  3. An in-game shadow is a great way to ground the player to the avatar in a convincing manner. Even when the hands are outside the field of view, seeing the shadows of the heads and hands triggers a very powerful sense of presence.
  4. While head rotation and orientation on the skull axis is fairly straightforward, head translation, i.e. significant leaning in or out or to the side, is a bit more abstract in terms of pupeteering choices. You may wish to explore either locomotion animations, or “from the hip” IK solutions to move the torso along with the head.
  5. RL : VR / 1:1 Spatial Calibration is KEY to great experiences.
    1. See 9.3, above : Properly position the Oculus camera
    2. and 13.4 : Properly position the Sixense hands 
  6. The built-in Unity IK leaves a lot to be desired when it comes to realistic approximations of elbow positions. We are investigating the FinalIK package and other professional-class solutions.
  7. This solution in its current form disables the ultra-cool “grasping” or “pointing” animations that are built-in to the Sixense template models. Investigate how to re-enable those animations on the rigged avatar’s MecAnim structure.
  8. You will also want to configure the remainder of the Hydra joysticks and buttons to control all  game functions, because it sucks to be reaching and fumbling for a keyboard when you are fully immersed in a VR world.
  9. The majority of this configuration starts in the Unity Input Manager
    1. Edit | Project Settings | Input…
  10. There should be keyboard fallback controls for users who do not own Hydras…
Have you tackled this same challenge? Have you created any solutions to the further investigations we’ve proposed above?
Share in the comments below,
because we’re all in this together! 


mobile : the final destination

Witness : the future. a truly mobile VR picnic.

Witness : the future. a truly mobile VR picnic.

Way back in 2005, I was in the business of creating massive multiplayer augmented reality systems. My team created playspaces which could read up to 200 players simultaneously, using high-powered projectors to paint the space with light, complex multi-camera arrays to sense the people and their movements… and very highly tuned algorithms to transform those raw camera feeds into usable structured data, in real time, with near-zero latency. This was called markerless motion capture, or markerless mocap. It was before Microsoft Kinect, before time-of-flight, and was considered one of the holy grails of computer vision.

We were able to package all this equipment: CPU, GPU, projector, cameras… all into a single piece of unified hardware. Our first build weighed in at 110 lbs.

The Gizmo v1 was a massive 89 pound steampunk joy machine

The Gizmo v1 was a massive 89 pound steampunk joy machine

I lugged that beast all around North America, paying the steep fees to airlines and hurting my back all the while. Due to the physical stresses, I demanded that we bring the weight in under 50 lbs, the top limit of airlines, and indeed, with some clever mechanical engineering, we were able to accomplish that goal.

Gizmo v3 weighed in at a svelte, travel-ready 49.9 lbs. including high-lumen projector, camera-array, power supply, CPU, GPU, and fans.

Gizmo v3 weighed in at a svelte, travel-ready 49.9 lbs. including high-lumen projector, camera-array, power supply, CPU, GPU, and fans.

Nonetheless, 49.9 lbs was still a hell of a lot to haul around, especially given my 13th story walk-up apartment in Lower East Side, Manhattan where I was based at the time. On the 20th time I climbed those stairs, I swore to the gods above that never again would I haul heavy hardware around the planet.

That promise held true for many years. Until now. Now, in 2015, somehow we find ourselves again in need of high powered GPUs, with the accompanying massive power supplies and cases. Thank the gods, I was able to engineer this thing to less than 20 lbs this time: the cameras are featherlight, and the projectors are replaced by goggles. Instead of projecting to an outerworld, we are creating rich innerworlds. However, its still a massive amount of heavy iron.

which brings us to the key event:
a seminal board meeting of my former company.

Matt, Suzanne, and I were sitting at the massive mahogany conference table, alongside all of our Board of Advisors: brilliant businessmen, financiers, and researchers. We presented our new ultralight 49 lb. unit, the PlayBox Mark IV. My father was in attendance; he played a key role in ushering in the modern era of VR, having launched the military’s SimNet initiative waaay back in 1980. He simply looked at the schematics, and said:

“You do realise, that all that hardware is going to sit inside a cellphone, inside of 5 years?”

At the time, I scoffed:

“A cellphone? That’s ridiculous! Do you realise the graphics supercomputing power we are harnessing to make this a real-time, responsive, computer vision AR system?”

But as the days, months, and years went on, I realised the wisdom of my father’s words. First came the pico-projectors, medium-lumen LED-powered HD projectors that were the size of a matchbox. Next came the low-powered, high-resolution stereoscopic camera arrays, these the size of a dime. And finally came nVidia’s Tegra line of GPUs, ulttra-fast graphics supercomputers that were purpose-designed for smartphones and tablets.

Before I knew it, all the parts were in place.

Which brings us to the present moment.

Once again, we have built graphics supercomputers to ease our entry into real-time, high-performance VR. We tweak and optimize every component to maintain the floor 75fps required for genuine presence.

the engine of our current VR-PC, the venerable Radeon 7990. 400 watts, 4 gigaflops of graphics supercomputing horsepower, and 75fps on our Oculus Rift.

the engine of our current VR-PC, the venerable Radeon 7990. 400 watts of energy draw, 4 gigaflops of graphics supercomputing horsepower, and 75fps on our Oculus Rift.

And then, I got a Samsung Note4 and the GearVR peripheral, a hardware/software combo lovingly hand-architected by none other than John Carmack, designed to deliver high-performance VR in a truly mobile form factor.

Samsung GearVR : the harbinger of the final form factor of VR : light, wireless, fast, mobile.

Samsung GearVR : the harbinger of the final form factor of VR : light, wireless, fast, mobile.

The shocker? To date, my GearVR has outperformed all desktop solutions we’ve created.

Let me say that again:

A $900 battery-powered 6-ounce smartphone currently outperforms my $2500, 1-kilowatt, 21-pound desktop beast…

AND, the added element of freedom of physical movement is not even factored in here. The ability to bring your GearVR on a picnic in an Adidas sport bag, as opposed to bringing people into your studio and holding the cords out of their way… that alone justifies the Gear.

In short: as soon as possible, dSky will be focusing all efforts on mobile as our lead platform. No worries, Oculus, Sony, and HTC: our apps and experiences will still perform insanely wonderfully on your platforms. Its just, as with the world:

dSky is Mobile First.

Unity 5 port complete

Well, the port to Unity 5 took a bit longer than expected. Then again, what port doesn’t? Overall, we’re very happy with the more robust namespace support in code, and the physically based shader model. It took quite some time to re-tool all our custom shaders into a PBR model, but once done, the results are spectacular, no pun intended.


R2D2 with the new PBR in Unity5. We’re loving that blue-alloy metal look!

And, we finally solved the mascara issue with all our character models, which we created in Mixamo’s excellent Fuse product. For those techies / artists out there: the trick is to duplicate the existing Legacy/Diffuse-Bump shader for each character, keep the textures and normals, and set the shader model to “Standard / Specular / Fade” with a smoothness of 1.0. Do the same with the eyes, and you’ll have that beautiful “twinkle in the eyes” that all pseudo-living avatars should properly exhibit.

Luke finally drops the mascara and gets real eyebrows -- and a spark

Luke finally drops the mascara and gets real eyebrows — and a spark

In other news, our friends at Magic Leap released their first actual concept video. Just single-player for now, but fun stuff nonetheless.

That’s all for today.


From here forward… Optimization

We recently migrated primary build development from my trusty MacBook Air to our high-performance Windows PC rig… while FPS was churning along at around 15fps on the MBA, I figured we’d easily be hitting the 75fps required for presence on the PC. Lo and behold, that was simply not to be the case… yet.

Shockingly, the frame-rate on the PC is about the same, and the stutters are even worse. This makes me angry. That PC has what was, a year or so ago, the best graphics card that money could buy. A monster Sapphire ATI R9 HD7900, with 3GB of fast DDR graphics RAM, pulling 400w of juice. How could this be doing similar performance to my humble 2012 MacBook Air with Intel 4000 graphics on a tiny chip welded to the motherboard?

Time to run some benchmarks. I used the beautiful Heaven and Valley benchmarks, both free from Unigine (not to be confused with Unreal and/or Unity, ha!). My PC rig scored a respectable(?) 901 on Heaven, average fps of 36, with a max of 66 and a low of 9. Same settings on the Mac, drove a humbling score of 87, average 3.4fps, max 5.3, low 2.4. This would imply a theoretical 10x performance enhancement on the PC… if we were only talking about pure graphics. Good… at least the benchmarks showed as much. Conclusion:  the raw horsepower is there.


Now I just have to do the hard work of getting Unity, and Oculus, and Sixense, to all perform similarly, and to optimize, optimize, optimize until we arrive at that fabled 75fps required for solid presence.

May the force be with me.


It appears that my CPU was totally the bottleneck. Since the machine was re-purposed from its past life as a CryptoCurrency miner, we invested heavily in the GPU and scantily in the CPU, currently an AMD Sempron 145 @2.6gHz. After sorting through the so-often misinformed reddits on this matter…

OctopusRift has a most excellent resource to build the ultimate VR PC:

Tom’s Hardware also appears to be a great resource, specifically:
• and
Tom’s Gaming CPU Hierarchy Chart March 2015


We’ve gone ahead and purchased a new brain for Anakin. Moving up from the lil Sempron. Here are the new specs: (and yes, 8GB of RAM comes next)

CPU : AMD FX-8350 8-Core Black
GPU : Sapphire Radeon R9 280X Dual-X 3GB GDDR5
HDD : 500gb SSD Samsung 850 EVO
RAM : 4GB Kingston 2GB x2
MBO : ASRock 970 Extreme4 — full ATX socket AM3+
PSU : 1200w Cooler Master Silent Pro Gold BEAST
Stay tuned for the new benchmark results.