Poor Man’s Positional Tracking

For those who don’t know me, I’m rather poor. Like living out of my truck, for the most part, poor. And that’s okay, it’s part of the adventure. But it also means that a $900 VR HMD plus the grand and half or so for the system to run it is waaaaaaay out of my price range. However, I happen to think that VR is the future. It’s got an insane amount of possibilities not just in the realm of storytelling, or for gameplay, but for things like educational experiences, psychological insights, and medical treatments as well. Because of that, I think it’s imperative that we get VR into the hands of as many people as possible who will be able to innovate, experiment, and bring more insight to the table going into the future. Google’s Cardboard is a great step forward in that arena, but it’s still lacking the maneuverability that one expects from being in a virtual world. While I’m sure GearVR or a Daydream headset will offer Positional Tracking soon enough, even that is sure to be outside the price range for those of us who don’t upgrade our phones more than once every couple of years. So with that in mind, I’ve been looking for a DIY Positional Tracking solution that would be affordable and work with my rather nomadic lifestyle.

Recently, I came across the VR Tracker Project designed by Jules Thuillier that streams Three Dimensional Coordinates over WiFi and websockets to the Cardboard Headset. But, a couple of hundred dollars for the system was still out of my price range. However, I liked the idea of offloading the Tracking to a separate system and began working with some cheap Playstation3 Eye Cameras to build a system of my own for as inexpensive as possible. Thus, The Poor Man’s Positional Tracking.

A Disclaimer: This is only my second time using Python and I’ve only been playing with Unity on and off for about a year. If you’re expecting some incredibly majestic code, you’re barking up the wrong tree. I treat programming languages like themed Lego sets. I’ve got a basic idea of how the standard Lego bricks work to click together and can build most anything in my head. Most of my time with a new language is spent looking for examples of how other people did what I have in my head using those special pieces that come with the themed sets. I’ll do my best to include links to as many of the tutorials that I drew from as I can, but there’s probably very little noticeable difference between me and a script kiddy, fair warning.

An Overview
This project is basically made up of two scripts. A python script that is monitoring two PS3 Eye Cameras for the brightest point they see and then calculating that brightest point’s coordinates in 3d space. The 3d Coordinates are then streamed over a WiFi websocket connection to your Unity application based HMD (GearVR, Daydream, Cardboard, etc.) On the Unity HMD side is a websocket script that takes those 3D coordinates and constantly updates the Transform.Location of a GameObject which you can attach your Head GameObject to. Move the Light/IR Marker and your Head GameObject moves in Unity. Piece of cake.

Another Disclaimer: If you have Cardboard and your phone has a tendency to drift then this project will probably not work too well for you. Pitch, yaw, and rotation are still dependent on whatever SDK your Unity application is using. This means if you drift, you’re going to drift, but your forward/back, right/left movement is going to be offset by whatever degrees you’ve drifted. This is not only confusing, but can lead to Pukeyville. Population: You. I’ve used this with a borrowed GearVR and an IR marker and was pleasantly surprised with how natural the system worked. It’s not the sub 20ms that presence requires but it’s smooth enough to play, develop, and experiment with the concept.

Okay, with all that out of the way, here’s what you need to implement this the way I did:

Source Code: https://github.com/steelejay/3d-positional-tracking
2 PS3 Eye Cameras

Another word of note: (It never ends!) There are multiple ways to make the bright point that the script tracks. If you have a playspace where you can turn off or at least dim the lights while working then you can get away with something like this velcroed to your head. If that doesn’t work for you there is the option of modifying the PS3 Eye Cameras to filter out most non-infrared light and use an IR marker as your light source. The reason I’m telling you this now is because there are at least 2 kinds of PS3 Eye Cameras and one of them is not capable of being modified for use with IR. While I’ve ordered from that Amazon link and gotten 2 cameras capable of being modified, I also got 2 that weren’t. If you think IR is going to be the only way for you I would advise picking up a couple of used cameras from Gamestop or someplace where you can examine the cameras first. The modifiable cameras have a rounded orb like lense on the outside while the non-modifiable cameras have a flat lense. You can see pictures here. These two videos will show you how to modify the cameras for IR if necessary. Part 1 Part 2
1 Raspberry Pi and the fixings – Optional, I’ll be describing how to do it with a Pi, if you’ve got a machine you’d like to use you’re on your own.
- 1 HDMI Cable
- An HDMI capable screen of some sort for the PI to help with calibration. Once it’s all setup you can have the script auto run after boot so you don’t even need a screen or anything hooked up to it. I’ve got mine setup for SSH so I can shut it down when necessary.
- Keyboard, mouse, etc
1 WiFi Network for your Pi and the HMD to communicate through
1 Multi-Directional Bright light source – You need something that can be seen by the camera even if you turn around. After experimenting with Infrared LEDs that have limited FOV I eventually settled on these Tap Lights. They’re really bright and get the job done without messing with wiring or pulling apart the cameras. If you can dim the room you are in, this is the way to go.
- The TapLight I ended up finding had the best results.
  
  Infrared Marker with Gear VR. This worked amazingly well when it wasn’t falling off my head. Needed velcro, which I now realize I had handy and had forgotten about it.
  
  My first attempt at an IR Marker. I later found that scraping the LEDs on cement increased their FOV dramatically.
  
  See? Scraped LEDs, wider FOV.
A Tape Measure and Blue painters tape, they’re both incredibly handy.
Basic Linux Skills and Unity Experience. Most of what I’ve done is drawn heavily from tutorials so you don’t need to be an expert to make this work, but you do need to know some basics about the environment you’re running the script in. I wrote and ran the script on a Raspberry Pi 2 using Python and OpenCV under Linux and that’s what I’m going to give you a roadmap for. Trust me, I fumbled my way through this with a lot of trial and error. If I can do it, you can do it too. That’s what computers are all about. When all else fails, either google the error message or google what you’re looking for with the word ‘tutorial’. That will get you farther than you can imagine in IT.
Cardboard, GearVR, Daydream, etc. Some sort of Unity VR HMD that runs WiFi.

Part 1. The Gateway
I used a Raspberry Pi 2 Bravo (I think) and I got reasonably decent results. I’d recommend using whatever the latest version of Pi is out; If you’re using a Pi! You don’t necessarily have to, but I’m only going to cover setting up the gateway’s software environment for a Pi. You could probably run this in a VirtualBox on your badass gaming rig and save yourself the $40 for a Pi.

I’m going to stop here for a second to give heavy props to Adrian Rosebrock @ http://www.pyimagesearch.com/ most of this Python code of mine is just mixing and matching from his tutorials. I really have very little experience with Python or OpenCV and he does an amazing job of cutting through the bullshit. One of these days I intend to go through his 21 day course and I highly recommend you do so too if this sort of stuff interests you. I also just noticed he is currently offering Raspbian + OpenCV pre-configured and pre-installed with his ebook packages here for $94. Out of my price range, but if you’re into this stuff and you’ve got the cash, probably worth it.

Part 1a. Install OpenCV and Python
So with that being said, I configured my Raspberry Pi using the tutorial found here.
I installed version 3.1.0 of OpenCV for Python 2.7 and that’s what I constructed the gateway script for.
There are also additional tutorials for installing OpenCV under different configurations here.
Awesome site, right?

Part 1b. Additional Libraries
(make sure you are in your working environment when you install these libraries. workon cv3 is the command if you followed the same tutorials I did)

NumPY which should have been installed if you followed the same tutorial I did.
imutils (https://github.com/jrosebr1/imutils)
SimpleWebSocketServer from https://github.com/dpallot/simple-websocket-server

$ pip install numpy	
$ pip install imutils
$ pip install git+https://github.com/dpallot/simple-websocket-server.git

Note: if you run into a permissions error message while trying to run pip add “sudo” to the beginning of your command. Example:

$ sudo pip install numpy

This will temporarily run the command as superuser and solve half your problems.

Part 1c. Run the Script.
Make sure your cameras are plugged into the Pi for this step. The PS Eye cameras have two settings for focusing their lens. You want the wide setting, which is achieved by turning the switch on the front all the way clockwise. Easy.
Copy the positionaltracking.py script into your home directory or where ever you want to store it ( I chose /home/pi/PT ) and then open up a CLI.
If you followed all the same tutorials I did, here’s the commands you’ll need to run

$ source ~/.profile
$ workon cv3
$ cd /home/pi/PT
$ python positionaltracking.py --display 1

If you did it properly you’ll see two displays pop up on your screen showing what your cameras see. Pressing ‘Q’ while having one of the displays selected will Quit the program. Each of the displays will have a blue circle around the brightest point the camera sees.
Now comes the fun part.

Note: Default file path for saving your calibration data is /home/pi if that is not the path to your home directory you’ll have to specify a different path like so:

$ python positionaltracking.py --display 1 --fp "/your_path/calibrated_data.npz"

Part 1D. Your Playspace
Next you need to figure out a place to put everything and where to mount your cameras. I found the best configuration was to mount the cameras as high as possible (I’ve had good results with anywhere from 6 to 7 feet) with about 10 feet between them and then angle them downward. You need about 3 feet of space between the camera and the front of your tracked area. So if you’re going for an estimated 36 square foot area like I’ve gotten you’ll want a pretty decent amount of room. Unity measures things in scales of meters. So in order to ensure the real world scale is as close as possible, what I do is tape out a 5 points on the floor with painters tape. Basically a cross two meters tall, two meters wide, about 6’5″ by 6’5″. What you’re basically doing is marking out on the floor the 5 points (0,0,0),(1,0,0),(-1,0,0),(0,0,1),(0,0,-1) which will correspond with your unity environment to exact scale. If you don’t have enough room you can make the cross smaller, just make sure the five points of your cross on the ground is scaled properly, you can then scale your movement up in Unity if you need to. These are the 5 points you’ll use to calibrate your two cameras so the Pi can do the 3d conversions.

Each outer X is approx 3.2 feet from the Center X.

The FOV of the Cameras require a gap of about 3 feet from the camera to usable space.

Arrange the Cameras so that you get as much of the 5 points in both camera displays as possible. It’s okay if you don’t get them all in, the important thing is that if you walk around the 5 points you can see the upper half of your body. This will ensure that when you’re working within the area you’ll be on camera and also able to calibrate the cameras.

Part 1E. Calibration
At this point you will begin calibrating the cameras. This is done by taking your Light Marker whether an IR or a flashlight and placing it over each of the 5 spots twice at the proper heights and pressing the corresponding button on the app.

I’ll walk you through one.

Warning: Do not move your cameras once you’ve begun calibrating. If you do, press the Reset button and start over from Mark 1. Not doing so will mess up your calibration and again lead to Pukeyville.

Installing the websockettest.apk go ahead and open it.
Change the IP address in the upper right corner.
- If you don’t know your IP, you can use the command ifconfig to find it on your Pi. Your IP should look something like this: 127.0.0.1:85 where 127.0.0.1 is your IP address and 85 is the port number for the Pi Websocket server. It should remain 85 unless you’ve messed with the code.
Press the Connect button and the message in the Top Left should Read “Connected to Gateway!”
Now, take a tape measure and tape it so that it sticks out to be approximately 3’2″ (1 Meter) from the ground and place it on the front X you’ve taped on the ground. This corresponds with Mark1 on the app calibration App.
Then take your Light Marker and hold it at the top of the tape measure and watch the camera displays to ensure that the light is circled in both camera displays.
Press the Mark 1 Button on the App. The Message in the Upper Right of the app should change to Answer:Received, Mark_Saved_Mark: 1.
Do this with the remaining 4 X’s.
Next, change your tape measure and make it so that it’s stuck out at 6’4″ (2 Meters). Then Press the Height=1 Button on the phone App. This will change the 5 Marks to Mark 6 0,2,1 etc etc. The reason we do this is because if we only do the 5 original marks for some reason OpenCV inverts our Y axis. So we need the additional Y layer of coordinates to calibrate with.
Now go to the front X, Set it up so the cameras can see your light at the top of the tape measure and press the Mark 6 button. Rinse and Repeat until you’ve done all Ten Marks.
Then you just Press the Calibrate Button and if everything has worked you’ll get a message that the system has been calibrated and the file saved. You can now press the XYZ on Button and you should receive a constantly updated stream of X,Y,Z coordinates tracking the brightest light your cameras see.

Calibration also saves the Calibration settings to a file so if you’ve got your path set properly you can just press the load button in the Calibration app and it will automatically reload your settings so you don’t need to re-calibrate every time you reboot your system. As long as you don’t move the cameras those settings should be usable. You can also reload your settings from the CLI like so:

$ python positionaltracking.py --display 1 --load 1 --fp "/home/pi/calibrated_data.npz"

That’s it! For anywhere from $100 to under $30 you’ve got a Positional Tracking Play Space. You’re now ready for developing and testing Unity. The code is on my github. Just don’t rely on the script in the calibration project, I haven’t updated that in months.

Final Notes and Things:
I’m probably not going to update this code much more beyond what it currently is. My Samsung Galaxy S5 that I bought a year ago specifically for trying to experiment in Cardboard drifts horribly, making this project useless for me. Thanks Samsung 😛
However, I did add another feature to the code that lets you track a colored ball. If you’re handy with Python you’ll note it’s currently setup for tracking a Blue Ballpit Ball, like in the TapLight photo above. You can turn that feature on with the following command:

$ python positionaltracking.py --type 1

In theory, you could modify that feature to track multiple colored balls in order to make an inexpensive trackable wand or whatever. I tried to make the code as simple as possible for people who aren’t familiar with programming to experiment with python, like me 😉 You’ll notice I explicitly defined each of the Mark location Variables when I could have just used a For Each loop to load an Array. There are plenty of ways to make this whole thing more refined I’m sure, and please feel free to do so. I’m just trying to make something that’s as accessible to as many people as possible. And yeah, I’m aware commenting my code better would probably have helped with that 😉

Handy Links
VR Tracker Where I got the Idea for this
VR Tracker github Where I got the OpenCV calibration functions and the Websockets dll for unity.
pyimagesearch.com such a helpful site
Ball Tracking in OpenCV
Installing OpenCV 3.0 for both Python 2.7 and Python 3+ on your Raspberry Pi 2
Simple Websocket Server