3D tracking


In 3D human-computer interaction, 3D tracking, also called pose tracking or positional tracking, is a process that tracks the position and/or orientation of head-mounted displays, controllers, or other input devices within Euclidean space. Pose tracking is often referred to as 6DOF tracking, for the six degrees of freedom in which the objects are often tracked.
In some consumer GPS systems, orientation data is added additionally using magnetometers, which give partial orientation information, but not the full orientation that pose tracking provides.
In virtual reality, it is paramount that pose tracking is both accurate and precise so as not to break the illusion of a being in virtual world. Several methods of tracking the position and orientation of a display and any associated objects or devices have been developed to achieve this. Many methods utilize sensors which repeatedly record signals from transmitters on or near the tracked object, and then send that data to the computer in order to maintain an approximation of their physical locations. A popular tracking method is Lighthouse tracking. By and large, these physical locations are identified and defined using one or more of three coordinate systems: the Cartesian rectilinear system, the spherical polar system, and the cylindrical system. Many interfaces have also been designed to monitor and control one's movement within and interaction with the virtual 3D space; such interfaces must work closely with positional tracking systems to provide a seamless user experience.
Another type of pose tracking used more often in newer systems is referred to as inside-out tracking, including simultaneous localization and mapping or visual-inertial odometry. An example of a device that uses inside-out positional tracking is the Oculus Quest 2.

Electromagnetic tracking

Magnetic tracking relies on measuring the intensity of inhomogenous magnetic fields with electromagnetic sensors. A base station, often referred to as the system's transmitter or field generator, generates an alternating or a static electromagnetic field, depending on the system's architecture.
To cover all directions in the three dimensional space, three magnetic fields are generated sequentially. The magnetic fields are generated by three electromagnetic coils which are perpendicular to each other. These coils should be put in a small housing mounted on a moving target which position is necessary to track. Current, sequentially passing through the coils, turns them into electromagnets, which allows them to determine their position and orientation in space.
Electromagnetic tracking is used for 3D surgical navigation.
Because magnetic tracking does not require a head-mounted display, which are frequently used in virtual reality, it is often the tracking system used in fully immersive virtual reality displays. Conventional equipment like head-mounted displays are obtrusive to the user in fully enclosed virtual reality experiences, so alternative equipment such as that used in magnetic tracking is favored. Magnetic tracking has been implemented by Polhemus and in Razer Hydra by Sixense. The system works poorly near any electrically conductive material, such as metal objects and devices, that can affect an electromagnetic field. Magnetic tracking worsens as the user moves away from the base emitter, and scalable area is limited and can't be bigger than 5 meters.
Pros:
  • Uses unobtrusive equipment that does not need to be worn by user, and does not interfere with the virtual reality experience
  • Suitable for fully immersive virtual reality displays
Cons:
  • User needs to be close to base emitter
  • Tracking worsens near metals or objects that interfere with the electromagnetic field
  • Tend to have a lot of error and jitter due to frequent calibration requirements

    Camera-based tracking

Camera-based tracking, also known as optical tracking, uses cameras placed on or around the headset to determine position and orientation based on computer vision algorithms. Camera-based 3D tracking systems require a direct line of light without occlusions, otherwise they will receive wrong data.
Optical tracking can be done either with or without markers. Tracking with markers involves targets with known patterns to serve as reference points, and cameras constantly seek these markers and then use various algorithms to extract the position of the object. Markers can be visible, such as printed QR codes, but many use infrared light that can only be picked up by cameras. Active implementations feature markers with built-in IR LED lights which can turn on and off to sync with the camera, making it easier to block out other IR lights in the tracking area. Passive implementations are retroreflectors which reflect the IR light back towards the source with little scattering. Markerless tracking does not require any pre-placed targets, instead using the natural features of the surrounding environment to determine position and orientation.

Outside-in tracking

In this method, cameras are placed in stationary locations in the environment to track the position of markers on the tracked device, such as a head mounted display or controllers. Having multiple cameras allows for different views of the same markers, and this overlap allows for accurate readings of the device position. The Oculus Rift CV1 utilizes this technique, placing a constellation of IR LEDs on its headset and controllers to allow external cameras in the environment to read their positions. This method is the most mature, having applications not only in VR but also in motion capture technology for film. However, this solution is space-limited, needing external sensors in constant view of the device.
Pros:
  • More accurate readings, can be improved by adding more cameras
  • Lower latency than inside-out tracking
Cons:
  • Occlusion, cameras need direct line of sight or else tracking will not work
  • Necessity of outside sensors means limited play space area

    Inside-out tracking

In this method, the camera is placed on the tracked device and looks outward to determine its location in the environment. Headsets that use this tech have multiple cameras facing different directions to get views of its entire surroundings. This method can work with or without markers. The Lighthouse system used by the HTC Vive is an example of active markers. Each external Lighthouse module contains IR LEDs as well as a laser array that sweeps in horizontal and vertical directions, and sensors on the headset and controllers can detect these sweeps and use the timings to determine position. Markerless tracking, such as on the Oculus Quest, does not require anything mounted in the outside environment. It uses cameras on the headset for a process called SLAM, or simultaneous localization and mapping, where a 3D map of the environment is generated in real time. Machine learning algorithms then determine where the headset is positioned within that 3D map, using feature detection to reconstruct and analyze its surroundings. This tech allows high-end headsets like the Microsoft HoloLens to be self-contained, but it also opens the door for cheaper mobile headsets without the need of tethering to external computers or sensors.
Pros:
  • Enables larger play spaces, can expand to fit room
  • Adaptable to new environments
Cons:
  • More on-board processing required
  • Latency can be higher

    Sensor fusion

Sensor fusion combines data from several tracking algorithms and can yield better outputs than only one technology. One of the variants of sensor fusion is to merge inertial and optical tracking. These two techniques are often used together because while inertial sensors are optimal for tracking fast movements they also accumulate errors quickly, and optical sensors offer absolute references to compensate for inertial weaknesses. Further, inertial tracking can offset some shortfalls of optical tracking. For example, optical tracking can be the main tracking method, but when an occlusion occurs inertial tracking estimates the position until the objects are visible to the optical camera again. Inertial tracking could also generate position data in-between optical tracking position data because inertial tracking has higher update rate. Optical tracking also helps to cope with a drift of inertial tracking. Combining optical and inertial tracking has shown to reduce misalignment errors that commonly occur when a user moves their head too fast. Microelectrical magnetic systems advancements have made magnetic/electric tracking more common due to their small size and low cost.

Radio triangulation-based 3D tracking

Wireless tracking uses a set of anchors that are placed around the perimeter of a tracking space and one or more tags that are tracked. This system is similar in concept to GPS, but works both indoors and outdoors. Sometimes referred to as indoor GPS. The tags triangulate their 3D position using the anchors placed around the perimeter. A wireless technology called Ultra Wideband has enabled the position tracking to reach a precision of under 100 mm. By using sensor fusion and high speed algorithms, the tracking precision can reach 5 mm level with update speeds of 200 Hz or 5 ms latency.
Pros:
  • User experiences unconstrained movement
  • Allows wider range of motion
  • Provides absolute location instead of just relative location
Cons:
  • Low sampling rate can decrease accuracy
  • Low latency rate relative to other sensors

    Inertial tracking

Inertial tracking is a native method of rotational tracking. It uses data from accelerometers and gyroscopes, and sometimes magnetometers. Accelerometers measure linear acceleration. Since the derivative of position with respect to time is velocity and the derivative of velocity is acceleration, the output of the accelerometer can theoretically be integrated to find the velocity and then integrated again to find the position relative to some initial point. Gyroscopes measure angular velocity. Angular velocity can be integrated as well to determine angular position relatively to the initial point. Magnetometers measure magnetic fields and magnetic dipole moments. The direction of Earth's magnetic field can be integrated to have an absolute orientation reference and to compensate for gyroscopic drifts. Modern inertial measurement units systems are based on MEMS technology allows to track the orientation in space with high update rates and minimal latency. Gyroscopes are always used for rotational tracking, but different techniques are used for positional tracking based on factors like cost, ease of setup, and tracking volume.
Dead reckoning is used to track positional data, which alters the virtual environment by updating motion changes of the user. The dead reckoning update rate and prediction algorithm used in a virtual reality system affect the user experience, but there is no consensus on best practices as many different techniques have been used. It is hard to rely only on inertial tracking to determine the precise position because dead reckoning leads to drift, so this type of tracking is not used in isolation in virtual reality. A lag between the user's movement and virtual reality display of more than 100ms has been found to cause nausea.
Inertial sensors are not only capable of tracking rotational movement, but also translational movement. These two types of movement together are known as the Six degrees of freedom. Many applications of virtual reality need to not only track the users’ head rotations, but also how their bodies move with them. Six degrees of freedom capability is not necessary for all virtual reality experiences, but it is useful when the user needs to move things other than their head.
Pros:
  • Can track fast movements well relative to other sensors, and especially well when combined with other sensors
  • Capable of high update rates
Cons:
  • Prone to errors, which accumulate quickly, due to dead reckoning
  • Any delay or miscalculations when determining position can lead to symptoms in the user such as nausea or headaches
  • May not be able to keep up with a user who is moving too fast
  • Inertial sensors can typically only be used in indoor and laboratory environments, so outdoor applications are limited