Technology Challenges

The entire goal of VR technology is to move toward a solution where the user is completely immersed in the virtual world. Ideally, this would include visual, aural, tactile, olfactory, vestibular, gustatory, and even proprioceptive stimulation. Today, however, the state of technology is such that VR is mostly limited to two main senses: seeing and hearing. Though not a completely immersive experience, when done correctly, a two-sense VR experience can still be extremely powerful.
Still, even within the visual and aural pieces, many challenges remain to be solved, mostly breaking down into image fidelity and rendering, proper response and synchronization, and form factor and cost.

1. Image Fidelity & Rendering

Stereoscopic Rendering

Stereoscopic rendering is the first problem that any VR hardware must solve. While viewers are used to focusing on a conventional display, in order to create image depth, VR must have a separate lens and display for each eye. Displaying two independent images seems simple, but actually draws heavily on the device’s processor. It is only recently that emergent GPU technology and graphics libraries can account for this. Samsung and Oculus provide an easy-to-use SDK for this function.

Exhibit 2 Stereoscopically rendered object [9] [10]

Exhibit 2 Stereoscopically rendered object [9] [10]

Compensating for the Magnifying Lens

As VR headgear positions a screen immediately in front of the user’s eyes, focus must be achieved through the use of a magnifying lens. While the ideal solution would use multiple lenses to correct for optical problems, in reality, cost and form factor considerations make this impractical. Using a single lens would allow the eyes to focus and conform to economic considerations, but would also degrade the quality of the scene unacceptably, forcing significant real-time processing to compensate for the distortion.

Exhibit 3 Distortion Compensation [11]

Exhibit 3 Distortion Compensation [11]

The heart of this problem is that each lens has its own curve which creates both barrel and chromatic distortion. In order to mathematically compensate for this, we need to understand both the physical measurements of the lens as well as its properties, requiring significant expertise.

Display Resolution, Rendering and Refresh Rate

Another side effect of using a magnifying lens to achieve focus is that users can more easily see pixels (known as the ‘screen door effect’ or SDE) or detect blur (similar to the pixel lag you might notice while watching a sports game on an older LCD TV). In VR, these problems at best take the user out of the immersive experience. At worst, they can make the user physically sick.

Exhibit 4 Screen Door Effect (SDE) [12]

Exhibit 4 Screen Door Effect (SDE) [12]

The best and most obvious way to eliminate SDE is to use higher pixel density screens. The goal, in this case, would be an 8K display, which, with the rate of advancement of display technology, we can expect in the near future.
Additionally, hardware-based image blur can be controlled or removed by using OLED (Organic Light Emitting Diode) technology. Since conventional LCD display technology utilizes transistors and liquid crystals to control the picture, it cannot respond quickly enough to fast movement. OLED, on the other hand, has a far simpler structure and can respond up to 200 times faster than an LCD display, making them a far more effective VR solution.
One of the most important factors influencing the image quality, however, is the image refresh rate, which can either create a dynamic and real-looking environment, or completely break the illusion. While a faster refresh rate is better, raising this rate dramatically increases the amount of processing power needed, creating practical difficulties. At the minimum, a 60fps (frames per second) refresh rate is needed, while a 75fps refresh rate would be considered more reasonable and above a 120fps rate ideal.

Field of View

A critical factor for VR content getting more attention is the field of view of the headset’s display. Anarrow field of view will appear to the user as similar to looking through a tunnel. Without the benefit of peripheral vision, objects may seem to appear out of nowhere or sneak up on the user more easily.
While a wider field of view is of obvious benefit to the user, it is difficult to accommodate from a device design perspective, as it increases the lens costs while reducing the image quality. To compensate for this loss in quality there would be a greater burden on software, adding more complexity and ultimately not fully compensating for the quality degradation. Using multiple lenses could solve this problem, but would add more weight and cost to the headset, replacing one problem with another.
Ultimately, field of view will be determined by cost and form factor considerations, where the user experience will have to be optimized within the boundaries of acceptable costs and physical comfort. This will likely result in a limited the field of view on consumer headsets for the time being.

2. Proper Response & Synchronization Motion

Motion & Head Tracking

Being an immersive experience, the users will expect to interact with VR in a natural way, this means moving their head to adjust their field of view, and using gestures or motions to control their environment. As such, even early VR headsets were equipped with sensors to measure head movement and adjust what is being displayed to the user accordingly. If the image lags behind the user’s head motions, it may trigger nausea. If the VR content does not respond quickly to the user’s gestures and motions, the experience will no longer be immersive.

Exhibit 5 Head Tracking - Axes of Movement [13]

Exhibit 5 Head Tracking - Axes of Movement [13]

To overcome these issues means relying on the VR headset’s ability to both quickly render necessary content and respond to movements (sensor response). If the sensors respond quickly, for example, but the content is not rendered correctly, the solution will still fail.
The standard sensor system in mobile devices is targeted to regular day-to-day use, where high resolution is not necessary. This means that, in systems utilizing a mobile phone as a viewing device, the on-board sensors cannot be used very effectively for VR. In the Gear VR, for example, Samsung augmented the phone’s sensors with another, high resolution, sensor system located in the VR hardware itself. This, however, is a necessary, but not sufficient, solution. Additional software features, such as front buffer rendering and synchronous time warp, are also required to minimize latency on the headset. These solutions are handled within the graphics framework and highlight that care and attention are required on the software side as well.

Exhibit 6 Motion to Photon

Exhibit 6 Motion to Photon

Adding Audial Depth

In addition to the visual field, users of VR headgear will also be receiving stimulus from an audio source, usually headphones. In standard headphones with standard audio drivers, sound will be perceived as flat and close to the user. While this has become the normal practice for listening to music or watching flat content on a screen, it has the potential to ruin the illusion of VR because it does not correspond to the user’s visual experience. Indeed, often times our reactions to what is happening around us is driven by sound before the stimulus has been identified visually or recognized.
Using the OpenSL [14] audio standard, a content creator could place sound within a 3D space to be rendered by the VR hardware later. This rendering, however, would likely need to employ a different, 3rd party solution in order to reproduce 3D sound faithfully on non-dedicated headphones.

3. Form Factor & Cost

Form Factor Considerations

As VR headsets are worn on the user’s head, and the virtual worlds may inspire the user to move around, great attention must be paid to the weight, portability, and comfort of the headset during the design process. Poor design can prevent the user from fully committing to the virtual world due to discomfort from the weight, balance, or fit, or due to other technical design issues such as lens fog.
The first design consideration is that the user’s face is extremely sensitive and does not bear weight well. Reducing weight should be a priority, but may not be practical. In the Gear VR, for example, the weight of the smartphone necessarily rests at the front of the headset. In the case here no more weight can be removed, the headset should seek to balance the weight between the front and the back.
Comfort is also an important consideration. The interface between the user’s face and the headset should be soft and durable, and any wires necessary should not be felt.
Today, in order to run high-powered games, VR headsets will likely need to be tethered to a powerful PC, which greatly decreases portability and mobility. It also risks the user getting tangled up in the wires while controlling yaw in their VR experience. For most use cases, a smartphone driven VR headset, such as the Gear VR, brings with it the benefit of being fully portable while still providing users with a great experience, albeit with more limits on computing power.

Pricing for Mass Market

The Samsung Gear VR is currently the most portable and low-cost consumer VR option, selling for $99 (without the necessary smartphone). The consumer version of the Oculus Rift retails for $599, while requiring a modern gaming PC to run. Still, at these prices demand has been very strong, with these products and similar one selling out quickly or requiring long waits for delivery.
It important to note that the volume of more expensive devices being sold is relatively low, and the strong sales and demand for the Gear VR show that there will likely be a large market for lower-priced VR headset options [3]. With that in mind, each manufacturer will have to make a trade-off between addressing all of VR’s addressable technical concerns (and carrying a high price) or appealing to a broader market that might be more price sensitive.

9. Mishra, Animesh. Rendering 3D Anaglyph in OpenGL. Animesh's Blog. [Online] 05 29, 2011. [Cited: 05 17, 2016.]
http://www.animesh.me/2011/05/rendering-3d-anaglyph-in-opengl.html.

10. How to Create Stereoscopic Cross-View 3D Pictures from Lytro Images. LightField Forum. [Online] 01 02, 2013. [Cited: 05 17, 2016.]
http://lightfield-forum.com/2013/01/how-to-create-stereoscopic-cross-view-3d-pictures-from-lytro-images/.

11. Mansurov, Nasim. What is Distortion? Photography Life. [Online] 08 11, 2011. [Cited: 05 17, 2016.]
https://photographylife.com/what-is-distortion.

12. Oculus VR. Oculus at E3 2013. Oculus Blog. [Online] 06 11, 2013. [Cited: 05 17, 2016.]
https://www.oculus.com/en-us/blog/oculus-at-e3-2013/.

13. —. Initialization and Sensor Enumeration. Developer Center — Documentation and SDKs. [Online] 2016. [Cited: 05 17, 2016.]
https://developer.oculus.com/documentation/pcsdk/latest/concepts/dg-sensor/.

14. Khronos Group. OpenSL ES - The Standard for Embedded Audio Acceleration. Khronos.org. [Online] 2016. [Cited: 5 11, 2016.]
https://www.khronos.org/opensles/.