Karl Guttag comments on Vision Pro (2): These hardware problems cannot be ignored

Last week, AR/VR optics expert Karl Guttag compared Vision Pro with Quest Pro in terms of price, VST perspective, application, interaction, etc., and found that Vision Pro made more correct decisions in design. Nevertheless, Guttag believes that the headset itself still has some hard-to-find problems in the hardware part, such as ergonomics and visual experience.

It is worth noting that he also corrected the guess about Apple’s optical module in the previous article, and said: Most of the online reports pointed out that Vision Pro uses Pancake lenses, and I said that aspheric lenses are inaccurate. Because Apple's official press release clearly mentions the words "customized catadioptric lens", and Pancake is also one of the categories of catadioptric lenses. In addition, Limbak, an optical company that Apple has acquired, is known for its catadioptric solutions and has previously introduced some kind of "Super Pancake" lens design.

In addition, let's take a look at what Guttag has to say about Vision Pro's hardware. Note: Due to the different visual sensitivities of the human eye, the sensory experience of the display technology is also different, so the problems in the actual experience of Vision Pro may be different, and some users may be able to adapt to certain adverse effects, while others cannot.

30 minutes is too short

At WWDC 2023, Apple allowed on-site media to experience some functions of Vision Pro, and the experience lasted about 30 minutes. Guttag pointed out that a live demo based on a curated selection of the best experiences naturally sidesteps obvious issues, and that 30 minutes is too short to detect ergonomic and visual issues that may arise with prolonged use.

Judging from the previous 3D movies and 3D riding projects, some people may feel dizzy, nausea and other discomfort after experiencing it for more than a few minutes. Similarly, the comfort of AR/VR may also be problematic, which requires long-term research and observation of a large number of people. Guttag said: Maybe Apple has conducted some user research, but from the existing experience data and hardware configuration, Vision Pro still made some mistakes/ill-considered decisions, and what impact does this have on users? worthy of attention.

Low latency and high refresh rate are not enough to solve visual problems

Vision Pro is equipped with a PC-level SoC: M2, plus a custom R1 vision processor, which can achieve powerful dynamic foveated rendering effects, and can greatly reduce the delay between the sensor and the display.

Apple has this description in the WWDC 2023 conference: The delay between the sensor and the display can easily cause discomfort, and the R1 processor almost eliminates the delay, and it only takes 12 milliseconds to transmit a new image to the display, faster than the blink of an eye Also 8 times faster.

Guttag believes that low-latency processing and high refresh rates are necessary to optimize XR visual problems, but this is only one of the known problems. On the other hand, while the Vision Pro's processing speed may be faster than other comparable headsets, it's not fast enough because sensor-to-display latency is only part of the motion-to-photon latency, other unpublished numbers include: camera and display latency.

Assuming that the camera and display are running at a 90Hz refresh rate and processing only one frame at a time, the total latency will roughly triple. In addition, there may be buffering delays, tracking delays, or errors of any kind. Therefore, Guttag believes that "the delay between the sensor and the display is faster than the blink of an eye" is just a marketing phrase, which does not indicate whether the headset itself is fast or not.

In theory, some low-latency systems have close to zero motion photon latency, but in practice when it comes to 3D virtual content rendering, special cameras and displays running in sync, the system latency is not low enough.

Take the night vision goggle technology in the military field as an example. At present, ENVG night vision goggles based on monochromatic photomultiplier tubes are more commonly used instead of semiconductor cameras. The biggest reason is delay. Because the army found during testing that even a slight image lag can be disorienting.

About Camera Arrays

Just like the Lynx R1 and other VST headsets, under normal circumstances the main perspective camera is coaxial with the left and right eyes, almost in front of the user's eyes. The Vision Pro includes an Eyesight screen in the front cover, and the see-through camera can only be placed under the headset with other sensors. Guttag believes that this is not the best choice, especially in terms of image correction, which may be more difficult than Meta Quest Pro.

Why do you say that? Steve Mann, an advocate of VST headsets, emphasized at the IEEE Spectrum 2013 conference that the VST camera is located in the center of the user's vision, which is the key to long-term visual comfort. Even a slight deviation in the camera position may cause obvious discomfort later. For example, when the user has adapted to the perspective perspective of the VST headset, it will take a while to adapt to the normal vision of the eyes after taking off the headset.

With the Vision Pro, since the see-through camera is below the user's line of sight, Apple will need to use some sort of complex correction algorithm to match the see-through perspective to the natural perspective of the human eye.

In addition, due to the position of the perspective camera, Vision Pro's visual depth sensing and coordination are more difficult and less accurate. Especially when close perspective, the problem is more obvious. Guttag said: Any error in the perspective camera correction algorithm may cause coordination problems, but this kind of problem is not obvious in a short demonstration, and it will take a long time to observe it later.

Overall, the placement decisions for the VST camera seem to be mostly related to the Eyesight screen. Apple also sacrificed many other designs for the Eyesight screen, such as adding a glass protective case with good light transmission. The advantage is that the appearance is high-end, but the disadvantage is that it is heavy, not conducive to heat dissipation, there is a difference in quality, and it causes camera and sensor calibration problems. etc.

About the VAC problem

One of the causes of dizziness when using XR headsets is the VAC problem, that is, the visual vergence adjustment conflict, that is, the focal length of the image does not match the natural focal length of the human eye, causing discomfort such as ghosting.

At this stage, there are many papers on solving VAC problems, but no really effective products have come out. Guttag pointed out that this usually means that VAC does not have a good solution. For example, ML1 previously realized zooming by stacking multi-layer waveguides, but the disadvantage was that image quality and cost were sacrificed. Meta is solved by mechanical focusing and liquid crystal lens, but it is still in the research stage and has not been used in a formal VR head display. On the other hand, there are light-field displays, computational holograms, and eye-tracking-based solutions that rely on heavy computation, hardware complexity, and absolute image quality.

Judging from the WWDC 2023 on-site developer experience feedback, Vision Pro is likely to fix the visual focus at a position of about 2 meters, by placing the virtual interface/content at a farther position to avoid being within arm’s distance or 1 meter. VAC problems that may arise from viewing images. According to developers, Apple seems to hope that developers will design objects and interfaces in apps to be larger and farther away from users.

If Vision Pro will replace computer monitors and TVs in the future, then the distance of the UI interface will also be particular about different application scenarios. For example, if a virtual screen in a headset is placed at a distance of 2 meters, then its characteristics are closer to that of a TV than a computer monitor, which is usually about half a meter away from the user.

The virtual screen is far away, which means that the text size should be enlarged in proportion to the XR screen, but there is still a difference from the computer monitor in terms of experience interaction. For example, when the user is close to the virtual interface, the change in text clarity is not as obvious as in reality, and the user cannot easily reach out and touch the content on the interface (beyond the distance of one arm). But if the XR virtual interface is set to the size and distance of common display screens, it is easy to create VAC problems.

The difference between usage behavior and computer screen is also a problem facing VR office. Even though some people have challenged to use VR for a month, no one seems to think that VR can completely replace the computer monitor.

Indeed, the Vision Pro headset may have higher resolution and better motion-sensing tracking than the existing mainstream VR headsets, but these improvements are not enough, and in terms of efficiency, it is not yet a good substitute for computer multi-screen work scenarios.

Guttag also pointed out that in fact, in the past 30 years, wearable displays/near-eye displays such as Sony Glasstron have appeared, but few people use them in airplanes, trains or offices. So why don't people choose high-resolution wearable large-screen solutions, but prefer mobile phones with lower resolution and smaller size? I believe that cost is not the main reason, there must be other reasons, such as efficiency.

About Foveated Rendering

Judging from the WWDC 2023 on-site experience feedback, Vision Pro's eye tracking is quite sensitive and accurate, and the effect surpasses other XR devices on the market. However, Guttag pointed out that existing eye tracking cannot capture every "micro-movement" of the human eye, such as saccade/saccade phenomenon. What's more, even if the eye-tracking system can capture the exact location of the gaze, it is difficult to understand what the human eye is seeing, which is the "image snapshot" captured by the human eye. Therefore, the foveated rendering system based on eye tracking cannot present natural changes like real scenes.

In the long run, users may perceive dynamic zoom and unnatural rendering, which may cause headaches and other risks.

about security

Compared with OST headsets, VST headsets always have problems with VR headsets, such as poor perception of the surrounding environment (even in perspective mode), and are not suitable for wearing and walking freely. Indeed, the boundary system of VR can limit the user's range of motion to a certain extent, and give out prompts when approaching obstacles, but this cannot effectively prevent users from bumping into obstacles.

The darkest color is the line of sight blocking part

The fundamental problem with VR headsets is that they block the peripheral vision (up, down, left, and right) of the human eye. Peripheral vision of the human eye is precisely the key tool for protecting users from walking in the environment (sensitive to motion and flickering). The ability of the human eye to look down is more important, because there are usually more dangers on the ground than in the sky.

In addition, although Vision Pro is close to the all-in-one design, it still uses wired power supply like PC VR. Since the Vision Pro uses the MegSafe magnetic interface, the connecting wire is relatively easy to be pulled off. Apple did not disclose whether the Vision Pro has a built-in small-capacity buffer battery. If there is no such battery, it means that if the power supply is cut off, the see-through screen in front of the user's eyes may disappear, which may lead to further dangers under special circumstances.

About Ergonomics

Guttag believes that Apple seems to value style more than function in the physical design of Vision Pro, especially in terms of weight distribution.

Indeed, most of the Vision Pro's weight seems to be concentrated on the face, and to alleviate this problem, Apple equipped it with a soft, face-friendly Light Seal visor. Even so, some testers only used Vision Pro for half an hour, and their foreheads and noses were obviously red.

In fact, although Apple has optimized the wearing comfort, but because the material of the headset itself (aluminum alloy + glass) is too heavy, it will press the face when worn for a long time. According to some experiencers, the weight of the Vision Pro headset is at least 450g, or even more than 500g. In addition, the weight of the external wire and the battery may also be around 60g and 200g respectively.

In the future, Apple's solution may include an additional head mount for weight sharing, or further optimization of the mask. However, Guttag said: Judging from the demo video at WWDC, the position of the Vision Pro's head strap is not at the center of gravity of the device. According to the principles of physics, this design can hardly reduce the weight of the front headset.

One possible solution would be to place the battery on the back of the head as a counterweight, as the HoloLens 2 and Quest Pro do. However, considering the flexible design of the Vision Pro headband, it does not balance the weight of the headset well. Of course, official or third-party headbands can also be supported in the future, such as hard headband peripherals similar to Quest Elite Strap.

If you could add 200g to the Vision Pro by placing the battery on the back of your head with the extra straps, the overall weight of the device would be somewhere between the Quest Pro and the HoloLens 2.

in conclusion

Guttag pointed out that at this stage, Vision Pro still has many problems. It is very difficult to do a good job in VST perspective, and we have to make a trade-off between function and appearance. Especially the addition of the Eyesight screen will affect the choice of camera position.

Although the Vision Pro's see-through effect is far better than that of the Quest Pro, Guttag doesn't think it can be used in the long-term, and some basic safety issues have yet to be resolved.

Guess you like

Origin blog.csdn.net/qingtingwang/article/details/131293752