Sunday, January 4

Intel Real Sense Architecture

Introduction

The Intel® RealSense™ SDK is architecturally different from its predecessor, the Intel® Perceptual Computing SDK. If you’re a developer who used the Intel Perceptual Computing SDK for app development, you’ll quickly see that the new SDK provides an enhanced programming model for accessing the modalities through several of the popular application development frameworks. In this article, we highlight some of the key changes that you can expect to see in the Intel RealSense SDK.

Architecture Overview

The SDK Core, I/O module, and the algorithm modules constitute the foundation of the SDK stack. The SDK Core helps manage application pipeline execution and I/O modules (camera). The algorithm modules constitute the middleware for hand tracking, gesture recognition, face detection, voice detection, and other modalities. The algorithms expose interfaces for application development through app development frameworks—C++, C#, Unity* software, Java*, Processing*, etc. Intel RealSense SDK applications sit on top of this stack (Figure 1).

Figure 1. Intel® RealSense™ SDK Architecture
For app developers familiar with the Intel Perceptual Computing SDK, this improvement will be very obvious. The earlier SDK’s interfaces could only be used with the Unity and Java frameworks (Figure 2).

Figure 2. Intel® Perceptual Computing SDK Architecture
Most of the SDK functionality was available to the C++/C# developers, which resulted in undue disadvantage to Unity and other application development frameworks. The Intel RealSense SDK addresses this limitation by providing uniform access to the core and middleware capabilities through the specially designed interfaces for each framework. In Figure 3, you can see how the .dll support across all frameworks has been completely redesigned. C++/C# interface is now available through a PInvoke interface compared to the C++/CLI C# wrapper in the Intel Perceptual Computing SDK. On the Unity side, the PXCUPipeline-based interface in the former SDK is replaced by a PInvoke-based C# interface. For Java and Processing frameworks, the PXCUPipeline-based interface is replaced by a completely redesigned JNI wrapper.

Figure 3. Uniform API access across app development frameworks

Simplified Class Hierarchy


Figure 4. Intel® RealSense™ SDK Class Hierarchy
The structure of interfaces in the Intel Perceptual Computing SDK was extremely hierarchical. Performing simple tasks required a long sequence of initialization, configuration, and data retrieval operations. This hierarchical structure is now flatter in the Intel RealSense SDK, allowing for easier access to its modalities and capabilities.
As shown in Figure 4 above, PXC[M]SenseManager replaces the UtilPipeline class of the SDK core. SenseManager is responsible for the organization and management of the execution pipeline. All of the camera devices and streams are managed by the PXC[M]CaptureManager, which replaces the UtilCapture interface in the Intel Perceptual Computing SDK. Note that the PXC[M]Capture interface also allows for simplified depth access. While developers needed to access two separate streams for depth and vertices data in Intel Perceptual Computing SDK, the depth data access mechanism in the Intel RealSense SDK allows developers to access depth and vertices data in the same stream. The PXC[M]Face3D manages the face tracking module replacing the PXC[M]Face interface, and the PXC[M]HandAnalysis performs gesture and hand tracking, replacing the PXC[M]Gesture interface.
Please refer to the next section - API migration guide for more information regarding the capabilities of these modules.

API Migration Guide

In addition to the overall SDK stack that has been redesigned, you will also find that more than 50% of the Intel Perceptual Computing SDK APIs are different in the Intel RealSense SDK. Most of these changes stem from the fact that there is a 3D element to the Intel RealSense SDK that enhances the way some of the existing modalities like hand and face tracking behave. Let’s see some of these enhancements below.

Face Tracking module

In terms of capability, the face tracking module now gives 78 landmark points with pose detection values in 2D and 3D compared to the Intel Perceptual Computing SDK where a maximum of 7 landmark points were available. The introduction of depth adds to the robustness of the data obtained. In addition to this, the Intel Perceptual Computing SDK required a separate configuration for face detection, landmark detection, and face recognition making the entire process of facial analysis very cumbersome. All of these aspects have led to a redesign of the face analysis module and associated APIs. The PXC[M]FaceAnalysis module is now replaced with the PXC[M]Face3D module, which provides a flat structure. The module now just needs to be configured once before obtaining the face detection, landmark detection, pose detection, and face recognition values (see Figure 5).

Figure 5. Comparison of face analysis module between Intel® Perceptual Computing SDK and Intel® RealSense™ SDK

Hand Tracking Module

The hand tracking module has been significantly enhanced in the Intel RealSense SDK. Compared to the 7 point hand data and 10 standard gestures, the Intel RealSense SDK now provides 22 data points, finger identification, left and right hand identification with orientation and rotation parameters for 3D interaction and a set of standard gestures. The PXC[M]Gesture interface in the Intel Perceptual Computing SDK is now replaced by the PXC[M]HandAnalysis interface for easier hand data access.
Table 1 below summarizes most of the major improvements made to the Intel RealSense SDK. Developers are encouraged to read the SDK Reference manual accompanying the Beta SDK download for more detailed information on each of the APIs within these interfaces.
Table 1. Comparison of Intel® Perceptual Computing SDK and Intel® RealSense™ SDK

Summary

The Intel RealSense SDK provides many advantages over the previous generation Intel Perceptual Computing SDK. Most of the existing modalities like the face and hand tracking algorithms have been enhanced while making improvements to the API access mechanism for most of the supported app development frameworks. The consistency in API access together with the improved middleware creates a very compelling platform for app developers to explore the world of computing using our senses!

Additional Resources

No comments:

Post a Comment