try_wait_for_frames slows down
We are building a 3d reconstruction app with multiple D415 cameras on one machine. Cameras are connected through a dedicated dual controller USB card and we had success recording color and depth onto disc with up to 8 cameras with no issue.
The problem we are running into now is when 3D reconstruction is running on GPU, the try_wait_for_frames call become slower once the 4th camera is added and can't perform at 30fps. The frame data already deep copied out of the librealsense buffer and is being process with other threads, so there is no direct holdup onto the librealsense buffer.
We had Intel engineer help us profile our app with VTune and can't find a definitive answer. So I'm curious, what can possibly cause this slow down effect during the try_wait_for_frames call?
OS Name Microsoft Windows 10 Pro
Version 10.0.17134 Build 17134
System Type x64-based PC
Processor Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz, 2600 Mhz, 18 Core(s), 36 Logical Processor(s)
Installed Physical Memory (RAM) 64.0 GB
Total Physical Memory 63.7 GB
-
Hi Qi Yao wait_for_frames is best suited to single camera applications. For multi camera applications, poll_for_frames should be used, in combination with a 'sleep' instruction to regulate when the CPU is put to sleep to save processing and for how long to sleep.
In the link below, the RealSense SDK Manager explains the difference between the types of for_frames instructions.
-
Hi Marty,
I have read the explanation, but this doesn't seems to address my issue at the core. From my understanding of the thread, the reason to use poll_for_frames is because there is no build in logic to wait for frames from multiple devices. The D415 are hardware synced so my implementation of pulling multiple camera is by using a dedicated std::thread and pull as quickly as possible and each using an separate pipe that is corresponding to a camera. This doesn't create any slow down when I'm simply saving the buffer into disc as images and previewing on monitor, and I was able to use current implementation with 8 cameras with no problem at all. The slow down only happens when I'm start doing 3D reconstructions by having GPU combine all the color and depth into an voxel grid. Just a note, processing power is not the issue here, my CPU is not even 50% utilized and GPU is only about 50% utilized. Also, when using poll_for_frames, since the computer CPU is in charge of timing, how to prevent drifting between CPU and camera clock? As CPU's 33 ms firing and sleeping cycle is not going to be perfectly synced with camera's 33 ms cycle.
-
If your GPU is not fully utilised but you need processing acceleration, an option may be to add specialist hardware acceleration to your vision application with an Intel Neural Compute Stick 2, which can plug into a USB port on the computer. It is available from the official RealSense online store, in a bundle with a camera or on its own.
https://store.intelrealsense.com/buy-intel-neural-compute-stick-2.html
Please sign in to leave a comment.
Comments
3 comments