How does the Box Dimensioner Multicam work exactly?
I've been looking into the Box Dimensioner Multicam tutorial code that Intel provides, but I am confused about how it works specifically. Is it drawing out a plane after it calibrates with the chess board, and that's how it knows where to place the box around the object? Is there any open-source code available for drawing a physical plane that's constantly in front of the camera's view? Lastly, is the script using object detection frameworks, like Pytorch or Tensorflow, to detect the objects in front of the camera?
-
Hi Edenclaire8 The box_dimensioner_multicam.py example program is not using an object detection algorithm. Instead, it calculates a combined pointcloud made from depth data from each attached camera and then uses coordinates on the RGB color image to calculate and draw a green bounding box for the pointcloud so that it fits around the object on the chessboard on the RGB image.
The section of code that handles this is highlighted at the link below.
In regards to a constant centered representation of a plane, the RealSense SDK's C++ open-source Depth Quality Tool may fit your description. It uses a plane-fit algorithm to establish the plane without the need for an IMU gyroscopic component.
https://github.com/IntelRealSense/librealsense/tree/master/tools/depth-quality

The plane-fit algorithm that the Depth Quality Tool uses is based on the logic described at the link below.
http://www.ilikebigbits.com/2015_03_04_plane_from_points.html
-
Thanks for the thorough response, MartyX Grover!
Just to clarify, if the bounding box is drawn on the color frame with the point cloud from the depth data, is the program drawing a box on the object closest to the camera? I'm confused with how it draws the bounding boxes without using object detection.
-
The bounding box is calculated in the measurement_task.py script of the project. It calculates a cumulative pointcloud from the pointclouds of all attached cameras and then works out the coordinates of the corner points (upper and lower) on the color image. The corner points are expressed in the coordinate system of the color sensor rather than the depth sensor.
The script uses a distance threshold of 5 centimeters from the chessboard as the area where useful points are found. This is likely how the object on the chessboard is prioritized over the flat chessboard surface when calculating the cumulative pointcloud, avoiding the need for object detection.
Once it has found those corner points then it subsequently calculates the length, width and height of the bounding box. The measurement values are calculated in the real-world 3D 'world coordinates' of the pointcloud. Then the script finally draws the bounding box as an overlay on the color image.
Please sign in to leave a comment.
Comments
3 comments