6D Object Pose Estimation of multiple cardboard boxes
Dear Intel RealSense-Team,
I have to develop a method which detects multiple cardboard boxes in a shipping container and estimate the 6D Pose relativ to the camera. A friend told me that it would be good to buy a Intel RealSense for this project.
But would a RealSense help me for this project? How could I use it to detect the 6D Pose of the boxes? I put here two images as example of the set up of the boxes.
Thanks for any help.

-
Hi Hamid Rezaie If you need to detect the pose of multiple boxes simultaneously instead of just one box then a DOPE approach based on ROS may be suited to your requirements. Such a system uses an RGB image to recognize the object as a box and then calculates its 6DOF pose. A couple of RealSense-compatible example projects of this type are in the links below.
-
Hey MartyG,
at first, thank you very much for your quick response. Good service! Thanks for the advice but I already know this examples. The problem is that I have to create a 3D Model. But in real world there are dozens of cardboard boxes shapes/sizes. If I follow this tutorial, my system would only detect the cardboard box which has the same dimensions like the 3D Model. Therefore, I'm looking for a method which is more general for all cardboard boxes in the world.
Or would the network also recognize 6D Pose of boxes that were a different size? -
Object detection systems based on trained objects tend to recognize objects based on their characteristics rather than their precise shape and dimensions. For example, in the RealSense 'Deep Neural Network (DNN)' example linked to below, an object detection program classifies a laptop and a monitor individually as "tv/monitor"
https://github.com/IntelRealSense/librealsense/tree/master/wrappers/openvino/dnn
Another example of a detection system that recognizes objects of many different sizes is a LEGO brick sorting project:
-
Hello MartyG,
you are completly right. But in the context of 6D Pose Estimation you need a correspondence between 2D-Points and 3D-Points to obtain the Pose. Mostly this correspondence is based on a 3D-Model of the object of interest. There are a lot of good methods to do object detection with cardboard boxes. I would precisly obtain a bounding box surrounding the cardboard boxes and and thus use the corners of the BB as 2D points. Next I need the same Points but represented in 3D. Now the most approches use as I mentioned a 3D-Model.
But I'm looking for a approach without 3D-Model to detect all cardboard boxes all in the world and not just the box which is based on the 3D Model. -
You could use the RealSense SDK to convert a 2D color pixel to a depth pixel, find the depth value of the pixel and then deproject that pixel to 3D world points.
https://github.com/IntelRealSense/librealsense/issues/9749#issuecomment-919034059
Please sign in to leave a comment.
Comments
5 comments