Improving Accuracy of Single-Camera Box Dimensioner Based on Multi-Camera Implementation

I've adapted Intel's multi-camera box dimensioner code to work with a single RealSense D435 camera, but I'm experiencing accuracy issues. I'd appreciate guidance on optimizing my implementation.

My single-camera implementation uses:
- D435 camera mounted facing downward above the measurement area
- Checkerboard for plane calibration
- Detection of points above the calibrated plane
- Percentile-based (5th/95th) dimension calculation

Key Differences Between My Code and Intel's Reference Implementation:
1. Camera Setup:
- My code: Single Intel RealSense D435 camera mounted facing downward
- Reference code: Multiple cameras working together, likely mounted at different angles

2. Calibration Method:
- My code: Uses OpenCV chessboard detection with SVD-based plane fitting
- Reference code: Uses Kabsch algorithm for pose estimation across multiple cameras

3. Segmentation Approach:
- My code: Creates a mask for points above the detected plane (DEPTH_THRESHOLD parameter)
- Reference code: Uses multiple viewpoints to generate a more complete point cloud

4. Coordinate System Transformation:
- My code: Creates a transformation matrix based on the plane normal
- Reference code: Uses the Kabsch method's inverse transformation

5. Dimension Calculation:
- My code: Uses percentiles (5th/95th) of the point cloud
- Reference code: Calculates measurements from a cumulative point cloud from multiple views

6. Post-Processing:
- My code: Uses median filtering over time for stabilization
- Reference code: Likely has better accuracy due to inherent redundancy of multiple viewpoints

Issues Encountered
- Measurements aren't as accurate as expected
- Results vary with box orientation and position
- Edge detection sometimes misses portions of the box
- Downward-facing camera sometimes has difficulty with detecting side faces of boxes

My Questions
1. What are the key considerations when adapting the multi-camera approach to single-camera, especially with a downward-facing mounting position?
2. Are there specific camera settings or filters I should adjust for the D435 to improve measurement accuracy in a top-down configuration?
3. Would a different segmentation approach work better for single-camera dimensioning?
4. Is there a better way to calculate dimensions from a single viewpoint than using percentiles?
5. Are there specific limitations I should be aware of when using just one D435 camera for this application?
6. Could optimizing the laser power or modifying the DEPTH_THRESHOLD parameter improve accuracy?
7. Would implementing a different plane fitting algorithm improve results?
8. Are there specific recommendations for downward-facing camera setups that differ from side-angle mounting?

Thank you for your assistance.
Onkar Chaudhari

MartyG

May 05, 2025 17:45

Hi Onkar It should not be necessary to adapt the box_dimensioner_multicam project for use with a single camera as, despite the project's name, it can work fine with a single camera placed at the corner of the checkerboard in an elevated position pointing diagonally downwards towards the box top and sides. That camera position, rather than placing it directly overhead the box and pointing straight down, is the orientation that I would strongly recommend using,

If your project needs to measure the box from directly above then it will be necessary to take account of the possibility of inaccuracy being introduced by the box casting a shadow on the floor, as described here:

https://github.com/IntelRealSense/librealsense/issues/12605#issuecomment-1904381650

The ideal 'sweet spot' mounting distance of the camera from the checkerboard is 1 meter. Placing it nearer or further (e.g 0.5 m or 1.5 m) can result in problems, such as inaccurate values or the green bounding-box overlay not aligning correctly with the box that is being measured.

Making settings adjustments such as laser power or using a Visual Preset camera configuration does not have a significant effect on the success of measurement. Distance from the checkerboard, positioning the camera diagonally-down-from-above are the most important factors.

I am not aware of a previous attempt at using a plane fitting algorithm with this tool, so there is not an existing precedent to refer to regarding such an adaptation, unfortunately.

If you use a custom checkerboard instead of the official checkerboard image supplied by the box_dimensioner_multicam project where the squares are a different size then you will need to adjust a few parameters in the box_dimensioner_multicam.py script relating to square size values.

If you would ideally prefer not to have to use a checkerboard then there is a discussion about modifying the box_dimensioner_multicam.py project to use the intrinsic parameters of a single camera instead of using a checkerboard to calculate this information.

https://github.com/IntelRealSense/librealsense/issues/10054

Improving Accuracy of Single-Camera Box Dimensioner Based on Multi-Camera Implementation

Comments

Didn't find what you were looking for?