7.4 Visual-based navigation and computer vision techniques
6 min read•july 30, 2024
Visual-based navigation and computer vision are game-changers for underwater robots. These techniques allow robots to see and understand their surroundings, using cameras to estimate position, avoid obstacles, and map the environment. It's like giving the robot eyes and a brain to navigate the murky depths.
Underwater conditions pose unique challenges for visual systems. Water absorbs light differently, causing color distortion and reduced visibility. Particles in the water scatter light, creating blurry images. Overcoming these hurdles is crucial for effective underwater robot navigation and mapping.
Computer vision for underwater robots
Visual navigation and localization
Top images from around the web for Visual navigation and localization
Frontiers | Towards Energy-Aware Feedback Planning for Long-Range Autonomous Underwater Vehicles View original
Is this image relevant?
Frontiers | Future Vision for Autonomous Ocean Observations View original
Is this image relevant?
Frontiers | A robust method for approximate visual robot localization in feature-sparse sewer pipes View original
Is this image relevant?
Frontiers | Towards Energy-Aware Feedback Planning for Long-Range Autonomous Underwater Vehicles View original
Is this image relevant?
Frontiers | Future Vision for Autonomous Ocean Observations View original
Is this image relevant?
1 of 3
Top images from around the web for Visual navigation and localization
Frontiers | Towards Energy-Aware Feedback Planning for Long-Range Autonomous Underwater Vehicles View original
Is this image relevant?
Frontiers | Future Vision for Autonomous Ocean Observations View original
Is this image relevant?
Frontiers | A robust method for approximate visual robot localization in feature-sparse sewer pipes View original
Is this image relevant?
Frontiers | Towards Energy-Aware Feedback Planning for Long-Range Autonomous Underwater Vehicles View original
Is this image relevant?
Frontiers | Future Vision for Autonomous Ocean Observations View original
Is this image relevant?
1 of 3
Visual navigation involves using visual features and landmarks to estimate the robot's position and orientation relative to its surroundings
Visual localization techniques determine the robot's absolute position in a known map or reference frame using visual cues
Feature-based localization methods extract distinctive features from images (corners, edges) and match them against a pre-built map to estimate the robot's pose
Appearance-based localization methods compare the robot's current view with a database of reference images to determine its location
Visual servoing and stereo vision
Visual servoing is a technique that uses visual feedback to control the robot's motion, enabling it to track and follow visual targets or maintain a desired pose relative to an object
Stereo vision systems use two or more cameras to estimate depth and reconstruct 3D information about the underwater environment (obstacle avoidance, navigation)
By comparing the slight differences in the images captured by the cameras, the system can triangulate the positions of objects and create a depth map
Stereo correspondence algorithms, such as block matching or semi-global matching, find matching pixels between the left and right camera images to calculate disparity and depth
Feature processing in underwater images
Feature extraction and description
involves detecting and describing distinctive visual features in underwater images, such as corners, edges, or regions with unique texture patterns
Corner detection algorithms, like the Harris corner detector or FAST (Features from Accelerated Segment Test), identify points of interest with high contrast changes in multiple directions
Edge detection techniques, such as the Canny edge detector or Sobel operator, identify boundaries and contours in the image based on intensity gradients
Blob detection methods, like the Laplacian of Gaussian (LoG) or the Difference of Gaussians (DoG), identify regions with distinct brightness or color compared to their surroundings
Feature description methods compute compact and distinctive representations of the extracted features, enabling efficient matching and tracking across multiple images
Local feature descriptors, such as SIFT (Scale-Invariant Feature Transform), SURF (Speeded Up Robust Features), or ORB (Oriented FAST and Rotated BRIEF), capture the local appearance and geometry of features in a scale and rotation-invariant manner
Global feature descriptors, like color histograms or texture descriptors, characterize the overall appearance of an image or region
Feature matching and tracking
Feature matching algorithms establish correspondences between features across different images, allowing for recognition and tracking of objects or scene elements
Brute-force matching compares each feature descriptor from one image against all descriptors from another image, finding the best matches based on a similarity metric (Euclidean distance)
Approximate nearest neighbor methods, such as FLANN (Fast Library for Approximate Nearest Neighbors), use efficient data structures and algorithms to speed up the matching process
Feature tracking techniques follow the motion and evolution of features across a sequence of images, enabling the estimation of camera motion and scene dynamics
methods, like the Lucas-Kanade algorithm or the Farneback algorithm, estimate the apparent motion of pixels between consecutive frames
Keypoint tracking approaches, such as the Kanade-Lucas-Tomasi (KLT) tracker, track the movement of distinctive feature points across the image sequence
By analyzing the trajectories of tracked features, the robot can infer its own motion and the motion of objects in the scene
Challenges of underwater image processing
Color attenuation and scattering
Underwater images suffer from color attenuation due to the selective absorption of light at different wavelengths in water
Red light is absorbed more strongly than blue light, leading to a bluish or greenish color cast in underwater images
Color correction techniques, such as white balancing or color transfer methods, can be applied to restore the natural colors of the scene
Scattering of light by suspended particles and organic matter in water leads to blurring and loss of contrast in underwater images
Forward scattering causes a veiling effect, reducing the visibility and sharpness of distant objects
Backscattering introduces bright artifacts and noise in the image due to light reflecting off particles back towards the camera
Turbidity and non-uniform illumination
Turbidity and variations in water clarity affect the quality and range of visibility in underwater images
Turbid water contains a high concentration of suspended particles (silt, sand), resulting in reduced contrast and limited visual range
Adaptive contrast enhancement techniques, such as histogram equalization or contrast-limited adaptive histogram equalization (CLAHE), can improve the visibility of features in low-contrast regions
Non-uniform illumination and shadows cast by the underwater robot or surrounding structures can create challenging lighting conditions for image processing
Illumination normalization methods, like homomorphic filtering or Retinex-based approaches, can help mitigate the effects of uneven lighting
Shadow detection and removal techniques can be employed to identify and suppress shadow regions in the image
Refractive distortions
Refraction at the water-air interface can cause geometric distortions in underwater images, particularly when using a flat port camera housing
Dome port housings can minimize refractive distortions by providing a constant viewing angle for the camera
Image rectification and calibration procedures can be applied to correct for geometric distortions caused by refraction
By modeling the refractive effects and applying appropriate transformations, the distortions can be compensated for and the images can be rectified to a more accurate representation of the scene
Visual odometry and SLAM for underwater robots
Visual odometry techniques
Visual odometry estimates the incremental motion of the underwater robot by analyzing the changes in sequential camera images
Feature-based visual odometry extracts and tracks visual features across consecutive frames to estimate the camera's motion
Direct visual odometry methods optimize the photometric error between pixel intensities in adjacent frames to estimate the camera's motion directly
By accumulating the incremental motion estimates over time, the robot can reconstruct its trajectory and create a sparse 3D map of the environment
Visual SLAM approaches
Visual (Simultaneous Localization and Mapping) constructs a map of the underwater environment while simultaneously estimating the robot's pose within that map
Keyframe-based SLAM approaches select a subset of informative frames (keyframes) to build a sparse map of the environment and perform bundle adjustment to optimize the camera poses and 3D point positions
Filter-based SLAM methods, such as the Extended Kalman Filter (EKF) or the Particle Filter, maintain a probabilistic estimate of the robot's pose and the map using a prediction-update framework
Loop closure detection is a crucial component of visual SLAM that recognizes previously visited locations to correct accumulated drift in the estimated trajectory and map
Bag-of-Words (BoW) techniques create a visual vocabulary of local features and use it to efficiently search for similar images in the database
Appearance-based loop closure methods compare global image descriptors or learned features to identify revisited places
Map management and sensor fusion
Map management strategies are employed to handle the growth and complexity of the underwater map as the robot explores new areas
Keyframe culling removes redundant or less informative keyframes to maintain a compact and efficient map representation
Map segmentation or hierarchical mapping approaches divide the map into smaller, more manageable subsets based on spatial or semantic criteria
Integration of visual SLAM with other sensors, such as inertial measurement units (IMUs) or acoustic sensors, can improve the and of underwater robot localization
Visual-inertial SLAM fuses visual and inertial measurements to estimate the robot's motion and map the environment, leveraging the complementary characteristics of both sensors
Visual-acoustic SLAM incorporates range measurements from sonar or acoustic beacons to constrain the visual SLAM solution and mitigate scale ambiguity
By combining multiple sensing modalities, the limitations of individual sensors can be overcome, resulting in a more reliable and accurate localization and mapping system for underwater robots