RIASSUNTO
Autonomous underwater navigation presents a whole set of challenges to be resolved in order to become adequately accurate and reliable. That is particularly critical when human divers work in close collaboration with autonomous underwater vehicles (AUVs). In absence of global positioning signals underwater, acoustic based sensors such as LBL (long-baseline), SBL (short-baseline) and USBL (ultrashort-baseline) are commonly used for navigation and localization. In addition to these low-bandwidth and high latency technologies, cameras and sonars can provide position measurements relative to the vehicle which can be used as an aid for navigation as well as for keeping a safe working distance between the diver and the AUV. While optical cameras are highly affected by water turbidity and lighting conditions, sonar images often become hard to interpret using conventional image processing methods due to image granulation and high noise levels. This paper focuses on finding a robust and reliable sonar image processing method for detection and tracking of human divers using convolutional neural networks. Machine learning algorithms are making a huge impact in computer vision applications but are not always considered when it comes to sonar image processing. After presenting commonly used image processing techniques the paper will focus on giving an overview of state-of-the-art machine learning algorithms and explore their performance in custom sonar image dataset processing. Finally, the performance of these algorithms will be compared on a set of sonar recordings to determine their reliability and applicability in a real-time operation.