Leveraging Foundation Models for Selecting the Most Effective Behavior Tree Action in Obstacle Avoidance​

Moriconi, Michele <2000>

View/Open

tesi30609626.pdf (7.366Mb)

Author

Moriconi, Michele <2000>

Date

2024-10-15

Data available

2024-11-07

Abstract

The aim of this thesis is to integrate a Foundation Model based semantic scene understanding pipeline into the Behavior Tree data structure utilized by the ROS2 navigation stack, Nav2. Obstacle avoidance is a crucial challenge in the field of autonomous navigation, especially in warehouse environments where the robot has to efficiently and safely transport goods from one location to another. This integration will allow the robot to understand the environment and the obstacles present in it semantically, which will allow it to select the best action for the given scenario. The pipeline is composed of two modules, the perception module and the reasoning module. The perception module is responsible for processing the sensor data and, using a Vision Language Model, generating a description of the obstacles present in the robot’s path. The reasoning module is responsible for processing the description generated by the perception module and selecting the best action for the robot to take. The whole pipeline is integrated into a Behavior Tree, that handles the navigation of the robot and the selection of the correct sub-tree to execute based on the response generated by the reasoning module. A novel dataset was created to evaluate the performance of the pipeline. The dataset consists of fifty scenarios, each associated with the correct action to be selected. The pipeline was evaluated in a end-to-end manner, showing that the pipeline is able to correctly select the action 74% of the time using the obstacle description generated by the perception module and 92% of the time using human descriptions. Future works will focus on improving the performance of the pipeline by fine-tuning the models used in the perception and reasoning modules, designing more modules to be integrated into the pipeline, and creating a richer dataset to include more scenarios during the evaluation process.

Type

info:eu-repo/semantics/masterThesis