Skip to main content

Visual object recognition

Chapter description

In this tutorial, we will return to the subject of the camera. We will learn a simple tool that allows you to quickly learn about patterns in an image and use it to perform simple actions. Then we will modify the code a bit so that the robot will be able to follow the recognized object. Time-of-Flight distance sensors will be used to track the detected object. As image processing is quite computationally complex, the speed of the algorithm will be low. Don't worry, because in the next chapter we'll show you an example of how to offload some of the calculations to our local hardware, which may speed things up.

This tutorial is dedicated to the ROSbot XL. However, most of the information and commands are compatible with all ROSbots, and most of the changes are simple and come down to launching the appropriate container, depending on your hardware configuration.

Available ROSbot platforms:

You can find the repository containing the final result after completing all the ROS 2 Tutorials here.

Introduction

Objects can be recognized by the robot using a vision system. The algorithm will detect the characteristic features of the image, such as points, lines, edge colors and their relative position. Object recognition processing consists of two steps. The first is training and should be done before the main operation of the robot. During this stage, the object of interest is presented to the vision system, and the algorithm automatically collects a set of features. This data is stored sequentially as a pattern. The second stage is actual recognition, which is performed continuously while the robot is running. Each frame from the camera is processed, image features are extracted and compared with the data set in memory. If enough features match the pattern, the object is recognized.

In this tutorial we will use the find_object_2d node from the find_object_2d package for both training and recognition. To install this package type:

"husarion@husarion:~$
sudo apt-get install ros-$ROS_DISTRO-find-object-2d

Teaching objects

Anything could be an object to recognize, but remember, that the more edges and contrast colors it has, the easier it will be recognized. A piece of paper with something drawn on it would be enough for this tutorial.

To run this tutorial you should be familiar with the topic of starting the camera. Based on this knowledge we create new launch file, which contains Gazebo or astra camera launching command and newly installed find_object_2d. Node find_object_2d by default subscribes to /image topic, so we need to remap it to our topic /camera/rgb/image_raw.

You should create new image_recognition.launch.py file can use below launch file:

~/ros2_ws/src/tutorial_pkg/launch/image_recognition.launch.py
from launch import LaunchDescription
from launch_ros.actions import Node
from launch.actions import DeclareLaunchArgument
from launch.conditions import LaunchConfigurationEquals
from launch.substitutions import LaunchConfiguration
from ament_index_python.packages import get_package_share_directory

def generate_launch_description():
teach = LaunchConfiguration('teach', default='false')

teach_arg = DeclareLaunchArgument(
'teach',
default_value='false',
description='Flag to enable teaching mode'
)

# Object detection
find_object_2d = Node(
package='find_object_2d',
executable='find_object_2d',
name='find_object_2d',
remappings=[('image', '/camera/color/image_raw')],
parameters=[{
'gui': teach,
'objects_path': get_package_share_directory('tutorial_pkg') + '/img_data/' if LaunchConfigurationEquals('teach', 'false') else ''
}]
)

return LaunchDescription([
teach_arg,
find_object_2d
])

Let's start teaching objects:

On ROSbot

Running the ROSbot is very easy thanks to the created docker services.

husarion@husarion:~$
docker compose up -d microros <rosbot_service> <camera_service>

Depending on your ROSbot model and specific configuration you should use an appropriate name for your <rosbot_service> and <camera_service>. To check available docker services use command docker compose config --services, or find suitable Docker images prepared by Husarion.

Then in the new terminal start your image_recognition.launch.py file.

husarion@husarion:~$
ros2 launch tutorial_pkg image_recognition.launch.py teach:=true

Find object 2d

After launching the image_recognition.launch.py file with properly adjusted image topic, new window should appear:

image

On the left side of the window there are thumbnails of saved images (should be empty at first run). Application main window contains a camera view. Yellow circles on the image are marking extracted image features.

To begin teaching process choose from the main toolbar EditAdd object from scene.... New window will appear. Now move the object and camera in order to cover as many features of the object as possible. While doing that try not to catch object surroundings. When it’s done, click Take picture.

image

In next view click Select region and select only that part of taken picture, that covers desired object and click Next. You will get confirmation of features extracted from the selected image area. If presented image is in accordance with what you selected, click End You should see a new thumbnail in the left panel. Notice the number outside of parentheses on the left of the image, this is the object ID.

image

Now you can add some more objects to be recognized. Remember their IDs, you will need them later:

image

When you have enough objects in the database, you can stop camera EditStop camera and choose from the main toolbar FileSave objects... and save it in ~/ros2_ws/src/tutorial_pkg/img_data folder. Now inside this folder you will find all captured objects. Close the window and stop all running nodes.

Last thing we need to do, is adding img_data folder into CMakeList.txt

~/ros2_ws/src/tutorial_pkg/CMakeLists.txt
install(DIRECTORY
launch
img_data
DESTINATION share/${PROJECT_NAME})

and build packages to apply changes.

husarion@husarion:~/ros2_ws$
colcon build --symlink-install

Recognizing objects

Objects will be recognized by the same find_object_2d node that was used for training, but it works in a slightly different configuration. In the created image_recognition.launch we will set two new parameters for the node. We will set the gui parameter to false, this will run the node without the window to train objects, because we no longer need it. The next parameter will be the objects_path, this should be the path to the folder you just selected to store the recognized objects. You can use the same launch file without teach parameter to observe the learning results.

husarion@husarion:~$
ros2 launch tutorial_pkg image_recognition.launch.py

Node is publishing to /objects topic with message type std_msgs/Float32MultiArray. Data in this message is covering: object ID, size of recognized object and its orientation. The most interesting for us is object ID, this is the first element of the array.

Whenever an object is recognized, it will be published in that topic. Place different objects in front of camera and observe as their data is published on the topic.

To watch messages published in the topic, you can use ros2 topic tool:

husarion@husarion:~$
ros2 topic echo /objects

Controlling ROSbot using an image

To perform a robot action based on a recognized object, we will make a new node. Its task will be to subscribe to /objects topic and publish messages to /cmd_vel with speed and direction depending on the object.

Create a new file, name it img_control.cpp and place it in the ~\ros2_ws\src\tutorial_pkg\srcfolder. Then open it in text editor and paste below code:

~\ros2_ws\src\tutorial_pkg\src\img_control.cpp
#include <rclcpp/rclcpp.hpp>
#include <std_msgs/msg/float32_multi_array.hpp>
#include <geometry_msgs/msg/twist.hpp>

// Set value according to your image number in your img_data folder.
#define ARROW_RIGHT 1
#define ARROW_UP 2
#define ARROW_LEFT 3

using namespace std::placeholders;

class ImgControlNode : public rclcpp::Node
{
public:
ImgControlNode() : Node("img_control")
{
vel_pub_ = this->create_publisher<geometry_msgs::msg::Twist>("/cmd_vel", 1);

obj_sub_ = this->create_subscription<std_msgs::msg::Float32MultiArray>(
"/objects", 1, std::bind(&ImgControlNode::objectCallback, this, _1));

RCLCPP_INFO(get_logger(), "Node started!");
}

private:
void objectCallback(const std_msgs::msg::Float32MultiArray::SharedPtr object)
{
geometry_msgs::msg::Twist vel_msg;

if (object->data.size() > 0)
{
int id = object->data[0];

switch (id)
{
case ARROW_RIGHT:
vel_msg.linear.x = 0;
vel_msg.angular.z = -1;
break;
case ARROW_UP:
vel_msg.linear.x = 0.5;
vel_msg.angular.z = 0;
break;
case ARROW_LEFT:
vel_msg.linear.x = 0;
vel_msg.angular.z = 1;
break;
}
}
else
{
vel_msg.linear.x = 0;
vel_msg.angular.z = 0;
}

vel_pub_->publish(vel_msg);
}

rclcpp::Publisher<geometry_msgs::msg::Twist>::SharedPtr vel_pub_;
rclcpp::Subscription<std_msgs::msg::Float32MultiArray>::SharedPtr obj_sub_;
};

int main(int argc, char **argv)
{
rclcpp::init(argc, argv);
auto node = std::make_shared<ImgControlNode>();
rclcpp::spin(node);
rclcpp::shutdown();
return 0;
}
The code explained
  1. Including required headers and defining constants for recognized objects, adjusting values to IDs of objects recognized by your system. Number should be set according to the name of images in img_data folder:

    #include <rclcpp/rclcpp.hpp>
    #include <std_msgs/msg/float32_multi_array.hpp>
    #include <geometry_msgs/msg/twist.hpp>

    #define ARROW_RIGHT 1
    #define ARROW_UP 2
    #define ARROW_LEFT 3
  2. Define publisher and subscriber.

    rclcpp::Publisher<geometry_msgs::msg::Twist>::SharedPtr vel_pub_;
    rclcpp::Subscription<std_msgs::msg::Float32MultiArray>::SharedPtr obj_sub_;
  3. Define speed control message.

    geometry_msgs::msg::Twist vel_msg;
  4. Checking if size of data field is non zero, if it is, then object is recognized. When data field size is zero, then no object was recognized and we stop the robot

    if (object->data.size() > 0){
    ...
    }
    else{
    vel_msg.linear.x = 0;
    vel_msg.angular.z = 0;
    }
  5. Reading the recognized object from incoming message and performing the appropriate behavior depending on the recognized object:

    int id = object->data[0];

    switch (id){
    case ARROW_RIGHT:
    vel_msg.linear.x = 0;
    vel_msg.angular.z = -1;
    break;
    case ARROW_UP:
    vel_msg.linear.x = 0.5;
    vel_msg.angular.z = 0;
    break;
    case ARROW_LEFT:
    vel_msg.linear.x = 0;
    vel_msg.angular.z = 1;
    break;
    }
  6. Publishing control message.

    vel_pub.publish(vel_msg);

Build code

Last thing to do is editing the CMakeLists.txt file. Find lines:

cmake_minimum_required(VERSION 3.8)
project(tutorial_pkg)

# Set C++ standard to C++14
if(NOT CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 14)
endif()

if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang")
add_compile_options(-Wall -Wextra -Wpedantic)
endif()

# find dependencies
find_package(ament_cmake REQUIRED)
find_package(geometry_msgs REQUIRED)
find_package(rclcpp REQUIRED)
find_package(sensor_msgs REQUIRED)
find_package(std_msgs REQUIRED)
find_package(std_srvs REQUIRED)

add_executable(my_first_node src/my_first_node.cpp)
ament_target_dependencies(my_first_node rclcpp sensor_msgs std_msgs std_srvs)

add_executable(img_control src/img_control.cpp)
ament_target_dependencies(img_control rclcpp geometry_msgs std_msgs)

install(TARGETS
my_first_node
img_control
DESTINATION lib/${PROJECT_NAME}
)

install(DIRECTORY
launch
img_data
DESTINATION share/${PROJECT_NAME})

ament_package()

Running your node

Now you can build your node. First start your find_object_2d node

husarion@husarion:~$
ros2 launch tutorial_pkg image_recognition.launch.py teach:=false

then run img_control node.

husarion@husarion:~$
ros2 run tutorial_pkg img_control

Object tracking ROSbot

Besides the ID of recognized object, find_object_2d node is also publishing a homography matrix of recognized object. In computer vision homography is used to define the position of object relative to the camera. We will use it to obtain the horizontal position of the object. Homography matrix is published in the same topic as the ID, but in the next cells of data array. They are formatted as

[objectId1, objectWidth, objectHeight, h11, h12, h13, h21, h22, h23, h31, h32, h33, object2].

The next example will be based largely on what has just been shown, so make sure you understand the previous example. Also to do this exercise you need openCV library installed.

Create a new follow_img.cpp file and use the following code.

~\ros2_ws\src\tutorial_pkg\src\track_object.cpp
#include <rclcpp/rclcpp.hpp>
#include <std_msgs/msg/float32_multi_array.hpp>
#include <geometry_msgs/msg/twist.hpp>
#include <opencv2/opencv.hpp>

// Choose object to track and type camera width.
#define OBJECT_TO_FOLLOW 1
#define CAMERA_WIDTH 320 // left 0, right 319

#define MIN_ANG_VEL 0.15f
#define MAX_ANG_VEL 0.5f
#define ANGULAR_GAIN 1.7 // 1.7 rad/s * the distance of the object from the center of the image

using namespace std::placeholders;

class TrackObjNode : public rclcpp::Node
{
public:
TrackObjNode() : Node("track_obj")
{
vel_pub_ = this->create_publisher<geometry_msgs::msg::Twist>("/cmd_vel", 1);

obj_sub_ = this->create_subscription<std_msgs::msg::Float32MultiArray>(
"/objects", 1, std::bind(&TrackObjNode::objectCallback, this, _1));

RCLCPP_INFO(get_logger(), "Node started!");
}

private:
void objectCallback(const std_msgs::msg::Float32MultiArray::SharedPtr object)
{
geometry_msgs::msg::Twist vel_msg;

if (object->data.size() > 0) {
int id = object->data[0];
float objectWidth = object->data[1];
float objectHeight = object->data[2];

cv::Mat cvHomography(3, 3, CV_32F);
std::vector<cv::Point2f> inPts, outPts;
if (id == OBJECT_TO_FOLLOW) {
// Matrix completion
for (int i = 0; i < 9; i++) {
cvHomography.at<float>(i % 3, i / 3) = object->data[i + 3];
}

// Save corners to vector
inPts.push_back(cv::Point2f(0, 0));
inPts.push_back(cv::Point2f(objectWidth, 0));
inPts.push_back(cv::Point2f(0, objectHeight));
inPts.push_back(cv::Point2f(objectWidth, objectHeight));
cv::perspectiveTransform(inPts, outPts, cvHomography);

int obj_x_pos = (outPts.at(0).x + outPts.at(1).x + outPts.at(2).x + outPts.at(3).x) / 4;
float ang_vel = ANGULAR_GAIN * (CAMERA_WIDTH / 2 - obj_x_pos) / CAMERA_WIDTH;

// Set angular speed
if (ang_vel <= -MIN_ANG_VEL || ang_vel >= MIN_ANG_VEL) {
vel_msg.angular.z = std::max(-MAX_ANG_VEL, std::min(ang_vel, MAX_ANG_VEL));
}
RCLCPP_INFO(get_logger(), "id: %d\t ang_vel: %f", id, vel_msg.angular.z);
}
}

vel_pub_->publish(vel_msg);
}

rclcpp::Publisher<geometry_msgs::msg::Twist>::SharedPtr vel_pub_;
rclcpp::Subscription<std_msgs::msg::Float32MultiArray>::SharedPtr obj_sub_;
};

int main(int argc, char ** argv)
{
rclcpp::init(argc, argv);
auto node = std::make_shared<TrackObjNode>();
rclcpp::spin(node);
rclcpp::shutdown();
return 0;
}
caution

Choose an object to track and type camera width. You can check your camera width by typing:

ros2 topie echo <camera_topic_info> --field width --once
The code explained
  1. As it was mentioned, the opencv library will be used, so we include it.

    #include <opencv2/opencv.hpp>
  2. Select on of the object or add new one in the same way as in Teaching objects.

    #define OBJECT_TO_FOLLOW 1
  3. Define width of camera image. This value will be used to determine horizontal centre pixel.

    #define CAMERA_WIDTH    320 // left 0, right 320
  4. Defined some parameters to control rotation speed.

    #define MIN_ANG_VEL     0.15f
    #define MAX_ANG_VEL 0.5f
    #define ANGULAR_GAIN 1.7 // 1.7 rad/s * the distance of the object from the center of the image expressed as a percentage
  5. Getting object width and height:

    float objectWidth = object->data[1];
    float objectHeight = object->data[2];
  6. Object for homography matrix:

    cv::Mat cvHomography(3, 3, CV_32F);
  7. Vectors for storing input and output planes:

    std::vector<cv::Point2f> inPts, outPts;
  8. Transfer of coefficients for a homographic matrix.

    for(int i=0; i<9; i++){
    cvHomography.at<float>(i%3, i/3) = object->data[i+3];
    }
  9. Defining corners of input plane.

    inPts.push_back(cv::Point2f(0, 0));
    inPts.push_back(cv::Point2f(objectWidth, 0));
    inPts.push_back(cv::Point2f(0, objectHeight));
    inPts.push_back(cv::Point2f(objectWidth, objectHeight));
  10. Calculating perspective transformation.

    cv::perspectiveTransform(inPts, outPts, cvHomography);
  11. Calculating center of object from its corners.

    int obj_x_pos = (outPts.at(0).x + outPts.at(1).x + outPts.at(2).x + outPts.at(3).x) / 4;
  12. Calculation of the angular velocity in proportion to the difference between the center of the camera and the horizontal position of the object. It is checked whether the calculated speed is high enough to make the correction. Then the velocity value is subjected to saturation.

    float ang_vel = ANGULAR_GAIN*(CAMERA_WIDTH/2 - obj_x_pos)/CAMERA_WIDTH;

    if(ang_vel <= -MIN_ANG_VEL || ang_vel >= MIN_ANG_VEL){
    vel_msg.linear.x = 0;
    vel_msg.angular.z = std::max(-MAX_ANG_VEL, std::min(ang_vel, MAX_ANG_VEL));
    }

Build code

Last thing to do is to edit the CMakeLists.txt file. Now we are using an external OpenCV library, so we need to note this fact in the file.

cmake_minimum_required(VERSION 3.8)
project(tutorial_pkg)

# Set C++ standard to C++14
if(NOT CMAKE_CXX_STANDARD)
set(CMAKE_CXX_STANDARD 14)
endif()

if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang")
add_compile_options(-Wall -Wextra -Wpedantic)
endif()

# find dependencies
find_package(OpenCV REQUIRED)

find_package(ament_cmake REQUIRED)
find_package(geometry_msgs REQUIRED)
find_package(rclcpp REQUIRED)
find_package(sensor_msgs REQUIRED)
find_package(std_msgs REQUIRED)
find_package(std_srvs REQUIRED)

add_executable(my_first_node src/my_first_node.cpp)
ament_target_dependencies(my_first_node rclcpp sensor_msgs std_msgs std_srvs)

add_executable(img_control src/img_control.cpp)
ament_target_dependencies(img_control rclcpp geometry_msgs std_msgs)

add_executable(track_obj src/track_obj.cpp)
ament_target_dependencies(track_obj rclcpp OpenCV geometry_msgs std_msgs)

install(TARGETS
my_first_node
img_control
track_obj
DESTINATION lib/${PROJECT_NAME}
)

install(DIRECTORY
launch
img_data
DESTINATION share/${PROJECT_NAME})

ament_package()

Add node to launch file

~/ros2_ws/src/tutorial_pkg/launch/image_recognition.launch.py
from launch import LaunchDescription
from launch_ros.actions import Node
from launch.actions import DeclareLaunchArgument
from launch.conditions import LaunchConfigurationEquals
from launch.substitutions import LaunchConfiguration
from ament_index_python.packages import get_package_share_directory

def generate_launch_description():
teach = LaunchConfiguration('teach', default='false')

teach_arg = DeclareLaunchArgument(
'teach',
default_value='false',
description='Flag to enable teaching mode'
)

# Object detection
find_object_2d = Node(
package='find_object_2d',
executable='find_object_2d',
name='find_object_2d',
remappings=[('image', '/your/camera/image')],
parameters=[{
'gui': teach,
'objects_path': get_package_share_directory('tutorial_pkg') + '/img_data/' if LaunchConfigurationEquals('teach', 'false') else ''
}]
)

# Object tracking
track_obj = Node(
package='tutorial_pkg',
executable='track_obj',
name='track_obj',
condition=LaunchConfigurationEquals("teach", "false"),
)

return LaunchDescription([
teach_arg,
find_object_2d,
track_obj
])

Running your node

Now you can build your node and test it by typing in new a terminal.

husarion@husarion:~$
ros2 launch tutorial_pkg image_recognition.launch.py
info

Image processing is quite computationally expensive, so ROSbot can react with a slight delay to a change in the position of the object.

Summary

After completing this tutorial you should be able to configure your ROSbot with a vision system to recognize objects. You should also be able to determine the position of a recognized object relative to the camera and create a node that performs specific action related to recognized objects.


by Rafał Górecki, Husarion

Do you need any support with completing this tutorial or have any difficulties with software or hardware? Feel free to describe your thoughts on our community forum: https://community.husarion.com/ or to contact our support: support@husarion.com