You will complete this exercise in pairs (ideally with your Q-learning project partner).
You will be programming a Turtlebot3 to respond to different images it "sees". The robot's behavior will change based on whether it recognizes a cat, a dog, or neither in its camera. You will need to pass the robot's camera footage to a trained neural network and use the output to command the robot to perform different behaviors described in the following narrative.
To get started on this exercise, update the intro_robo
class package to get the
lab_f_image_classifier
ROS package and starter code that we'll be using for this activity.
$ cd ~/catkin_ws/src/intro_robo
$ git pull
$ git submodule update --init --recursive
$ cd ~/catkin_ws && catkin_make
$ source devel/setup.bash
Additionally, you'll need to make sure you have PyTorch installed (e.g., by executing pip3 show torch ). If you don't have PyTorch installed, run the following command to install it:
$ pip3 install torch torchvision torchaudio
In order for the robot to respond to different images (e.g., cat, dog, something else) it sees, we'll first need to train an image classifier that can distinguish these different classes. We'll do this work in the Jupiter Notebook defined in cifar10_tutorial.ipynb
.
You can open the Jupiter Notebook by running:
$ jupyter notebook cifar10_tutorial.ipynb
Next, follow the instructions in the Jupiter Notebook to train an image classifier. Your classifier will be able to distinguish several different types of objects within an image, see below.
Now that you've trained a neural network to identify different objects within an image, we'll use that trained network to allow our turtlebot to respond uniquely to seeing a cat, a dog, or neither from it's RGB camera. Since there aren't any real cats or dogs in JCL, we recommend that you bring up an images of cats, dogs, and/or other items on your phone or computer and place your phone/computer so that the robot can see it.
Robot Narrative: Imagine you are programming a curious robot named Scout. Scout is equipped with a camera that allows it to observe its surroundings in the real world. Scout is designed to be adaptive and responsive, capable of reacting to visual cues to navigate its environment safely. One day, Scout is exploring a park when its camera captures various images. Each time Scout's camera processes an image, it analyzes it to determine whether it depicts a cat, a dog, or something else entirely.
Now, let's get to programming. Open up robot_scout.py
to see your TODOs for implemeting your robot Scout.
Launch Bringup on the Turtlebot. In one terminal, run:
$ roscore
In a second terminal, run:
$ ssh pi@IP_OF_TURTLEBOT
$ set_ip LAST_THREE_DIGITS
$ bringup
In a third terminal, run the following commands to start receiving ROS messages from the Raspberry Pi camera:
$ ssh pi@IP_OF_TURTLEBOT
$ set_ip LAST_THREE_DIGITS
$ bringup_cam
bringup_cam
is an alias for the command
roslaunch turtlebot3_bringup turtlebot3_rpicamera.launch
.
Additionally, running bringup_cam
does not open the camera feed window. You must run code
to see the camera feed.
In a fourth terminal, run the following command to decompress the camera messages:
$ rosrun image_transport republish compressed in:=raspicam_node/image raw out:=camera/rgb/image_raw
Finally, in a fifth terminal, run the ROS node for Robot Scout:
$ rosrun lab_f_image_classifier robot_scout.py
The starter code implements a helpful debugging window to help visualize the camera view.
cv2.imgshow()
to run on the main thread.
cv2.namedWindow("window", 1)
robot_scout.py
imports a class object called Net from image_classifier.py
.
See how the neural network is defined there.The code in this lab was created by Ting-Han (Timmy) Lin and Tewodros (Teddy) Ayalew. Part of the code is taken and modified from Training a Classifier .