Lab 4: Visual Servoing

Presentation and Lab Briefing

Download Lab Briefing (PDF)


Autonomous Race Car: Visual Servoing

Diego Contreras, Kevin Huang, Nathaniel Morgan, Weiming Zhou, Soe Wai Yan

Overview

Visual servoing enables real-time cone parking and line following on an autonomous racecar. The goal of this lab was to park in front of a cone and follow a line using camera-based perception.

The system is composed of four modules:

  1. Detect orange cone by color segmentation
  2. Object detection algorithms (SIFT, Template Matching, YOLO)
  3. Pixel-to-real-world coordinate transformation via homography
  4. Steering and stopping at a target distance (parking controller)

These modules are synthesized together for a line following application.

Module 1: Cone Detection via Color Segmentation

Color segmentation can effectively detect orange cones. The pipeline processes the camera image through the following steps:

  1. Original image is captured from the ZED camera
  2. Gaussian Blur is applied to reduce noise
  3. Mask is generated by filtering for orange color in HSV space
  4. Bounding Box is drawn around the largest detected contour

Our color segmentation achieves a median 0.79 IOU with IQR = 0.18 on the cone dataset. Ground truth bounding boxes (green) were compared against predicted bounding boxes (red). Test 1 achieved an IOU score of 0.97, while test 7 (with a smaller, more distant cone) achieved an IOU score of 0.63.

Module 2: Object Detection Algorithms

Part 1: SIFT & Template Matching

Two classical object detection algorithms were evaluated: SIFT (Scale-Invariant Feature Transform) and Template Matching.

SIFT Detection Results

SIFT was tested on two datasets:

Template Matching Results

Template matching was tested on the Stata Map dataset, where it performed well with IOU scores ranging from 0.48 to 0.91.

Method Best Use Case
SIFTLandmark Localization
Template MatchingMap Localization

Part 2: YOLO Object Detection

YOLO detects objects on the live ZED camera feed with tunable confidence and IOU thresholds. We experimented with different threshold values:

Module 3: Pixel-to-Plane via Homography

Homography transforms pixel coordinates to robot frame coordinates with a mean error of 1.5 cm (standard deviation: 1.7 cm). The error is mainly in the forward direction.

Calibration

Manual calibration was performed using rqt_image_view to collect pixel-to-ground correspondences. The cone tip was identified in the camera image and its corresponding real-world position was measured.

Homography Computation

Four calibration point pairs were used to compute the homography matrix via cv2.findHomography():

Point Pixel (u, v) Ground (x cm, y cm)
1(211, 162)(30.48, 7.62)
2(415, 154)(46.99, -12.70)
3(351, 145)(109.22, -13.97)
4(402, 167)(22.86, -6.35)

The homography matrix $H$ maps pixel coordinates $(u, v)$ to ground-plane coordinates $(x, y)$ via the relation:

$$s \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & 1 \end{bmatrix} \begin{bmatrix} u \\ v \\ 1 \end{bmatrix}$$

Module 4: Parking Controller

Our parking controller converges to the target distance across multiple trials.

Controller Design

Performance

Over $N = 9$ trials, the controller achieved:

Simulation Tests

The parking controller was evaluated in simulation across three scenarios:

Sim Test 1: Cone in Front

When the cone is placed directly in front of the robot, the controller drives forward and converges to the desired parking distance. The distance error and y-error converge to zero, while the x-error settles to the target distance of 0.75 m.

Sim Test 2: Cone to the Side

When the cone is placed to the side, the controller first steers to align with the cone and then drives to the target distance. The y-error gradually decreases as the robot aligns itself, and the distance error converges to the parking distance.

Sim Test 3: Cone Behind

When the cone is placed behind the robot, the controller reverses and maneuvers to face the cone, then drives forward to park at the target distance. This scenario shows the largest initial transient as the robot must execute a more complex trajectory.

Robot Issues

During real-robot testing, we encountered hardware issues including connectivity problems with the racecar's onboard computer (SSH via WiFi) and a damaged USB cable for the ZED camera. These issues prevented deployment of the parking controller and line following on the physical robot.

Conclusion

Visual servoing enables reliable cone parking in simulation. The key results are:

Lab Goals Status

Goal Status
Object detection algorithms implemented and evaluatedComplete
Homography computed and validated with error metricComplete
Parking controller deployed on real robotNot Complete
Line following demonstrated on trackNot Complete

Future Work: Deploy the parking controller and line following system on the physical robot once hardware issues are resolved.

Citations

  1. OpenCV Documentation. cv2.findHomography. https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html
  2. Ultralytics. YOLOv8 Documentation. https://docs.ultralytics.com
  3. MIT RSS Lab 4: Visual Servoing. https://github.com/mit-rss/visual_servoing
  4. A. Shwaiheen, "Line Follower Robot - Very Fast Using Port Manipulation," Hackster.io, Jan. 3, 2020.