Presentation and Lab Briefing

Autonomous Race Car: Visual Servoing

Diego Contreras, Kevin Huang, Nathaniel Morgan, Weiming Zhou, Soe Wai Yan

Overview

Visual servoing enables real-time cone parking and line following on an autonomous racecar. The goal of this lab was to park in front of a cone and follow a line using camera-based perception.

The system is composed of four modules:

Detect orange cone by color segmentation
Object detection algorithms (SIFT, Template Matching, YOLO)
Pixel-to-real-world coordinate transformation via homography
Steering and stopping at a target distance (parking controller)

These modules are synthesized together for a line following application.

Module 1: Cone Detection via Color Segmentation

Color segmentation can effectively detect orange cones. The pipeline processes the camera image through the following steps:

Original image is captured from the ZED camera
Gaussian Blur is applied to reduce noise
Mask is generated by filtering for orange color in HSV space
Bounding Box is drawn around the largest detected contour

Our color segmentation achieves a median 0.79 IOU with IQR = 0.18 on the cone dataset. Ground truth bounding boxes (green) were compared against predicted bounding boxes (red). Test 1 achieved an IOU score of 0.97, while test 7 (with a smaller, more distant cone) achieved an IOU score of 0.63.

Module 2: Object Detection Algorithms

Part 1: SIFT & Template Matching

Two classical object detection algorithms were evaluated: SIFT (Scale-Invariant Feature Transform) and Template Matching.

SIFT Detection Results

SIFT was tested on two datasets:

Citgo dataset — Works well. SIFT successfully matched features across different views of the Citgo sign with high IOU scores (up to 0.91).
Stata Map dataset — Works poorly. SIFT failed to find enough matches on the map images, returning 0.0 IOU across all test images.

Template Matching Results

Template matching was tested on the Stata Map dataset, where it performed well with IOU scores ranging from 0.48 to 0.91.

Method	Best Use Case
SIFT	Landmark Localization
Template Matching	Map Localization

Part 2: YOLO Object Detection

YOLO detects objects on the live ZED camera feed with tunable confidence and IOU thresholds. We experimented with different threshold values:

Confidence threshold = 0.2: More detections but with lower precision
Confidence threshold = 0.9: Fewer detections but higher precision
IOU threshold = 0.9: More overlapping boxes retained
IOU threshold = 0.2: Aggressive non-maximum suppression removes overlapping boxes

Module 3: Pixel-to-Plane via Homography

Homography transforms pixel coordinates to robot frame coordinates with a mean error of 1.5 cm (standard deviation: 1.7 cm). The error is mainly in the forward direction.

Calibration

Manual calibration was performed using rqt_image_view to collect pixel-to-ground correspondences. The cone tip was identified in the camera image and its corresponding real-world position was measured.

Homography Computation

Four calibration point pairs were used to compute the homography matrix via cv2.findHomography():

Point	Pixel (u, v)	Ground (x cm, y cm)
1	(211, 162)	(30.48, 7.62)
2	(415, 154)	(46.99, -12.70)
3	(351, 145)	(109.22, -13.97)
4	(402, 167)	(22.86, -6.35)

The homography matrix $H$ maps pixel coordinates $(u, v)$ to ground-plane coordinates $(x, y)$ via the relation:

$$s \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & 1 \end{bmatrix} \begin{bmatrix} u \\ v \\ 1 \end{bmatrix}$$

Module 4: Parking Controller

Our parking controller converges to the target distance across multiple trials.

Controller Design

If cone is far: drive closer
If cone is too close: reverse further
If cone is off-center: steer to align
Desired parking distance: 0.75 meters
Parameters: $K_p = 1.0$, $K_d = 0.1$

Performance

Over $N = 9$ trials, the controller achieved:

Final $x_{\text{error}}$: 0.75 m
Final distance error: 0 m

Simulation Tests

The parking controller was evaluated in simulation across three scenarios:

Sim Test 1: Cone in Front

When the cone is placed directly in front of the robot, the controller drives forward and converges to the desired parking distance. The distance error and y-error converge to zero, while the x-error settles to the target distance of 0.75 m.

Sim Test 2: Cone to the Side

When the cone is placed to the side, the controller first steers to align with the cone and then drives to the target distance. The y-error gradually decreases as the robot aligns itself, and the distance error converges to the parking distance.

Sim Test 3: Cone Behind

When the cone is placed behind the robot, the controller reverses and maneuvers to face the cone, then drives forward to park at the target distance. This scenario shows the largest initial transient as the robot must execute a more complex trajectory.

Robot Issues

During real-robot testing, we encountered hardware issues including connectivity problems with the racecar's onboard computer (SSH via WiFi) and a damaged USB cable for the ZED camera. These issues prevented deployment of the parking controller and line following on the physical robot.

Conclusion

Visual servoing enables reliable cone parking in simulation. The key results are:

Color segmentation achieved median 0.79 IOU on the cone dataset
Homography achieved 1.5 cm mean error for pixel-to-ground transformation
Parking controller converged to 0 m distance error in simulation

Lab Goals Status

Goal	Status
Object detection algorithms implemented and evaluated	Complete
Homography computed and validated with error metric	Complete
Parking controller deployed on real robot	Not Complete
Line following demonstrated on track	Not Complete

Future Work: Deploy the parking controller and line following system on the physical robot once hardware issues are resolved.

Citations

OpenCV Documentation. cv2.findHomography. https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html
Ultralytics. YOLOv8 Documentation. https://docs.ultralytics.com
MIT RSS Lab 4: Visual Servoing. https://github.com/mit-rss/visual_servoing
A. Shwaiheen, "Line Follower Robot - Very Fast Using Port Manipulation," Hackster.io, Jan. 3, 2020.