Self Driving Car (Simulation)

Self Driving Car (Simulation)

Type
Tech CornerFeature
Title

Self Driving Car (Simulation)

Published
February 14, 2024
Status
Live
Tags
Convolution Neural NetworksSelf drivingSimulation
Note

This article is best utilized in conjunction with the associated Git project.

Use this for easy navigation to main pages. This is hidden on your site
Delete me after reading

OVERVIEW

Date: 14th Feb 2024

In this project, my aim is to implement the Convolutional Neural Network (CNN) architecture proposed by NVIDIA for self-driving vehicles, as detailed in the following paper: https://arxiv.org/abs/1604.07316. The core idea is to input images of the road captured by cameras into the network, allowing it to predict the necessary steering angle to maintain the vehicle's trajectory. This implementation specifically targets steering angle prediction, assuming a constant speed for the vehicle. Furthermore, it does not take into account other categorical data such as traffic lights, signs, or collision conditions involving objects like vehicles or pedestrians. Please utilize the GitHub link provided above to delve deeper into the implementation and feel free to give it a try yourself.

                               The above image shows us the udacity simulator in action under training mode
The above image shows us the udacity simulator in action under training mode

BASICS

As mentioned previously, the project utilizes the free Udacity simulator for training and experimenting with the machine learning model. The simulator comprises two modes :

  • Training
    • During the training mode, we capture images of the road from three cameras (left, center, and right) along with their corresponding steering angles as we navigate through the map. The record button, as depicted in the above figure of the simulator, is used to initiate this recording process.
  • Autonomous
    • During the autonomous mode, the simulator accepts input values on the local host port 4567. Consequently, we execute our trained model, feeding it with input images in real-time. The model then predicts the steering angle, which is subsequently relayed back to the simulator for execution.

WORKING

What kind of data do we have ?

  • IMG folder
    • Within this folder, you'll find all the images recorded while navigating the car on the map during training mode. Each image is named based on its respective camera position.
  • Tabular data
    • Following that, there is a CSV file containing the image locations in the first three columns, corresponding to the three cameras. Additionally, the file contains numerical data such as throttle, speed, and steering angle in subsequent columns.

Proposed pipeline

  • In the image below, we can observe the pipeline that will be utilized to predict the steering angle.
  • The preprocessed image from the camera serves as input to the deep neural network, which subsequently predicts the steering angle of the vehicle.
  • image

Simple exploratory data analysis

  • Now that we've identified our prediction target, namely the steering angle, we can conduct an exploratory data analysis to gain insights into the distribution of the data points we'll be predicting.
  • Since our implementation does not utilize any additional numerical or categorical data, our analysis will focus solely on the steering angle.
  • In the histogram below, we observe that the data points with a steering angle of zero are more numerous compared to other steering angles. This imbalance in the steering angle data is evident.
  • This imbalance is attributed to the likelihood of driving the car primarily in a straight line during data recording. Consequently, the steering angle remains close to zero for a significant portion of the time.
  •                    Histogram plot of steering angle values before removing excess data
    Histogram plot of steering angle values before removing excess data
  • To address this imbalance, we attempt to balance the data by limiting the number of data points in each bin of the histogram. Specifically, we cap the data points in each bin at 400 samples. The resulting distribution of the histogram is displayed below. Notably, we observe that the distribution now resembles a more normal or Gaussian distribution.
  •                    Histogram plot of steering angle values after removing excess data
    Histogram plot of steering angle values after removing excess data

Image augmentation & preprocessing

  • We possess images captured from three cameras: left, center, and right. It's crucial to rectify the images from the left and right cameras concerning the center camera. This involves determining the adjustment needed for the left steering angle to align with the center, and similarly for the right camera. For the left camera, we add 0.15 to the steering angle, while for the right camera image, we subtract 0.15 from the steering angle.
  • Now that we have the images and their corresponding steering angle we have our data ready to be split for training and testing.
  • We apply various data augmentation techniques like rotation, flipping, pan to enrich the training data.
  • The images must undergo preprocessing before training. Initially, we crop the images to remove unnecessary backgrounds such as trees. Subsequently, we apply Gaussian blur. Finally, we convert the image to YUV format, as required by the CNN network proposed by NVIDIA. An example of an image in YUV format is depicted below.
  •                                        Original RAW image vs preprocessed image
    Original RAW image vs preprocessed image

Implementing the CNN architecture & training

  • Now that the data is prepared, we proceed to construct the CNN architecture as depicted in the image below. This architecture, proposed by NVIDIA, forms the backbone of our model. However, it's important to note that in the code, we do not utilize all layers due to time and space constraints while utilizing the free GPU on Google Colab.
  •                         architecture proposed by NVIDIA [credits :
    architecture proposed by NVIDIA [credits : link]
  • Below is an illustration showcasing the layers involved in the training of our model.
  • image
  • The model is trained in batches of 100 with an epoch of 10 and 300 steps per epoch. The resulting loss curve is shown below.
  • The trained model is serialized and downloaded to the local computer.
  •               Loss curve obtained after training the model
    Loss curve obtained after training the model

Driver snippet to put the trained model to action

  • Now that we possess the serialized model in .h5 format, we can construct the pipeline. Here, we input the preprocessed image to the CNN network, which in turn predicts the corresponding steering angle. (refer the proposed pipeline section above)
  • The preprocessing of the images during prediction should mirror the methods used during training the images.
  • We will use a Flask / socketio server to communicate with the simulator which listens on the port 4567 local host.
  • We run the driver script and then open the simulator in the “Autonomous” mode to accept steering values.
  • Remember we are only predicting the steering angle. The speed of the car remains constant.
  • The following code snippet illustrates the driver script and its utilization of the model for prediction, as well as how it communicates the results to the simulator.
import socketio
import eventlet
from flask import Flask

import numpy as np
from keras.models import load_model

import base64
from io import BytesIO

from PIL import Image
import cv2


"""Here we will use a combination of Flask & SocketIO to communicate with the simulator"""

# define the socketio server, this initializes the socketio server
sio = socketio.Server()

# Flask application initialization, 
# Flask is a web framework used for building web applictaions in Python
app = Flask(__name__) 


# This variable is used to control the speed of the vechicle
speed_limit = 10

# ***************************************************************************************

# We will define the preprocessing function for the image data 
# Note that the preprocessing has to be the same as that which was used in training
def img_preprocess(img):
    img = img[60:135,:,:]
    img = cv2.cvtColor(img, cv2.COLOR_RGB2YUV)
    img = cv2.GaussianBlur(img,(3,3),0)
    img = cv2.resize(img,(200,66))
    img = img/255
    return img

# ***************************************************************************************

# function that emits the steering and throttle to the simulator
# It emits a "steer" event with the steering and throttle values
# We did not predict throttle but it is needed for it to work
def send_control(steering_angle,throttle):
    sio.emit('steer',data={
            'steering_angle': steering_angle.__str__(),
            'throttle': throttle.__str__()
    })

# ***************************************************************************************

# This is a SocketIO event handler that listens for a connect event
# When the client connects to the server the connect function is executed
@sio.on('connect')
def connect(sid,environ):
    print('Connected')
    send_control(1,0)

# ***************************************************************************************

@sio.on('telemetry')
def telemetry(sid,data):

    speed = float(data['speed'])

    # Read the bytes of image data received using BytesIO
    # Decode the base64 format of the image 
    # Use the Image module of PIL to read the image matrix
    image = Image.open(BytesIO(base64.b64decode(data['image'])))
    # convert the image tp numpy array from PIL format 
    image = np.asarray(image)
    # Apply the image preprocessing on the image
    image = img_preprocess(image)
    # Here we add an extra dimension to the image 
    # This essentially creates a batch of one image (represents batch size)
    # example if image has shape (h,w,chan) then np.array([img]) => (1,h,w,chan)
    # Here 1 is the batch size
    image = np.array([image])

    steering_angle = float(model.predict(image))
    throttle = 1.0 - speed/speed_limit

    print('{}{}{}'.format(steering_angle, throttle, speed))
    # send the control with the steering_angle and the throttle
    send_control(steering_angle, throttle)



# ***************************************************************************************



if __name__ == "__main__":
    # Load the model
    model = load_model("model/model.h5")
    # call socketio, this combines both socketio server and flask app to work together
    app = socketio.Middleware(sio,app)
    # This launches the Flask server using the eventlet web server
    # It listens on port 4567 for incoming connections from the simulator
    eventlet.wsgi.server(eventlet.listen(('',4567)),app)


# ***************************************************************************************

SUMMARY

  • This article is well utilized with the code implementation on Git
  • We only use the images and steering angle in our project
  • Indeed, alternative simulators such as CARLA offer dynamic environments featuring pedestrians, traffic signals, and various traffic signs.
  • There are several avenues for enhancing this model, including :
    • Hyper-parameter tuning (learning rate, # epochs, batch size..)
    • Data augmentation and other preprocessing techniques
    • Trying a different CNN architecture
    • Refining the layers in the existing model involves strategic modifications, such as integrating BatchNormalization layers and substituting pooling operations with convolutional layers utilizing strides. These adjustments aim to enhance model performance and efficiency.
  • Incorporating a dynamic environment would necessitate adopting mixed modeling approaches to accommodate additional numerical and categorical data. This is an aspect I intend to explore in the next iteration of this project.
         Complete CNN architecture [
Complete CNN architecture [credits : Cloning Safe Driving Behavior for Self-Driving Cars using CNN link]

Content

Name
Excerpt

Twitter | Instagram | LinkedIn

© 2025 visionmatrix