Here is Part 1 of How to build Image classifier Robot using Raspberry Pi, with Deep Learning

Image Classifier Robot

Image classifier Robot Part 2 : Python code

In this Part 2 of building Image classifier Robot, we will start to dig in python code. We’ll start with image classification first. You can write code on your local machine and use git clone/ push / pull to Pi, or you can just use SSH or VNC and write code in your Pi directly. Suppose you use Pi command line, nano is my favourite editor.

$ sudo nano

Now, we will import necessary package, Keras library has pre-trained model available in their applications’ method. The ‘subprocess’ library is use for running command line in python file. We use it to run pico2wav and omxplayer for robot’s speech.

# import the necessary packages

from keras.applications import ResNet50
import tensorflow as tf

#Uncomment in case you would like to try another pretrained model
#from keras.applications import InceptionV3
#from keras.applications import Xception # TensorFlow ONLY
#from keras.applications import VGG16
#from keras.applications import VGG19
#from keras.applications import inception_resnet_v2 

from keras.applications import imagenet_utils
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
import numpy as np
import h5py
import subprocess

Time Decorator

It’s easy to check the run time of your code, by using time decorator. Then just put @fn_timer prior to any function you would like to check.

#Declare Time Decorator
import time
from functools import wraps

def fn_timer(function):
  def function_timer(*args, **kwargs):
  t0 = time.time()
  result = function(*args, **kwargs)
  t1 = time.time()
  print ("Total time running : %s seconds" % (str(t1-t0)))
  return result
  return function_timer


Create model

We start by define our model and its weights as ‘Imagenet’.

model = ResNet50(weights="imagenet")

Here, we declare our classify() function: Resnet50 take image input shape = 224 x 224 pixels, it may vary if you use different models, but mostly they are 224×224 or 299×299.

Here is the tricks, you should declare model outside the function classify(). Because if you put it in the function, it will get in the loop. Every time you call function, you’ll have to wait for the model (or graph in TensorFlow) created, and that may cost a minute in Raspberry Pi. So, we just call this model once while import this file to our main file. It will load on memory and good to go. You can try by yourself and see how slower it would be if you put model inside classify() function.

def classify(): 
  inputShape = (224, 224)
  preprocess = imagenet_utils.preprocess_input
  image_path = '/dev/shm/mjpeg/cam.jpg'
  image = load_img(image_path, target_size=inputShape)
  image = img_to_array(image)
  image = np.expand_dims(image, axis=0)
  image = preprocess(image)

The image path to Rpi-Webcam_interface is '/dev/shm/mjpeg/cam.jpg', but you can change it to any path, or if you want to run classify() on any static picture files.

Then we predict the image classify by calling model.predict and decode it.

  preds = model.predict(image)
  P = imagenet_utils.decode_predictions(preds)

The we loop over the predictions and display the rank-5 predictions + probabilities to our terminal.

  for (i, (imagenetID, label, prob)) in enumerate(P[0]):
    output = ("{}. {}: {:.2f}%".format(i + 1, label, prob * 100))

Now that’s our classify() function is nearly finish. You can try run your code by changing the image_path pointing to any image file that download from internet. To download image file just google to the file and use

wget http://(file URL location)

Adding Speech

Now we will bring voice to our Image classifier Robot. I got this idea of phase “I’m thinking…” from Lukas’s blog. We will use our thinking.wav file from previous part. Feel free to change the words to anything you want. We call it by using ‘subprocess’ library, just put it above the prediction line.

  # classify the image
  print("I'm thinking...")['omxplayer','think.wav'])
  preds = model.predict(image)
  P = imagenet_utils.decode_predictions(preds)

We also want the robot to speak which class it think the image most likely is, so we take the winner from object ‘P’, and covert it to temporary wav file, and let it speak.

  winner = P[0][0][1]
  speak = ("I think I see {}".format(winner))['pico2wave','-w','/tmp/win.wav',speak])['omxplayer','/tmp/win.wav'])['rm','/tmp/win.wav'])

OK, that’s very Cool! Our robot can see and classify the image now.

Bring motors to life

Now it’s time to get our Image classifier Robot worked as a car, moving. We’ll use a library ‘RPi.GPIO’. This package provides a class to control the GPIO on a Raspberry Pi. Basically, it tell Pi to send current to any GPIO channel that we want to function. In this case, it’s a motor and sonar. We’ll keep sonar for the next post.

First, we create a new file ‘’ , then import the necessary packages.

import RPi.GPIO as GPIO
from time import sleep
import sys
import signal
import random
import tkinter as tk

Tkinkter is a library to do a loop in the codes we desired. We use it here to standby and receive commands from keypress telling the robot which direction to move.

To use RPi.GPIO , we start by setmode of GPIO.  like to use BCM which is a specific number of GPIO , but you can use the pin number by setting to GPIO.BOARD

GPIO Raspberry Pi

Figure 8 : GPIO map on RaspberryPi 2 and 3 (source)

Then we setup the GPIO/pin number as an ouput, and set PWM for it. PWM stands for Pulse Width Modulation. You can set Frequency and Duty Cycle by using GPIO.PWM. has a very good post explanation of it. The codes will look like this.

GPIO.setmode(GPIO.BCM) # GPIO numbering
GPIO.setup(Your_GPIO_number, GPIO.OUT) # set pin as output
p = GPIO.PWM(Your_GPIO_number, 100) # set Frequency to 100

p.start(100) # start at duty cycle 100
sleep(1) # let it run for 1 second 

GPIO.cleanup() # Free the GPIO pin after use

The logic how the motor will be controlled are differ depends on the motor control board. You can look in their manual or in the manufacturer website. Then we will write a function to perform moving direction task. This is an example of my car using Cytron’s motor hat. You can look at an example of my code at Github.

def forward(tf):
  print ("Forward")
  GPIO.output(12, 100)
  GPIO.output(13, 100)
  GPIO.output(12, GPIO.LOW)

Sentdex had made a wonderful step-by-step VDO on YouTube to follow easily.


Keyboard input to command the direction

Our Image classifier Robot need a controller now, we start by declare a function of Keyboard input. I also import our classify() function, so we can tell the robot when to do image classification task.

import classify as cs

def key_input(event):
  key_press = event.keysym.lower() # convert to lower-case letters
  sleep_time = 0.20
  if key_press == 'w':
  elif key_press == 's':
  elif key_press == 'a':
  elif key_press == 'd':
  elif key_press == 'q': # Just stop the car
  elif key_press == 'z': # Stop the car and EXIT the program
  elif key_press == 'space': # Doing image classification
    print('Wrong key press')

Lastly, we use Tkinkter to run a program loop and connect key_input.

command = tk.Tk()
command.bind_all('', key_input)


Now you can test and play with your Image Classifier Robot. Next post is the last one (optional), we’ll attach the Sonar system!