Raspberry Pi Hardware Accelerated RTSP Camera

Raspberry Pi’s are wonderful little computers, just sometimes they lack the umph to get stuff done. That may change with the new Raspberry Pi 4, but what to do with all those old ones? Or how about that pile of old webcams? Well this article will help turn all those into a full on security system. (Can also use a raspberry pi camera if you got one!)

Other posts I have read on this subject often only use motion to capture detection events locally. Sometimes they go a bit further and set the Raspberry PI to stream MJPEG as an IP camera. Or set up MotionEyeOS and make it into a singular video surveillance system.

With our IP camera, we are going to take it further and encode the video stream locally. Then we will send it over the network via rtsp. This will save huge amounts of bandwidth! It also does not require the client to re-encode the stream before saving, distributing the work. That way we can also hook it into a larger security suite without draining any of its resources, in this case I will use Blue Iris.

Now, the first thing I am going to do is discourage you. If you don’t already have a Pi and a webcam or pi camera for the cause, don’t run out to buy them just for this. It’s just not economical. A 1080p WiFi camera that has has ONVIF capabilities can be had for less than $50. So why do this at all? Well because 1.) It’s all under your control and no worry about Chinaware, 2.) If you already got the equipment, it’s another free security eye, and 3.) It’s fun.

Update: If you’re just looking for results, check out my helper script that does all the work for you!

wget https://raw.githubusercontent.com/cdgriffith/pi_streaming_setup/master/streaming_setup.py
sudo python3 streaming_setup.py --rtsp

Standard Raspbian setup

Not going to go into too much detail here. If you haven’t already, download Raspbian and get it onto a SD Card. (I used raspbian buster for this tutorial) If you aren’t going to connect a display and keyboard to it, make sure to add an empty file named ssh on the root of the boot (SD Card) drive. That way you can just SSH to the raspberry pi via command line or PuTTY on Windows.

# Default settings
host: raspberrypi 
username: pi
password: raspberry

Remember to run sudo raspi-config, change your password and don’t forget to set up wifi, then reboot. Also, good idea to update the system before continuing.

sudo apt update --fix-missing
sudo apt upgrade -y 
sudo reboot 

Install a node rtsp server

To start with, we need a place for ffmpeg to connect to for the rtsp connection. Most security systems expect to connect to a rstp server, instead of listening as a server themselves, so we need a middleman.

There are a lot of rstp server options out there, I wanted to go with a lightweight one we can just run on the pi itself that is easy to install and run easily. This is what I run at my own house, so don’t think I’m skimping out for this post 😉

UPDATE: I have stopped using the below server, and instead use rtsp-simple-server, which has ARM builds per-compiled. This is what is used with the helper script. (Not because the Node based one gave me issues, just this other one is much more lightweight and easy to install.)

First of, we need to install Node JS. The easiest way I have found is to use the pre-created scripts to add the proper package links to the apt system for us.

If you are on an arm6 based system, such as the pi zero, you will need to do just a little extra work to install Node. For arm7 systems, like anything Raspberry Pi 3 or newer, we will use Node 12. Find out your arm version with uname -a command and seeing if the string “arm6” or “arm7” appears.

Now, lets install Node JS and other needed libraries, such as git and coffeescript. If you want to view the script itself before running it, it is available to view here.

curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -

sudo apt-get install -y nodejs git
sudo npm install -g coffeescript

Once that is complete, we want to download the node rtsp server code and install all it’s dependencies. Note, I am assuming you are doing this in the root of your home folder, which will later use as the base for the directory for the service.

cd ~ 
git clone https://github.com/iizukanao/node-rtsp-rtmp-server.git --depth 1
cd node-rtsp-rtmp-server
npm install -d

Now you should be good to go, you can test it out by running:

sudo coffee server.coffee

It takes about 60 seconds or more to start up, so give it minute before you will see any text. Example output is below.

2019-12-16 14:24:18.465 attachRecordedDir: dir=file app=file
(node:6812) [DEP0005] DeprecationWarning: Buffer() is deprecated ...
2019-12-16 14:24:18.683 [rtmp] server started on port 1935
2019-12-16 14:24:18.691 [rtsp/http/rtmpt] server started on port 80

Simple make sure it starts up and then you can stop it by hitting Ctrl+c At this point you can also go into the server.coffee file and edit it to your hearts content, however I do keep it standard myself.

Create rtsp server service

You probably want this to always start this on boot, so lets add it as a systemd service. Copy and paste the following code into /etc/systemd/system/rtsp_server.service.

# /etc/systemd/system/rtsp_server.service

[Unit]
Description=rtsp_server
After=network.target rc-local.service

[Service]
Restart=always
WorkingDirectory=/home/pi/node-rtsp-rtmp-server
ExecStart=coffee server.coffee

[Install]
WantedBy=multi-user.target

Now we can start it up via the service, and enable it to start on boot.

sudo systemctl start rtsp_server
# Can make sure it works with sudo systemctl status rtsp_server
sudo systemctl enable rtsp_server

Compile FFMPEG with Hardware Acceleration

If you are just using the raspberry pi camera, or another one with h264 or h265 built in support, you can use the distribution version of ffmpeg instead.

UPDATE: The built in FFmpeg now had hardware acceleration built in, so you can skip the compilation, or use my helper script to compile it for you with a lot of extras.

This is going to take a while to make. I suggest reading a good blog post or watching some Red vs Blue while it builds. This guide is just small modifications from another one. We are also adding libfreetype font package so we can add text (like a datetime) to the video stream, as well as the default libx264 so that we can use it with the Pi Camera if you have one.

sudo apt-get install libomxil-bellagio-dev libfreetype6-dev libmp3lame-dev checkinstall libx264-dev fonts-freefont-ttf libasound2-dev -y
cd ~
git clone https://github.com/FFmpeg/FFmpeg.git --depth 1
cd FFmpeg
sudo ./configure --arch=armel --extra-libs="-lpthread -lm" --extra-ldflags="-latomic" --target-os=linux --enable-gpl --enable-omx --enable-omx-rpi --enable-nonfree --enable-libfreetype --enable-libx264 --enable-libmp3lame --enable-mmal --enable-indev=alsa --enable-outdev=alsa

# For old hardware / Pi zero remove the `-j4` 
sudo make -j4

When that is finally done, run the steps below that will install it. We take the additional precaution of turning it into a standard system package and hold it so we don’t overwrite our ffmpeg version.

sudo checkinstall --pkgname=ffmpeg -y 
sudo apt-mark hold ffmpeg 
echo "ffmpeg hold" | sudo dpkg --set-selections

Figure out your camera details

If you haven’t already, plug the webcam into the raspberry pi. Then we are going to use video4linux2 to discover what it’s capable of.

v4l2-ctl --list-devices

Mine lists my webcam and two paths it’s located at. Sometimes a camera will have multiple devices for different types of formats it supports, so it’s a good idea to check each one out.

Microsoft® LifeCam Cinema(TM): (usb-3f980000.usb-1.2):
        /dev/video0
        /dev/video1

Now we need to see what resolutions and FPS it can handle. Be warned MJPEG streams are much more taxing to encode them some of their counterparts. In this example we are going to specifically try to find YUYV 4:2:2 streams, as they are a lot easier to encode. (Unless you see h264, then use that!)

In my small testing group, MJPEG streams averaged only 70% of the FPS of the YUYV, while running the CPU up to 60%. Comparatively, YUYV encoding only took 20% of the CPU usage on average.

v4l2-ctl -d /dev/video0 --list-formats-ext

This pumps out a lot of info. Basically you want to find the subset under YUYV and figure out which resolution and fps you want. Here is an example of some of the ones my webcam supports.

ioctl: VIDIOC_ENUM_FMT
        Type: Video Capture

        [0]: 'YUYV' (YUYV 4:2:2)
                Size: Discrete 640x480
                        Interval: Discrete 0.033s (30.000 fps)
                        Interval: Discrete 0.050s (20.000 fps)
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.133s (7.500 fps)
                Size: Discrete 1280x720
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.133s (7.500 fps)
                Size: Discrete 960x544
                        Interval: Discrete 0.067s (15.000 fps)
                        Interval: Discrete 0.100s (10.000 fps)
                        Interval: Discrete 0.133s (7.500 fps)

I am going to be using the max resolution of 1280×720 and the highest fps of 10. Now if it looks perfect as is, you can skip to the next section. Though if you need to tweak the brightness, contrast or other camera settings, read on.

Image tweaks

Let’s figure out what settings we can play with on the camera.

v4l2-ctl -d /dev/video0 --all
brightness (int)                : min=30 max=255 step=1 default=-8193 value=135
contrast (int)                  : min=0 max=10 step=1 default=57343 value=5
saturation (int)                : min=0 max=200 step=1 default=57343 value=100
power_line_frequency (menu)     : min=0 max=2 default=2 value=2
sharpness (int)                 : min=0 max=50 step=1 default=57343 value=27
backlight_compensation (int)    : min=0 max=10 step=1 default=57343 value=0
exposure_auto (menu)            : min=0 max=3 default=0 value=3
exposure_absolute (int)         : min=5 max=20000 step=1 default=156 value=156
pan_absolute (int)              : min=-201600 max=201600 step=3600 default=0 
tilt_absolute (int)             : min=-201600 max=201600 step=3600 default=0 
focus_absolute (int)            : min=0 max=40 step=1 default=57343 value=12
focus_auto (bool)               : default=0 value=0
zoom_absolute (int)             : min=0 max=10 step=1 default=57343 value=0

Plenty of options, excellent. Now, if you don’t have a method to look at the camera display just yet, come back to this part after you have the live stream going. You can change these settings while it is going thankfully.

v4l2-ctl -d /dev/video0 --set-ctrl <setting>=<value>

The main problems I had with my camera was that it was a little dark and liked to auto-focus every 5~10 seconds. So I added the following lines of code to my rc.local file, but there are various way to run commands on startup.

#  I added these lines right before the exit 0

# dirty hack to make sure v4l2 has time to initialize the cameras
sleep 10 

v4l2-ctl -d /dev/video0 --set-ctrl focus_auto=0
v4l2-ctl -d /dev/video0 --set-ctrl focus_absolute=12
v4l2-ctl -d /dev/video0 --set-ctrl brightness=135

Now onto the fun stuff!

Real Time Encoding

Now we are going to use hardware accelerated ffmpeg library h264_omx to encode the webcam stream. That is, unless you happen to already be using a camera that supports h264 already. Like the built-in raspberry pi camera. If you are lucky enough to have one, you can just copy the output directly to the rtsp stream.

# Only for cameras that support h264 natively!
ffmpeg -input_format h264 -f video4linux2 -video_size 1920x1080 -framerate 30 -i /dev/video0 -c:v copy -an -f rtsp rtsp://localhost:80/live/stream

If at any point you receive the error ioctl(VIDIOC_STREAMON) failure : 1, Operation not permitted, go into raspi-config and up the video memory (memory split) to 256 and reboot.

In the code below, make sure to change the -s 1280x720 to your video resolution (can also use -video_size instead of -s) and both -r 10 occurrences to your frame rate (can also use -framerate).

ffmpeg -input_format yuyv422 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -c:v h264_omx -r 10 -b:v 2M -an -f rtsp rtsp://localhost:80/live/stream

So lets brake this down. The first part is telling ffmpeg what to expect from your device, -i /dev/video0. Which means all those arguments must go before the declaration of the device itself.

-input_format yuyv422 -f video4linux2 -s <your resolution> -r <your framerate>

We are making clear that we only want the yuyv format as it is best available for my two cameras, yours may be different. Then specifying what resolution and fps we want it at. Be warned, that if you set one of them wrong, it may seem like it works (still encodes) but will give an error message to look out for:

[video4linux2] The V4L2 driver changed the video from 1280x8000 to 1280x800
[video4linux2] The driver changed the time per frame from 1/30 to 1/10

The next section is our conversion parameters.

-c:v h264_omx -r <your framerate> -b:v 2M

Here -c:v h264_omx we are saying the video codex to use h264, with the special omx hardware encoder. We are then telling it what the frame rate will be out as well, -r 10, and specifying the quality with -b:v 2M (aka bitrate) which determines how much bandwidth will be used when transmitting the video. Play around with different settings like -b:v 500k to see where you want it to be at. You will need a higher bitrate for higher resolution and framerate, and a lot less for lower resolution.

After that, we are telling it to disable audio with -an for the moment. If you do want audio, there is an optional section below going over how to enable that.

-f rtsp rtsp://localhost:80/live/stream

Finally we are telling it where to send the video, and to send it in the expected rtsp format (rstp is the video wrapper format, the video itself is still mp4). Notice that with the rstp server we can have as many camera with their own sub url, so instead of live/stream at the end could be live/camera1 and live/camera2.

Adding audio

Optional, not included in my final service script

As most webcams have built-in microphones, it makes it easy to add it to our stream if you want. First we need to identify our audio device.

arecord -l

You should get a list of possible devices back, in this case only my webcam is showing up as expected. If you have more than one, make sure you check out the ffmpeg article on “surviving the reboot” so they don’t get randomly re-ordered.

**** List of CAPTURE Hardware Devices ****
card 1: CinemaTM [Microsoft® LifeCam Cinema(TM)], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

Notice it says card 1 at the very being of the webcam, and specifically device 0, that is the ID we are going to use to reference it with ffmpeg. I’m going to show the full command first like before and break it down again.

ffmpeg -input_format yuyv422 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -f alsa -ac 1 -ar 44100 -i hw:1,0 -map 0:0 -map 1:0 -c:a aac -b:a 96k -c:v h264_omx -r 10 -b:v 2M -f rtsp -rtsp_transport tcp rtsp://127.0.0.1:80/live/webcam

So to start with, we are adding a new input of type ALSA ( Advanced Linux Sound Architecture) -f alsa -i hw:1,0. Because it’s a webcam, which generally only has a single channel of audio (aka mono), it needs -ac 1 passed to it as it by defaults tries to interpret it as stereo (-ac 2). It you get the error cannot set channel count to 1 (Invalid argument) that means it probably actually does have stereo, so you can remove it or set it to -ac 2.

Finally, I am setting a custom sampling rate of 44.1kHz, -ar 44100, the same used on CDs. All that giving us the new input of -f alsa -ac 1 -ar 44100 -i hw:1,0 .

Next we do a custom mapping to make sure our output streams are set up as we expect. Now ffmpeg is usually pretty good about doing this by default if we have a single input with video, and a single input with audio, this is really just to make sure that nobody out there has weird issues. -map 0:0 -map 1:0 is saying that we want the first track from the first source 0:0 and the first track from the second source 1:0.

Finally our encoding for the audio is set up with -c:a aac -b:a 96k which is saying to use the AAC audio type, with a bitrate of 96k. Now this could be a lot higher, as the theoretical bitrate of this source is now 352k (sample rate X bit depth X channels), but I can’t tell the difference past 96k with my mic is why I stuck with that.

One gotcha with sound, is that if the ffmpeg encoding can’t keep up with the source, aka the fps output isn’t the same as the input, the audio will probably skip weirdly, so you may need to step it down to a lower framerate or resolution if it can’t keep up.

Adding timestamp

Optional, but is included in my service script

This is optional, but I find it handy to directly add the current timestamp to the stream. I also like to have the timestamp in a box so I can always read it in case the background is close to the same color as the font. Here is what we are going to add into the middle of our ffmpeg command.

-vf "drawtext=fontfile=/usr/share/fonts/truetype/freefont/FreeSerif.ttf:text='%{localtime}':x=8:y=8:fontcolor=white: box=1: boxcolor=black"

It’s a lot text, but pretty self explanatory. We specify which font file to use drawtext=fontfile=/usr/share/fonts/truetype/freefont/FreeSerif.ttf, that the text will be the local time (make sure you have set your locale right!). Next we are going to start the box 8 pixels in and 8 pixels down from the top left corner. Then we set the font’s color, and that it will have a box around it with a different color.

ffmpeg -input_format yuyv422 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -c:v h264_omx -r 10 -b:v 2M -vf "drawtext=fontfile=/usr/share/fonts/truetype/freefont/FreeSerif.ttf:text='%{localtime}':x=8:y=8:fontcolor=white: box=1: boxcolor=black" -an -f rtsp rtsp://localhost:80/live/stream

Making it systemd service ready

When running ffmpeg as a service, you probably don’t want to pollute the logs with standard output info. I also had a random issue with it trying to read info from stdin when a service, so I also added the -nostdin for my own sake. You can add these at the start of the command.

-nostdin -hide_banner -loglevel error

You can hide even more if you want to up it to -loglevel panic, but I personally want to see any errors that come up just in case.

So now our full command is pretty hefty.

ffmpeg -nostdin -hide_banner -loglevel error -input_format yuyv422 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -c:v h264_omx -r 10 -b:v 2M -vf "drawtext=fontfile=/usr/share/fonts/truetype/freefont/FreeSerif.ttf:text='%{localtime}:x=8:y=8:fontcolor=white: box=1: boxcolor=black" -an -f rtsp rtsp://localhost:80/live/stream

Our new full command is a lot in one line, but it gets the job done!

Viewing the stream

When you have the stream running, you can pull up VLC or other network enabled media players and point to rtsp://raspberrypi:80/live/stream (if you changed your hostname, will have to do it based off ip).

When you have the command massaged exactly how you want it, we are going to create a systemd file with it, just like we did for the rstp server. In this case we will save it to the file /etc/systemd/system/encode_webcam.service and we will also add the argument -nostdin right after ffmpeg safety. sudo vi /etc/systemd/system/encode_webcam.service

# /etc/systemd/system/encode_webcam.service

[Unit]
Description=encode_webcam
After=network.target rtsp_server.service rc-local.service

[Service]
Restart=always
RestartSec=20s
User=pi
ExecStart=ffmpeg -nostdin -hide_banner -loglevel error -input_format yuyv422 
 -f video4linux2 -s 1280x720 -r 10 -i /dev/video0 -c:v h264_omx -r 10 -b:v 2M 
 -vf "drawtext=fontfile=/usr/share/fonts/truetype/freefont/FreeSerif.ttf 
 :text='%{localtime}:x=8:y=8:fontcolor=white: box=1: boxcolor=black" -an 
 -f rtsp rtsp://localhost:80/live/stream


[Install]
WantedBy=multi-user.target

Now start it up, and enable it to run on boot.

sudo systemctl start encode_webcam
sudo systemctl enable encode_webcam

Connect it to your security center

I have looked at a few different security suits for my personal needs. They included iSpy (Windows) and ZoneMinder (Linux) but I finally decided upon the industry standard Blue Iris (Windows). I like it because of feature set: mobile app, motion detection, mobile alerts, NAS and Cloud storage, etc… Blue Iris also has a 15 day evaluation period to try before you buy. You don’t even need to register or provide credit info!

For our needs, the best part about Blue Iris is that it supports direct to disk recording. That way we don’t have to re-encode the stream! So lets get rolling, on the top left, select the first menu and hit “Add new camera”.

add new camera

It will then have a popup to name and configure the camera, here make sure to select the last option “Direct to disk recording”.

Next it will need the network info for the camera, put in the same info as you did for VLC. Blue Iris should auto parse it into the fields it wants, and hit OK.

Volia! Your raspberry pi is now added to your security suite!

Now you can have fun setting up recording schedules, motion detection recording, mobile alerts, and more!