Overview

A broad introduction and examples of a module, library or program

Encoding UHD 4K HDR10 videos with FFmpeg

I talked about this before with my encoding setting for handbrake post, but there is a fundamental flaw using Handbrake for HDR 10-bit video….it only has a 8-bit internal pipeline! So while you still get a 10-bit x265 video, you are losing the HDR10 data.

Thankfully, you can avoid that and save the HDR by using FFmpeg instead. (To learn about saving Dolby Vision or HDR10+ by remuxing, skip ahead.) So cutting straight to the chase, here are the two most basic commands you need to save your juicy HDR10 data. This will use the Dolby Vision (profile 8.1) Glass Blowing demo.

Extract the Mastering Display metadata

First, we need to use FFprobe to extract the Mastering Display and Content Light Level metadata. We are going to tell it to only read the first frame’s metadata -read_intervals "%+#1" for the file GlassBlowingUHD.mp4

ffprobe -hide_banner -loglevel warning -select_streams v -print_format json -show_frames -read_intervals "%+#1" -show_entries "frame=color_space,color_primaries,color_transfer,side_data_list,pix_fmt" -i GlassBlowingUHD.mp4

A quick breakdown of what we are sending ffprobe:

  • -hide_banner -loglevel warning Don’t display what we don’t need
  • -select_streams v We only want the details for the video (v) stream
  • -print_format json Make it easier to parse
  • -read_intervals "%+#1" Only grab data from the first frame
  • -show_entries ... Pick only the relevant data we want
  • -i GlassBlowingUHD.mp4 input (-i) is our Dobly Vision demo file

That will output something like this:

{ "frames": [
        {
            "pix_fmt": "yuv420p10le",
            "color_space": "bt2020nc",
            "color_primaries": "bt2020",
            "color_transfer": "smpte2084",
            "side_data_list": [
                {
                    "side_data_type": "Mastering display metadata",
                    "red_x": "35400/50000",
                    "red_y": "14600/50000",
                    "green_x": "8500/50000",
                    "green_y": "39850/50000",
                    "blue_x": "6550/50000",
                    "blue_y": "2300/50000",
                    "white_point_x": "15635/50000",
                    "white_point_y": "16450/50000",
                    "min_luminance": "50/10000",
                    "max_luminance": "40000000/10000"
                },
                {
                    "side_data_type": "Content light level metadata",
                    "max_content": 0,
                    "max_average": 0
} ] } ] }

I chose to output it with json via the -print_format json option to make it more machine parsible, but you can omit that if you just want the text.

We are now going to take all that data, and break it down into groups of <color abbreviation>(<x>, <y>) while leaving off the right side of the \, so for example we combine red_x "35400/50000"and red_y "14600/50000" into R(35400, 14600).

G(8500,39850)B(6550,2300)R(35400, 14600)WP(15635,16450)L(50,40000000)

This data, as well as the Content light level (<max_content>,<max_average>) of (0,0) will be fed into the encoder command options.

Convert the video

This command converts only the video, keeping the HDR10 intact. We will have to pass these arguments not to ffmpeg, but to the x265 encoder directly via the -x265-params option. (If you’re not familiar with FFmpeg, don’t fret. FastFlix, which I talk about later, will do the work for you!)

ffmpeg  -i GlassBlowingUHD.mp4 -map 0 -c:v libx265 -x265-params hdr-opt=1:repeat-headers=1:colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc:master-display=G(8500,39850)B(6550,2300)R(35400,14600)WP(15635,16450)L(40000000,50):max-cll=(0,0) -crf 20 -preset veryfast GlassBlowingConverted.mkv

Let’s break down what we are throwing into the x265-params:

  • hdr-opt=1 we are telling it yes, we will be using HDR
  • repeat-headers=1we want these headers on every frame as required
  • colorprim, transfer and colormatrix the same as ffprobe listed
  • master-display this is where we add our color string from above
  • max-cll Content light level data, in our case (0,0)

During a conversion like this, when a Dolby Vision layer exists, you will see a lot of messages like [hevc @ 000001f93ece2e00] Skipping NAL unit 62 because there is an entire layer that ffmpeg does not yet know how to decode.

For the quality of the conversion, I was setting it to -crf 20 with a -preset veryfast to convert it quickly without a lot of quality loss. I dig deeper into how FFmpeg handles crf vs preset with regards to quality below.

All sound and data and everything will be copied over thanks to the -map 0 option, that is a blanket statement of “copy everything from the first (0 start index) input stream”.

That is really all you need to know for the basics of how to encode your video and save the HDR10 data!

FFmpeg conversion settings

I covered this a bit before in the other post, but I wanted to go through the full gauntlet of presets and crfs one might feel inclined to use. I compared each encoding with the original using VMAF and SSIM calculations over a 11 second clip. Then, I created over a 100 conversions for this single chart, so it is a little cramped:

First takeaways are that there is no real difference between veryslow and slower, nor between veryfast and faster, as their lines are drawn on top of each other. The same is true for both VMAF and SSIM scores.

Second, no one in their right mind would ever keep a recording stored by using ultrafast. That is purely for real time streaming use.

Now for VMAF scores, 5~6 points away from source is visually distinguishable when watching. In other words it will have very noticeable artifacts. Personally I can tell on my screen with just a single digit difference, and some people are even more sensitive, so this is by no means an exact tell all. At minimum, lets zoom in a bit and get rid of anything that will produce video with very noticeable artifacts.

From these chart, it seems clear that there is obviously no reason whatsoever to ever use anything other than slow. Which I personally do for anything I am encoding. However, slow lives up to its namesake.

Encoding Speed and Bitrate

I had to trim off veryslow and slower from the main chart to be able to even see the rest, and slow is still almost three times slower than medium. All the charts contain the same data, just with some of the longer running presets removed from each to better see details of the faster presets.

Please note, the first three crf datapoints are little dirty, as the system was in use for the first three tests. However, there is enough clean data to see how that compares down the line.

To see a clearer picture of how long it takes for each of the presets, I will exclude those first three times, and average the remaining data. The data is then compared against the medium (default) present and the original clip length of eleven seconds.

PresetTimevs “medium”vs clip length (11s)
ultrafast11.2043.370x0.982x
superfast12.1753.101x0.903x
veryfast19.1391.973x0.575x
faster19.1691.970x0.574x
fast22.7921.657x0.482x
medium37.7641.000x0.291x
slow97.7550.386x0.112x
slower315.9000.120x0.035x
veryslow574.5800.066x0.019x

What is a little scary here is that even with “ultrafast” preset we are not able to get realtime conversion, and these tests were run on a fairly high powered system wielding an i9-9900k! While it might be clear from the crf graph that slow is the clear winner, unless you have a beefy computer, it may be a non-option.

Use the slowest preset that you have patience for

FFmpeg encoding guide

Also unlike VBR encoding, the average bitrate and filesize using crf will wildly differ based upon different source material. This next chart is just showing off the basic curve effect you will see, however it cannot be compared to what you may expect to see with your file.

The two big jumps are between slow and medium as well as veryfast and superfast. That is interesting because while slow and medium are quite far apart on the VMAF comparison, veryfast and superfast are not. I expected a much larger dip from superfast to ultrafast but was wrong.

FastFlix, doing the heavy lifting for you!

I have written a GUI program, FastFlix, around FFmpeg and other tools to convert videos easily to HEVC, AV1 and other formats. While I won’t promise it will provide everything you are looking for, it will do the work for you of extracting the HDR details of a video and passing them into a FFmpeg command. It also has a panel that shows you exactly the command(s) it is about to run, so you could copy it and modify it to your hearts content!



Currently I only automatically produce windows builds, as it was originally designed for SVT-AV1 which also only had Windows builds, but it should work on other systems by cloning the git repo and running via python.

If you have any problems with it please help by raising an issue!

Saving Dolby Vision or HDR10+

Note I say “saving” and not “converting”, because unless you have the original RPU file for the DV to pass to x265, you’re out of luck as of now and cannot convert the video. (If you or anyone you know figured out a way to parse / use baked in RPU info, let me know!) However, the x265 encoder is able to take a RPU file and create a Dolby Vision ready movie, but we don’t have anything to extract that (yet).

Thankfully, it is possible to at least convert the audio and change around the streams with remuxers. For example tsMuxeR (nightly build, not default download) is popular to be able to take mkv files that most TVs won’t recognize HDR in, and remux them into into ts files so they do. If you also have TrueHD sound tracks, you may need to use eac3to first to break it into the TrueHD and AC3 Core tracks before muxing.

Easily Viewing HDR / Video information

Another helpful program to quickly view what type of HDR a video has is MediaInfo. For example here is the original Dolby Vision Glass Blowing video info (some trimmed):

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L5.1@Main
HDR format                               : Dolby Vision, Version 1.0, dvhe.08.09, BL+RPU, HDR10 compatible / SMPTE ST 2086, HDR10 compatible
Codec ID                                 : hev1
Color space                              : YUV
Chroma subsampling                       : 4:2:0 (Type 2)
Bit depth                                : 10 bits
Color range                              : Limited
Color primaries                          : BT.2020
Transfer characteristics                 : PQ
Matrix coefficients                      : BT.2020 non-constant
Mastering display color primaries        : BT.2020
Mastering display luminance              : min: 0.0050 cd/m2, max: 4000 cd/m2
Codec configuration box                  : hvcC+dvvC

And here it is after conversion:

Video
ID                                       : 1
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main 10@L5.1@Main
HDR format                               : SMPTE ST 2086, HDR10 compatible
Codec ID                                 : V_MPEGH/ISO/HEVC
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 10 bits
Color range                              : Limited
Color primaries                          : BT.2020
Transfer characteristics                 : PQ
Matrix coefficients                      : BT.2020 non-constant
Mastering display color primaries        : BT.2020
Mastering display luminance              : min: 0.0050 cd/m2, max: 4000 cd/m2

Notice we have lost the Dolby Vision, BL+RPU information but at least we retained the HDR10 data, which Handbrake can’t do!

That’s a wrap!

Hope you found this information useful, and please feel free to leave a comment for feedback, suggestions or questions!

Until next time, stay safe and love each other!

Top 10ish Python standard library modules

When interviewing Python programming candidates, my wife always likes to ask the simple question, “can you name ten Python standard library modules?” This is harder than most think, as many people will completely blank out and others will be dead wrong. “Requests?” one poor soul answered. It’s a good interview question, as it gives insight onto what people are familiar with and may use regularly. So I sat down and though of of which ones I use and enjoy the most. So here are my top ten(ish) useful, favorite and unordered standard modules.

pathlib

Back in the dark days, you would have to store your path as a string, and call obscure functions under os.path to figure anything out about it. Pathlib removes the headache.

from pathlib import Path

my_path = Path('text_file.txt')
if not my_path.exists():
    my_path.write_text('File Content')
assert my_path.exists()
assert my_path.is_file()

Read more at the pathlib python docs.

tempfile

There are a boatload of uses for a temporary file or directory. Hence why it’s in the standard library. I find myself using them together, inside context managers more often than not.

from pathlib import Path
from tempfile import TemporaryDirectory, TemporaryFile


with TemporaryDirectory(prefix='Code_', suffix='_Calamity') as temp_dir:
    with TemporaryFile(dir=temp_dir) as temp_file:
        temp_file.write(b'Test')

        temp_file_path = Path(temp_file.name)
        assert temp_file_path.exists()

# Make sure file only exists within the context
assert not temp_file_path.exists()

I usually end up using this when a tool or library wants to work with a file rather than standard input, so short lived files in a context manager make life a lot easier. Tempfile python docs.

subprocess

Python is pretty amazing, but sometimes you do need to call other programs. Subprocess makes it easy to execute and interact with other executable across operating systems. Check out my other post on it!

from subprocess import run, PIPE
 
response = run("echo 'Join the Dark Side!'", shell=True, stdout=PIPE)

print(response.stdout.decode('utf-8'))
# 'Join the Dark Side!'

logging

This is probably the most useful built in library for debugging there is, and I see it either unused or misused more than anything else.

import logging
import sys
 
logger = logging.getLogger(__name__)
my_stream = logging.StreamHandler(stream=sys.stdout)
my_stream.setLevel(logging.DEBUG)
my_stream.setFormatter(
    logging.Formatter("%(asctime)s - %(name)-12s  "
                      "%(levelname)-8s %(message)s"))
logger.addHandler(my_stream)
logger.setLevel(logging.DEBUG)
 
logger.info("We the people")

If you haven’t already, go on your first date with python logging! It’s also possible to put all the configuration details into a separate ini or json file, learn more from the logging python docs.

threading and multithreading

Two very different things for widely different uses, but they have very similar interfaces and easy to talk about at the same time. Quick and dirty difference: Use threading for IO heavy tasks (writing to files, reading websites, etc) and multithreading for CPU heavy tasks.

from multiprocessing.pool import ThreadPool, Pool
 
def square_it(x):
    return x*x
 
# On Windows, make sure that multiprocessing doesn't start
# until after "if __name__ == '__main__'" 
 
# Pool and ThreadPool are interchangable in this example 
with Pool(processes=5) as pool:
   results = pool.map(square_it, [5, 4, 3, 2 ,1])
 
print(results) 
# [25, 16, 9, 4, 1]

I did a post on ThreadPools and Multithreading Pools, as I find them the easiest way to work with (multi)threading in Python.

os and sys

The Python world would not exist if we didn’t have all the power and functionality these built-ins bring. You really haven’t coded in Python if you haven’t used these yet, so I won’t even bother elaborating them here.

random and uuid

Maybe you’re making a game…

import random
random.choice(['Sneak Attack', 'High Kick', 'Low Kick'])

Or debugging a webserver…

from uuid import uuid4

# Bad example, but not writing out a whole webserver to prove a point
def get(*args):
request = uuid4()
logger.info(f'Request {request} called with args {args}')

Or turning your webserver into your own type of game…

if user_name == 'My Boss':
    time.sleep(random.randint(1, 5))

No matter which you are doing, it’s always handy to have randomly generated or unique numbers.

socket

The internet runs because of sockets. Modern technology exists because of sockets. Sockets are life, sockets are….annoying low level at times but its good to know the basics so you can appreciate everything written on top of them.

Thankfully the Python docs have good examples of them in use.

hashlib

Need to check a file’s integrity? Hashlib is there with md5 and sha hashes! I created a reusable function to easily reference when I need to do it.

Need to securly store people’s password hashes for a website? Hashlib now has scrypt support! Heck, here is my own function I always use to generate the scrypt hashes.

from collections import namedtuple
import hashlib
import os


Hashed = namedtuple('Hashed', ['hash', 'salt', 'n', 'r', 'p', 'key_length'])


def secure_hash(value: bytes, salt: bytes = None, key_length: int = 128, n: int = 2 ** 16, r: int = 8, p: int = 1):
    maxmem = n * r * 2 * key_length
    salt = salt or os.urandom(16)
    hashed = hashlib.scrypt(value, salt=salt, n=n, r=r, p=p, maxmem=maxmem, dklen=key_length)
    return Hashed(hash=hashed.hex(), salt=salt.hex(), n=n, r=r, p=p, key_length=key_length)

venv

You probably only think of it as a command when you run python -m venv python_virtual_env to create your environments, but it’s run that way because it’s a standard library. Every new project you start or Python program you install should be using this library, so it is used a lot!

Summary

There ya go, 10 or so can’t live without standard libraries! Isn’t it so nice that Python comes “batteries included”?

Introducing FastFlix – AV1 encoder GUI and more!

First, straight to the fun, download FastFlix here to try it out! (Windows only builds right now). FastFlix started out to be a small clip and GIF maker for myself, but quickly realized I could expand it for larger usage. And it just so happens that AV1 is an emerging codec that doesn’t have a lot of GUI options yet, it’s like it was meant to be!

The main GUI

Before going out and re-encoing everything with AV1, which will be the next standard codec as everyone is on board with it, there are a few catches, so make sure to read on

What makes FastFlix unique?

AV1 Support for multiple libraries

FastFlix is designed as a general command wrapper, so it can support multiple different programs. Right now it supports both the libaom-av1 and SVT-AV1 libraries.

Totally MIT open source code, reuse to your hearts content!

Unlike most converts that are limited to the GPL license thanks to the libx265/libx264 libraries and others, FastFlix has been designed in a way to keep it legal to steal use any of it’s core code in your own projects without forcing them to be open source.

Extensible

FastFlix was designed to have a plug in architecture. That way anyone can develop or use their own plugins on top of what is already available to bring additional functionally.

What’s the catch?

New program using an experimental codec

There will be bugs. Both on the GUI side and on the codec side. Report anything weird you see and we’ll try to figure out it’s a GUI problem or needs to be passed along to the codec team.

SVT-AV1 makes it difficult to convert videos

Right now SVT-AV1 requires the source input to be broken up to smaller chucks (if it is longer than the segment size) as raw YUV video, which can take up gigabytes of space. They are automatically cleaned up as it goes along, but it is still silly that SVT-AV1 cannot take a regular video file as input yet.

I’m just one guy, and this is a side project

I don’t make any money (nor take donations. $10K+ bribes, please email me 😉) and FastFlix is not something I will spend all my free time on. So I am always looking for help and feedback!

Wrapping up

Please give a github star if you like FastFlix and be sure to send your love to SVT-AV1 as well if you find their program useful!

Again, you can download FastFlix on the Github release page.

For those of you interested more in how FastFlix works or was created I hope to do a follow up post that goes into using PySide2 and the full workflow of using Appveyor to deliver releases.

Encoding settings for HDR 4K videos using 10-bit x265

There is currently a serious lack of data on compressing 4K HDR videos out there, so I took it upon myself to get learned in the ways of the x265 encoding world.

First things first, this is NOT a guide for Dolby Vision or HDR10. This is simply for videos using the BT.2020 color primaries. Please read the new article for saving HDR.

I have historically been using the older x264 mp4s for my videos, as it just works on everything. However most devices finally have some native h.265 decoding. (As a heads up h.265 is the specification, and x265 is encoder for it. I may mix it up myself in this article, don’t worry about the letter, just the numbers.)

Updated: 6/29/2020 – Please refer to the new guide

Updated: 4/14/2019 – New Preset Setting (tl;dr: use slow)

What are the best settings for me to use when encoding x265 videos?

The honest to god true answer is “it depends”, however I find that answer unsuitable for my own needs. I want a setting that I can use on any incoming 4K HDR video I buy.

I mainly use Handbrake now use ffmpeg because I learned Handbrake only has a 8-bit internal pipeline. In the past, I went straight to Handbrake’s documentation. It states that for 4K videos with x265 they suggest a Constant Rate Factor (CRF) encoding in the range of 22-28 (the larger the number the lower the quality).

Through some experimentation I found that I personally never can really see a difference between anything lower than 22 using a Slow present. Therefore I played it safe, bump it down a notch and just encode all of my stuff with x265 10-bit at CRF of 20 on Slow preset. That way I know I should never be disappointed.

Then I recently read YouTubes suggest guidelines for bitrates. They claim that a 4K video coming into their site should optimally be 35~45Mbps when encoded with the older x264 codecs.

Now I know that x265 can be around 50% more efficient than x264, and that YouTube needs it higher quality coming in so when they re-compress it it will still look good. But when I looked at the videos I was enjoying just fine at CRF 22, they were mostly coming out with less than a 10Mbps bitrate. So I had to ask myself:

How much better is x265 than x264?

To find out I would need a lot of comparable data. I started with a 4K HDR example video. First thing I did was to chop out a minute segment and promptly remove the HDR. Thus comparing the two encoders via their default 8-bit compressors.

I found this code to convert the 10-bit “HDR” yuv420p10le colorspace down to the standard yuv420p 8-bit colorspace from the colourspace blog so props to them for having a handy guide just for this.

ffmpeg -y -ss 07:48 -t 60 -i my_movie.mkv-vf zscale=t=linear:npl=100,format=gbrpf32le,zscale=p=bt709,tonemap=tonemap=hable:desat=0,zscale=t=bt709:m=bt709:r=tv,format=yuv420p -c:v libx265 -preset ultrafast -x265-params lossless=1 -an -sn -dn -reset_timestamps 1 movie_non_hdr.mkv

Average Overall SSIM

Then I ran multiple two pass ABR runs using ffmpeg for both x264 and x265 using the same target bitrate. Afterwards compared them to the original using the Structural Similarity Index (SSIM). Put simply, the closer the result is to 1 the better. It means there is less differences between the original and the compressed one

Generated via Python and matplotlib
(Click to view larger version)

The SSIM result is done frame by frame, so we have to average them all together to see which is best overall. On the section of video I chose, x264 needed considerably more bitrate to achieve the same score. The horizontal line shows this where x264 needs 14Mbps to match x265’s 9Mbps, a 5000kbps difference! If we wanted to go by YouTube’s recommendations for a video file that will be re-encoded again, you would only need a 25Mbps x265 file instead of a 35Mbps x264 video.

Sample commands I used to generate these files:

ffmpeg -i movie.mkv -c:v libx265 -b:v 500k -x265-params pass=1 -an -f mp4 NUL

ffmpeg -i movie.mkv -c:v libx265 -b:v 500k -x265-params pass=2 -an h265\movie_500.mp4

ffmpeg -i my_movie.mkv -i h265\movie_500.mp4 -lavfi  ssim=265_movie_500_ssim.log -f null -

Lowest 1% SSIM

However the averages don’t tell the whole story. Because if every frame was that good, we shouldn’t need more than 6Mbps x265 or 10Mbps x264 4K video. So lets take a step back and look at the lowest 1% of the frames.

Generated via Python and matplotlib
(Click to view larger version)

Here we can see x264 has a much harder time at lower bitrates. Also note that the highest marker on this chart is 0.98, compared the total average chart’s 0.995.

This information alone confirmed for me that I will only be using x265 or newer encodings (maybe AV1 in 2020) for storing videos going forward.

Download the SSIM data as CSV.

How does CRF compare to ABR?

I have always read to use Constant Rate Factor over Average BitRate for stored video files (and especially over Constant Quality). CRF is the best of both worlds. If you have an easily compressible video, it won’t bloat the encoded video to meet some arbitrary bitrate. And bitrate directly correlates to file size. It also won’t be constrained to that limit if the video requires a lot more information to capture the complex scene.

But that is all hypothetical. We have some hard date, lets use it. So remember, Handbrake recommends a range of 22-28 CRF, and I personally cannot see any visual loss at CRF 20. So where does that show up on our chart?

Generated via Python and matplotlib
(Click to view larger version)

Now this is an apples to oranges comparison. The CRF videos were done via Handbrake using x265 10-bit, whereas everything else was done via ffmpeg using x265 or x264 8-bit. Still, we get a good idea of where these show up. At both CRF 24 and CRF 22, even the lowest frames don’t dip below SSIM 0.95. I personally think the extra 2500kbps for the large jump in minimum quality from CRF 24 to CRF 22 is a must. To some, including myself, it could be worth the extra 4000kbps jump from CRF 22 to CRF 20.

So let’s get a little more apples to apples. In this test, I encoded all videos with ffmpeg using the default presents. I did three CRF videos first, at 22, 20, and 18, then using their resulting bitrates created three ABR videos.

Generated via Python and matplotlib
(Click to view larger version)

Their overall average SSIM scores were near as identical. However, CRF shows its true edge on the lowest 1%, easily beating out ABR at every turn.

To 10-bit or not to 10-bit?

Thankfully there is a simple answer. If you are encoding to x264 or x265, encode to 10-bit if your devices support it. Even if your source video doesn’t use the HDR color space, it compresses better.

There is only one time to not use it. When the device you are going to watch it on doesn’t support it.

Which preset should I use?

The normal wisdom is to use the the slowest you can stand for the encoding time. Slower = better video quality and compression. However, that does not mean smaller file size at the same CRF.

Even though others have tackled this issue, I wanted to use the same material I was already testing and make sure it held true with 4K HDR video.

Generated via Python and matplotlib
(Click to view larger version)

I used a three minute 4K HDR clip, using Handbrake to only modify which present was used. The results were surprising to me to be honest, I was expecting medium to have a better margin between fast and slow. But based on just the average, slow was the obvious choice, as even bumping up the CRF from 18 to 16 didn’t match the quality. Even thought the file size was much larger for the CRF 16 Medium encoding than it was than for the CRF 18 Slow! (We’ll get to that later.)

Okay, okay, lets back up a step and look at the bottom 1% again as well.

Generated via Python and matplotlib
(Click to view larger version)

Well well wishing well, that is even more definitive. The jump from medium to slow is very significant in multiple ways. Even though it does cost double the time of medium it really delivers in the quality department. Easily beating out the lowest 1% of even CRF 16 medium, two entire steps away.

Generated via Excel
(Click to view larger version)

The bitrates are as expected, the higher quality it gets the more bitrate it will need. What is interesting, is if we put CRF 16 - Medium encoding’s bitrate on this chart it would go shoot off the top at a staggering 15510kbps! Keep in mind that is while still being lesser quality than CRF 18 - Slow.

In this data set, slow is the clear winner in multiple ways. Which is very similar to other’s results as well, so I’m personally sticking too it. (And if I ran these tests first, I would have even used slow for all the other testing!)

Conclusion

If you want a single go to setting for encoding, based on my personal testing CRF 20 with Slow preset looks amazing (but may take too long if you are using older hardware).

Now, if I have a super computer and unlimited storage, I might lean towards CRF 18 or maybe even 16, but still wouldn’t feel the need to take it the whole way to CRF 14 and veryslow or anything crazy.

I hope you found this information as useful as I did, if you have any thoughts or feedback please let me know!


Paint, Paper, Panoramas, and Python

I’m an artist and a python developer, two things that rarely occupy the same worlds, let alone the same sentence. However, I have recently found a way to combine these two passions: Panoramas.

My current smartphone takes excellent pictures. It does a great job at figuring out colors, lighting, and focus, even in low lighting. As an artist, this is important to me because I often use my phone to snap quick pictures of a scene as a reference to take back to my studio. It’s a huge improvement in the technology I had in my hands even five years ago. There is one thing about my old phone that I miss though – its ‘panorama’ photo mode, but not because it was better.

I miss how amazingly awful it used to be, and more importantly, the freedom to make awful pictures it allowed. I’d point the lens out the window of the car as it sped along (as a passenger of course) to make jagged and confusing images of tiny bits of the landscape that the phone struggled to hodgepodge together. I’d tilt and move the phone in random directions to make weird swirls of the horizon. Even when being used ‘as directed’, it would usually struggle with focus and lighting coming up with spontaneously and wonderfully terrible photos with abstract light glare or menacing dark patches. It’s hard to explain, but sometimes as an artist, a terrible photo can be just as inspiring as those picture perfect reference pics I take with me back to the studio.

My current phone is too smart for that though, and it snatches away any joy of bad photography by making consistently beautiful and seamless panoramas. Not only that, but it accomplishes this mostly by yelling at you (“You’re going too fast!”) or by using angry arrows to make sure you can only move the phone in one direction, and then abruptly ending the photograph when you don’t cooperate. So, I did what anyone does when they get nostalgic for awful photography – I made a python script to make my own terrible panoramas.

My plan was simple. First, I would shoot short videos where my phone wouldn’t yell at me for moving, tilting, and spinning the image as much as I wanted. Next, use Python to convert each frame of the video clip to an image, crop the image into a tiny sliver out of the center of the image and then glue them all together. The results are imperfect. And gloriously so.

Side note: Although I used my smartphone to shoot some video, this script could be applied to any video. Think of the wild panoramas you could create from some Russian dash cam footage, or a GoPro strapped to a fish, or a tiny clip from the Lord of the Rings. However, this script works best on videos that are less than 10 seconds long or else it produces mile long panoramic images. Currently, I don’t bother limiting the image size at all, but theoretically I could by using one out of every five frames for instance, or by cutting down the image slice size based on video length.

The Python

I used ffmpeg for turning each video frame into an image. It was simple to install, just download and unzip. Here’s a handy installation guide -> https://github.com/adaptlearning/adapt_authoring/wiki/Installing-FFmpeg

The Python Image Library is the only other requirement, installed with pip.

pip3 install pillow

The script works by pulling all videos out of a source directory based on file suffix and creating a panorama for each. This could easily be modified to convert just one video at a time by removing the loops and passing the path to the desired video directly to ffmpeg.


directory = Path('my\\videos\\dir')

vids = []
for vid in directory.iterdir():
    if vid.suffix.lower() in ('.mp4', '.mkv'):
        vids.append(vid)

Every frame pulled out by ffmpeg is stored in a file. I delete the directory and recreate it before ffmpeg runs to delete the old frames from the last run.


for vid in vids:
    shutil.rmtree("pics", ignore_errors=True)
    os.makedirs("pics", exist_ok=True)

    print(f'Creating panoramic {vid.stem}')
    result = run(
        f'ffmpeg -i {vid.absolute()} '
        f'-y pics\\thumb%04d.jpg -hide_banner', 
        shell=True, stderr=PIPE)
    result.check_returncode()
    print(result.stderr.decode('utf-8'))

After it finishes pulling out all the frames, I start the panorama by creating an empty image. I need to have the dimensions of the finished image to create it. To get the final width, I multiply the number of frames ffmpeg pulled out by the width of my image slice (40 pixels). For the height, I open up one of the frames and use it size as a reference. I also use the sample image’s dimensions to figure out the center of the image for cropping everything down later.

Then, I loop through all the frame images in reverse order (because … long story short, it usually looks better that way) and then work on slicing each image down to 40 pixels wide to glue into the panorama.

    
    sample = Image.open("pics/{}".format(os.listdir("pics")[0]))
    width, height = sample.size
    center = width / 2

    panoramic = Image.new('RGB', (len(os.listdir("pics")*40), height))
    
    # This offset is so PIL knows where to start adding 
    # each image slice to the panorama
    x_offset = 0

    for i in reversed(os.listdir("pics")):
        img = Image.open("pics/{}".format(i))
        area = (center - 20, 0, center + 20, height)
        cropped_img = img.crop(area)
        panoramic.paste(cropped_img, (x_offset, 0))
        x_offset += 40

    panoramic.save(f'{vid.stem}.jpeg')
    panoramic.close()

The Painting

So far, I am quite happy with the results of this adorable little script.
It has definitely given me the creative inspiration I was missing. In the past two weeks, I have done three series of paintings based on panoramas I have created using it, with plans for many more. Here’s an example of how I used it to create some artwork!

I took this video:

turned it into this panorama:

played with some paint and smudged around some charcoal and pastels:

and came up with this:

Final Thoughts

It feels really amazing to apply Python to unusual problems, even if that challenge is finding a unique way of creating original art. Plus, if the inspiration ever dries up, I have some ideas for making this script even more fun:

  • grab each slice from a random spot rather than dead center of the image for something much more jumbled and abstract
  • options to not use every frame for longer videos
  • PIL ‘effects’, like a black and white mode, over saturation, or extra blurry images
  • an ‘up and down’ mode for tall panoramas

I hope you enjoyed! Feel free to check out my website or my instagram for more artwork if you are interested.