There is currently a serious lack of data on compressing 4K HDR videos out there, so I took it upon myself to get learned in the ways of the x265 encoding world.
First things first, this is NOT a guide for Dolby Vision or HDR10. This is simply for videos using the BT.2020 color primaries. Please read the new article for saving HDR.
I have historically been using the older x264 mp4s for my videos, as it just works on everything. However most devices finally have some native h.265 decoding. (As a heads up h.265 is the specification, and x265 is encoder for it. I may mix it up myself in this article, don’t worry about the letter, just the numbers.)
Updated: 6/29/2020 – Please refer to the new guide
Updated: 4/14/2019 – New Preset Setting (tl;dr: use slow
)
What are the best settings for me to use when encoding x265 videos?
The honest to god true answer is “it depends”, however I find that answer unsuitable for my own needs. I want a setting that I can use on any incoming 4K HDR video I buy.
I mainly use Handbrake now use ffmpeg because I learned Handbrake only has a 8-bit internal pipeline. In the past, I went straight to Handbrake’s documentation. It states that for 4K videos with x265 they suggest a Constant Rate Factor (CRF) encoding in the range of 22-28 (the larger the number the lower the quality).
Through some experimentation I found that I personally never can really see a difference between anything lower than 22 using a Slow
present. Therefore I played it safe, bump it down a notch and just encode all of my stuff with x265 10-bit at CRF of 20 on Slow preset. That way I know I should never be disappointed.
Then I recently read YouTubes suggest guidelines for bitrates. They claim that a 4K video coming into their site should optimally be 35~45Mbps when encoded with the older x264 codecs.
Now I know that x265 can be around 50% more efficient than x264, and that YouTube needs it higher quality coming in so when they re-compress it it will still look good. But when I looked at the videos I was enjoying just fine at CRF 22, they were mostly coming out with less than a 10Mbps bitrate. So I had to ask myself:
How much better is x265 than x264?
To find out I would need a lot of comparable data. I started with a 4K HDR example video. First thing I did was to chop out a minute segment and promptly remove the HDR. Thus comparing the two encoders via their default 8-bit compressors.
I found this code to convert the 10-bit “HDR” yuv420p10le
colorspace down to the standard yuv420p
8-bit colorspace from the colourspace blog so props to them for having a handy guide just for this.
ffmpeg -y -ss 07:48 -t 60 -i my_movie.mkv-vf zscale=t=linear:npl=100,format=gbrpf32le,zscale=p=bt709,tonemap=tonemap=hable:desat=0,zscale=t=bt709:m=bt709:r=tv,format=yuv420p -c:v libx265 -preset ultrafast -x265-params lossless=1 -an -sn -dn -reset_timestamps 1 movie_non_hdr.mkv
Average Overall SSIM
Then I ran multiple two pass ABR runs using ffmpeg for both x264 and x265 using the same target bitrate. Afterwards compared them to the original using the Structural Similarity Index (SSIM). Put simply, the closer the result is to 1 the better. It means there is less differences between the original and the compressed one
The SSIM result is done frame by frame, so we have to average them all together to see which is best overall. On the section of video I chose, x264 needed considerably more bitrate to achieve the same score. The horizontal line shows this where x264 needs 14Mbps to match x265’s 9Mbps, a 5000kbps difference! If we wanted to go by YouTube’s recommendations for a video file that will be re-encoded again, you would only need a 25Mbps x265 file instead of a 35Mbps x264 video.
Sample commands I used to generate these files:
ffmpeg -i movie.mkv -c:v libx265 -b:v 500k -x265-params pass=1 -an -f mp4 NUL
ffmpeg -i movie.mkv -c:v libx265 -b:v 500k -x265-params pass=2 -an h265\movie_500.mp4
ffmpeg -i my_movie.mkv -i h265\movie_500.mp4 -lavfi ssim=265_movie_500_ssim.log -f null -
Lowest 1% SSIM
However the averages don’t tell the whole story. Because if every frame was that good, we shouldn’t need more than 6Mbps x265 or 10Mbps x264 4K video. So lets take a step back and look at the lowest 1% of the frames.
Here we can see x264 has a much harder time at lower bitrates. Also note that the highest marker on this chart is 0.98, compared the total average chart’s 0.995.
This information alone confirmed for me that I will only be using x265 or newer encodings (maybe AV1 in 2020) for storing videos going forward.
Download the SSIM data as CSV.
How does CRF compare to ABR?
I have always read to use Constant Rate Factor over Average BitRate for stored video files (and especially over Constant Quality). CRF is the best of both worlds. If you have an easily compressible video, it won’t bloat the encoded video to meet some arbitrary bitrate. And bitrate directly correlates to file size. It also won’t be constrained to that limit if the video requires a lot more information to capture the complex scene.
But that is all hypothetical. We have some hard date, lets use it. So remember, Handbrake recommends a range of 22-28 CRF, and I personally cannot see any visual loss at CRF 20. So where does that show up on our chart?
Now this is an apples to oranges comparison. The CRF videos were done via Handbrake using x265 10-bit, whereas everything else was done via ffmpeg using x265 or x264 8-bit. Still, we get a good idea of where these show up. At both CRF 24 and CRF 22, even the lowest frames don’t dip below SSIM 0.95. I personally think the extra 2500kbps for the large jump in minimum quality from CRF 24 to CRF 22 is a must. To some, including myself, it could be worth the extra 4000kbps jump from CRF 22 to CRF 20.
So let’s get a little more apples to apples. In this test, I encoded all videos with ffmpeg using the default presents. I did three CRF videos first, at 22, 20, and 18, then using their resulting bitrates created three ABR videos.
Their overall average SSIM scores were near as identical. However, CRF shows its true edge on the lowest 1%, easily beating out ABR at every turn.
To 10-bit or not to 10-bit?
Thankfully there is a simple answer. If you are encoding to x264 or x265, encode to 10-bit if your devices support it. Even if your source video doesn’t use the HDR color space, it compresses better.
There is only one time to not use it. When the device you are going to watch it on doesn’t support it.
Which preset should I use?
The normal wisdom is to use the the slowest you can stand for the encoding time. Slower = better video quality and compression. However, that does not mean smaller file size at the same CRF.
Even though others have tackled this issue, I wanted to use the same material I was already testing and make sure it held true with 4K HDR video.
I used a three minute 4K HDR clip, using Handbrake to only modify which present was used. The results were surprising to me to be honest, I was expecting medium
to have a better margin between fast
and slow
. But based on just the average, slow
was the obvious choice, as even bumping up the CRF from 18 to 16 didn’t match the quality. Even thought the file size was much larger for the CRF 16 Medium encoding than it was than for the CRF 18 Slow! (We’ll get to that later.)
Okay, okay, lets back up a step and look at the bottom 1% again as well.
Well well wishing well, that is even more definitive. The jump from medium
to slow
is very significant in multiple ways. Even though it does cost double the time of medium
it really delivers in the quality department. Easily beating out the lowest 1% of even CRF 16 medium, two entire steps away.
The bitrates are as expected, the higher quality it gets the more bitrate it will need. What is interesting, is if we put CRF 16 - Medium
encoding’s bitrate on this chart it would go shoot off the top at a staggering 15510kbps! Keep in mind that is while still being lesser quality than CRF 18 - Slow
.
In this data set, slow
is the clear winner in multiple ways. Which is very similar to other’s results as well, so I’m personally sticking too it. (And if I ran these tests first, I would have even used slow
for all the other testing!)
Conclusion
If you want a single go to setting for encoding, based on my personal testing CRF 20 with Slow
preset looks amazing (but may take too long if you are using older hardware).
Now, if I have a super computer and unlimited storage, I might lean towards CRF 18 or maybe even 16, but still wouldn’t feel the need to take it the whole way to CRF 14 and veryslow
or anything crazy.
I hope you found this information as useful as I did, if you have any thoughts or feedback please let me know!
I tried encoding some 4K HDR files using Handbrake and they look washed out. I tried H.265 10 bit, H.265 12 bit, and H.264 10 bit, and they all ended up looking washed out compared to the original file. The original was H.265 encoded with a .ts extension. Handbrake 1.2.2. Any idea what might cause this?
Try running ffprobe on it ( ffprobe -i “filename.ts” -show_streams ) to see what the format and colorspace are, for HDR should be:
pix_fmt=yuv420p10le
color_space=bt2020nc
color_transfer=smpte2084
color_primaries=bt2020
(Windows ffmpeg/ffprobe package https://ffmpeg.zeranoe.com/builds/, others https://ffmpeg.org/download.html)
Found this incredibly helpful, as I’ve recently begun ripping Blurays to x265 instead of x264, particularly when I got my new 12-core Ryzen 3900X. However, I’m really only dealing with 1080p rips… Curious if you have any data on the difference between Medium and Slow settings at 1080p? Should the results be the same, relatively? Not sure if they’re agnostic of resolution…
Hi David, good question!
I ran some quick tests this morning with a script I am using for a new version of this article using
ffmpeg
instead of Handbrake.crf preset vmaf ssim
18 slow 68.298806 0.96016
18 medium 67.552191 0.960256
20 slow 67.835845 0.959723
20 medium 66.884759 0.9598
22 slow 67.187793 0.959167
22 medium 65.979806 0.959182
Using the Dolby Vision glass blowing demo for a 10 second clip starting at 1:03, it seems that SSIM scroes really couldn’t tell a difference any of them, but VMAF (which I am using more heavily in the next article) does show a similar trend to what we saw before. It looks to take about a crf increase of 2 to achieve near parity of medium to slow (aka 18 medium is close to 20 slow).
Also nice choice on the new processor!
It is well known between pros that SSIM and PSNR are not good measures or SUBJECTIVELY PERCEIVABLE quality.
Default psycho-acoustics tunings on x264 are much more mature and balanced for perceived quality than x265. (if you go higher than 4.1 preset).
By using CRF 15~18 vslow, Film TUNE, and 5.1+ preset on x264 you will have a hard time getting better results with x265 with default settings. You will have to tweak A LOT of the extra complex settings of x265 encoder to even MATCH x264, let alone surpass it in size to perceivable quality, and by the time you do, you will see that x265 is just a order of magnitude slower than x264 to achieve marginally better results which is very problematic with 4k encodes
The only point of using x265 is for compatibility, lossless and HDR content, otherwise, even at SDR 4k x264 will fare better if given enough bitrate.
At very low bitrates, however, x265 will indeed surpass x264, but at the levels of quality it does with default settings, you will be smudging a lot of fine grain and dark detail and the source would benefit from downscale DNR and pre-processing anyway.
I don’t use x265 for anything besides 4k HDR content. Definitely not for archiving.
Keep in mind I’m talking exclusively of x264 level 5.1+. When comparing to “standard” 4.1 profile than yes, x265 outperform it in almost any metric.
Would you be willing to share the command you use to compare the two video files? I’ve got the right version of FFmpeg with libvmaf but am having a hard time getting the right CLI parameters. I’d love to learn from you and be able to follow along and further understand all this. Thank you!!
Hey Jamie, sure thing! https://gist.github.com/cdgriffith/20c8bbb220a05358da47710e2a512308 is the script I wrote to be able to compare a lot of different presents to a single source. You need to download the model you want to use (vmaf_4k_v0.6.1.pkl and vmaf_4k_v0.6.1.pkl.model in my case) from https://github.com/Netflix/vmaf/tree/master/model. The command I use boils down to
ffmpeg.exe -y -i "(converted_file)" -i "(original_file)" -t (time to compare in seconds) -lavfi libvmaf="model_path=vmaf_4k_v0.6.1.pkl:log_path=details.txt;[0:v][1:v]ssim=1" -report -f null -\n
Hope that helps, thanks for asking and happy to help!
Awesome, thank you!
I wanted to thank you for all this great work you’ve done on these articles. Really awesome! Do you think for H.265 but with 1080p and less content that all of what you’ve posted here would still be applicable or do you think it’s only for 4k content?
I know from some small scale testing that the gaps between the different presents will remain similar (newer article has better graph for that). Also that I always recommend crf over hard bitrate/filesize because it is really source dependent for saving the proper amount of quality. Aka a old movie with lots of film grain will need a lot higher bitrate than a modern action / adventure move, which will itself needs a lot higher than an animation.
The real question is HDR. If the source has HDR and I want to preserve it, it should be in 10-bit H.265. If it doesn’t have HDR, the time to encode it doesn’t make sense to me as it will take over twice as long vs H.264.
This is also all evolving, so next year the AV1 tools may be fast and user friendly enough to use that instead of H.265 for encoding. Then in three years H.266 / VVC toolsets will probably be popping up.
Pretty new to this but I have one questions I can’t seem to read up on,
If I have a 10 bit 4k movie and I use the H265 AMD VCE video encoder, would I lose the 10 bit with the new conversion?
Depends on the encoder, the
ffmpeg
one I don’t think supports 10 bit, but VCEEncC does. Linking VCEEncC in the Settings panel of FastFlix is the easiest way to get started with it and will automatically put the required setting of--output-depth 10
to make it 10 bit.If it has HDR10 data and trying from the command line, also will need to provide
--master-display copy --max-cll copy
to VCEEncC (auto added in FastFlix as well)https://i0.wp.com/codecalamity.com/wp-content/uploads/2019/04/4K_preset_avg_extra.png?ssl=1 why there is CRF 16 and CRF 17 dot showing? from the chart SSIM difference is very low. is it worth to spend more time to encode slower preset? and also why slower and veryslow SSIM is lower than slow preset? shouldn’t slower preset produce better picture quality?