AMD improves video encoding yet again! This time with Pre-Analysis

AMD made some impressive quality jumps a few months ago when they re-introduced B-frames to their H.264/AVC hardware encoder. Now they went and did it again with AMF 1.4.26 by adding temporal Pre-Analysis, which detects scene changes and can better insert index frames where needed.

While this won’t help with real time streaming*, it’s another large boost for anybody doing video conversion. Before, we saw large dips during scene changes (Blue line). Now with Pre-Analysis enabled, those virtually go away (Orange line)!

VAMF score comparing Pre-Analysis On (Orange) vs Off (Blue) with FFMetrics
VAMF score is Y axis, Frame Number is X axis

Using the same Big Buck Bunny video as last time, we see great improvements. But let’s try with a clip that doesn’t really have a traditional scene change. I will use a video of Clara Griffith painting one of her absurd paintings. Still image from video:

Clara Griffith Painting “Consumed”

Here we can see the new Pre-Analysis (Red) still does better overall.

VAMF score comparing Pre-Analysis On (Red) vs Off (Green) with FFMetrics

What does Pre-Analysis do?

AMD has a great overview of what their Pre-Analysis encoding process does, so I won’t try to rephrase it myself:

During [Pre-Analysis] encoding, Content Adaptive Quantization (CAQ) is applied based on the results generated by [Pre-Analysis]. The encoder also makes various encoding decisions based on [Pre-Analysis] results. For example, depending on whether the PA scene change detection flag is triggered or not, the encoder may force an intra encoded frame and apply a new frame QP at the new scene. The encoder may also insert a skip frame based on whether the PA static scene detection flag is triggered or not.

AMD AMF Docs – https://github.com/GPUOpen-LibrariesAndSDKs/AMF/tree/master/amf/doc

tl;dr: The encoder can make smarter decisions about future frames if it knows what is coming.

The Competition

First let’s see how it has improved against itself. We ran these tests using VCEEnc 7.0.3.

Against Itself

This is using the 1080p 24fps Big Buck Bunny video. We have a big score increase, at the cost of speed.

SettingVMAF ScoreSpeedBitrate (set at 5000)I-Frames
AMD No B-Frames, No Pre-Analysis94.1215139.21 fps4808.76 kbps13
AMD B-Frames, No Pre-Analysis95.387783.75 fps 4807.49 kbps13
AMD B-Frames, Pre-Analysis96.069726.03 fps 4822.74 kbps22
Big Buck Bunny

Notice how with Pre-Analysis enabled, we have a lot more I-Frames. Which means it was able to detect places it would have large quality losses if using P or B frames, and instead inserted the higher quality I-Frame.

Next we’ll use Clara Griffith’s painting video, in 1080p 30fps, which doesn’t have a traditional scene change.

SettingVMAF ScoreSpeedBitrate (set at 3000)I-Frames
AMD B-Frames, No Pre-Analysis85.760676.08 fps3018.80 kbps1
AMD B-Frames, Pre-Analysis86.170525.16 fps3018.04 kbps1
Clara Griffith Painting “Consumed”

In this case we only see a 0.4 VMAF improvement instead of the higher 0.7 VMAF from the Big Buck Bunny video, which is still impressive, and shows that it’s not only scene changes making the difference.

Against Intel QSV and Nvidia NVEnc

But does it stack up vs the others now? I am grabbing the scores from my last blog post for Intel and Nvidia, where you can also see the commands used for them.

ContenderVMAF ScoreBitrate (set at 5000)
AMD96.06974822 kbps
Intel96.374890 kbps
Nvidia96.13 4892 kbps

In this case, AMD is finally within arms length of the big boys!

Considerations

I want to add some clear “hold up” info here before this gets spread around as gospel.

  • This is a single test, at a single bitrate. There is not enough info to draw any definitive conclusions about true comparative quality vs other encoders. This could be the worst or best spot for one of them. The goal here is to show how it improved against itself.
  • This is with the example PA settings from the VCEEnc encoder, with no tweaking tried. This could be optimal Pre-Alaysis settings for this movie, or the worst. I don’t know.
  • * This will not help with real time streaming or game capture. It’s too slow to use for that, only available for VBR mode, and you need future knowledge of frames for the appropriately named “Pre-Analysis”. Some software may chose to add it with a large frame buffer, but I doubt it will be a standard feature anytime soon for those cases.

Commands Used for Pre-Analysis

VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --bframes 3 --ref 3 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-bp.mp4
--------------------------------------------------------------------------------
big_buck_bunny_1080p-vce-5000-bp.mp4
--------------------------------------------------------------------------------
storage->SetProperty(BPicturesDeltaQP)=6 failed: invalid param..
storage->SetProperty(ReferenceBPicturesDeltaQP)=4 failed: invalid param..
VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win)
OS:            Windows 11 x64 (22000) [UTF-8]
CPU:           AMD Ryzen 9 5950X 16-Core Processor [4.58GHz] (16C/32T)
GPU:           AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26
Input Info:    y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps
Vpp Filters    copyHtoD
Output:        H.264/AVC  High @ Level 4
               1920x1080p 1:1 24.000fps (24/1fps)
               avwriter: h264 => mp4
Quality:       slow
VBR:           5000 kbps
Max bitrate:   25000 kbps
QP:            Min: 0, Max: 0
VBV Bufsize:   31250 kbps
Bframes:       3 frames, b-pyramid: (null)
Delta QP:      Bframe: 0, RefBframe: 0
Pre Analysis:  off
Ref frames:    3 frames
LTR frames:    0 frames
Motion Est:    Q-pel
Slices:        1
GOP Len:       240 frames
VUI:              range:limited
Others:        deblock pe

encoded 2881 frames, 83.75 fps, 4807.49 kbps, 68.80 MB
encode time 0:00:34, CPU: 1.4%, GPU: 4.0%, VE: 99.4%
frame type IDR   13
frame type I     13,  avgQP  18.31,  total size   3.12 MB
frame type P    720,  avgQP  19.84,  total size  41.48 MB
frame type B   2148,  avgQP  23.03,  total size  24.20 MB


VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-no-b.mp4
--------------------------------------------------------------------------------
big_buck_bunny_1080p-vce-5000-no-b.mp4
--------------------------------------------------------------------------------
VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win)
OS:            Windows 11 x64 (22000) [UTF-8]
CPU:           AMD Ryzen 9 5950X 16-Core Processor [4.55GHz] (16C/32T)
GPU:           AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26
Input Info:    y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps
Vpp Filters    copyHtoD
Output:        H.264/AVC  High @ Level 4
               1920x1080p 1:1 24.000fps (24/1fps)
               avwriter: h264 => mp4
Quality:       slow
VBR:           5000 kbps
Max bitrate:   25000 kbps
QP:            Min: 0, Max: 0
VBV Bufsize:   31250 kbps
Bframes:       0 frames
Pre Analysis:  off
Ref frames:    2 frames
LTR frames:    0 frames
Motion Est:    Q-pel
Slices:        1
GOP Len:       240 frames
VUI:              range:limited
Others:        deblock pe

encoded 2881 frames, 139.21 fps, 4808.76 kbps, 68.81 MB
encode time 0:00:20, CPU: 2.1%, GPU: 6.9%, VE: 98.8%
frame type IDR   13
frame type I     13,  avgQP  19.15,  total size   2.88 MB
frame type P   2868,  avgQP  23.16,  total size  65.93 MB


VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --bframes 3 --ref 3 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --pa sc=high,ss=high,activity-type=yuv,paq=caq,taq=on,lookahead=32 --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-pa.mp4

--------------------------------------------------------------------------------
big_buck_bunny_1080p-vce-5000-pa.mp4
--------------------------------------------------------------------------------
storage->SetProperty(BPicturesDeltaQP)=6 failed: invalid param..
storage->SetProperty(ReferenceBPicturesDeltaQP)=4 failed: invalid param..
VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win)
OS:            Windows 11 x64 (22000) [UTF-8]
CPU:           AMD Ryzen 9 5950X 16-Core Processor [4.56GHz] (16C/32T)
GPU:           AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26
Input Info:    y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps
Vpp Filters    copyHtoD
Output:        H.264/AVC  High @ Level 4
               1920x1080p 1:1 24.000fps (24/1fps)
               avwriter: h264 => mp4
Quality:       slow
VBR:           5000 kbps
Max bitrate:   25000 kbps
QP:            Min: 0, Max: 0
VBV Bufsize:   31250 kbps
Bframes:       3 frames, b-pyramid: (null)
Delta QP:      Bframe: 0, RefBframe: 0
Pre Analysis:  sc high, ss high, activity yuv
               lookahead 32, caq medium, paq caq, taq on, motion-qual none, ltr off
Ref frames:    3 frames
LTR frames:    0 frames
Motion Est:    Q-pel
Slices:        1
GOP Len:       240 frames
VUI:              range:limited
Others:        deblock pe

encoded 2881 frames, 26.03 fps, 4822.74 kbps, 69.01 MB
encode time 0:01:50, CPU: 2.4%, GPU: 4.0%, VE: 85.1%
frame type IDR   22
frame type I     22,  avgQP  15.91,  total size   6.76 MB
frame type P    723,  avgQP  18.41,  total size  44.68 MB
frame type B   2136,  avgQP  24.33,  total size  17.58 MB
Terminating internal PA thread