AMD made some impressive quality jumps a few months ago when they re-introduced B-frames to their H.264/AVC hardware encoder. Now they went and did it again with AMF 1.4.26 by adding temporal Pre-Analysis, which detects scene changes and can better insert index frames where needed.
While this won’t help with real time streaming*, it’s another large boost for anybody doing video conversion. Before, we saw large dips during scene changes (Blue line). Now with Pre-Analysis enabled, those virtually go away (Orange line)!
Using the same Big Buck Bunny video as last time, we see great improvements. But let’s try with a clip that doesn’t really have a traditional scene change. I will use a video of Clara Griffith painting one of her absurd paintings. Still image from video:
Here we can see the new Pre-Analysis (Red) still does better overall.
What does Pre-Analysis do?
AMD has a great overview of what their Pre-Analysis encoding process does, so I won’t try to rephrase it myself:
During [Pre-Analysis] encoding, Content Adaptive Quantization (CAQ) is applied based on the results generated by [Pre-Analysis]. The encoder also makes various encoding decisions based on [Pre-Analysis] results. For example, depending on whether the PA scene change detection flag is triggered or not, the encoder may force an intra encoded frame and apply a new frame QP at the new scene. The encoder may also insert a skip frame based on whether the PA static scene detection flag is triggered or not.
AMD AMF Docs – https://github.com/GPUOpen-LibrariesAndSDKs/AMF/tree/master/amf/doc
tl;dr: The encoder can make smarter decisions about future frames if it knows what is coming.
The Competition
First let’s see how it has improved against itself. We ran these tests using VCEEnc 7.0.3.
Against Itself
This is using the 1080p 24fps Big Buck Bunny video. We have a big score increase, at the cost of speed.
Setting | VMAF Score | Speed | Bitrate (set at 5000) | I-Frames |
AMD No B-Frames, No Pre-Analysis | 94.1215 | 139.21 fps | 4808.76 kbps | 13 |
AMD B-Frames, No Pre-Analysis | 95.3877 | 83.75 fps | 4807.49 kbps | 13 |
AMD B-Frames, Pre-Analysis | 96.0697 | 26.03 fps | 4822.74 kbps | 22 |
Notice how with Pre-Analysis enabled, we have a lot more I-Frames. Which means it was able to detect places it would have large quality losses if using P or B frames, and instead inserted the higher quality I-Frame.
Next we’ll use Clara Griffith’s painting video, in 1080p 30fps, which doesn’t have a traditional scene change.
Setting | VMAF Score | Speed | Bitrate (set at 3000) | I-Frames |
AMD B-Frames, No Pre-Analysis | 85.7606 | 76.08 fps | 3018.80 kbps | 1 |
AMD B-Frames, Pre-Analysis | 86.1705 | 25.16 fps | 3018.04 kbps | 1 |
In this case we only see a 0.4 VMAF improvement instead of the higher 0.7 VMAF from the Big Buck Bunny video, which is still impressive, and shows that it’s not only scene changes making the difference.
Against Intel QSV and Nvidia NVEnc
But does it stack up vs the others now? I am grabbing the scores from my last blog post for Intel and Nvidia, where you can also see the commands used for them.
Contender | VMAF Score | Bitrate (set at 5000) |
AMD | 96.0697 | 4822 kbps |
Intel | 96.37 | 4890 kbps |
Nvidia | 96.13 | 4892 kbps |
In this case, AMD is finally within arms length of the big boys!
Considerations
I want to add some clear “hold up” info here before this gets spread around as gospel.
- This is a single test, at a single bitrate. There is not enough info to draw any definitive conclusions about true comparative quality vs other encoders. This could be the worst or best spot for one of them. The goal here is to show how it improved against itself.
- This is with the example PA settings from the VCEEnc encoder, with no tweaking tried. This could be optimal Pre-Alaysis settings for this movie, or the worst. I don’t know.
- * This will not help with real time streaming or game capture. It’s too slow to use for that, only available for VBR mode, and you need future knowledge of frames for the appropriately named “Pre-Analysis”. Some software may chose to add it with a large frame buffer, but I doubt it will be a standard feature anytime soon for those cases.
Commands Used for Pre-Analysis
VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --bframes 3 --ref 3 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-bp.mp4 -------------------------------------------------------------------------------- big_buck_bunny_1080p-vce-5000-bp.mp4 -------------------------------------------------------------------------------- storage->SetProperty(BPicturesDeltaQP)=6 failed: invalid param.. storage->SetProperty(ReferenceBPicturesDeltaQP)=4 failed: invalid param.. VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win) OS: Windows 11 x64 (22000) [UTF-8] CPU: AMD Ryzen 9 5950X 16-Core Processor [4.58GHz] (16C/32T) GPU: AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26 Input Info: y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps Vpp Filters copyHtoD Output: H.264/AVC High @ Level 4 1920x1080p 1:1 24.000fps (24/1fps) avwriter: h264 => mp4 Quality: slow VBR: 5000 kbps Max bitrate: 25000 kbps QP: Min: 0, Max: 0 VBV Bufsize: 31250 kbps Bframes: 3 frames, b-pyramid: (null) Delta QP: Bframe: 0, RefBframe: 0 Pre Analysis: off Ref frames: 3 frames LTR frames: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 240 frames VUI: range:limited Others: deblock pe encoded 2881 frames, 83.75 fps, 4807.49 kbps, 68.80 MB encode time 0:00:34, CPU: 1.4%, GPU: 4.0%, VE: 99.4% frame type IDR 13 frame type I 13, avgQP 18.31, total size 3.12 MB frame type P 720, avgQP 19.84, total size 41.48 MB frame type B 2148, avgQP 23.03, total size 24.20 MB VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-no-b.mp4 -------------------------------------------------------------------------------- big_buck_bunny_1080p-vce-5000-no-b.mp4 -------------------------------------------------------------------------------- VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win) OS: Windows 11 x64 (22000) [UTF-8] CPU: AMD Ryzen 9 5950X 16-Core Processor [4.55GHz] (16C/32T) GPU: AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26 Input Info: y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps Vpp Filters copyHtoD Output: H.264/AVC High @ Level 4 1920x1080p 1:1 24.000fps (24/1fps) avwriter: h264 => mp4 Quality: slow VBR: 5000 kbps Max bitrate: 25000 kbps QP: Min: 0, Max: 0 VBV Bufsize: 31250 kbps Bframes: 0 frames Pre Analysis: off Ref frames: 2 frames LTR frames: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 240 frames VUI: range:limited Others: deblock pe encoded 2881 frames, 139.21 fps, 4808.76 kbps, 68.81 MB encode time 0:00:20, CPU: 2.1%, GPU: 6.9%, VE: 98.8% frame type IDR 13 frame type I 13, avgQP 19.15, total size 2.88 MB frame type P 2868, avgQP 23.16, total size 65.93 MB VCEEncC.exe -i big_buck_bunny_1080p24.y4m --trim 0:2880 --video-metadata clear --metadata clear --chapter-copy -c avc --vbr 5000 --bframes 3 --ref 3 --b-pyramid --preset slow --level auto --motion-est q-pel --pe --pa sc=high,ss=high,activity-type=yuv,paq=caq,taq=on,lookahead=32 --colorrange tv --avsync cfr -o big_buck_bunny_1080p-vce-5000-pa.mp4 -------------------------------------------------------------------------------- big_buck_bunny_1080p-vce-5000-pa.mp4 -------------------------------------------------------------------------------- storage->SetProperty(BPicturesDeltaQP)=6 failed: invalid param.. storage->SetProperty(ReferenceBPicturesDeltaQP)=4 failed: invalid param.. VCEEnc (x86) 7.03 (r1144) by rigaya, Aug 10 2022 10:08:40 (VC 1932/Win) OS: Windows 11 x64 (22000) [UTF-8] CPU: AMD Ryzen 9 5950X 16-Core Processor [4.56GHz] (16C/32T) GPU: AMD Radeon RX 6900 XT, AMF Runtime 1.4.26 / SDK 1.4.26 Input Info: y4m(yv12)->nv12 [AVX2], 1920x1080, 24/1 fps Vpp Filters copyHtoD Output: H.264/AVC High @ Level 4 1920x1080p 1:1 24.000fps (24/1fps) avwriter: h264 => mp4 Quality: slow VBR: 5000 kbps Max bitrate: 25000 kbps QP: Min: 0, Max: 0 VBV Bufsize: 31250 kbps Bframes: 3 frames, b-pyramid: (null) Delta QP: Bframe: 0, RefBframe: 0 Pre Analysis: sc high, ss high, activity yuv lookahead 32, caq medium, paq caq, taq on, motion-qual none, ltr off Ref frames: 3 frames LTR frames: 0 frames Motion Est: Q-pel Slices: 1 GOP Len: 240 frames VUI: range:limited Others: deblock pe encoded 2881 frames, 26.03 fps, 4822.74 kbps, 69.01 MB encode time 0:01:50, CPU: 2.4%, GPU: 4.0%, VE: 85.1% frame type IDR 22 frame type I 22, avgQP 15.91, total size 6.76 MB frame type P 723, avgQP 18.41, total size 44.68 MB frame type B 2136, avgQP 24.33, total size 17.58 MB Terminating internal PA thread
Hello! Great articles so far about the new updates to the AMF encoder. I do have smore information that youay be interested in regarding both.
The PreAnalysis Component is useable with effectively all rate control methods; the documentation to reflect this has yet to be updated. Reference: AMF github issue ticket #318 https://github.com/GPUOpen-LibrariesAndSDKs/AMF/issues/318#issuecomment-1148055185
Regarding the PreAnalysis Component being useful for streaming: PreAnalysis works fantasitically in the OBS Studio v28 beta. VCEEnc was only just updated yesterday to remedy an issue where B-frames and the PreAnalysis Component could not be used simultaneously. The 25fps you observed while using PreAnalysis is not the experience that I have observed over the past few weeks of testing github build and the beta builds of OBS Studio v28.
Hello, great Info (especially the speed values). Could you include those also in the older table(s) with Intel / Nvidia? (With the current energy-costs dependent on your machine it’s sometimes more economical to buy an extra HD instead of investing the energy-costs to shrink a movie while still preserving decent quality) + do you have planned a comparison with an Apple M1/M2 machine (including quality vs. file size vs. speed) / or do you know where I can find some decent info about the Apple HW (so far I only found tests that “advertised” the speed but ignored quality / filesize -> useless!)
Another topic: I’m trying to “reverse engineer” the used encoding pre-sets of encoded movies via MediaInfo. Could you give me a hint how I can identify the different HW-encoders?
Thanks and regards,
Sudder