Performance test – multithread decode
Media SDK version
Intel Media SDK (4.0.024-HSW)
Test environments
CPU: Core ivy bridge i7-3770
OS: Ubuntu server 12.04 LTS, kernel version 3.2.0-23 x86_64 x86_64 x86_64 GNU/Linux
Test program
kdvcodec_msdkdec_mt (svn: 10)
Working mode
a. Decode with simulated block mode
b. Return with NV12 buffer.
c. Run N decoding threads. Open & caching H.264 Elementary Streams from one specified file.
d. Timing: start before thread created, end time after all thread terminated(Means it’s a ballpark estimation).
Command-line arguments
kdvcodec_msdkdec_mt InputBitstream [TestType] [Instance Number] [LOOP]
Options:
[TestType]
0 – Decode only(including copy decoded buffer from mfxFrameSurface1 to a mfxU8* byte pointer)
1 – Decode and print status
2 – Decode, printf status, save YUV file to aaa.yuv
[Instance Number]
Start N instance(thread) to run the test, don’t set this variant too large if your RAM is not large enough
[LOOP]
Loop times of cache will be input to decode
Example:
./kdvcodec_msdkdec_mt /home/jacky/test.264 2 4 4
Test scenarios 1: 4 threads
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 4 4
Description
Run 4 threads, loop 4 times, and output NV12 only
Result
Tests result on my PC shows this test scenario can decode 500 ~ 530 fps, but the CPU always stuck in a high level.
Here is a sample test result:
Overall: 11430, output frames: 9101, fps=524.50256, Total used time: 17.35168 s, 17351679 usec
CPU usage: 40% ~ 60%
jacky@ubuntu-msdk:/opt/workspace/msdk/bin$ ./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 4 4
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
libva info: va_openDriver() returns 0
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
Output Type = 1
libva info: Found init function __vaDriverInit_0_32
Output Type = 1
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
MFX_WRN_VIDEO_PARAM_CHANGED
…
MFX_WRN_VIDEO_PARAM_CHANGED
(0x00DD4010): 1143, output frames: 2278, fps=131.90689, Total used time: 17.26976 s, 17269757 usec
(0x00DDA1D0): 2286, output frames: 2258, fps=130.49756, Total used time: 17.30301 s, 17303006 usec
(0x00DE00C0): 3429, output frames: 2288, fps=132.09439, Total used time: 17.32095 s, 17320947 usec
(0x00DF7440): 4572, output frames: 2278, fps=131.37719, Total used time: 17.33939 s, 17339388 usec
————————————————-
Overall: 11430, output frames: 9101, fps=524.50256, Total used time: 17.35168 s, 17351679 usec
Test scenarios 2: 8 threads
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 8 4
Description
Run 8 threads, loop 4 times, and output NV12 only
Result
Overall: 41148, output frames: 17686, fps=344.67759, Total used time: 51.31172 s, 51311720 usec
CPU usage: 30% ~ 50%
jacky@ubuntu-msdk:/opt/workspace/msdk/bin$ ./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 8 4
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
libva info: VA-API version 0.34.0
Output Type = 1
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
Output Type = 1
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
…
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
MFX_WRN_VIDEO_PARAM_CHANGED
…
MFX_WRN_VIDEO_PARAM_CHANGED
(0x01D52010): 1143, output frames: 2278, fps=44.57920, Total used time: 51.10006 s, 51100059 usec
(0x01D581D0): 2286, output frames: 2164, fps=42.30637, Total used time: 51.15068 s, 51150684 usec
(0x01D5E0C0): 3429, output frames: 2203, fps=43.07164, Total used time: 51.14734 s, 51147344 usec
(0x01D75440): 4572, output frames: 2136, fps=41.71776, Total used time: 51.20122 s, 51201217 usec
(0x01D6A0D0): 5715, output frames: 2264, fps=44.20714, Total used time: 51.21345 s, 51213451 usec
(0x01D8F6B0): 6858, output frames: 2238, fps=43.66062, Total used time: 51.25900 s, 51259003 usec
(0x01D987D0): 8001, output frames: 2253, fps=43.93440, Total used time: 51.28100 s, 51281004 usec
(0x01DA18F0): 9144, output frames: 2152, fps=41.95352, Total used time: 51.29486 s, 51294859 usec
————————————————-
Overall: 41148, output frames: 17686, fps=344.67759, Total used time: 51.31172 s, 51311720 usec
Test scenarios 3: 1 thread
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 1 4
Description
Run 1 threads, loop 40 times, and output NV12 only
Result
Overall: 1143, output frames: 2278, fps=354.90440, Total used time: 6.41863 s, 6418630 usec
CPU usage: 10% ~ 20%
jacky@ubuntu-msdk:/opt/workspace/msdk/bin$ ./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 1 4
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
MFX_WRN_VIDEO_PARAM_CHANGED
…
MFX_WRN_VIDEO_PARAM_CHANGED
(0x0146C010): 1143, output frames: 2278, fps=355.55328, Total used time: 6.40692 s, 6406916 usec
————————————————-
Overall: 1143, output frames: 2278, fps=354.90440, Total used time: 6.41863 s, 6418630 usec
Test scenarios 4: 2 threads
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 2 20
Description
Run 2 threads, loop 20 times, and output NV12 only
Result
Overall: 3429, output frames: 22785, fps=478.96642, Total used time: 47.57118 s, 47571185 usec
.
jacky@ubuntu-msdk:/opt/workspace/msdk/bin$ ./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 2 20
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
libva info: va_openDriver() returns 0
Output Type = 1
Output Type = 1
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
MFX_WRN_VIDEO_PARAM_CHANGED
…
MFX_WRN_VIDEO_PARAM_CHANGED
(0x00A0D010): 1143, output frames: 11398, fps=240.22250, Total used time: 47.44768 s, 47447679 usec
(0x00A131D0): 2286, output frames: 11390, fps=239.48212, Total used time: 47.56096 s, 47560961 usec
————————————————-
Overall: 3429, output frames: 22785, fps=478.96642, Total used time: 47.57118 s, 47571185 usec
Test scenarios 5: 4 instances, no copy NV12
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 4 40
Description
Run 4 parallel instances, loop 40 times, don’t output any buffer (to check whether its memcpy which caused CPU loading)
Result
Overall: 5010, output frames: 39796, fps=594.05649, Total used time: 66.99026 s, 66990262 usec
CPU usage: 40% ~ 70%
jacky@ubuntu-msdk:/opt/workspace/msdk/bin$ ./kdvcodec_msdkdec_mt ~/Videos/riverbed_1920x1080_25.h264 0 4 40
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: Found init function __vaDriverInit_0_32
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
libva info: va_openDriver() returns 0
libva info: va_openDriver() returns 0
Output Type = 1
Output Type = 1
Output Type = 1
libva info: VA-API version 0.34.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Output Type = 1
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
VideoInfo:
Resolution: 1920×1088
FPS: 60(2)
MFX_WRN_VIDEO_PARAM_CHANGED
…
MFX_WRN_VIDEO_PARAM_CHANGED
(0x024CE010): 501, output frames: 9998, fps=150.19414, Total used time: 66.56718 s, 66567176 usec
(0x024D3190): 1002, output frames: 9973, fps=149.44981, Total used time: 66.73144 s, 66731435 usec
(0x024D8280): 1503, output frames: 9936, fps=148.62010, Total used time: 66.85502 s, 66855020 usec
(0x024DE380): 2004, output frames: 9920, fps=148.11366, Total used time: 66.97559 s, 66975591 usec
————————————————-
Overall: 5010, output frames: 39796, fps=594.05649, Total used time: 66.99026 s, 66990262 usec
Test scenarios 5: extended test
This test was based on another version of this test program which added a usleep() in every input. Goal of this test is to run decode more like what we encounter in reality, that is video frames are input to the codec periodic, rather than endless, restless.
Command
./kdvcodec_msdkdec_mt ~/Videos/red_kayak_1080p.h264 0 4 40
Description
Run 4 pararol instances, loop 40 times, the decode thread sleeps 30 ms after input one frame to the codec.
Result
Overall: 68136, output frames: 7993, fps=295.29221, Total used time: 27.06810 s, 27068103 usec
CPU usage: 5% ~ 10%
(0x02291010): 501, output frames: 498, fps=18.70292, Total used time: 26.62686 s, 26626861 usec
(0x02296190): 1002, output frames: 493, fps=18.37663, Total used time: 26.82755 s, 26827547 usec
(0x022B1660): 3507, output frames: 492, fps=18.32293, Total used time: 26.85159 s, 26851594 usec
(0x022A8FF0): 5010, output frames: 505, fps=18.77590, Total used time: 26.89618 s, 26896182 usec
(0x0229B280): 1503, output frames: 499, fps=18.54937, Total used time: 26.90118 s, 26901183 usec
(0x022A1380): 2004, output frames: 495, fps=18.38211, Total used time: 26.92836 s, 26928363 usec
(0x022AD0D0): 5511, output frames: 492, fps=18.36819, Total used time: 26.78543 s, 26785435 usec
(0x022B5740): 4008, output frames: 506, fps=18.81383, Total used time: 26.89511 s, 26895107 usec
(0x022B9820): 4509, output frames: 503, fps=18.72153, Total used time: 26.86746 s, 26867461 usec
(0x0229F0B0): 3006, output frames: 511, fps=19.08146, Total used time: 26.77992 s, 26779920 usec
(0x022D1C00): 6513, output frames: 499, fps=18.51492, Total used time: 26.95124 s, 26951235 usec
(0x022A5460): 2505, output frames: 496, fps=18.47320, Total used time: 26.84970 s, 26849702 usec
(0x022DDEA0): 8016, output frames: 498, fps=18.48051, Total used time: 26.94731 s, 26947310 usec
(0x022CDB20): 6012, output frames: 518, fps=19.27356, Total used time: 26.87620 s, 26876199 usec
(0x022D9DC0): 7515, output frames: 494, fps=18.36024, Total used time: 26.90597 s, 26905971 usec
(0x022D5CE0): 7014, output frames: 494, fps=18.40993, Total used time: 26.83335 s, 26833345 usec
————————————————-
Overall: 68136, output frames: 7993, fps=295.29221, Total used time: 27.06810 s, 27068103 usec
Test analysis & conclusion
According to the test results, scenario 1 running with 4 instance seems have the best performance which can decode 500+ 1080P frames per second, however the CPU usage is also the highest one.
Best performance: scenario 1. Max to 530+ fps (1080P) , with CPU 40% ~ 60% (if reuse mfxFrameSurface1 pointer may get 600+ fps performance)
Best loading: scenario 3. CPU usage: 10%~ while decoding 1080P video in 350+ fps.
Tips for running this test:
1. Don’t too much instances if you don’t have sufficient RAM
2. Enlarge shared memory for graphic card driver, and see whether we can get more improvement in MSDK performance.