JPEG Codec

Operator Effect

Schematic Image
VP5out_jpegcodec

Principle

JPEG (Joint Photographic Experts Group) for the international standard, is a photographic images and widely used lossy compression standard method. It was developed by the Joint Photographic Experts Group. Its scope of application is wide, in addition to still image coding, but also extends to the intra-frame image compression of television image sequences.

JPEG Encoding Principle

JPEG encoding, encode YUV format images into JPEG compressed format image files, such as *.jpg.

Schematic Image
VP5out_jpegencode

JPEG encoding principle

JPEG decoding, realize the decoding of .jpg, .jpeg, .JPG, .JPEG image files.

Schematic Image
VP5out_jpegdecode

The main process of JPEG encoding is described by taking an 8x8 subregion of an image as an example.The values of the 8x8 image subregion are shown as belows:

[52556166706164736359559010985697262596811314410466736358711221541067069676168104126886870796560707768587585716459556165838779696865767894]\begin{aligned}\begin{bmatrix}52 & 55 & 61 & 66 & 70 & 61 & 64 & 73\\ 63 & 59 & 55 & 90 & 109 & 85 & 69 & 72\\62 & 59 & 68 & 113 & 144 & 104 & 66 & 73\\63 & 58 & 71 & 122 & 154 & 106 & 70 & 69\\67 & 61 & 68 & 104 & 126 & 88 & 68 & 70\\79 & 65 & 60 & 70 & 77 & 68 & 58 & 75\\85 & 71 & 64 & 59 & 55 & 61 & 65 & 83\\87 & 79 & 69 & 68 & 65 & 76 & 78 & 94\\\end{bmatrix}\end{aligned}

A two-dimensional Discrete Cosine Transform (DCT) is first performed on the 8x8 subregion with the aim of transforming the YUV color space into the frequency domain space. Since the range of values accepted for discrete cosine variation is [-128, 127], the following matrix is obtained by subtracting -128 from the values of the 8x8 image sub-region:

[767367625867645565697338194359566669601516246255657057626225859616760242406058496368585160705343576469736763454149596063525034]\begin{aligned}\begin{bmatrix} -76&-73&-67&-62&-58&-67&-64&-55\\ -65&-69&-73&-38&-19&-43&-59&-56\\-66&-69&-60&-15&16&-24&-62&-55\\-65&-70&-57&-6&26&-22&-58&-59\\-61&-67&-60&-24&-2&-40&-60&-58\\-49&-63&-68&-58&-51&-60&-70&-53\\-43&-57&-64&-69&-73&-67&-63&-45\\-41&-49&-59&-60&-63&-52&-50&-34\end{bmatrix}\end{aligned}

The normalized matrix is subjected to DCT and the transformationformula:

F(u,v)=14C(u)C(v)[m=07n=07f(m,n)cos(2m+1)uπ16cos(2n+1)vπ16]F(u,v)=\cfrac{1}{4}C(u)C(v)[\displaystyle\sum_{m=0}^{7} \displaystyle\sum_{n=0}^{7}f(m,n)cos\cfrac{(2m+1)u\pi}{16}cos\cfrac{(2n+1)v\pi}{16}]

The DCT coefficient matrix is obtained after transformation:

[415306127562020422611013795477772529105649123415106221271342233832621421002134100141012]\begin{aligned}\begin{bmatrix} -415&-30&-61&27&56&-20&-2&0\\ 4&-22&-61&10&13&-7&-9&5\\-47&7&77&-25&-29&10&5&-6\\-49&12&34&-15&-10&6&2&2\\12&-7&-13&-4&-2&2&-3&3\\-8&3&2&-6&-2&1&4&2\\-1&0&0&-2&-1&-3&4&-1\\0&0&-1&-4&-1&0&1&2\end{bmatrix}\end{aligned}

The coefficients at the points (0, 0) are called the direct current components (DC coefficients) and the coefficients at the remaining 63 points are called the alternating current components (AC coefficients).

Next, the DCT coefficients of the luminance and chrominance components are quantized, i.e., the DCT coefficients are divided by the quantization table and rounded to the nearest integer. Since the human eye is more sensitive to luminance signals than to chromaticity signals, two quantization scales, for the luminance component and the chromaticity component, are used. The default quantization tables are derived from extensive experiments, and custom quantization tables are also available.

The default quantization table for the luminance component:

[1611101624405161121214192658605514131624405769561417222951878062182237566810910377243555648110411392496478871031211201017292959811210010399]\begin{aligned}\begin{bmatrix} 16& 11& 10& 16& 24& 40& 51& 61 \\12& 12& 14& 19& 26& 58& 60& 55 \\14& 13& 16& 24& 40& 57& 69& 56\\14& 17& 22& 29& 51& 87& 80& 62\\18& 22& 37& 56& 68& 109& 103& 77\\24& 35& 55& 64& 81& 104& 113& 92\\49& 64& 78& 87& 103& 121& 120& 101\\72& 92& 95& 98& 112& 100& 103& 99\\\end{bmatrix}\end{aligned}

The default quantization table for colorimetric components:

[17182447999999991821266699999999242656999999999947669999999999999999999999999999999999999999999999999999999999999999999999999999]\begin{aligned}\begin{bmatrix} 17 & 18 & 24 & 47 & 99 & 99 & 99 & 99\\18 & 21 & 26 & 66 & 99 & 99 & 99 & 99\\24 & 26 & 56 & 99 & 99 & 99 & 99 & 99\\47 & 66 & 99 & 99 & 99 & 99 & 99 & 99\\99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\\end{bmatrix}\end{aligned}

The quantized DCT coefficients can be obtained by dividing the DCT coefficient matrix obtained earlier with the luminance table of the default values and rounding up:

[26362210002411000315110004121000010000000000000000000000000000000]\begin{aligned}\begin{bmatrix} -26 & -3 & -6 & 2 & 2 & -1 & 0 & 0\\0 & -2 & -4 & 1 & 1 & 0 & 0 & 0\\-3 & 1 & 5 & -1 & -1 & 0 & 0 & 0\\-4 & 1 & 2 & -1 & 0 & 0 & 0 & 0\\1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\\end{bmatrix}\end{aligned}

Observe that the quantized DCT coefficients of the quantized data are a bit larger relative to the AC coefficients and that the AC coefficients kind of contain a large number of zeros. Therefore, using Z-shaped encoding allows you to concatenate a large number of zeros together to reduce the size after encoding. The main idea is to organize the quantized DCT coefficients in a zigzag pattern starting from the first pixel in the upper left corner of the quantized DCT coefficients:

Schematic Image
VP5out_jpegzigzag

Since the DC coefficients of the DCT coefficients after zigzag coding have large values and the DC coefficients of the neighboring 8x8 image regions do not vary much, therefore, differential pulse coding technique is used to encode the difference of DC coefficients between neighboring image regions; for characters with AC coefficients that are repeated and occur many times in a row, the stroke length coding is used. Both encoding methods have intermediate formats that are intended to further minimize storage.

After obtaining the intermediate format of DC coefficients and the intermediate format of AC coefficients, entropy coding of both is required for further compression of image data. Compression is achieved by encoding characters with a higher probability of occurrence with a smaller number of bits. The JPEG basic system specifies the Huffman coding method. Different Huffman coding tables are used for DC coefficient and AC coefficient for Huffman coding, and different Huffman coding tables are used for luminance and chrominance. Therefore, 4 Huffman coding tables are needed to complete the entropy coding, and waiting until the specific Huffman coding is done efficiently by using table lookups. However, there is no default Huffman table defined in the JPEG standard, so the user is free to choose one according to the actual application. It is also possible to use the Huffman table recommended by the JPEG standard, or to predefine a generic Huffman table. It is also possible to compute the value of the Huffman table for a particular image by collecting its statistical features before compression coding.

API Interface

JPEG Encoding Interface

// create JPEG encoding context int32_t hbVPCreateJPEGEncContext(hbVPJPEGContext *context, hbVPJPEGEncParam *param); // JPEG encoding API interface int32_t hbVPJPEGEncode(hbUCPTaskHandle_t *taskHandle, hbVPImage const *srcImg, hbVPJPEGContext context); // get JPEG encoding ouptput buffer int32_t hbVPGetJPEGEncOutputBuffer(hbUCPTaskHandle_t taskHandle, hbVPArray *dstBuf); // release JPEG encoding context int32_t hbVPReleaseJPEGEncContext(hbVPJPEGContext context);

JPEG Decoding Interface

// create JPEG decoding context int32_t hbVPCreateJPEGDecContext(hbVPJPEGContext *context, uint32_t outBufCount, uint8_t imageFormat); // JPEG decoding API interface int32_t hbVPJPEGDecode(hbUCPTaskHandle_t *taskHandle, hbVPArray const *srcBuf, hbVPJPEGContext context); // get JPEG decoding ouptput buffer int32_t hbVPGetJPEGDecOutputBuffer(hbUCPTaskHandle_t taskHandle, hbVPImage *dstImg); // release JPEG decoding context int32_t hbVPReleaseJPEGDecContext(hbVPJPEGContext context);

For detailed interface information, please refer to hbVPJPEGEncode and hbVPJPEGDecode.

Usage

JPEG Encoding Usage

// Include the header #include "hobot/hb_ucp.h" #include "hobot/vp/hb_vp.h" #include "hobot/vp/hb_vp_jpeg_codec.h" // init image_buf, alloc memory for image data hbUCPSysMem image_mem; hbUCPMalloc(&image_mem, yuv_size, 0); hbVPImage image_buf = hbVPImage{HB_VP_IMAGE_FORMAT_YUV420, HB_VP_IMAGE_TYPE_U8C1, width, height, stride, image_mem.virAddr, image_mem.phyAddr, nullptr, 0, 0}; // init jpeg_buf hbVPArray jpeg_buf{0}; // init encoding param hbVPJPEGEncParam enc_param; enc_param.qualityFactor = 50; enc_param.extendedSequential = 0; enc_param.width = width; enc_param.height = height; enc_param.imageFormat = image_format; enc_param.outBufCount = 5; // create encoding context hbVPJPEGContext enc_context{nullptr}; hbVPCreateJPEGEncContext(&enc_context, &enc_param); // init task handle and schedule param hbUCPTaskHandle_t task_handle{nullptr}; hbUCPSchedParam sched_param; HB_UCP_INITIALIZE_SCHED_PARAM(&sched_param); sched_param.backend = HB_UCP_JPU_CORE_0; sched_param.priority = 0; // create encoding task hbVPJPEGEncode(&task_handle, &jpeg_buf, &image_buf, enc_context); // submit encoding task hbUCPSubmitTask(task_handle, &sched_param); // wait for encoding task done hbUCPWaitTaskDone(task_handle, 0); // get encoded buffer hbVPGetJPEGEncOutputBuffer(taskHandle, &jpeg_buf); // process jpeg data // release task handle hbUCPReleaseTask(task_handle); // release encoding context hbVPReleaseJPEGEncContext(enc_context); // release memory hbUCPFree(&image_mem);

JPEG Decoding Usage

// Include the header #include "hobot/hb_ucp.h" #include "hobot/vp/hb_vp.h" #include "hobot/vp/hb_vp_jpeg_codec.h" // init jpeg_buf, alloc memory for jpeg data hbUCPSysMem jpeg_mem; hbUCPMalloc(&jpeg_mem, jpeg_size, 0); hbVPArray jpeg_buf; jpeg_buf.phyAddr = jpeg_mem.phyAddr; jpeg_buf.virAddr = jpeg_mem.virAddr; jpeg_buf.memSize = jpeg_mem.memSize; jpeg_buf.size = jpeg_mem.memSize; jpeg_buf.capacity = jpeg_mem.memSize; // init image_buf hbVPImage image_buf{0}; // create decoding context hbVPJPEGContext dec_context{nullptr}; hbVPCreateJPEGDecContext(&dec_context, outbuf_count, image_format); // init task handle and schedule param hbUCPTaskHandle_t task_handle{nullptr}; hbUCPSchedParam sched_param; HB_UCP_INITIALIZE_SCHED_PARAM(&sched_param); sched_param.backend = HB_UCP_JPU_CORE_0; sched_param.priority = 0; // create decoding task hbVPJPEGDecode(&task_handle, &image_buf, &jpeg_buf, dec_context); // submit decoding task hbUCPSubmitTask(task_handle, &sched_param); // wait for decoding task done hbUCPWaitTaskDone(task_handle, 0); // get decoded buffer hbVPGetJPEGDecOutputBuffer(task_handle, &image_buf); // process image data // release task handle hbUCPReleaseTask(task_handle); // release decoding context hbVPReleaseJPEGDecContext(dec_context); // release memory hbUCPFree(&jpeg_mem);