Projects
Essentials
libde265
Sign Up
Log In
Username
Password
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
Expand all
Collapse all
Changes of Revision 12
View file
libde265.changes
Changed
@@ -1,4 +1,48 @@ ------------------------------------------------------------------- +Tue Jun 16 07:40:59 UTC 2026 - Bjørn Lie <zaitor@opensuse.org> + +- Update to version 1.1.1: + + The decoding speed has been improved by about 8% on x86 CPUs + thanks to more SIMD acceleration and optimized CABAC code. Also + the startup time has been improved, which gives a 3% speed + improvement when decoding HEIC files with similar-sized tiles. + + Security: + - CVE-2026-54240 (GHSA-ccfw-29x7-rrx3) - Pixel accessor signed + integer overflow causes heap OOB read/write + - CVE-2026-54241 (GHSA-j2qq-x2xq-g9wr) - SAO sequential filter + heap buffer overflow via signed integer overflow +- Changes from version 1.1.0: + + Added de265_security_limits parameters to limit the maximum + image size and memory that libde265 will use during decoding. + + Security fixes: + - CVE-2026-49295 (GHSA-g2rg-wj66-w594) - Out-of-bounds write in + process_reference_picture_set via predicted short-term RPS + - CVE-2026-49337 (GHSA-g5hj-rf9f-7vxm) - Unbounded memory + accumulation via orphaned slice headers in read_slice_NAL + - CVE-2026-49346 (GHSA-vv8h-932h-7r86) - Heap buffer overflow + in de265_image_get_buffer via SPS dimension integer overflow + - (GHSA-x27c-jp65-g395) - Quadratic CPU consumption in NAL + parser (remove_stuffing_bytes, resize) +- Changes from version 1.0.19: + + This release contains a number of security fixes, correctness + fixes for edge cases, and build/packaging improvements. + The public API is binary-compatible with v1.0.18. + + Security fixes: + - CVE-2026-45382 (GHSA-hwhx-x2mq-ccr9) : Heap-buffer-overflow + READ in decode_slice_unit_tiles via unvalidated PPS tile + geometry. + - CVE-2026-45383 (GHSA-wg9q-ppqw-6q38) : Heap buffer overflow + (OOB read) in decode_slice_unit_WPP() via out-of-bounds + CtbAddrRStoTS access +- Changes from version 1.0.18: + + libde265ConfigVersion.cmake renamed to + libde265-config-version.cmake + + fix pkg-config when installing to absolute paths + + fix compilation with MSVC in Debug mode + + removed the (defunct) encoder code and the internal development + tools from the tarball. + +------------------------------------------------------------------- Wed Mar 18 13:35:29 UTC 2026 - Bjørn Lie <zaitor@opensuse.org> - Update to version 1.0.17:
View file
libde265.spec
Changed
@@ -18,7 +18,7 @@ %define so_ver 0 Name: libde265 -Version: 1.0.17 +Version: 1.1.1 Release: 0 Summary: Open H.265 video codec implementation License: LGPL-3.0-only
View file
libde265-1.0.17.tar.gz/enc265
Deleted
-(directory)
View file
libde265-1.0.17.tar.gz/enc265/CMakeLists.txt
Deleted
@@ -1,15 +0,0 @@ -add_executable (enc265 - enc265.cc - image-io-png.cc -) - -if(MSVC) - target_sources(enc265 PRIVATE - ../extra/getopt.c - ../extra/getopt_long.c - ) -endif() - -target_link_libraries (enc265 PRIVATE de265) - -install (TARGETS enc265 DESTINATION ${CMAKE_INSTALL_BINDIR})
View file
libde265-1.0.17.tar.gz/enc265/COPYING
Deleted
@@ -1,20 +0,0 @@ - - MIT License - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in all - copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE.
View file
libde265-1.0.17.tar.gz/enc265/enc265.cc
Deleted
@@ -1,361 +0,0 @@ -/* - libde265 example application "enc265". - - MIT License - - Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in all - copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE. -*/ - -#include "libde265/en265.h" -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "libde265/encoder/configparam.h" -#include "libde265/image-io.h" -#include "libde265/encoder/encoder-core.h" -#include "libde265/util.h" -#include "image-io-png.h" - -#include <getopt.h> - - - -#if HAVE_VIDEOGFX -#include <libvideogfx.hh> -using namespace videogfx; - -void debug_show_image_libvideogfx(const de265_image* input, int slot) -{ - static X11Win debugwin; - static bool opened=false; - int w = input->get_width(); - int h = input->get_height(); - if (!opened) { - opened=true; - debugwin.Create(w,h, "debug"); - } - - Image<Pixel> img; - img.Create(w,h,Colorspace_YUV, Chroma_420); - - for (int y=0;y<h;y++) - memcpy(img.AskFrameY()y, input->get_image_plane_at_pos(0,0,y), w); - - for (int y=0;y<h/2;y++) { - memcpy(img.AskFrameU()y, input->get_image_plane_at_pos(1,0,y), w/2); - memcpy(img.AskFrameV()y, input->get_image_plane_at_pos(2,0,y), w/2); - } - - debugwin.Display(img); - //debugwin.WaitForKeypress(); -} -#endif - - - -int show_help=false; -int verbosity=0; - -static struct option long_options = { - {"help", no_argument, &show_help, 1 }, - {"verbose", no_argument, 0, 'v' }, - {0, 0, 0, 0 } -}; - - -struct inout_params -{ - inout_params(); - - // input - - option_int first_frame; - option_int max_number_of_frames; - - option_string input_yuv; - option_int input_width; - option_int input_height; - - option_bool input_is_rgb; - - // output - - option_string output_filename; - - // debug - - option_string reconstruction_yuv; - - - void register_params(config_parameters& config); -}; - - -inout_params::inout_params() -{ - input_yuv.set_ID("input"); input_yuv.set_short_option('i'); - input_yuv.set_default("paris_cif.yuv"); - - output_filename.set_ID("output"); output_filename.set_short_option('o'); - output_filename.set_default("out.bin"); - - reconstruction_yuv.set_ID("input"); - reconstruction_yuv.set_default("recon.yuv"); - - first_frame.set_ID("first-frame"); - first_frame.set_default(0); - first_frame.set_minimum(0); - - max_number_of_frames.set_ID("frames"); - max_number_of_frames.set_short_option('f'); - max_number_of_frames.set_minimum(1); - //max_number_of_frames.set_default(INT_MAX); - - input_width.set_ID("width"); input_width.set_short_option('w'); - input_width.set_minimum(1); input_width.set_default(352); - - input_height.set_ID("height"); input_height.set_short_option('h'); - input_height.set_minimum(1); input_height.set_default(288); - - input_is_rgb.set_ID("rgb"); - input_is_rgb.set_default(false); - input_is_rgb.set_description("input is sequence of RGB PNG images"); -} - - -void inout_params::register_params(config_parameters& config) -{ - config.add_option(&input_yuv); - config.add_option(&output_filename); - config.add_option(&first_frame); - config.add_option(&max_number_of_frames); - config.add_option(&input_width); - config.add_option(&input_height); -#if HAVE_VIDEOGFX - if (videogfx::PNG_Supported()) { - config.add_option(&input_is_rgb); - } -#endif -} - - -void test_parameters_API(en265_encoder_context* ectx) -{ - const char** param = en265_list_parameters(ectx); - if (param) { - for (int i=0; parami; i++) { - printf("|%s| ",parami); - - enum en265_parameter_type type = en265_get_parameter_type(ectx, parami); - const char* type_name="unknown"; - switch (type) { - case en265_parameter_int: type_name="int"; break; - case en265_parameter_bool: type_name="bool"; break; - case en265_parameter_string: type_name="string"; break; - case en265_parameter_choice: type_name="choice"; break; - } - - printf("(%s)",type_name); - - if (type==en265_parameter_choice) { - const char** choices = en265_list_parameter_choices(ectx, parami); - if (choices) { - for (int k=0; choicesk; k++) { - printf(" %s",choicesk); - } - } - } - - printf("\n"); - } - } - - // en265_set_parameter_int(ectx, "min-tb-size", 8); -} - - -extern int skipTBSplit, noskipTBSplit; -extern int zeroBlockCorrelation625; - -/*LIBDE265_API*/ ImageSink_YUV reconstruction_sink; - - -int main(int argc, char** argv) -{ - de265_init(); - - en265_encoder_context* ectx = en265_new_encoder(); - - - bool cmdline_errors = false; - - // --- in/out parameters --- - - struct inout_params inout_params; - config_parameters inout_param_config; - inout_params.register_params(inout_param_config); - - int first_idx=1; - if (!inout_param_config.parse_command_line_params(&argc,argv, &first_idx, true)) { - cmdline_errors = true; - } - - - // --- read encoder parameters --- - - if (en265_parse_command_line_parameters(ectx, &argc, argv) != DE265_OK) { - cmdline_errors = true; - } - - - - while (1) { - int option_index = 0; - - int c = getopt_long(argc, argv, "v" - , long_options, &option_index); - if (c == -1) - break; - - switch (c) { - case 'v': verbosity++; break; - } - } - - - // --- show usage information --- - - if (optind != argc || cmdline_errors || show_help) { - fprintf(stderr," enc265 v%s\n", de265_get_version()); - fprintf(stderr,"--------------\n"); - fprintf(stderr,"usage: enc265 options\n"); - fprintf(stderr,"The video file must be a raw YUV file or a PNG sequence for RGB input\n"); - fprintf(stderr,"\n"); - fprintf(stderr,"options:\n"); - fprintf(stderr," --help show help\n"); - fprintf(stderr," -v, --verbose increase verbosity level (up to 3 times)\n"); - - inout_param_config.print_params(); - fprintf(stderr,"\n"); - en265_show_parameters(ectx); - - exit(show_help ? 0 : 5); - } - - - - de265_set_verbosity(verbosity); - -#if HAVE_VIDEOGFX - //debug_set_image_output(debug_show_image_libvideogfx); -#endif - - //test_parameters_API(ectx); - - - if (strlen(inout_params.reconstruction_yuv.get().c_str()) != 0) { - reconstruction_sink.set_filename(inout_params.reconstruction_yuv.get().c_str()); - //ectx.reconstruction_sink = &reconstruction_sink; - } - - ImageSource* image_source; - ImageSource_YUV image_source_yuv; -#if HAVE_VIDEOGFX - ImageSource_PNG image_source_png; -#endif - - - if (inout_params.input_is_rgb) { -#if HAVE_VIDEOGFX - image_source_png.set_input_file(inout_params.input_yuv.get().c_str()); - image_source = &image_source_png; -#endif - } - else { - image_source_yuv.set_input_file(inout_params.input_yuv.get().c_str(), - inout_params.input_width, - inout_params.input_height); - image_source = &image_source_yuv; - } - - - PacketSink_File packet_sink; - packet_sink.set_filename(inout_params.output_filename.get().c_str()); - - - // --- run encoder --- - - image_source->skip_frames( inout_params.first_frame ); - - en265_start_encoder(ectx, 0); - - int maxPoc = INT_MAX; - if (inout_params.max_number_of_frames.is_defined()) { - maxPoc = inout_params.max_number_of_frames; - } - - bool eof = false; - for (int poc=0; poc<maxPoc && !eof ;poc++) - { - // push one image into the encoder - - de265_image* input_image = image_source->get_image(); - if (input_image==NULL) { - en265_push_eof(ectx); - eof=true; - } - else { - en265_push_image(ectx, input_image); - } - - - - // encode images while more are available - - en265_encode(ectx); - - - // write all pending packets - - for (;;) { - en265_packet* pck = en265_get_packet(ectx,0); - if (pck==NULL) - break; - - packet_sink.send_packet(pck->data, pck->length); - - en265_free_packet(ectx,pck); - } - } - - - // --- print statistics --- - - en265_print_logging((encoder_context*)ectx, "tb-split", NULL); - - - en265_free_encoder(ectx); - - de265_free(); - - return 0; -}
View file
libde265-1.0.17.tar.gz/enc265/image-io-png.cc
Deleted
@@ -1,114 +0,0 @@ -/* - libde265 example application "enc265". - - MIT License - - Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in all - copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE. -*/ - -#include "image-io-png.h" -#include <assert.h> - -#if HAVE_VIDEOGFX -#include <libvideogfx.hh> -using namespace videogfx; - - -ImageSource_PNG::ImageSource_PNG() -{ - mFilenameTemplate = NULL; - mNextImageNumber = 1; - - mReachedEndOfStream = false; - - mWidth=mHeight=0; -} - -ImageSource_PNG::~ImageSource_PNG() -{ -} - -bool ImageSource_PNG::set_input_file(const char* filename) -{ - mFilenameTemplate = filename; - return true; -} - -de265_image* ImageSource_PNG::get_image(bool block) -{ - if (mReachedEndOfStream) return NULL; - - - // --- construct image filename --- - - char filename1000; - sprintf(filename,mFilenameTemplate,mNextImageNumber); - mNextImageNumber++; - - - // --- load image --- - - Image<Pixel> input; - bool success = videogfx::ReadImage_PNG(input, filename); - if (!success) { - mReachedEndOfStream = true; - return NULL; - } - - - mWidth = input.AskWidth(); - mHeight= input.AskHeight(); - - de265_image* img = new de265_image; - img->alloc_image(mWidth,mHeight,de265_chroma_444, NULL, false, - NULL, /*NULL,*/ 0, NULL, false); - assert(img); // TODO: error handling - - - uint8_t* p; - int stride; - - for (int c=0;c<3;c++) { - int h265channel; - switch (c) { - case 0: h265channel=2; break; // R - case 1: h265channel=0; break; // G - case 2: h265channel=1; break; // B - } - - p = img->get_image_plane(h265channel); - stride = img->get_image_stride(h265channel); - - for (int y=0;y<mHeight;y++) { - memcpy(p, input.AskFrame((BitmapChannel(c)))y, mWidth); - p += stride; - } - } - - return img; -} - -void ImageSource_PNG::skip_frames(int n) -{ - mNextImageNumber += n; -} - -#endif
View file
libde265-1.0.17.tar.gz/enc265/image-io-png.h
Deleted
@@ -1,60 +0,0 @@ -/* - libde265 example application "enc265". - - MIT License - - Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in all - copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE. - */ - -#ifndef IMAGE_IO_PNG_H -#define IMAGE_IO_PNG_H - -#include "libde265/image-io.h" -#include <deque> - - -#if HAVE_VIDEOGFX -class ImageSource_PNG : public ImageSource -{ - public: - LIBDE265_API ImageSource_PNG(); - virtual LIBDE265_API ~ImageSource_PNG(); - - bool LIBDE265_API set_input_file(const char* filename); - - //virtual ImageStatus get_status(); - virtual LIBDE265_API de265_image* get_image(bool block=true); - virtual LIBDE265_API void skip_frames(int n); - - virtual LIBDE265_API int get_width() const { return mWidth; } - virtual LIBDE265_API int get_height() const { return mHeight; } - - private: - const char* mFilenameTemplate; - int mNextImageNumber; - - bool mReachedEndOfStream; - - int mWidth,mHeight; -}; -#endif - -#endif
View file
libde265-1.0.17.tar.gz/libde265/arm
Deleted
-(directory)
View file
libde265-1.0.17.tar.gz/libde265/arm/CMakeLists.txt
Deleted
@@ -1,20 +0,0 @@ -add_library(arm OBJECT arm.cc arm.h) - -if(HAVE_NEON) - add_library(arm_neon OBJECT - cpudetect.S - hevcdsp_qpel_neon.S - ) - - target_compile_options(arm_neon PRIVATE - -mfpu=neon - -DHAVE_NEON - -DEXTERN_ASM= - -DHAVE_AS_FUNC - -DHAVE_SECTION_DATA_REL_RO - ) - - set(ARM_OBJECTS $<TARGET_OBJECTS:arm> $<TARGET_OBJECTS:arm_neon> PARENT_SCOPE) -else() - set(ARM_OBJECTS $<TARGET_OBJECTS:arm> PARENT_SCOPE) -endif()
View file
libde265-1.0.17.tar.gz/libde265/en265.cc
Deleted
@@ -1,321 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "libde265/en265.h" -#include "libde265/encoder/encoder-context.h" - - -LIBDE265_API en265_encoder_context* en265_new_encoder(void) -{ - de265_error init_err = de265_init(); - if (init_err != DE265_OK) { - return NULL; - } - - encoder_context* ectx = new encoder_context(); - return reinterpret_cast<en265_encoder_context*>(ectx); -} - - -LIBDE265_API de265_error en265_free_encoder(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - delete ectx; - - return de265_free(); -} - - -/* -LIBDE265_API void en265_set_image_release_function(en265_encoder_context* e, - void (*release_func)(en265_encoder_context*, - de265_image*, - void* userdata), - void* alloc_userdata) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - ectx->param_image_allocation_userdata = alloc_userdata; - ectx->release_func = release_func; -} -*/ - - -// ========== encoder parameters ========== - -LIBDE265_API de265_error en265_parse_command_line_parameters(en265_encoder_context* e, - int* argc, char** argv) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - //if (!ectx->params_config.parse_command_line_params(argc,argv, &ectx->params, true)) { - int first_idx=1; - if (!ectx->params_config.parse_command_line_params(argc,argv, &first_idx, true)) { - return DE265_ERROR_PARAMETER_PARSING; - } - else { - return DE265_OK; - } -} - -LIBDE265_API void en265_show_parameters(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - //ectx->params_config.show_params(&ectx->params); - - ectx->params_config.print_params(); -} - - -LIBDE265_API const char** en265_list_parameters(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.get_parameter_string_table(); -} - - -LIBDE265_API enum en265_parameter_type en265_get_parameter_type(en265_encoder_context* e, - const char* parametername) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.get_parameter_type(parametername); -} - - -LIBDE265_API de265_error en265_set_parameter_bool(en265_encoder_context* e, - const char* param,int value) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.set_bool(param,value) ? DE265_OK : DE265_ERROR_PARAMETER_PARSING; -} - - -LIBDE265_API de265_error en265_set_parameter_int(en265_encoder_context* e, - const char* param,int value) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.set_int(param,value) ? DE265_OK : DE265_ERROR_PARAMETER_PARSING; -} - -LIBDE265_API de265_error en265_set_parameter_string(en265_encoder_context* e, - const char* param,const char* value) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.set_string(param,value) ? DE265_OK : DE265_ERROR_PARAMETER_PARSING; -} - -LIBDE265_API de265_error en265_set_parameter_choice(en265_encoder_context* e, - const char* param,const char* value) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.set_choice(param,value) ? DE265_OK : DE265_ERROR_PARAMETER_PARSING; -} - - -LIBDE265_API const char** en265_list_parameter_choices(en265_encoder_context* e, - const char* parametername) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->params_config.get_parameter_choices_table(parametername); -} - - - -// ========== encoding loop ========== - - -LIBDE265_API de265_error en265_start_encoder(en265_encoder_context* e, int number_of_threads) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - ectx->start_encoder(); - - return DE265_OK; -} - - -LIBDE265_API struct de265_image* en265_allocate_image(en265_encoder_context* e, - int width, int height, de265_chroma chroma, - de265_PTS pts, void* image_userdata) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - de265_image* img = new de265_image; - if (img->alloc_image(width,height,de265_chroma_420, NULL, false, - NULL, /*ectx,*/ pts, image_userdata, true) != DE265_OK) { - delete img; - return NULL; - } - - return img; -} - -// Request a specification of the image memory layout for an image of the specified dimensions. -LIBDE265_API void en265_get_image_spec(en265_encoder_context* e, - int width, int height, de265_chroma chroma, - struct de265_image_spec* out_spec) -{ - out_spec->format = de265_image_format_YUV420P8; - out_spec->width = width; - out_spec->height= height; - out_spec->alignment = 1; - - out_spec->crop_left =0; - out_spec->crop_right =0; - out_spec->crop_top =0; - out_spec->crop_bottom=0; - - out_spec->visible_width = out_spec->width - out_spec->crop_left - out_spec->crop_right; - out_spec->visible_height = out_spec->height - out_spec->crop_top - out_spec->crop_bottom; -} - -// Image memory layout specification for an image returned by en265_allocate_image(). -//LIBDE265_API void de265_get_image_spec_from_image(de265_image* img, struct de265_image_spec* spec); - - - -LIBDE265_API de265_error en265_push_image(en265_encoder_context* e, - struct de265_image* img) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - ectx->sop->insert_new_input_image(img); - return DE265_OK; -} - - -LIBDE265_API de265_error en265_push_eof(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - ectx->sop->insert_end_of_stream(); - return DE265_OK; -} - - -LIBDE265_API de265_error en265_block_on_input_queue_length(en265_encoder_context*, - int max_pending_images, - int timeout_ms) -{ - // TODO - return DE265_OK; -} - -LIBDE265_API de265_error en265_trim_input_queue(en265_encoder_context*, int max_pending_images) -{ - // TODO - return DE265_OK; -} - -LIBDE265_API int en265_current_input_queue_length(en265_encoder_context*) -{ - // TODO - return DE265_OK; -} - -LIBDE265_API de265_error en265_encode(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - while (ectx->picbuf.have_more_frames_to_encode()) - { - de265_error result = ectx->encode_picture_from_input_buffer(); - if (result != DE265_OK) return result; - } - - return DE265_OK; -} - -LIBDE265_API enum en265_encoder_state en265_get_encoder_state(en265_encoder_context* e) -{ - // TODO - return EN265_STATE_IDLE; -} - -LIBDE265_API struct en265_packet* en265_get_packet(en265_encoder_context* e, int timeout_ms) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - assert(timeout_ms==0); // TODO: blocking not implemented yet - - if (ectx->output_packets.size()>0) { - en265_packet* pck = ectx->output_packets.front(); - ectx->output_packets.pop_front(); - - return pck; - } - else { - return NULL; - } -} - -LIBDE265_API void en265_free_packet(en265_encoder_context* e, struct en265_packet* pck) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - // Do not delete images here. They are owned by the EncPicBuf. - //delete pck->input_image; - //delete pck->reconstruction; - - if (pck->frame_number >= 0) { - ectx->mark_image_is_outputted(pck->frame_number); - - ectx->release_input_image(pck->frame_number); - } - - delete pck->data; - delete pck; -} - -LIBDE265_API int en265_number_of_queued_packets(en265_encoder_context* e) -{ - assert(e); - encoder_context* ectx = reinterpret_cast<encoder_context*>(e); - - return ectx->output_packets.size(); -}
View file
libde265-1.0.17.tar.gz/libde265/en265.h
Deleted
@@ -1,218 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef EN265_H -#define EN265_H - -#ifdef __cplusplus -extern "C" { -#endif - -#include <libde265/de265.h> - - -// ========== encoder context ========== - -struct en265_encoder_context; // private structure - -/* Get a new encoder context. Must be freed with en265_free_encoder(). */ -LIBDE265_API en265_encoder_context* en265_new_encoder(void); - -/* Free encoder context. May only be called once on a context. */ -LIBDE265_API de265_error en265_free_encoder(en265_encoder_context*); - -/* The alloc_userdata pointer will be given to the release_func(). */ -/* -LIBDE265_API void en265_set_image_release_function(en265_encoder_context*, - void (*release_func)(en265_encoder_context*, - struct de265_image*, - void* userdata), - void* alloc_userdata); -*/ - -// ========== encoder parameters ========== - -LIBDE265_API de265_error en265_set_parameter_bool(en265_encoder_context*, - const char* parametername,int value); -LIBDE265_API de265_error en265_set_parameter_int(en265_encoder_context*, - const char* parametername,int value); -LIBDE265_API de265_error en265_set_parameter_string(en265_encoder_context*, - const char* parametername,const char* value); -LIBDE265_API de265_error en265_set_parameter_choice(en265_encoder_context*, - const char* parametername,const char* value); - - -LIBDE265_API const char** en265_list_parameters(en265_encoder_context*); - -enum en265_parameter_type { - en265_parameter_bool, - en265_parameter_int, - en265_parameter_string, - en265_parameter_choice -}; - -LIBDE265_API enum en265_parameter_type en265_get_parameter_type(en265_encoder_context*, - const char* parametername); - -LIBDE265_API const char** en265_list_parameter_choices(en265_encoder_context*, - const char* parametername); - - -// --- convenience functions for command-line parameters --- - -LIBDE265_API de265_error en265_parse_command_line_parameters(en265_encoder_context*, - int* argc, char** argv); -LIBDE265_API void en265_show_parameters(en265_encoder_context*); - - - -// ========== encoding loop ========== - -LIBDE265_API de265_error en265_start_encoder(en265_encoder_context*, int number_of_threads); - -// If we have provided our own memory release function, no image memory will be allocated. -LIBDE265_API struct de265_image* en265_allocate_image(en265_encoder_context*, - int width, int height, - enum de265_chroma chroma, - de265_PTS pts, void* image_userdata); - -LIBDE265_API void* de265_alloc_image_plane(struct de265_image* img, int cIdx, - void* inputdata, int inputstride, void *userdata); -LIBDE265_API void de265_free_image_plane(struct de265_image* img, int cIdx); - - -// Request a specification of the image memory layout for an image of the specified dimensions. -LIBDE265_API void en265_get_image_spec(en265_encoder_context*, - int width, int height, enum de265_chroma chroma, - struct de265_image_spec* out_spec); - -// Image memory layout specification for an image returned by en265_allocate_image(). -/* TODO: do we need this? -LIBDE265_API void de265_get_image_spec_from_image(de265_image* img, struct de265_image_spec* spec); -*/ - - -LIBDE265_API de265_error en265_push_image(en265_encoder_context*, - struct de265_image*); // non-blocking - -LIBDE265_API de265_error en265_push_eof(en265_encoder_context*); - -// block when there are more than max_input_images in the input queue -LIBDE265_API de265_error en265_block_on_input_queue_length(en265_encoder_context*, - int max_pending_images, - int timeout_ms); - -LIBDE265_API de265_error en265_trim_input_queue(en265_encoder_context*, int max_pending_images); - -LIBDE265_API int en265_current_input_queue_length(en265_encoder_context*); - -// Run encoder in main thread. Only use this when not using background threads. -LIBDE265_API de265_error en265_encode(en265_encoder_context*); - -enum en265_encoder_state -{ - EN265_STATE_IDLE, - EN265_STATE_WAITING_FOR_INPUT, - EN265_STATE_WORKING, - EN265_STATE_OUTPUT_QUEUE_FULL, - EN265_STATE_EOS -}; - - -LIBDE265_API enum en265_encoder_state en265_get_encoder_state(en265_encoder_context*); - - -enum en265_packet_content_type { - EN265_PACKET_VPS, - EN265_PACKET_SPS, - EN265_PACKET_PPS, - EN265_PACKET_SEI, - EN265_PACKET_SLICE, - EN265_PACKET_SKIPPED_IMAGE -}; - - -enum en265_nal_unit_type { - EN265_NUT_TRAIL_N = 0, - EN265_NUT_TRAIL_R = 1, - EN265_NUT_TSA_N = 2, - EN265_NUT_TSA_R = 3, - EN265_NUT_STSA_N = 4, - EN265_NUT_STSA_R = 5, - EN265_NUT_RADL_N = 6, - EN265_NUT_RADL_R = 7, - EN265_NUT_RASL_N = 8, - EN265_NUT_RASL_R = 9, - EN265_NUT_BLA_W_LP = 16, - EN265_NUT_BLA_W_RADL= 17, - EN265_NUT_BLA_N_LP = 18, - EN265_NUT_IDR_W_RADL= 19, - EN265_NUT_IDR_N_LP = 20, - EN265_NUT_CRA = 21, - EN265_NUT_VPS = 32, - EN265_NUT_SPS = 33, - EN265_NUT_PPS = 34, - EN265_NUT_AUD = 35, - EN265_NUT_EOS = 36, - EN265_NUT_EOB = 37, - EN265_NUT_FD = 38, - EN265_NUT_PREFIX_SEI = 39, - EN265_NUT_SUFFIX_SEI = 40 -}; - - -struct en265_packet -{ - int version; // currently: 1 - - const uint8_t* data; - int length; - - int frame_number; - - enum en265_packet_content_type content_type; - char complete_picture : 1; - char final_slice : 1; - char dependent_slice : 1; - - enum en265_nal_unit_type nal_unit_type; - unsigned char nuh_layer_id; - unsigned char nuh_temporal_id; - - en265_encoder_context* encoder_context; - - const struct de265_image* input_image; - const struct de265_image* reconstruction; -}; - -// timeout_ms - timeout in milliseconds. 0 - no timeout, -1 - block forever -LIBDE265_API struct en265_packet* en265_get_packet(en265_encoder_context*, int timeout_ms); -LIBDE265_API void en265_free_packet(en265_encoder_context*, struct en265_packet*); - -LIBDE265_API int en265_number_of_queued_packets(en265_encoder_context*); - -#ifdef __cplusplus -} -#endif - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder
Deleted
-(directory)
View file
libde265-1.0.17.tar.gz/libde265/encoder/CMakeLists.txt
Deleted
@@ -1,16 +0,0 @@ -set (encoder_sources - configparam.h configparam.cc - encoder-core.cc encoder-core.h - encoder-types.h encoder-types.cc - encoder-params.h encoder-params.cc - encoder-context.h encoder-context.cc - encoder-syntax.h encoder-syntax.cc - encoder-intrapred.h encoder-intrapred.cc - encoder-motion.h encoder-motion.cc - encpicbuf.h encpicbuf.cc - sop.h sop.cc -) - -add_subdirectory (algo) -add_library(encoder OBJECT ${encoder_sources}) -set(ENCODER_OBJECTS $<TARGET_OBJECTS:encoder> ${ALGO_OBJECTS} PARENT_SCOPE)
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo
Deleted
-(directory)
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/CMakeLists.txt
Deleted
@@ -1,19 +0,0 @@ -set (algo_sources - algo.h algo.cc - coding-options.h coding-options.cc - ctb-qscale.h ctb-qscale.cc - cb-split.h cb-split.cc - cb-intrapartmode.h cb-intrapartmode.cc - cb-interpartmode.h cb-interpartmode.cc - cb-skip.h cb-skip.cc - cb-intra-inter.h cb-intra-inter.cc - cb-mergeindex.h cb-mergeindex.cc - tb-split.h tb-split.cc - tb-transform.h tb-transform.cc - tb-intrapredmode.h tb-intrapredmode.cc - tb-rateestim.h tb-rateestim.cc - pb-mv.h pb-mv.cc -) - -add_library(algo OBJECT ${algo_sources}) -set(ALGO_OBJECTS $<TARGET_OBJECTS:algo> PARENT_SCOPE)
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/algo.cc
Deleted
@@ -1,95 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/encoder-context.h" - -#include <stdarg.h> - - - -#ifdef DE265_LOG_DEBUG -static int descendLevel = 0; - -void Algo::enter() -{ - if (logdebug_enabled(LogEncoder)) { - printf("%d",descendLevel+1); - for (int i=0;i<descendLevel+1;i++) { printf(" "); } - printf(":%s\n",name()); - } -} - -void Algo::descend(const enc_node* node, const char* option, ...) -{ - if (logdebug_enabled(LogEncoder)) { - descendLevel++; - printf("%d ",descendLevel); - for (int i=0;i<descendLevel;i++) { printf(" "); } - - va_list va; - va_start(va, option); - va_end(va); - - fprintf(stdout, ">%s(", name()); - vfprintf(stdout, option, va); - fprintf(stdout, ") %d;%d %dx%d %p\n",node->x,node->y,1<<node->log2Size,1<<node->log2Size,static_cast<void*>(node)); - } -} - -void Algo::ascend(const enc_node* resultNode, const char* fmt, ...) -{ - if (logdebug_enabled(LogEncoder)) { - if (fmt != NULL) { - printf("%d ",descendLevel); - for (int i=0;i<descendLevel;i++) { printf(" "); } - - va_list va; - va_start(va, fmt); - va_end(va); - - fprintf(stdout, "<%s(", name()); - vfprintf(stdout, fmt, va); - fprintf(stdout, ") <- %p\n",static_cast<void*>(resultNode)); - } - - descendLevel--; - } -} - -void Algo::leaf(const enc_node* node, const char* option, ...) -{ - if (logdebug_enabled(LogEncoder)) { - printf("%d ",descendLevel+1); - for (int i=0;i<descendLevel+1;i++) { printf(" "); } - - va_list va; - va_start(va, option); - va_end(va); - - fprintf(stdout, "%s(", name()); - vfprintf(stdout, option, va); - fprintf(stdout, ") %d;%d %dx%d\n",node->x,node->y,1<<node->log2Size,1<<node->log2Size); - } -} - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/algo.h
Deleted
@@ -1,95 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ALGO_H -#define ALGO_H - -#include "libde265/encoder/encoder-types.h" - - -/* When entering the next recursion level, it is assumed that - a valid CB structure is passed down. Ownership is transferred to - the new algorithm. That algorithm passes back a (possibly different) - CB structure that the first algorithm should use. The receiving - algorithm will be the owner of the passed back algorithm. - The original CB structure might have been deleted in the called algorithm. - - When using CodingOptions, it is important to set the passed back - enc_node in the CodingOption (set_node()), so that the CodingOption - can correctly handle ownership and delete nodes as needed. - - The context_model_table passed down is at the current state. - When the algorithm returns, the state should represent the state - after running this algorithm. - */ - -class Algo -{ - public: - virtual ~Algo() { } - - virtual const char* name() const { return "noname"; } - -#ifdef DE265_LOG_DEBUG - void enter(); - void descend(const enc_node* node,const char* option_description, ...); - void ascend(const enc_node* resultNode=NULL, const char* fmt=NULL, ...); - void leaf(const enc_node* node,const char* option_description, ...); -#else - inline void enter() { } - inline void descend(const enc_node*,const char*, ...) { } - inline void ascend(const enc_node* resultNode=NULL,const char* fmt=NULL, ...) { } - inline void leaf(const enc_node*,const char*, ...) { } -#endif -}; - - -class Algo_CB : public Algo -{ - public: - virtual ~Algo_CB() { } - - /* The context_model_table that is provided can be modified and - even released in the function. On exit, it should be filled with - a (optionally new) context_model_table that represents the state - after encoding the syntax element. However, to speed up computation, - it is also allowed to not modify the context_model_table at all. - */ - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb) = 0; -}; - - -class Algo_PB : public Algo -{ - public: - virtual ~Algo_PB() { } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb, - int PBidx, int x,int y,int w,int h) = 0; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-interpartmode.cc
Deleted
@@ -1,113 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-interpartmode.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> - - - -enc_cb* Algo_CB_InterPartMode::codeAllPBs(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - int x = cb->x; - int y = cb->y; - int log2Size = cb->log2Size; - int w = 1<<log2Size; - int s; // splitSize; - - int nPB; - switch (cb->PartMode) { - case PART_2Nx2N: - cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x,y,1<<log2Size,1<<log2Size); - break; - - case PART_NxN: - s = 1<<(log2Size-1); - descend(cb,"NxN(1/4)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x ,y ,s,s); ascend(); - descend(cb,"NxN(2/4)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x+s,y ,s,s); ascend(); - descend(cb,"NxN(3/4)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 2, x ,y+s,s,s); ascend(); - descend(cb,"NxN(4/4)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 3, x+s,y+s,s,s); ascend(); - break; - - case PART_2NxN: - s = 1<<(log2Size-1); - descend(cb,"2NxN(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x,y ,w,s); ascend(); - descend(cb,"2NxN(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x,y+s,w,s); ascend(); - break; - - case PART_Nx2N: - s = 1<<(log2Size-1); - descend(cb,"Nx2N(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x ,y,s,w); ascend(); - descend(cb,"Nx2N(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x+s,y,s,w); ascend(); - break; - - case PART_2NxnU: - s = 1<<(log2Size-2); - descend(cb,"2NxnU(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x,y ,w,s); ascend(); - descend(cb,"2NxnU(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x,y+s,w,w-s); ascend(); - break; - - case PART_2NxnD: - s = 1<<(log2Size-2); - descend(cb,"2NxnD(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x,y ,w,w-s); ascend(); - descend(cb,"2NxnD(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x,y+w-s,w,s); ascend(); - break; - - case PART_nLx2N: - s = 1<<(log2Size-2); - descend(cb,"nLx2N(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x ,y,s ,w); ascend(); - descend(cb,"nLx2N(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x+s,y,w-s,w); ascend(); - break; - - case PART_nRx2N: - s = 1<<(log2Size-2); - descend(cb,"nRx2N(1/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 0, x ,y,w-s,w); ascend(); - descend(cb,"nRx2N(2/2)"); cb = mChildAlgo->analyze(ectx, ctxModel, cb, 1, x+w-s,y,s ,w); ascend(); - break; - } - - return cb; -} - - -enc_cb* Algo_CB_InterPartMode_Fixed::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - const int x = cb->x; - const int y = cb->y; - - enum PartMode partMode = mParams.partMode(); - - cb->PartMode = partMode; - ectx->img->set_PartMode(x,y, partMode); - - cb = codeAllPBs(ectx,ctxModel,cb); - - return cb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-interpartmode.h
Deleted
@@ -1,108 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_INTERPARTMODE_H -#define CB_INTERPARTMODE_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-split.h" -#include "libde265/encoder/algo/cb-intrapartmode.h" - - -// ========== CB Intra/Inter decision ========== - -class Algo_CB_InterPartMode : public Algo_CB -{ - public: - virtual ~Algo_CB_InterPartMode() { } - - void setChildAlgo(Algo_PB* algo) { mChildAlgo = algo; } - - virtual const char* name() const { return "cb-interpartmode"; } - - protected: - Algo_PB* mChildAlgo; - - enc_cb* codeAllPBs(encoder_context*, - context_model_table&, - enc_cb* cb); -}; - - - - -class option_InterPartMode : public choice_option<enum PartMode> // choice_option -{ - public: - option_InterPartMode() { - add_choice("2Nx2N", PART_2Nx2N, true); - add_choice("NxN", PART_NxN); - add_choice("Nx2N", PART_Nx2N); - add_choice("2NxN", PART_2NxN); - add_choice("2NxnU", PART_2NxnU); - add_choice("2NxnD", PART_2NxnD); - add_choice("nLx2N", PART_nLx2N); - add_choice("nRx2N", PART_nRx2N); - } -}; - -class Algo_CB_InterPartMode_Fixed : public Algo_CB_InterPartMode -{ - public: - struct params - { - params() { - partMode.set_ID("CB-InterPartMode-Fixed-partMode"); - } - - option_InterPartMode partMode; - }; - - void registerParams(config_parameters& config) { - config.add_option(&mParams.partMode); - } - - void setParams(const params& p) { mParams=p; } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); - - virtual const char* name() const { return "cb-interpartmode-fixed"; } - - private: - params mParams; -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-intra-inter.cc
Deleted
@@ -1,132 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-intra-inter.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> - - - -enc_cb* Algo_CB_IntraInter_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - assert(cb->pcm_flag==0); - - bool try_intra = true; - bool try_inter = (ectx->shdr->slice_type != SLICE_TYPE_I); - - bool debug_halt = try_inter; - try_inter = false; - //try_intra = !try_inter; // TODO HACK: no intra in inter frames - - if (ectx->imgdata->frame_number > 0) { - //printf("%d\n",ectx->imgdata->frame_number); - } - - // 0: intra - // 1: inter - - CodingOptions<enc_cb> options(ectx,cb,ctxModel); - - CodingOption<enc_cb> option_intra = options.new_option(try_intra); - CodingOption<enc_cb> option_inter = options.new_option(try_inter); - - options.start(); - - enc_cb* cb_inter = NULL; - enc_cb* cb_intra = NULL; - - const int log2CbSize = cb->log2Size; - const int x = cb->x; - const int y = cb->y; - - - // try encoding with inter - - if (option_inter) { - option_inter.begin(); - cb_inter = option_inter.get_node(); - - cb_inter->PredMode = MODE_INTER; - ectx->img->set_pred_mode(x,y, log2CbSize, MODE_INTER); - - enc_cb* cb_result; - descend(cb,"inter"); - cb_result=mInterAlgo->analyze(ectx, option_inter.get_context(), cb_inter); - ascend(); - - if (cb_result->PredMode != MODE_SKIP) { - CABAC_encoder_estim* cabac = option_inter.get_cabac(); - cabac->reset(); - - cabac->write_CABAC_bit(CONTEXT_MODEL_PRED_MODE_FLAG, 0); // 0 - inter - float rate_pred_mode_flag = cabac->getRDBits(); - //printf("inter bits: %f\n", rate_pred_mode_flag); - - cb_result->rate += rate_pred_mode_flag; - } - - option_inter.set_node(cb_result); - - option_inter.end(); - } - - - // try intra - - if (option_intra) { - option_intra.begin(); - cb_intra = option_intra.get_node(); - - cb_intra->PredMode = MODE_INTRA; - ectx->img->set_pred_mode(x,y, log2CbSize, MODE_INTRA); - - enc_cb* cb_result; - descend(cb,"intra"); - cb_result=mIntraAlgo->analyze(ectx, option_intra.get_context(), cb_intra); - ascend(); - - if (ectx->shdr->slice_type != SLICE_TYPE_I) { - CABAC_encoder_estim* cabac = option_intra.get_cabac(); - cabac->reset(); - - cabac->write_CABAC_bit(CONTEXT_MODEL_PRED_MODE_FLAG, 1); // 1 - intra - float rate_pred_mode_flag = cabac->getRDBits(); - //printf("intra bits: %f\n", rate_pred_mode_flag); - - cb_result->rate += rate_pred_mode_flag; - } - - option_intra.set_node(cb_result); - - option_intra.end(); - } - - - options.compute_rdo_costs(); - return options.return_best_rdo_node(); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-intra-inter.h
Deleted
@@ -1,68 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_INTRA_INTER_H -#define CB_INTRA_INTER_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-split.h" -#include "libde265/encoder/algo/cb-intrapartmode.h" - - -// ========== CB Intra/Inter decision ========== - -class Algo_CB_IntraInter : public Algo_CB -{ - public: - virtual ~Algo_CB_IntraInter() { } - - void setIntraChildAlgo(Algo_CB* algo) { mIntraAlgo = algo; } - void setInterChildAlgo(Algo_CB* algo) { mInterAlgo = algo; } - - virtual const char* name() const { return "cb-intra-inter"; } - - protected: - Algo_CB* mIntraAlgo; - Algo_CB* mInterAlgo; -}; - -class Algo_CB_IntraInter_BruteForce : public Algo_CB_IntraInter -{ - public: - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-intrapartmode.cc
Deleted
@@ -1,185 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-intrapartmode.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <iostream> - - -#define ENCODER_DEVELOPMENT 1 - - - -enc_cb* Algo_CB_IntraPartMode_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb_in) -{ - const int log2CbSize = cb_in->log2Size; - const int x = cb_in->x; - const int y = cb_in->y; - - const bool can_use_NxN = ((log2CbSize == ectx->get_sps().Log2MinCbSizeY) && - (log2CbSize > ectx->get_sps().Log2MinTrafoSize)); - - - // test all modes - - assert(cb_in->pcm_flag==0); - - - // 0: 2Nx2N (always checked) - // 1: NxN (only checked at MinCbSize) - - CodingOptions<enc_cb> options(ectx,cb_in,ctxModel); - CodingOption<enc_cb> option2; - option0 = options.new_option(true); - option1 = options.new_option(can_use_NxN); - - options.start(); - - for (int p=0;p<2;p++) - if (optionp) { - optionp.begin(); - - enc_cb* cb = optionp.get_node(); - *(cb_in->downPtr) = cb; - - // --- set intra prediction mode --- - - cb->PartMode = (p==0 ? PART_2Nx2N : PART_NxN); - - ectx->img->set_pred_mode(x,y, log2CbSize, cb->PredMode); // TODO: probably unnecessary - ectx->img->set_PartMode (x,y, cb->PartMode); - - - // encode transform tree - - int IntraSplitFlag= (cb->PredMode == MODE_INTRA && cb->PartMode == PART_NxN); - int MaxTrafoDepth = ectx->get_sps().max_transform_hierarchy_depth_intra + IntraSplitFlag; - - descend(cb,p==0 ? "2Nx2N" : "NxN"); - - enc_tb* tb = new enc_tb(x,y,log2CbSize,cb); - tb->downPtr = &cb->transform_tree; - - cb->transform_tree = mTBIntraPredModeAlgo->analyze(ectx, optionp.get_context(), - ectx->imgdata->input, tb, - 0, MaxTrafoDepth, IntraSplitFlag); - - ascend(); - - cb->distortion = cb->transform_tree->distortion; - cb->rate = cb->transform_tree->rate; - - - // rate for cu syntax - - logtrace(LogSymbols,"$1 part_mode=%d\n",cb->PartMode); - if (log2CbSize == ectx->get_sps().Log2MinCbSizeY) { - int bin = (cb->PartMode==PART_2Nx2N); - optionp.get_cabac()->reset(); - optionp.get_cabac()->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+0, bin); - cb->rate += optionp.get_cabac()->getRDBits(); - } - - optionp.end(); - } - - options.compute_rdo_costs(); - enc_cb* bestCB = options.return_best_rdo_node(); - - return bestCB; -} - - -enc_cb* Algo_CB_IntraPartMode_Fixed::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - enum PartMode PartMode = mParams.partMode(); - - - const int log2CbSize = cb->log2Size; - const int x = cb->x; - const int y = cb->y; - - - // NxN can only be applied at minimum CB size. - // If we are not at the minimum size, we have to use 2Nx2N. - - if (PartMode==PART_NxN && log2CbSize != ectx->get_sps().Log2MinCbSizeY) { - PartMode = PART_2Nx2N; - } - - - // --- set intra prediction mode --- - - cb->PartMode = PartMode; - ectx->img->set_PartMode(x,y, PartMode); - - - // encode transform tree - - int IntraSplitFlag= (cb->PredMode == MODE_INTRA && cb->PartMode == PART_NxN); - int MaxTrafoDepth = ectx->get_sps().max_transform_hierarchy_depth_intra + IntraSplitFlag; - - enc_tb* tb = new enc_tb(x,y,log2CbSize,cb); - tb->blkIdx = 0; - tb->downPtr = &cb->transform_tree; - - descend(cb,"fixed:%s", (PartMode==PART_2Nx2N ? "2Nx2N":"NxN")); - cb->transform_tree = mTBIntraPredModeAlgo->analyze(ectx, ctxModel, - ectx->imgdata->input, tb, - 0, MaxTrafoDepth, IntraSplitFlag); - ascend(); - - - // rate and distortion for this CB - - cb->distortion = cb->transform_tree->distortion; - cb->rate = cb->transform_tree->rate; - - - // rate for cu syntax - - CABAC_encoder_estim estim; - estim.set_context_models(&ctxModel); - - //encode_coding_unit(ectx,&estim,cb,x,y,log2CbSize, false); - - //encode_part_mode(ectx,&estim, MODE_INTRA, PartMode, 0); - - logtrace(LogSymbols,"$1 part_mode=%d\n",PartMode); - if (log2CbSize == ectx->get_sps().Log2MinCbSizeY) { - int bin = (PartMode==PART_2Nx2N); - estim.write_CABAC_bit(CONTEXT_MODEL_PART_MODE+0, bin); - } - - cb->rate += estim.getRDBits(); - - return cb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-intrapartmode.h
Deleted
@@ -1,149 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_INTRAPARTMODE_H -#define CB_INTRAPARTMODE_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-split.h" - - -/* Encoder search tree, bottom up: - - - Algo_TB_Split - whether TB is split or not - - - Algo_TB_IntraPredMode - choose the intra prediction mode (or NOP, if at the wrong tree level) - - - Algo_CB_IntraPartMode - choose between NxN and 2Nx2N intra parts - - - Algo_CB_Split - whether CB is split or not - - - Algo_CTB_QScale - select QScale on CTB granularity - */ - - -// ========== CB intra NxN vs. 2Nx2N decision ========== - -enum ALGO_CB_IntraPartMode { - ALGO_CB_IntraPartMode_BruteForce, - ALGO_CB_IntraPartMode_Fixed -}; - -class option_ALGO_CB_IntraPartMode : public choice_option<enum ALGO_CB_IntraPartMode> -{ - public: - option_ALGO_CB_IntraPartMode() { - add_choice("fixed", ALGO_CB_IntraPartMode_Fixed); - add_choice("brute-force",ALGO_CB_IntraPartMode_BruteForce, true); - } -}; - - -class Algo_CB_IntraPartMode : public Algo_CB -{ - public: - Algo_CB_IntraPartMode() : mTBIntraPredModeAlgo(NULL) { } - virtual ~Algo_CB_IntraPartMode() { } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb) = 0; - - void setChildAlgo(Algo_TB_IntraPredMode* algo) { mTBIntraPredModeAlgo = algo; } - - virtual const char* name() const { return "cb-intrapartmode"; } - - protected: - Algo_TB_IntraPredMode* mTBIntraPredModeAlgo; -}; - -/* Try both NxN, 2Nx2N and choose better one. - */ -class Algo_CB_IntraPartMode_BruteForce : public Algo_CB_IntraPartMode -{ - public: - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); - - virtual const char* name() const { return "cb-intrapartmode-bruteforce"; } -}; - - -class option_PartMode : public choice_option<enum PartMode> // choice_option -{ - public: - option_PartMode() { - add_choice("NxN", PART_NxN); - add_choice("2Nx2N", PART_2Nx2N, true); - } -}; - - -/* Always use choose selected part mode. - If NxN is chosen but cannot be applied (CB tree not at maximum depth), 2Nx2N is used instead. - */ -class Algo_CB_IntraPartMode_Fixed : public Algo_CB_IntraPartMode -{ - public: - Algo_CB_IntraPartMode_Fixed() { } - - struct params - { - params() { - partMode.set_ID("CB-IntraPartMode-Fixed-partMode"); - } - - option_PartMode partMode; - }; - - void registerParams(config_parameters& config) { - config.add_option(&mParams.partMode); - } - - void setParams(const params& p) { mParams=p; } - - virtual enc_cb* analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb); - - virtual const char* name() const { return "cb-intrapartmode-fixed"; } - - private: - params mParams; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-mergeindex.cc
Deleted
@@ -1,176 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-mergeindex.h" -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encoder-syntax.h" -#include "libde265/encoder/encoder-motion.h" -#include <assert.h> -#include <limits> -#include <math.h> - - - - -enc_cb* Algo_CB_MergeIndex_Fixed::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - assert(cb->split_cu_flag==false); - assert(cb->PredMode==MODE_SKIP); // TODO: || (cb->PredMode==MODE_INTER && cb->inter.skip_flag)); - - - PBMotion mergeCandList5; - - int partIdx = 0; - - int cbSize = 1 << cb->log2Size; - - get_merge_candidate_list_from_tree(ectx, ectx->shdr, - cb->x, cb->y, // xC/yC - cb->x, cb->y, // xP/yP - cbSize, // nCS - cbSize,cbSize, // nPbW/nPbH - partIdx, // partIdx - mergeCandList); - - PBMotionCoding& spec = cb->inter.pbpartIdx.spec; - - spec.merge_flag = 1; // we use merge mode - spec.merge_idx = 0; - - - // build prediction - - // previous frame (TODO) - const de265_image* refimg = ectx->get_image(ectx->imgdata->frame_number -1); - - //printf("prev frame: %p %d\n",refimg,ectx->imgdata->frame_number); - - /* - printf("#l0: %d\n",ectx->imgdata->shdr.num_ref_idx_l0_active); - printf("#l1: %d\n",ectx->imgdata->shdr.num_ref_idx_l1_active); - - for (int i=0;i<ectx->imgdata->shdr.num_ref_idx_l0_active;i++) - printf("RefPixList0%d = %d\n", i, ectx->imgdata->shdr.RefPicList0i); - */ - - // TODO: fake motion data - - const PBMotion& vec = mergeCandListspec.merge_idx; - cb->inter.pbpartIdx.motion = vec; - - //ectx->img->set_mv_info(cb->x,cb->y, 1<<cb->log2Size,1<<cb->log2Size, vec); - - /* - generate_inter_prediction_samples(ectx, ectx->shdr, ectx->prediction, - cb->x,cb->y, // int xC,int yC, - 0,0, // int xB,int yB, - 1<<cb->log2Size, // int nCS, - 1<<cb->log2Size, - 1<<cb->log2Size, // int nPbW,int nPbH, - &vec); - */ - - generate_inter_prediction_samples(ectx, ectx->shdr, ectx->img, - cb->x,cb->y, // int xC,int yC, - 0,0, // int xB,int yB, - 1<<cb->log2Size, // int nCS, - 1<<cb->log2Size, - 1<<cb->log2Size, // int nPbW,int nPbH, - &vec); - - /* - printBlk("merge prediction:", - ectx->img->get_image_plane_at_pos(0, cb->x,cb->y), 1<<cb->log2Size, - ectx->img->get_image_stride(0), - "merge "); - */ - - // estimate rate for sending merge index - - //CABAC_encoder_estim cabac; - //cabac.write_bits(); - - int IntraSplitFlag = 0; - int MaxTrafoDepth = ectx->get_sps().max_transform_hierarchy_depth_inter; - - if (mCodeResidual) { - assert(false); - descend(cb,"with residual"); - assert(false); - /* TODO - cb->transform_tree = mTBSplit->analyze(ectx,ctxModel, ectx->imgdata->input, NULL, cb, - cb->x,cb->y,cb->x,cb->y, cb->log2Size,0, - 0, MaxTrafoDepth, IntraSplitFlag); - */ - ascend(); - - cb->inter.rqt_root_cbf = ! cb->transform_tree->isZeroBlock(); - - cb->distortion = cb->transform_tree->distortion; - cb->rate = cb->transform_tree->rate; - } - else { - const de265_image* input = ectx->imgdata->input; - //de265_image* img = ectx->prediction; - int x0 = cb->x; - int y0 = cb->y; - int tbSize = 1<<cb->log2Size; - - CABAC_encoder_estim cabac; - cabac.set_context_models(&ctxModel); - encode_merge_idx(ectx, &cabac, spec.merge_idx); - - leaf(cb,"no residual"); - - cb->rate = cabac.getRDBits(); - - cb->inter.rqt_root_cbf = 0; - - - enc_tb* tb = new enc_tb(x0,y0,cb->log2Size,cb); - tb->downPtr = &cb->transform_tree; - cb->transform_tree = tb; - - tb->reconstruct(ectx, ectx->img); // reconstruct luma - - /* - printBlk("distortion input:", - input->get_image_plane_at_pos(0,x0,y0), 1<<cb->log2Size, - input->get_image_stride(0), - "input "); - - printBlk("distortion prediction:", - ectx->img->get_image_plane_at_pos(0,x0,y0), 1<<cb->log2Size, - ectx->img->get_image_stride(0), - "pred "); - */ - - cb->distortion = compute_distortion_ssd(input, ectx->img, x0,y0, cb->log2Size, 0); - } - - //printf("%d;%d rqt_root_cbf=%d\n",cb->x,cb->y,cb->inter.rqt_root_cbf); - - return cb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-mergeindex.h
Deleted
@@ -1,70 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_MERGEINDEX_H -#define CB_MERGEINDEX_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/tb-split.h" - - -// ========== CB Skip/Inter decision ========== - -class Algo_CB_MergeIndex : public Algo_CB -{ - public: - Algo_CB_MergeIndex() : mCodeResidual(false) { } - virtual ~Algo_CB_MergeIndex() { } - - void set_code_residual(bool flag=true) { mCodeResidual=flag; } - - void setChildAlgo(Algo_TB_Split* algo) { mTBSplit = algo; } - // TODO void setInterChildAlgo(Algo_CB_IntraPartMode* algo) { mInterPartModeAlgo = algo; } - - virtual const char* name() const { return "cb-mergeindex"; } - - protected: - Algo_TB_Split* mTBSplit; - - bool mCodeResidual; -}; - -class Algo_CB_MergeIndex_Fixed : public Algo_CB_MergeIndex -{ - public: - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-skip.cc
Deleted
@@ -1,114 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-skip.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-syntax.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> - - - - -enc_cb* Algo_CB_Skip_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - bool try_skip = (ectx->shdr->slice_type != SLICE_TYPE_I); - bool try_nonskip = true; - - //try_nonskip = !try_skip; - - CodingOptions<enc_cb> options(ectx,cb,ctxModel); - CodingOption<enc_cb> option_skip = options.new_option(try_skip); - CodingOption<enc_cb> option_nonskip = options.new_option(try_nonskip); - options.start(); - - for (int i=0;i<CONTEXT_MODEL_TABLE_LENGTH;i++) { - //printf("%i: %d/%d\n",i, ctxModeli.state, ctxModeli.MPSbit); - } - - if (option_skip) { - CodingOption<enc_cb>& opt = option_skip; - opt.begin(); - - enc_cb* cb = opt.get_node(); - - // calc rate for skip flag (=true) - - CABAC_encoder_estim* cabac = opt.get_cabac(); - encode_cu_skip_flag(ectx, cabac, cb, true); - float rate_pred_mode = cabac->getRDBits(); - cabac->reset(); - - // set skip flag - - cb->PredMode = MODE_SKIP; - ectx->img->set_pred_mode(cb->x,cb->y, cb->log2Size, cb->PredMode); - - // encode CB - - descend(cb,"yes"); - cb = mSkipAlgo->analyze(ectx, opt.get_context(), cb); - ascend(); - - // add rate for PredMode - - cb->rate += rate_pred_mode; - opt.set_node(cb); - opt.end(); - } - - if (option_nonskip) { - CodingOption<enc_cb>& opt = option_nonskip; - enc_cb* cb = opt.get_node(); - - opt.begin(); - - // calc rate for skip flag (=true) - - float rate_pred_mode = 0; - - if (try_skip) { - CABAC_encoder_estim* cabac = opt.get_cabac(); - encode_cu_skip_flag(ectx, cabac, cb, false); - rate_pred_mode = cabac->getRDBits(); - cabac->reset(); - } - - descend(cb,"no"); - cb = mNonSkipAlgo->analyze(ectx, opt.get_context(), cb); - ascend(); - - // add rate for PredMode - - cb->rate += rate_pred_mode; - opt.set_node(cb); - opt.end(); - } - - options.compute_rdo_costs(); - return options.return_best_rdo_node(); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-skip.h
Deleted
@@ -1,72 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_SKIP_H -#define CB_SKIP_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/cb-mergeindex.h" - - -// ========== CB Skip/Inter decision ========== - -class Algo_CB_Skip : public Algo_CB -{ - public: - virtual ~Algo_CB_Skip() { } - - void setSkipAlgo(Algo_CB_MergeIndex* algo) { - mSkipAlgo = algo; - mSkipAlgo->set_code_residual(false); - } - - void setNonSkipAlgo(Algo_CB* algo) { mNonSkipAlgo = algo; } - - const char* name() const { return "cb-skip"; } - - protected: - Algo_CB_MergeIndex* mSkipAlgo; - Algo_CB* mNonSkipAlgo; -}; - -class Algo_CB_Skip_BruteForce : public Algo_CB_Skip -{ - public: - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); - - const char* name() const { return "cb-skip-bruteforce"; } -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-split.cc
Deleted
@@ -1,178 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/cb-split.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encoder-syntax.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <iostream> - - -// Utility function to encode all four children in a split CB. -// Children are coded with the specified algo_cb_split. -enc_cb* Algo_CB_Split::encode_cb_split(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb) -{ - int w = ectx->imgdata->input->get_width(); - int h = ectx->imgdata->input->get_height(); - - - cb->split_cu_flag = true; - - - // encode all 4 children and sum their distortions and rates - - for (int i=0;i<4;i++) { - cb->childreni = NULL; - } - - for (int i=0;i<4;i++) { - int child_x = cb->x + ((i&1) << (cb->log2Size-1)); - int child_y = cb->y + ((i>>1) << (cb->log2Size-1)); - - if (child_x>=w || child_y>=h) { - // NOP - } - else { - enc_cb* childCB = new enc_cb; - childCB->log2Size = cb->log2Size-1; - childCB->ctDepth = cb->ctDepth+1; - - childCB->x = child_x; - childCB->y = child_y; - childCB->parent = cb; - childCB->downPtr = &cb->childreni; - - descend(cb,"yes %d/4",i+1); - cb->childreni = analyze(ectx, ctxModel, childCB); - ascend(); - - cb->distortion += cb->childreni->distortion; - cb->rate += cb->childreni->rate; - } - } - - return cb; -} - - - - -enc_cb* Algo_CB_Split_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb_input) -{ - assert(cb_input->pcm_flag==0); - - // --- prepare coding options --- - - const SplitType split_type = get_split_type(&ectx->get_sps(), - cb_input->x, cb_input->y, - cb_input->log2Size); - - - bool can_split_CB = (split_type != ForcedNonSplit); - bool can_nosplit_CB = (split_type != ForcedSplit); - - //if (can_split_CB) { can_nosplit_CB=false; } // TODO TMP - //if (can_nosplit_CB) { can_split_CB=false; } // TODO TMP - - CodingOptions<enc_cb> options(ectx, cb_input, ctxModel); - - CodingOption<enc_cb> option_no_split = options.new_option(can_nosplit_CB); - CodingOption<enc_cb> option_split = options.new_option(can_split_CB); - - options.start(); - - /* - cb_input->writeSurroundingMetadata(ectx, ectx->img, - enc_node::METADATA_CT_DEPTH, // for estimation cb-split bits - cb_input->get_rectangle()); - */ - - // --- encode without splitting --- - - if (option_no_split) { - CodingOption<enc_cb>& opt = option_no_split; // abbrev. - - opt.begin(); - - enc_cb* cb = opt.get_node(); - *cb_input->downPtr = cb; - - // set CB size in image data-structure - //ectx->img->set_ctDepth(cb->x,cb->y,cb->log2Size, cb->ctDepth); - //ectx->img->set_log2CbSize(cb->x,cb->y,cb->log2Size, true); - - /* We set QP here, because this is required at in non-split CBs only. - */ - cb->qp = ectx->active_qp; - - // analyze subtree - assert(mChildAlgo); - - descend(cb,"no"); - cb = mChildAlgo->analyze(ectx, opt.get_context(), cb); - ascend(); - - // add rate for split flag - if (split_type == OptionalSplit) { - encode_split_cu_flag(ectx,opt.get_cabac(), cb->x,cb->y, cb->ctDepth, 0); - cb->rate += opt.get_cabac_rate(); - } - - opt.set_node(cb); - opt.end(); - } - - // --- encode with splitting --- - - if (option_split) { - option_split.begin(); - - enc_cb* cb = option_split.get_node(); - *cb_input->downPtr = cb; - - cb = encode_cb_split(ectx, option_split.get_context(), cb); - - // add rate for split flag - if (split_type == OptionalSplit) { - encode_split_cu_flag(ectx,option_split.get_cabac(), cb->x,cb->y, cb->ctDepth, 1); - cb->rate += option_split.get_cabac_rate(); - } - - option_split.set_node(cb); - option_split.end(); - } - - options.compute_rdo_costs(); - enc_cb* bestCB = options.return_best_rdo_node(); - - //bestCB->debug_assertTreeConsistency(ectx->img); - - return bestCB; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/cb-split.h
Deleted
@@ -1,88 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CB_SPLIT_H -#define CB_SPLIT_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-split.h" - - -/* Encoder search tree, bottom up: - - - Algo_TB_Split - whether TB is split or not - - - Algo_TB_IntraPredMode - choose the intra prediction mode (or NOP, if at the wrong tree level) - - - Algo_CB_IntraPartMode - choose between NxN and 2Nx2N intra parts - - - Algo_CB_Split - whether CB is split or not - - - Algo_CTB_QScale - select QScale on CTB granularity - */ - - -// ========== CB split decision ========== - -class Algo_CB_Split : public Algo_CB -{ - public: - virtual ~Algo_CB_Split() { } - - // TODO: probably, this will later be a intra/inter decision which again - // has two child algorithms, depending on the coding mode. - void setChildAlgo(Algo_CB* algo) { mChildAlgo = algo; } - - const char* name() const { return "cb-split"; } - - protected: - Algo_CB* mChildAlgo; - - enc_cb* encode_cb_split(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb); -}; - - -class Algo_CB_Split_BruteForce : public Algo_CB_Split -{ - public: - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb); - - const char* name() const { return "cb-split-bruteforce"; } -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/coding-options.cc
Deleted
@@ -1,202 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" - - -template <class node> -CodingOptions<node>::CodingOptions(encoder_context* ectx, node* _node, context_model_table& tab) -{ - //mCBMode = true; - mInputNode = _node; - mContextModelInput = &tab; - - mBestRDO=-1; - - mECtx = ectx; -} - -template <class node> -CodingOptions<node>::~CodingOptions() -{ -} - -template <class node> -CodingOption<node> CodingOptions<node>::new_option(bool active) -{ - if (!active) { - return CodingOption<node>(); - } - - - CodingOptionData opt; - - bool firstOption = mOptions.empty(); - if (firstOption) { - opt.mNode = mInputNode; - } - else { - opt.mNode = new node(*mInputNode); - } - - opt.context = *mContextModelInput; - opt.computed = false; - - CodingOption<node> option(this, mOptions.size()); - - mOptions.push_back( std::move(opt) ); - - return option; -} - - -template <class node> -void CodingOptions<node>::start(enum RateEstimationMethod rateMethod) -{ - /* We don't need the input context model anymore. - Releasing it now may save a copy during a later decouple(). - */ - mContextModelInput->release(); - - bool adaptiveContext; - switch (rateMethod) { - case Rate_Default: - adaptiveContext = mECtx->use_adaptive_context; - break; - case Rate_FixedContext: - adaptiveContext = false; - break; - case Rate_AdaptiveContext: - adaptiveContext = true; - break; - } - - if (adaptiveContext) { - /* If we modify the context models in this algorithm, - we need separate models for each option. - */ - for (auto& option : mOptions) { - option.context.decouple(); - } - - cabac = &cabac_adaptive; - } - else { - cabac = &cabac_constant; - } -} - - -template <class node> -void CodingOptions<node>::compute_rdo_costs() -{ - for (size_t i=0;i<mOptions.size();i++) { - if (mOptionsi.computed) { - //printf("compute_rdo_costs %d: %f\n",i, mOptionsi.mNode->rate); - mOptionsi.rdoCost = mOptionsi.mNode->distortion + mECtx->lambda * mOptionsi.mNode->rate; - } - } -} - - -template <class node> -int CodingOptions<node>::find_best_rdo_index() -{ - assert(mOptions.size()>0); - - - float bestRDOCost = 0; - bool first=true; - int bestRDO=-1; - - for (size_t i=0;i<mOptions.size();i++) { - if (mOptionsi.computed) { - float cost = mOptionsi.rdoCost; - - //printf("option %d cost: %f\n",i,cost); - - if (first || cost < bestRDOCost) { - bestRDOCost = cost; - first = false; - bestRDO = i; - } - } - } - - return bestRDO; -} - - -template <class node> -node* CodingOptions<node>::return_best_rdo_node() -{ - int bestRDO = find_best_rdo_index(); - - assert(bestRDO>=0); - - *mContextModelInput = mOptionsbestRDO.context; - - - // delete all CBs except the best one - - for (size_t i=0;i<mOptions.size();i++) { - if (i != bestRDO) - { - delete mOptionsi.mNode; - mOptionsi.mNode = NULL; - } - } - - return mOptionsbestRDO.mNode; -} - - -template <class node> -void CodingOption<node>::begin() -{ - assert(mParent); - assert(mParent->cabac); // did you call CodingOptions.start() ? - - mParent->cabac->reset(); - mParent->cabac->set_context_models( &get_context() ); - - mParent->mOptionsmOptionIdx.computed = true; - - // link this node into the coding tree - - node* n = get_node(); - *(n->downPtr) = n; -} - - -template <class node> -void CodingOption<node>::end() -{ -} - - -template class CodingOptions<enc_tb>; -template class CodingOptions<enc_cb>; - -template class CodingOption<enc_tb>; -template class CodingOption<enc_cb>;
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/coding-options.h
Deleted
@@ -1,151 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CODING_OPTIONS_H -#define CODING_OPTIONS_H - -#include "libde265/encoder/encoder-types.h" - - -template <class node> class CodingOption; - - -template <class node> -class CodingOptions -{ - public: - CodingOptions(encoder_context*, node*, context_model_table& tab); - ~CodingOptions(); - - typedef CodingOption<node> Option; - - // --- init --- call before object use - - CodingOption<node> new_option(bool active=true); - - enum RateEstimationMethod - { - Rate_Default, // take default value from encoder_context - Rate_AdaptiveContext, - Rate_FixedContext - }; - - void start(enum RateEstimationMethod = Rate_Default); - - - // --- processing --- - - // compute RDO cost (D + lambda*R) for all options - void compute_rdo_costs(); - - - // --- end processing --- do not call any function after this one - - /* Return the CB with the lowest RDO cost. All other CBs are destroyed. - If the current metadata stored in the image are not from the returned block, - its metadata flags are set to zero. - */ - node* return_best_rdo_node(); - - private: - struct CodingOptionData - { - node* mNode; - - context_model_table context; - bool mOptionActive; - bool computed; - float rdoCost; - }; - - - encoder_context* mECtx; - - bool mCBMode; - node* mInputNode; - - context_model_table* mContextModelInput; - - int mBestRDO; - - std::vector<CodingOptionData> mOptions; - - CABAC_encoder_estim cabac_adaptive; - CABAC_encoder_estim_constant cabac_constant; - CABAC_encoder_estim* cabac; - - friend class CodingOption<node>; - - int find_best_rdo_index(); -}; - - -template <class node> -class CodingOption -{ - public: - CodingOption() { - mParent = NULL; - mOptionIdx = 0; - } - - node* get_node() { return mParent->mOptionsmOptionIdx.mNode; } - void set_node(node* _node) { - if (_node != mParent->mOptionsmOptionIdx.mNode) { - //printf("delete TB %p\n", mParent->mOptionsmOptionIdx.tb); - //delete mParent->mOptionsmOptionIdx.mNode; - } - mParent->mOptionsmOptionIdx.mNode = _node; - } - - context_model_table& get_context() { return mParent->mOptionsmOptionIdx.context; } - - /** @return True if the option is active. - */ - operator bool() const { return mParent; } - - /* When modifying the metadata stored in the image, you have to - encapsulate the modification between these two functions to ensure - that the correct reconstruction will be active after return_best_rdo(). - */ - void begin(); - void end(); - - // Manually set RDO costs instead of computing them with compute_rdo_costs. - // Only required when using custom costs. - void set_rdo_cost(float rdo) { mParent->mOptionsmOptionIdx.rdoCost=rdo; } - - CABAC_encoder_estim* get_cabac() { return mParent->cabac; } - float get_cabac_rate() const { return mParent->cabac->getRDBits(); } - -private: - CodingOption(class CodingOptions<node>* parent, int idx) - : mParent(parent), mOptionIdx(idx) { } - - class CodingOptions<node>* mParent; - int mOptionIdx; - - friend class CodingOptions<node>; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/ctb-qscale.cc
Deleted
@@ -1,61 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/ctb-qscale.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> - - -#define ENCODER_DEVELOPMENT 1 - - -enc_cb* Algo_CTB_QScale_Constant::analyze(encoder_context* ectx, - context_model_table& ctxModel, - int x,int y) -{ - enc_cb* cb = new enc_cb(); - - cb->log2Size = ectx->get_sps().Log2CtbSizeY; - cb->ctDepth = 0; - cb->x = x; - cb->y = y; - cb->downPtr = ectx->ctbs.getCTBRootPointer(x,y); - *cb->downPtr = cb; - - cb->qp = ectx->active_qp; - - // write currently unused coding options - cb->cu_transquant_bypass_flag = false; - cb->pcm_flag = false; - - assert(mChildAlgo); - descend(cb, "Q=%d",ectx->active_qp); - enc_cb* result_cb = mChildAlgo->analyze(ectx,ctxModel,cb); - ascend(); - - *cb->downPtr = result_cb; - - return result_cb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/ctb-qscale.h
Deleted
@@ -1,109 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CTB_QSCALE_H -#define CTB_QSCALE_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/algo/algo.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/cb-split.h" - - -/* Encoder search tree, bottom up: - - - Algo_TB_Split - whether TB is split or not - - - Algo_TB_IntraPredMode - choose the intra prediction mode (or NOP, if at the wrong tree level) - - - Algo_CB_IntraPartMode - choose between NxN and 2Nx2N intra parts - - - Algo_CB_Split - whether CB is split or not - - - Algo_CTB_QScale - select QScale on CTB granularity - */ - - -// ========== choose a qscale at CTB level ========== - -class Algo_CTB_QScale : public Algo -{ - public: - Algo_CTB_QScale() : mChildAlgo(NULL) { } - virtual ~Algo_CTB_QScale() { } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - int ctb_x,int ctb_y) = 0; - - void setChildAlgo(Algo_CB_Split* algo) { mChildAlgo = algo; } - - protected: - Algo_CB_Split* mChildAlgo; -}; - - - -class Algo_CTB_QScale_Constant : public Algo_CTB_QScale -{ - public: - struct params - { - params() { - mQP.set_range(1,51); - mQP.set_default(27); - mQP.set_ID("CTB-QScale-Constant"); - mQP.set_cmd_line_options("qp",'q'); - } - - option_int mQP; - }; - - void setParams(const params& p) { mParams=p; } - - void registerParams(config_parameters& config) { - config.add_option(&mParams.mQP); - } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - int ctb_x,int ctb_y); - - int getQP() const { return mParams.mQP; } - - const char* name() const { return "ctb-qscale-constant"; } - - private: - params mParams; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/pb-mv.cc
Deleted
@@ -1,318 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/pb-mv.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-context.h" -#include <assert.h> -#include <limits> -#include <math.h> - - - -enc_cb* Algo_PB_MV_Test::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb, - int PBidx, int x,int y,int w,int h) -{ - enum MVTestMode testMode = mParams.testMode(); - - - MotionVector mvp2; - - fill_luma_motion_vector_predictors(ectx, ectx->shdr, ectx->img, - cb->x,cb->y,1<<cb->log2Size, x,y,w,h, - 0, // l - 0, 0, // int refIdx, int partIdx, - mvp); - - //printf("%d/%d: %d;%d %d;%d\n",cb->x,cb->y, mvp0.x,mvp0.y, mvp1.x,mvp1.y); - - - PBMotionCoding& spec = cb->inter.pbPBidx.spec; - PBMotion& vec = cb->inter.pbPBidx.motion; - - spec.merge_flag = 0; - spec.merge_idx = 0; - - spec.inter_pred_idc = PRED_L0; - spec.refIdx0 = vec.refIdx0 = 0; - spec.mvp_l0_flag = 0; - - int value = mParams.range(); - - switch (testMode) { - case MVTestMode_Zero: - spec.mvd00=0; - spec.mvd01=0; - break; - - case MVTestMode_Random: - spec.mvd00 = (rand() % (2*value+1)) - value; - spec.mvd01 = (rand() % (2*value+1)) - value; - break; - - case MVTestMode_Horizontal: - spec.mvd00=value; - spec.mvd01=0; - break; - - case MVTestMode_Vertical: - spec.mvd00=0; - spec.mvd01=value; - break; - } - - spec.mvd00 -= mvp0.x; - spec.mvd01 -= mvp0.y; - - vec.mv0.x = mvp0.x + spec.mvd00; - vec.mv0.y = mvp0.y + spec.mvd01; - vec.predFlag0 = 1; - vec.predFlag1 = 0; - - ectx->img->set_mv_info(x,y,w,h, vec); - - /* TMP REMOVE: ectx->prediction does not exist anymore - generate_inter_prediction_samples(ectx, ectx->shdr, ectx->prediction, - cb->x,cb->y, // int xC,int yC, - 0,0, // int xB,int yB, - 1<<cb->log2Size, // int nCS, - 1<<cb->log2Size, - 1<<cb->log2Size, // int nPbW,int nPbH, - &vec); - */ - - // TODO estimate rate for sending MV - - int IntraSplitFlag = 0; - int MaxTrafoDepth = ectx->get_sps().max_transform_hierarchy_depth_inter; - - mCodeResidual=true; - if (mCodeResidual) { - assert(mTBSplitAlgo); - assert(false); - /* - cb->transform_tree = mTBSplitAlgo->analyze(ectx,ctxModel, ectx->imgdata->input, NULL, cb, - cb->x,cb->y,cb->x,cb->y, cb->log2Size,0, - 0, MaxTrafoDepth, IntraSplitFlag); - */ - - cb->inter.rqt_root_cbf = ! cb->transform_tree->isZeroBlock(); - - cb->distortion = cb->transform_tree->distortion; - cb->rate = cb->transform_tree->rate; - } - else { - const de265_image* input = ectx->imgdata->input; - /* TODO TMP REMOVE: prediction does not exist anymore - de265_image* img = ectx->prediction; - int x0 = cb->x; - int y0 = cb->y; - int tbSize = 1<<cb->log2Size; - - cb->distortion = compute_distortion_ssd(input, img, x0,y0, cb->log2Size, 0); - cb->rate = 5; // fake (MV) - - cb->inter.rqt_root_cbf = 0; - */ - } - - return cb; -} - - - - -int sad(const uint8_t* p1,int stride1, - const uint8_t* p2,int stride2, - int w,int h) -{ - int cost=0; - - for (int y=0;y<h;y++) { - for (int x=0;x<w;x++) { - cost += abs_value(*p1 - *p2); - p1++; - p2++; - } - - p1 += stride1-w; - p2 += stride2-w; - } - - return cost; -} - - -enc_cb* Algo_PB_MV_Search::analyze(encoder_context* ectx, - context_model_table& ctxModel, - enc_cb* cb, - int PBidx, int x,int y,int pbW,int pbH) -{ - enum MVSearchAlgo searchAlgo = mParams.mvSearchAlgo(); - - - MotionVector mvp2; - - fill_luma_motion_vector_predictors(ectx, ectx->shdr, ectx->img, - cb->x,cb->y,1<<cb->log2Size, x,y,pbW,pbH, - 0, // l - 0, 0, // int refIdx, int partIdx, - mvp); - - PBMotionCoding& spec = cb->inter.pbPBidx.spec; - PBMotion& vec = cb->inter.pbPBidx.motion; - - spec.merge_flag = 0; - spec.merge_idx = 0; - - spec.inter_pred_idc = PRED_L0; - spec.refIdx0 = vec.refIdx0 = 0; - spec.mvp_l0_flag = 0; - - int hrange = mParams.hrange(); - int vrange = mParams.vrange(); - - // previous frame (TODO) - const de265_image* refimg = ectx->get_image(ectx->imgdata->frame_number -1); - const de265_image* inputimg = ectx->imgdata->input; - - int w = refimg->get_width(); - int h = refimg->get_height(); - - int mincost = 0x7fffffff; - - double lambda = 10.0; - - double *bits_h = new double2*hrange+1; - double *bits_v = new double2*vrange+1; - - for (int i=-hrange;i<=hrange;i++) { - int diff = (i - mvp0.x); - int b; - - if (diff==0) { b=0; } - else if (diff==1 || diff==-1) { b=2; } - else { b=abs_value(b+2); } - - bits_hi+hrange=b; - } - - for (int i=-vrange;i<=vrange;i++) { - int diff = (i - mvp0.y); - int b; - - if (diff==0) { b=0; } - else if (diff==1 || diff==-1) { b=2; } - else { b=abs_value(b+2); } - - bits_vi+vrange=b; - } - - for (int my = y-vrange; my<=y+vrange; my++) - for (int mx = x-hrange; mx<=x+hrange; mx++) - { - if (mx<0 || mx+pbW>w || my<0 || my+pbH>h) continue; - - int cost = sad(refimg->get_image_plane_at_pos(0,mx,my), - refimg->get_image_stride(0), - inputimg->get_image_plane_at_pos(0,x,y), - inputimg->get_image_stride(0), - pbW,pbH); - - int bits = bits_hmx-x+hrange + bits_vmy-y+vrange; - - cost += lambda * bits; - - //printf("%d %d : %d\n",mx,my,cost); - - if (cost<mincost) { - mincost=cost; - - spec.mvd00=(mx-x)<<2; - spec.mvd01=(my-y)<<2; - } - } - - spec.mvd00 -= mvp0.x; - spec.mvd01 -= mvp0.y; - - vec.mv0.x = mvp0.x + spec.mvd00; - vec.mv0.y = mvp0.y + spec.mvd01; - vec.predFlag0 = 1; - vec.predFlag1 = 0; - - ectx->img->set_mv_info(x,y,pbW,pbH, vec); - - /* TMP REMOVE: ectx->prediction does not exist anymore - generate_inter_prediction_samples(ectx, ectx->shdr, ectx->prediction, - cb->x,cb->y, // int xC,int yC, - 0,0, // int xB,int yB, - 1<<cb->log2Size, // int nCS, - 1<<cb->log2Size, - 1<<cb->log2Size, // int nPbW,int nPbH, - &vec); - */ - - // --- create residual --- - - - - // TODO estimate rate for sending MV - - int IntraSplitFlag = 0; - int MaxTrafoDepth = ectx->get_sps().max_transform_hierarchy_depth_inter; - - mCodeResidual=true; - if (mCodeResidual) { - assert(false); - /* - cb->transform_tree = mTBSplitAlgo->analyze(ectx,ctxModel, ectx->imgdata->input, NULL, cb, - cb->x,cb->y,cb->x,cb->y, cb->log2Size,0, - 0, MaxTrafoDepth, IntraSplitFlag); - */ - - cb->inter.rqt_root_cbf = ! cb->transform_tree->isZeroBlock(); - - cb->distortion = cb->transform_tree->distortion; - cb->rate = cb->transform_tree->rate; - } - else { - const de265_image* input = ectx->imgdata->input; - de265_image* img = ectx->img; - int x0 = cb->x; - int y0 = cb->y; - int tbSize = 1<<cb->log2Size; - - cb->distortion = compute_distortion_ssd(input, img, x0,y0, cb->log2Size, 0); - cb->rate = 5; // fake (MV) - - cb->inter.rqt_root_cbf = 0; - } - - delete bits_h; - delete bits_v; - - return cb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/pb-mv.h
Deleted
@@ -1,177 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef PB_MV_H -#define PB_MV_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/algo.h" - - -// ========== CB Intra/Inter decision ========== - -class Algo_TB_Split; - - -class Algo_PB_MV : public Algo_PB -{ - public: - Algo_PB_MV() : mTBSplitAlgo(NULL) { } - virtual ~Algo_PB_MV() { } - - void setChildAlgo(Algo_TB_Split* algo) { mTBSplitAlgo = algo; } - - protected: - Algo_TB_Split* mTBSplitAlgo; -}; - - - - -enum MVTestMode - { - MVTestMode_Zero, - MVTestMode_Random, - MVTestMode_Horizontal, - MVTestMode_Vertical - }; - -class option_MVTestMode : public choice_option<enum MVTestMode> -{ - public: - option_MVTestMode() { - add_choice("zero", MVTestMode_Zero); - add_choice("random", MVTestMode_Random); - add_choice("horiz", MVTestMode_Horizontal, true); - add_choice("verti", MVTestMode_Vertical); - } -}; - - -class Algo_PB_MV_Test : public Algo_PB_MV -{ - public: - Algo_PB_MV_Test() : mCodeResidual(false) { } - - struct params - { - params() { - testMode.set_ID("PB-MV-TestMode"); - range.set_ID ("PB-MV-Range"); - range.set_default(4); - } - - option_MVTestMode testMode; - option_int range; - }; - - void registerParams(config_parameters& config) { - config.add_option(&mParams.testMode); - config.add_option(&mParams.range); - } - - void setParams(const params& p) { mParams=p; } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb, - int PBidx, int x,int y,int w,int h); - - private: - params mParams; - - bool mCodeResidual; -}; - - - - -enum MVSearchAlgo - { - MVSearchAlgo_Zero, - MVSearchAlgo_Full, - MVSearchAlgo_Diamond, - MVSearchAlgo_PMVFast - }; - -class option_MVSearchAlgo : public choice_option<enum MVSearchAlgo> -{ - public: - option_MVSearchAlgo() { - add_choice("zero", MVSearchAlgo_Zero); - add_choice("full", MVSearchAlgo_Full, true); - add_choice("diamond",MVSearchAlgo_Diamond); - add_choice("pmvfast",MVSearchAlgo_PMVFast); - } -}; - - -class Algo_PB_MV_Search : public Algo_PB_MV -{ - public: - Algo_PB_MV_Search() : mCodeResidual(false) { } - - struct params - { - params() { - mvSearchAlgo.set_ID("PB-MV-Search-Algo"); - hrange.set_ID ("PB-MV-Search-HRange"); - vrange.set_ID ("PB-MV-Search-VRange"); - hrange.set_default(8); - vrange.set_default(8); - } - - option_MVSearchAlgo mvSearchAlgo; - option_int hrange; - option_int vrange; - }; - - void registerParams(config_parameters& config) { - config.add_option(&mParams.mvSearchAlgo); - config.add_option(&mParams.hrange); - config.add_option(&mParams.vrange); - } - - void setParams(const params& p) { mParams=p; } - - virtual enc_cb* analyze(encoder_context*, - context_model_table&, - enc_cb* cb, - int PBidx, int x,int y,int w,int h); - - private: - params mParams; - - bool mCodeResidual; -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-intrapredmode.cc
Deleted
@@ -1,532 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/algo/tb-split.h" -#include "libde265/encoder/algo/coding-options.h" -#include "libde265/encoder/encoder-intrapred.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <algorithm> -#include <iostream> - - -float get_intra_pred_mode_bits(const enum IntraPredMode candidates3, - enum IntraPredMode intraMode, - enum IntraPredMode intraModeC, - context_model_table& context_models, - bool includeChroma) -{ - float rate; - int enc_bin; - - /**/ if (candidates0==intraMode) { rate = 1; enc_bin=1; } - else if (candidates1==intraMode) { rate = 2; enc_bin=1; } - else if (candidates2==intraMode) { rate = 2; enc_bin=1; } - else { rate = 5; enc_bin=0; } - - CABAC_encoder_estim estim; - estim.set_context_models(&context_models); - logtrace(LogSymbols,"$1 prev_intra_luma_pred_flag=%d\n",enc_bin); - estim.write_CABAC_bit(CONTEXT_MODEL_PREV_INTRA_LUMA_PRED_FLAG, enc_bin); - - // TODO: currently we make the chroma-pred-mode decision for each part even - // in NxN part mode. Since we always set this to the same value, it does not - // matter. However, we should only add the rate for it once (for blkIdx=0). - - if (includeChroma) { - assert(intraMode == intraModeC); - - logtrace(LogSymbols,"$1 intra_chroma_pred_mode=%d\n",0); - estim.write_CABAC_bit(CONTEXT_MODEL_INTRA_CHROMA_PRED_MODE,0); - } - rate += estim.getRDBits(); - - return rate; -} - - - - -float estim_TB_bitrate(const encoder_context* ectx, - const de265_image* input, - const enc_tb* tb, - enum TBBitrateEstimMethod method) -{ - int x0 = tb->x; - int y0 = tb->y; - int blkSize = 1 << tb->log2Size; - - float distortion; - - switch (method) - { - case TBBitrateEstim_SSD: - return SSD(input->get_image_plane_at_pos(0, x0,y0), - input->get_image_stride(0), - tb->intra_prediction0->get_buffer_u8(), - tb->intra_prediction0->getStride(), - blkSize, blkSize); - break; - - case TBBitrateEstim_SAD: - return SAD(input->get_image_plane_at_pos(0, x0,y0), - input->get_image_stride(0), - tb->intra_prediction0->get_buffer_u8(), - tb->intra_prediction0->getStride(), - blkSize, blkSize); - break; - - case TBBitrateEstim_SATD_DCT: - case TBBitrateEstim_SATD_Hadamard: - { - int16_t coeffs64*64; - int16_t diff64*64; - - // Usually, TBs are max. 32x32 big. However, it may be that this is still called - // for 64x64 blocks, because we are sometimes computing an intra pred mode for a - // whole CTB at once. - assert(blkSize <= 64); - - diff_blk(diff,blkSize, - input->get_image_plane_at_pos(0, x0,y0), input->get_image_stride(0), - tb->intra_prediction0->get_buffer_u8(), - tb->intra_prediction0->getStride(), - blkSize); - - void (*transform)(int16_t *coeffs, const int16_t *src, ptrdiff_t stride); - - - if (tb->log2Size == 6) { - // hack for 64x64 blocks: compute 4 times 32x32 blocks - - if (method == TBBitrateEstim_SATD_Hadamard) { - transform = ectx->acceleration.hadamard_transform_86-1-2; - - transform(coeffs, &diff0 , 64); - transform(coeffs+1*32*32, &diff32 , 64); - transform(coeffs+2*32*32, &diff32*64 , 64); - transform(coeffs+3*32*32, &diff32*64+32, 64); - } - else { - transform = ectx->acceleration.fwd_transform_86-1-2; - - transform(coeffs, &diff0 , 64); - transform(coeffs+1*32*32, &diff32 , 64); - transform(coeffs+2*32*32, &diff32*64 , 64); - transform(coeffs+3*32*32, &diff32*64+32, 64); - } - } - else { - assert(tb->log2Size-2 <= 3); - - if (method == TBBitrateEstim_SATD_Hadamard) { - ectx->acceleration.hadamard_transform_8tb->log2Size-2(coeffs, diff, &diffblkSize - &diff0); - } - else { - ectx->acceleration.fwd_transform_8tb->log2Size-2(coeffs, diff, &diffblkSize - &diff0); - } - } - - float distortion=0; - for (int i=0;i<blkSize*blkSize;i++) { - distortion += abs_value((int)coeffsi); - } - - return distortion; - } - break; - - /* - case TBBitrateEstim_AccurateBits: - assert(false); - return 0; - */ - } - - assert(false); - return 0; -} - - - -enc_tb* -Algo_TB_IntraPredMode_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, - int IntraSplitFlag) -{ - enter(); - - enc_cb* cb = tb->cb; - - bool selectIntraPredMode = false; - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_2Nx2N && TrafoDepth==0); - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_NxN && TrafoDepth==1); - - if (selectIntraPredMode) { - - CodingOptions<enc_tb> options(ectx, tb, ctxModel); - CodingOption<enc_tb> option35; - - for (int i=0;i<35;i++) { - bool computeIntraMode = isPredModeEnabled((enum IntraPredMode)i); - optioni = options.new_option(computeIntraMode); - } - - options.start(); - - - const seq_parameter_set* sps = &ectx->get_sps(); - enum IntraPredMode candidates3; - fillIntraPredModeCandidates(candidates, tb->x,tb->y, - tb->x > 0, tb->y > 0, ectx->ctbs, &ectx->get_sps()); - - - for (int i = 0; i<35; i++) { - if (!optioni) { - continue; - } - - - enum IntraPredMode intraMode = (IntraPredMode)i; - - - optioni.begin(); - - enc_tb* tb_option = optioni.get_node(); - - *tb_option->downPtr = tb_option; - - tb_option->intra_mode = intraMode; - - // set chroma mode to same mode is its luma mode - enum IntraPredMode intraModeC; - - if (cb->PartMode==PART_2Nx2N || ectx->get_sps().ChromaArrayType==CHROMA_444) { - intraModeC = intraMode; - } - else { - intraModeC = tb_option->parent->children0->intra_mode; - } - - tb_option->intra_mode_chroma = intraModeC; - - - descend(tb_option,"%d",intraMode); - tb_option = mTBSplitAlgo->analyze(ectx,optioni.get_context(),input,tb_option, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - optioni.set_node(tb_option); - ascend(); - - float intraPredModeBits = get_intra_pred_mode_bits(candidates, - intraMode, - intraModeC, - optioni.get_context(), - tb_option->blkIdx == 0); - - tb_option->rate_withoutCbfChroma += intraPredModeBits; - tb_option->rate += intraPredModeBits; - - optioni.end(); - } - - - options.compute_rdo_costs(); - - enc_tb* bestTB = options.return_best_rdo_node(); - - return bestTB; - } - else { - descend(tb,"NOP"); // TODO: not parent - enc_tb* new_tb = mTBSplitAlgo->analyze(ectx, ctxModel, input, tb, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - ascend(); - - return new_tb; - } - - assert(false); - return NULL; -} - - - -enc_tb* -Algo_TB_IntraPredMode_MinResidual::analyze(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) -{ - enter(); - - enc_cb* cb = tb->cb; - - int x0 = tb->x; - int y0 = tb->y; - int xBase = cb->x; - int yBase = cb->y; - int log2TbSize = tb->log2Size; - - bool selectIntraPredMode = false; - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_2Nx2N && TrafoDepth==0); - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_NxN && TrafoDepth==1); - - //printf("tb-intrapredmode: %d %d %d\n", cb->PredMode, cb->PartMode, TrafoDepth); - - if (selectIntraPredMode) { - - *tb->downPtr = tb; - - enum IntraPredMode intraMode; - float minDistortion = std::numeric_limits<float>::max(); - - assert(nPredModesEnabled()>=1); - - if (nPredModesEnabled()==1) { - intraMode = getPredMode(0); - } - else { - tb->intra_prediction0 = std::make_shared<small_image_buffer>(log2TbSize, sizeof(uint8_t)); - - for (int idx=0;idx<nPredModesEnabled();idx++) { - enum IntraPredMode mode = getPredMode(idx); - - tb->intra_mode = mode; - decode_intra_prediction_from_tree(ectx->img, tb, ectx->ctbs, ectx->get_sps(), 0); - - float distortion; - distortion = estim_TB_bitrate(ectx, input, tb, - mParams.bitrateEstimMethod()); - - if (distortion<minDistortion) { - minDistortion = distortion; - intraMode = mode; - } - } - } - - //cb->intra.pred_modeblkIdx = intraMode; - //if (blkIdx==0) { cb->intra.chroma_mode = intraMode; } - - //intraMode=(IntraPredMode)17; // HACK - - //printf("INTRA MODE (%d;%d) = %d\n",x0,y0,intraMode); - - tb->intra_mode = intraMode; - - // set chroma mode to same mode is its luma mode - enum IntraPredMode intraModeC; - - if (cb->PartMode==PART_2Nx2N || ectx->get_sps().ChromaArrayType==CHROMA_444) { - intraModeC = intraMode; - } - else { - intraModeC = tb->parent->children0->intra_mode; - } - - tb->intra_mode_chroma = intraModeC; - - - // Note: cannot prepare intra prediction pixels here, because this has to - // be done at the lowest TB split level. - - - descend(tb,"%d",intraMode); - tb = mTBSplitAlgo->analyze(ectx,ctxModel,input,tb, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - ascend(); - - debug_show_image(ectx->img, 0); - - - enum IntraPredMode candidates3; - fillIntraPredModeCandidates(candidates, x0,y0, - x0>0, y0>0, ectx->ctbs, &ectx->get_sps()); - - float intraPredModeBits = get_intra_pred_mode_bits(candidates, - intraMode, - intraModeC, - ctxModel, - tb->blkIdx == 0); - - tb->rate_withoutCbfChroma += intraPredModeBits; - tb->rate += intraPredModeBits; - - return tb; - } - else { - descend(tb,"NOP"); - enc_tb* nop_tb = mTBSplitAlgo->analyze(ectx, ctxModel, input, tb, - TrafoDepth, MaxTrafoDepth, - IntraSplitFlag); - ascend(); - return nop_tb; - } - - assert(false); - return NULL; -} - -static bool sortDistortions(std::pair<enum IntraPredMode,float> i, - std::pair<enum IntraPredMode,float> j) -{ - return i.second < j.second; -} - - -enc_tb* -Algo_TB_IntraPredMode_FastBrute::analyze(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) -{ - enc_cb* cb = tb->cb; - - bool selectIntraPredMode = false; - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_2Nx2N && TrafoDepth==0); - selectIntraPredMode |= (cb->PredMode==MODE_INTRA && cb->PartMode==PART_NxN && TrafoDepth==1); - - if (selectIntraPredMode) { - float minCost = std::numeric_limits<float>::max(); - int minCostIdx=0; - float minCandCost; - - const seq_parameter_set* sps = &ectx->get_sps(); - enum IntraPredMode candidates3; - fillIntraPredModeCandidates(candidates, tb->x,tb->y, - tb->x>0, tb->y>0, ectx->ctbs, &ectx->get_sps()); - - - - std::vector< std::pair<enum IntraPredMode,float> > distortions; - - int log2TbSize = tb->log2Size; - tb->intra_prediction0 = std::make_shared<small_image_buffer>(log2TbSize, sizeof(uint8_t)); - - for (int idx=0;idx<35;idx++) - if (idx!=candidates0 && idx!=candidates1 && idx!=candidates2 && - isPredModeEnabled((enum IntraPredMode)idx)) - { - enum IntraPredMode mode = (enum IntraPredMode)idx; - - tb->intra_mode = mode; - decode_intra_prediction_from_tree(ectx->img, tb, ectx->ctbs, ectx->get_sps(), 0); - - float distortion; - distortion = estim_TB_bitrate(ectx, input, tb, - mParams.bitrateEstimMethod()); - - distortions.push_back( std::make_pair((enum IntraPredMode)idx, distortion) ); - } - - std::sort( distortions.begin(), distortions.end(), sortDistortions ); - - - for (int i=0;i<distortions.size();i++) - { - //printf("%d -> %f\n",i,distortionsi.second); - } - - int keepNBest=std::min((int)mParams.keepNBest, (int)distortions.size()); - distortions.resize(keepNBest); - distortions.push_back(std::make_pair((enum IntraPredMode)candidates0,0)); - distortions.push_back(std::make_pair((enum IntraPredMode)candidates1,0)); - distortions.push_back(std::make_pair((enum IntraPredMode)candidates2,0)); - - - CodingOptions<enc_tb> options(ectx, tb, ctxModel); - std::vector<CodingOption<enc_tb> > option; - - for (size_t i=0;i<distortions.size();i++) { - enum IntraPredMode intraMode = (IntraPredMode)distortionsi.first; - if (!isPredModeEnabled(intraMode)) { continue; } - - CodingOption<enc_tb> opt = options.new_option(isPredModeEnabled(intraMode)); - opt.get_node()->intra_mode = intraMode; - option.push_back(opt); - } - - options.start(); - - - for (int i=0;i<option.size();i++) { - - enc_tb* opt_tb = optioni.get_node(); - - *opt_tb->downPtr = opt_tb; - - // set chroma mode to same mode is its luma mode - enum IntraPredMode intraModeC; - if (cb->PartMode==PART_2Nx2N || ectx->get_sps().ChromaArrayType==CHROMA_444) { - intraModeC = opt_tb->intra_mode; - } - else { - intraModeC = opt_tb->parent->children0->intra_mode; - } - - opt_tb->intra_mode_chroma = intraModeC; - - optioni.begin(); - - descend(opt_tb,"%d",opt_tb->intra_mode); - opt_tb = mTBSplitAlgo->analyze(ectx,optioni.get_context(),input,opt_tb, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - optioni.set_node(opt_tb); - ascend(); - - - float intraPredModeBits = get_intra_pred_mode_bits(candidates, - opt_tb->intra_mode, - intraModeC, - optioni.get_context(), - tb->blkIdx == 0); - - opt_tb->rate_withoutCbfChroma += intraPredModeBits; - opt_tb->rate += intraPredModeBits; - - optioni.end(); - } - - - options.compute_rdo_costs(); - - return options.return_best_rdo_node(); - } - else { - descend(tb,"NOP"); - enc_tb* new_tb = mTBSplitAlgo->analyze(ectx, ctxModel, input, tb, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - ascend(); - return new_tb; - } - - assert(false); - return NULL; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-intrapredmode.h
Deleted
@@ -1,297 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef TB_INTRAPREDMODE_H -#define TB_INTRAPREDMODE_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/encoder/algo/algo.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - - -// ========== TB intra prediction mode ========== - -enum ALGO_TB_IntraPredMode { - ALGO_TB_IntraPredMode_BruteForce, - ALGO_TB_IntraPredMode_FastBrute, - ALGO_TB_IntraPredMode_MinResidual -}; - -class option_ALGO_TB_IntraPredMode : public choice_option<enum ALGO_TB_IntraPredMode> -{ - public: - option_ALGO_TB_IntraPredMode() { - add_choice("min-residual",ALGO_TB_IntraPredMode_MinResidual); - add_choice("brute-force" ,ALGO_TB_IntraPredMode_BruteForce); - add_choice("fast-brute" ,ALGO_TB_IntraPredMode_FastBrute, true); - } -}; - - -enum TBBitrateEstimMethod { - //TBBitrateEstim_AccurateBits, - TBBitrateEstim_SSD, - TBBitrateEstim_SAD, - TBBitrateEstim_SATD_DCT, - TBBitrateEstim_SATD_Hadamard -}; - -class option_TBBitrateEstimMethod : public choice_option<enum TBBitrateEstimMethod> -{ - public: - option_TBBitrateEstimMethod() { - add_choice("ssd",TBBitrateEstim_SSD); - add_choice("sad",TBBitrateEstim_SAD); - add_choice("satd-dct",TBBitrateEstim_SATD_DCT); - add_choice("satd",TBBitrateEstim_SATD_Hadamard, true); - } -}; - -class Algo_TB_Split; - - -/** Base class for intra prediction-mode algorithms. - Selects one of the 35 prediction modes. - */ -class Algo_TB_IntraPredMode : public Algo -{ - public: - Algo_TB_IntraPredMode() : mTBSplitAlgo(NULL) { } - virtual ~Algo_TB_IntraPredMode() { } - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) = 0; - - void setChildAlgo(Algo_TB_Split* algo) { mTBSplitAlgo = algo; } - - const char* name() const { return "tb-intrapredmode"; } - - protected: - Algo_TB_Split* mTBSplitAlgo; -}; - - -enum ALGO_TB_IntraPredMode_Subset { - ALGO_TB_IntraPredMode_Subset_All, - ALGO_TB_IntraPredMode_Subset_HVPlus, - ALGO_TB_IntraPredMode_Subset_DC, - ALGO_TB_IntraPredMode_Subset_Planar -}; - -class option_ALGO_TB_IntraPredMode_Subset : public choice_option<enum ALGO_TB_IntraPredMode_Subset> -{ - public: - option_ALGO_TB_IntraPredMode_Subset() { - add_choice("all" ,ALGO_TB_IntraPredMode_Subset_All, true); - add_choice("HV+" ,ALGO_TB_IntraPredMode_Subset_HVPlus); - add_choice("DC" ,ALGO_TB_IntraPredMode_Subset_DC); - add_choice("planar",ALGO_TB_IntraPredMode_Subset_Planar); - } -}; - - -/** Utility class for intra prediction-mode algorithm that uses a subset of modes. - */ -class Algo_TB_IntraPredMode_ModeSubset : public Algo_TB_IntraPredMode -{ - public: - Algo_TB_IntraPredMode_ModeSubset() { - enableAllIntraPredModes(); - } - - void enableAllIntraPredModes() { - for (int i=0;i<35;i++) { - mPredMode_enabledi = true; - mPredModei = (enum IntraPredMode)i; - } - - mNumPredModesEnabled = 35; - } - - void disableAllIntraPredModes() { - for (int i=0;i<35;i++) { - mPredMode_enabledi = false; - } - - mNumPredModesEnabled = 0; - } - - void enableIntraPredMode(enum IntraPredMode mode) { - if (!mPredMode_enabledmode) { - mPredModemNumPredModesEnabled = mode; - mPredMode_enabledmode = true; - mNumPredModesEnabled++; - } - } - - // TODO: method to disable modes - - void enableIntraPredModeSubset(enum ALGO_TB_IntraPredMode_Subset subset) { - switch (subset) - { - case ALGO_TB_IntraPredMode_Subset_All: // activate all is the default - for (int i=0;i<35;i++) { enableIntraPredMode((enum IntraPredMode)i); } - break; - case ALGO_TB_IntraPredMode_Subset_DC: - disableAllIntraPredModes(); - enableIntraPredMode(INTRA_DC); - break; - case ALGO_TB_IntraPredMode_Subset_Planar: - disableAllIntraPredModes(); - enableIntraPredMode(INTRA_PLANAR); - break; - case ALGO_TB_IntraPredMode_Subset_HVPlus: - disableAllIntraPredModes(); - enableIntraPredMode(INTRA_DC); - enableIntraPredMode(INTRA_PLANAR); - enableIntraPredMode(INTRA_ANGULAR_10); - enableIntraPredMode(INTRA_ANGULAR_26); - break; - } - } - - - enum IntraPredMode getPredMode(int idx) const { - assert(idx<mNumPredModesEnabled); - return mPredModeidx; - } - - int nPredModesEnabled() const { - return mNumPredModesEnabled; - } - - bool isPredModeEnabled(enum IntraPredMode mode) { - return mPredMode_enabledmode; - } - - private: - IntraPredMode mPredMode35; - bool mPredMode_enabled35; - int mNumPredModesEnabled; -}; - - -/** Algorithm that brute-forces through all intra prediction mode. - */ -class Algo_TB_IntraPredMode_BruteForce : public Algo_TB_IntraPredMode_ModeSubset -{ - public: - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - - const char* name() const { return "tb-intrapredmode_BruteForce"; } -}; - - -/** Algorithm that makes a quick pre-selection of modes and then brute-forces through them. - */ -class Algo_TB_IntraPredMode_FastBrute : public Algo_TB_IntraPredMode_ModeSubset -{ - public: - - struct params - { - params() { - keepNBest.set_ID("IntraPredMode-FastBrute-keepNBest"); - keepNBest.set_range(0,32); - keepNBest.set_default(5); - - bitrateEstimMethod.set_ID("IntraPredMode-FastBrute-estimator"); - } - - option_TBBitrateEstimMethod bitrateEstimMethod; - option_int keepNBest; - }; - - void registerParams(config_parameters& config) { - config.add_option(&mParams.keepNBest); - config.add_option(&mParams.bitrateEstimMethod); - } - - void setParams(const params& p) { mParams=p; } - - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - - const char* name() const { return "tb-intrapredmode_FastBrute"; } - - private: - params mParams; -}; - - -/** Algorithm that selects the intra prediction mode on minimum residual only. - */ -class Algo_TB_IntraPredMode_MinResidual : public Algo_TB_IntraPredMode_ModeSubset -{ - public: - - struct params - { - params() { - bitrateEstimMethod.set_ID("IntraPredMode-MinResidual-estimator"); - } - - option_TBBitrateEstimMethod bitrateEstimMethod; - }; - - void setParams(const params& p) { mParams=p; } - - void registerParams(config_parameters& config) { - config.add_option(&mParams.bitrateEstimMethod); - } - - enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - const char* name() const { return "tb-intrapredmode_MinResidual"; } - - private: - params mParams; -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-rateestim.cc
Deleted
@@ -1,46 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/tb-rateestim.h" -#include "libde265/encoder/encoder-syntax.h" -#include <assert.h> -#include <iostream> - - -float Algo_TB_RateEstimation_Exact::encode_transform_unit(encoder_context* ectx, - context_model_table& ctxModel, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, - int blkIdx) -{ - CABAC_encoder_estim estim; - estim.set_context_models(&ctxModel); - - leaf(cb, NULL); - - ::encode_transform_unit(ectx, &estim, tb,cb, x0,y0, xBase,yBase, - log2TrafoSize, trafoDepth, blkIdx); - - return estim.getRDBits(); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-rateestim.h
Deleted
@@ -1,101 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef TB_RATEESTIM_H -#define TB_RATEESTIM_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/encoder/algo/algo.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - - -enum ALGO_TB_RateEstimation { - ALGO_TB_RateEstimation_None, - ALGO_TB_RateEstimation_Exact -}; - -class option_ALGO_TB_RateEstimation : public choice_option<enum ALGO_TB_RateEstimation> -{ - public: - option_ALGO_TB_RateEstimation() { - add_choice("none" ,ALGO_TB_RateEstimation_None); - add_choice("exact",ALGO_TB_RateEstimation_Exact, true); - } -}; - - - -class Algo_TB_RateEstimation : public Algo -{ - public: - virtual ~Algo_TB_RateEstimation() { } - - virtual float encode_transform_unit(encoder_context* ectx, - context_model_table& ctxModel, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx) = 0; - - virtual const char* name() const { return "tb-rateestimation"; } -}; - - -class Algo_TB_RateEstimation_None : public Algo_TB_RateEstimation -{ - public: - virtual float encode_transform_unit(encoder_context* ectx, - context_model_table& ctxModel, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx) - { - leaf(cb, NULL); - return 0.0f; - } - - virtual const char* name() const { return "tb-rateestimation-none"; } -}; - - -class Algo_TB_RateEstimation_Exact : public Algo_TB_RateEstimation -{ - public: - virtual float encode_transform_unit(encoder_context* ectx, - context_model_table& ctxModel, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx); - - virtual const char* name() const { return "tb-rateestimation-exact"; } -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-split.cc
Deleted
@@ -1,378 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/encoder-core.h" -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encoder-syntax.h" -#include "libde265/encoder/encoder-intrapred.h" -#include "libde265/encoder/algo/coding-options.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <iostream> - - -struct Logging_TB_Split : public Logging -{ - int skipTBSplit, noskipTBSplit; - int zeroBlockCorrelation625; - - const char* name() const { return "tb-split"; } - - void print(const encoder_context* ectx, const char* filename) - { - for (int tb=3;tb<=5;tb++) { - for (int z=0;z<=1;z++) { - float total = 0; - - for (int c=0;c<5;c++) - total += zeroBlockCorrelationtbzc; - - for (int c=0;c<5;c++) { - printf("%d %d %d : %d %5.2f\n", tb,z,c, - zeroBlockCorrelationtbzc, - total==0 ? 0 : zeroBlockCorrelationtbzc/total*100); - } - } - } - - - for (int z=0;z<2;z++) { - printf("\n"); - for (int tb=3;tb<=5;tb++) { - float total = 0; - - for (int c=0;c<5;c++) - total += zeroBlockCorrelationtbzc; - - printf("%dx%d ",1<<tb,1<<tb); - - for (int c=0;c<5;c++) { - printf("%5.2f ", total==0 ? 0 : zeroBlockCorrelationtbzc/total*100); - } - printf("\n"); - } - } - } -} logging_tb_split; - - -template <class pixel_t> -void diff_blk(int16_t* out,int out_stride, - const pixel_t* a_ptr, int a_stride, - const pixel_t* b_ptr, int b_stride, - int blkSize) -{ - for (int by=0;by<blkSize;by++) - for (int bx=0;bx<blkSize;bx++) - { - outby*out_stride+bx = a_ptrby*a_stride+bx - b_ptrby*b_stride+bx; - } -} - - -template <class pixel_t> -void compute_residual_channel(encoder_context* ectx, enc_tb* tb, const de265_image* input, - int cIdx, int x,int y,int log2Size) -{ - int blkSize = (1<<log2Size); - - enum IntraPredMode mode; - - if (cIdx==0) { - mode = tb->intra_mode; - } - else { - mode = tb->intra_mode_chroma; - } - - // decode intra prediction - - tb->intra_predictioncIdx = std::make_shared<small_image_buffer>(log2Size, sizeof(pixel_t)); - - decode_intra_prediction_from_tree(ectx->img, tb, ectx->ctbs, ectx->get_sps(), cIdx); - - // create residual buffer and compute differences - - tb->residualcIdx = std::make_shared<small_image_buffer>(log2Size, sizeof(int16_t)); - - diff_blk<pixel_t>(tb->residualcIdx->get_buffer_s16(), blkSize, - input->get_image_plane_at_pos(cIdx,x,y), - input->get_image_stride(cIdx), - tb->intra_predictioncIdx->get_buffer<pixel_t>(), blkSize, - blkSize); -} - - -template <class pixel_t> -void compute_residual(encoder_context* ectx, enc_tb* tb, const de265_image* input, int blkIdx) -{ - int tbSize = 1<<tb->log2Size; - - /* - tb->writeSurroundingMetadata(ectx, ectx->img, - enc_node::METADATA_RECONSTRUCTION_BORDERS, - tb->get_rectangle_with_width(1<<(tb->log2Size+1))); - */ - - compute_residual_channel<pixel_t>(ectx,tb,input, 0,tb->x,tb->y,tb->log2Size); - - if (ectx->get_sps().chroma_format_idc == CHROMA_444) { - compute_residual_channel<pixel_t>(ectx,tb,input, 1,tb->x,tb->y,tb->log2Size); - compute_residual_channel<pixel_t>(ectx,tb,input, 2,tb->x,tb->y,tb->log2Size); - } - else if (tb->log2Size > 2) { - int x = tb->x / input->SubWidthC; - int y = tb->y / input->SubHeightC; - int log2BlkSize = tb->log2Size -1; // TODO chroma 422/444 - - compute_residual_channel<pixel_t>(ectx,tb,input, 1,x,y,log2BlkSize); - compute_residual_channel<pixel_t>(ectx,tb,input, 2,x,y,log2BlkSize); - } - else if (blkIdx==3) { - int x = tb->parent->x / input->SubWidthC; - int y = tb->parent->y / input->SubHeightC; - int log2BlkSize = tb->log2Size; - - compute_residual_channel<pixel_t>(ectx,tb,input, 1,x,y,log2BlkSize); - compute_residual_channel<pixel_t>(ectx,tb,input, 2,x,y,log2BlkSize); - } -} - - -enc_tb* -Algo_TB_Split_BruteForce::analyze(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) -{ - enter(); - - enc_cb* cb = tb->cb; - - int log2TbSize = tb->log2Size; - - bool test_split = (log2TbSize > 2 && - TrafoDepth < MaxTrafoDepth && - log2TbSize > ectx->get_sps().Log2MinTrafoSize); - - bool test_no_split = true; - if (IntraSplitFlag && TrafoDepth==0) test_no_split=false; // we have to split - if (log2TbSize > ectx->get_sps().Log2MaxTrafoSize) test_no_split=false; - - assert(test_no_split || test_split); - - CodingOptions<enc_tb> options(ectx, tb, ctxModel); - - CodingOption<enc_tb> option_no_split = options.new_option(test_no_split); - CodingOption<enc_tb> option_split = options.new_option(test_split); - - //if (test_no_split) test_split = false; - //if (test_split) test_no_split = false; // HACK for debugging - - options.start(); - - - enc_tb* tb_no_split = NULL; - enc_tb* tb_split = NULL; - - if (test_no_split) { - descend(tb,"no split"); - option_no_split.begin(); - tb_no_split = option_no_split.get_node(); - //tb_no_split = new enc_tb(*tb); - *tb->downPtr = tb_no_split; - - if (cb->PredMode == MODE_INTRA) { - compute_residual<uint8_t>(ectx, tb_no_split, input, tb->blkIdx); - } - - tb_no_split = mAlgo_TB_Residual->analyze(ectx, option_no_split.get_context(), - input, tb_no_split, TrafoDepth,MaxTrafoDepth,IntraSplitFlag); - ascend(tb_no_split,"bits:%f/%f",tb_no_split->rate,tb_no_split->rate_withoutCbfChroma); - - - option_no_split.set_node(tb_no_split); - option_no_split.end(); - - - // --- some statistics --- - - if (log2TbSize <= mParams.zeroBlockPrune()) { - bool zeroBlock = tb_no_split->isZeroBlock(); - - if (zeroBlock) { - test_split = false; - logging_tb_split.skipTBSplit++; - } - else - logging_tb_split.noskipTBSplit++; - } - } - - - if (test_split) { - option_split.begin(); - - tb_split = option_split.get_node(); - *tb->downPtr = tb_split; - - //descend(tb,"split"); - tb_split = encode_transform_tree_split(ectx, option_split.get_context(), input, tb_split, cb, - TrafoDepth, MaxTrafoDepth, IntraSplitFlag); - option_split.set_node(tb_split); - //ascend("bits:%f/%f",tb_split->rate,tb_split->rate_withoutCbfChroma); - - option_split.end(); - } - - - // --- do some statistics that will help us develop a fast algorithm --- - - if (test_split && test_no_split) { - bool zero_block = tb_no_split->isZeroBlock(); - - int nChildZero = 0; - for (int i=0;i<4;i++) { - if (tb_split->childreni->isZeroBlock()) nChildZero++; - } - - logging_tb_split.zeroBlockCorrelationlog2TbSizezero_block ? 0 : 1nChildZero++; - } - - - //bool split = (rd_cost_split < rd_cost_no_split); - - //if (test_split) split=true; /// DEBUGGING HACK - - options.compute_rdo_costs(); - - enc_tb* bestTB = options.return_best_rdo_node(); - return bestTB; -} - - - -enc_tb* Algo_TB_Split::encode_transform_tree_split(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - enc_cb* cb, - int TrafoDepth, int MaxTrafoDepth, - int IntraSplitFlag) -{ - const de265_image* img = ectx->img; - - int log2TbSize = tb->log2Size; - int x0 = tb->x; - int y0 = tb->y; - - context_model ctxModelCbfChroma4; - for (int i=0;i<4;i++) { - ctxModelCbfChromai = ctxModelCONTEXT_MODEL_CBF_CHROMA+i; - } - - tb->split_transform_flag = true; - tb->rate_withoutCbfChroma = 0; - tb->distortion = 0; - - // --- encode all child nodes --- - - for (int i=0;i<4;i++) { - tb->childreni = NULL; - } - - for (int i=0;i<4;i++) { - - // generate child node and propagate values down - - int dx = (i&1) << (log2TbSize-1); - int dy = (i>>1) << (log2TbSize-1); - - enc_tb* child_tb = new enc_tb(x0+dx,y0+dy, log2TbSize-1,cb); - - child_tb->intra_mode = tb->intra_mode; - child_tb->intra_mode_chroma = tb->intra_mode_chroma; - child_tb->TrafoDepth = tb->TrafoDepth + 1; - child_tb->parent = tb; - child_tb->blkIdx = i; - child_tb->downPtr = &tb->childreni; - - descend(tb,"split %d/4",i+1); - - if (cb->PredMode == MODE_INTRA) { - //descend(tb,"intra"); - tb->childreni = mAlgo_TB_IntraPredMode->analyze(ectx, ctxModel, input, - child_tb, - TrafoDepth+1, MaxTrafoDepth, - IntraSplitFlag); - //ascend("bits:%f",tb->rate); - } - else { - //descend(tb,"inter"); - tb->childreni = this->analyze(ectx, ctxModel, input, - child_tb, TrafoDepth+1, MaxTrafoDepth, IntraSplitFlag); - //ascend(); - } - - ascend(); - - tb->distortion += tb->childreni->distortion; - tb->rate_withoutCbfChroma += tb->childreni->rate_withoutCbfChroma; - } - - tb->set_cbf_flags_from_children(); - - - // --- add rate for this TB level --- - - CABAC_encoder_estim estim; - estim.set_context_models(&ctxModel); - - - - - const seq_parameter_set* sps = &ectx->img->get_sps(); - - if (log2TbSize <= sps->Log2MaxTrafoSize && - log2TbSize > sps->Log2MinTrafoSize && - TrafoDepth < MaxTrafoDepth && - !(IntraSplitFlag && TrafoDepth==0)) - { - encode_split_transform_flag(ectx, &estim, log2TbSize, 1); - tb->rate_withoutCbfChroma += estim.getRDBits(); - estim.reset(); - } - - // restore chroma CBF context models - - for (int i=0;i<4;i++) { - ctxModelCONTEXT_MODEL_CBF_CHROMA+i = ctxModelCbfChromai; - } - - tb->rate = (tb->rate_withoutCbfChroma + - recursive_cbfChroma_rate(&estim,tb, log2TbSize, TrafoDepth)); - - return tb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-split.h
Deleted
@@ -1,126 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef TB_SPLIT_H -#define TB_SPLIT_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/encoder/algo/algo.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-rateestim.h" -#include "libde265/encoder/algo/tb-transform.h" - - -// ========== TB split decision ========== - -class Algo_TB_Split : public Algo -{ - public: - Algo_TB_Split() : mAlgo_TB_IntraPredMode(NULL) { } - virtual ~Algo_TB_Split() { } - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) = 0; - - void setAlgo_TB_IntraPredMode(Algo_TB_IntraPredMode* algo) { mAlgo_TB_IntraPredMode=algo; } - void setAlgo_TB_Residual(Algo_TB_Residual* algo) { mAlgo_TB_Residual=algo; } - - protected: - enc_tb* encode_transform_tree_split(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - enc_cb* cb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - Algo_TB_IntraPredMode* mAlgo_TB_IntraPredMode; - Algo_TB_Residual* mAlgo_TB_Residual; -}; - - - -enum ALGO_TB_Split_BruteForce_ZeroBlockPrune { - // numeric value specifies the maximum size for log2Tb for which the pruning is applied - ALGO_TB_BruteForce_ZeroBlockPrune_off = 0, - ALGO_TB_BruteForce_ZeroBlockPrune_8x8 = 3, - ALGO_TB_BruteForce_ZeroBlockPrune_8x8_16x16 = 4, - ALGO_TB_BruteForce_ZeroBlockPrune_all = 5 -}; - -class option_ALGO_TB_Split_BruteForce_ZeroBlockPrune -: public choice_option<enum ALGO_TB_Split_BruteForce_ZeroBlockPrune> -{ - public: - option_ALGO_TB_Split_BruteForce_ZeroBlockPrune() { - add_choice("off" ,ALGO_TB_BruteForce_ZeroBlockPrune_off); - add_choice("8x8" ,ALGO_TB_BruteForce_ZeroBlockPrune_8x8); - add_choice("8-16" ,ALGO_TB_BruteForce_ZeroBlockPrune_8x8_16x16); - add_choice("all" ,ALGO_TB_BruteForce_ZeroBlockPrune_all, true); - } -}; - - -class Algo_TB_Split_BruteForce : public Algo_TB_Split -{ - public: - struct params - { - params() { - zeroBlockPrune.set_ID("TB-Split-BruteForce-ZeroBlockPrune"); - } - - option_ALGO_TB_Split_BruteForce_ZeroBlockPrune zeroBlockPrune; - }; - - void setParams(const params& p) { mParams=p; } - - void registerParams(config_parameters& config) { - config.add_option(&mParams.zeroBlockPrune); - } - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - const char* name() const { return "tb-split-bruteforce"; } - - private: - params mParams; -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-transform.cc
Deleted
@@ -1,254 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/algo/tb-transform.h" -#include "libde265/encoder/encoder-core.h" -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encoder-syntax.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <iostream> - - -// DEPRECATED IN THIS FILE -void diff_blk(int16_t* out,int out_stride, - const uint8_t* a_ptr, int a_stride, - const uint8_t* b_ptr, int b_stride, - int blkSize) -{ - for (int by=0;by<blkSize;by++) - for (int bx=0;bx<blkSize;bx++) - { - outby*out_stride+bx = a_ptrby*a_stride+bx - b_ptrby*b_stride+bx; - } -} - - -static bool has_nonzero_value(const int16_t* data, int n) -{ - for (int i=0;i<n;i++) - if (datai) return true; - - return false; -} - - -void compute_transform_coeffs(encoder_context* ectx, - enc_tb* tb, - const de265_image* input, // TODO: probably pass pixels/stride directly - //int16_t* residual, int stride, - int x0,int y0, // luma position - int log2TbSize, // chroma adapted - const enc_cb* cb, - int cIdx) -{ - int xC = x0; - int yC = y0; - int tbSize = 1<<log2TbSize; - if (cIdx>0) { xC>>=1; yC>>=1; } - - enum PredMode predMode = cb->PredMode; - - int16_t blk32*32; // residual - int16_t* residual; - - - //printf("transform-coeffs %d;%d size:%d cIdx:%d\n", tb->x,tb->y,1<<tb->log2Size,cIdx); - - // --- do intra prediction --- - - if (predMode==MODE_INTRA) { - residual = tb->residualcIdx->get_buffer_s16(); - - //printf("intra residual: %p stride: %d\n",residual, tb->residualcIdx->get_stride()); - } - else { - // --- subtract prediction from input image --- - - /* TMP REMOVE: ectx->prediction does not exist anymore - - uint8_t* pred = ectx->prediction->get_image_plane(cIdx); - int stride = ectx->prediction->get_image_stride(cIdx); - - //printBlk("input",input->get_image_plane_at_pos(cIdx,xC,yC), tbSize, input->get_image_stride(cIdx)); - //printBlk("prediction", pred,tbSize, stride); - - diff_blk(blk,tbSize, - input->get_image_plane_at_pos(cIdx,xC,yC), input->get_image_stride(cIdx), - &predyC*stride+xC,stride, tbSize); - - residual=blk; - */ - - //printBlk("residual", blk,tbSize,tbSize); - } - - - - - - // --- forward transform --- - - tb->alloc_coeff_memory(cIdx, tbSize); - - - // transformation mode (DST or DCT) - - int trType; - if (cIdx==0 && log2TbSize==2 && predMode==MODE_INTRA) trType=1; // TODO: inter mode - else trType=0; - - - // do forward transform - - fwd_transform(&ectx->acceleration, tb->coeffcIdx, tbSize, log2TbSize, trType, residual, tbSize); - - - // --- quantization --- - - quant_coefficients(tb->coeffcIdx, tb->coeffcIdx, log2TbSize, cb->qp, true); - - - // set CBF to 0 if there are no non-zero coefficients - - tb->cbfcIdx = has_nonzero_value(tb->coeffcIdx, 1<<(log2TbSize<<1)); -} - - -enc_tb* Algo_TB_Transform::analyze(encoder_context* ectx, - context_model_table& ctxModel, - const de265_image* input, - enc_tb* tb, - int trafoDepth, int MaxTrafoDepth, - int IntraSplitFlag) -{ - enter(); - - const enc_cb* cb = tb->cb; - *tb->downPtr = tb; // TODO: should be obsolet - - de265_image* img = ectx->img; - - int stride = ectx->img->get_image_stride(0); - - uint8_t* luma_plane = ectx->img->get_image_plane(0); - uint8_t* cb_plane = ectx->img->get_image_plane(1); - uint8_t* cr_plane = ectx->img->get_image_plane(2); - - // --- compute transform coefficients --- - - int x0 = tb->x; - int y0 = tb->y; - int xBase = cb->x; - int yBase = cb->y; - int log2TbSize = tb->log2Size; - - // luma block - - compute_transform_coeffs(ectx, tb, input, x0,y0, log2TbSize, cb, 0 /* Y */); - - - // chroma blocks - - if (ectx->get_sps().chroma_format_idc == CHROMA_444) { - compute_transform_coeffs(ectx, tb, input, x0,y0, log2TbSize, cb, 1 /* Cb */); - compute_transform_coeffs(ectx, tb, input, x0,y0, log2TbSize, cb, 2 /* Cr */); - } - else if (log2TbSize > 2) { - // if TB is > 4x4, do chroma transform of half size - compute_transform_coeffs(ectx, tb, input, x0,y0, log2TbSize-1, cb, 1 /* Cb */); - compute_transform_coeffs(ectx, tb, input, x0,y0, log2TbSize-1, cb, 2 /* Cr */); - } - else if (tb->blkIdx==3) { - // if TB size is 4x4, do chroma transform for last sub-block - compute_transform_coeffs(ectx, tb, input, xBase,yBase, log2TbSize, cb, 1 /* Cb */); - compute_transform_coeffs(ectx, tb, input, xBase,yBase, log2TbSize, cb, 2 /* Cr */); - } - - - // --- reconstruction --- - - /* We could compute the reconstruction lazy on first access. However, we currently - use it right away for computing the distortion. - */ - tb->reconstruct(ectx, ectx->img); - - - // measure rate - - CABAC_encoder_estim estim; - estim.set_context_models(&ctxModel); - - - tb->rate_withoutCbfChroma = 0; - - const seq_parameter_set* sps = &ectx->img->get_sps(); - - - if (log2TbSize <= sps->Log2MaxTrafoSize && - log2TbSize > sps->Log2MinTrafoSize && - trafoDepth < MaxTrafoDepth && - !(IntraSplitFlag && trafoDepth==0)) - { - encode_split_transform_flag(ectx, &estim, log2TbSize, 0); - tb->rate_withoutCbfChroma += estim.getRDBits(); - estim.reset(); - } - - // --- CBF CB/CR --- - - float luma_cbf_bits = 0; - if (cb->PredMode == MODE_INTRA || trafoDepth != 0 || - tb->cbf1 || tb->cbf2) { - encode_cbf_luma(&estim, trafoDepth==0, tb->cbf0); - luma_cbf_bits = estim.getRDBits(); - } - - descend(tb,"DCT"); - float bits = mAlgo_TB_RateEstimation->encode_transform_unit(ectx,ctxModel, - tb,cb, x0,y0, xBase,yBase, - log2TbSize, trafoDepth, tb->blkIdx); - ascend(); - - tb->rate_withoutCbfChroma += bits + luma_cbf_bits; - - estim.reset(); // TODO: not needed ? - - tb->rate = (tb->rate_withoutCbfChroma + - recursive_cbfChroma_rate(&estim,tb,log2TbSize,trafoDepth)); - - //float rate_cbfChroma = estim.getRDBits(); - //tb->rate = tb->rate_withoutCbfChroma + rate_cbfChroma; - - - // measure distortion - - int tbSize = 1<<log2TbSize; - tb->distortion = SSD(input->get_image_plane_at_pos(0, x0,y0), input->get_image_stride(0), - tb->reconstruction0->get_buffer_u8(), - tb->reconstruction0->getStride(), - tbSize, tbSize); - - return tb; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/algo/tb-transform.h
Deleted
@@ -1,86 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef TB_TRANSFORM_H -#define TB_TRANSFORM_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/encoder/algo/algo.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-rateestim.h" - - -void diff_blk(int16_t* out,int out_stride, - const uint8_t* a_ptr, int a_stride, - const uint8_t* b_ptr, int b_stride, - int blkSize); - - -// ========== TB split decision ========== - -class Algo_TB_Residual : public Algo -{ -public: - Algo_TB_Residual() { } - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* tb, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag) = 0; - - const char* name() const { return "residual-unknown"; } -}; - - -class Algo_TB_Transform : public Algo_TB_Residual -{ -public: - Algo_TB_Transform() : mAlgo_TB_RateEstimation(NULL) { } - - virtual enc_tb* analyze(encoder_context*, - context_model_table&, - const de265_image* input, - enc_tb* parent, - int TrafoDepth, int MaxTrafoDepth, int IntraSplitFlag); - - void setAlgo_TB_RateEstimation(Algo_TB_RateEstimation* algo) { mAlgo_TB_RateEstimation=algo; } - - const char* name() const { return "residual-FDCT"; } - - protected: - Algo_TB_RateEstimation* mAlgo_TB_RateEstimation; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/configparam.cc
Deleted
@@ -1,471 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "configparam.h" - -#include <string.h> -#include <ctype.h> -#include <sstream> -#include <iomanip> -#include <iostream> -#include <algorithm> -#include <typeinfo> - -#ifndef RTTI_ENABLED -#error "Need to compile with RTTI enabled." -#endif - -static void remove_option(int* argc,char** argv,int idx, int n=1) -{ - for (int i=idx+n;i<*argc;i++) { - argvi-n = argvi; - } - - *argc-=n; -} - - -bool option_string::processCmdLineArguments(char** argv, int* argc, int idx) -{ - if (argv==NULL) { return false; } - if (idx >= *argc) { return false; } - - value = argvidx; - value_set = true; - - remove_option(argc,argv,idx,1); - - return true; -} - - -void option_int::set_range(int mini,int maxi) -{ - have_low_limit =true; - have_high_limit=true; - low_limit =mini; - high_limit=maxi; -} - -std::string option_int::getTypeDescr() const -{ - std::stringstream sstr; - sstr << "(int)"; - - if (have_low_limit || have_high_limit) { sstr << " "; } - if (have_low_limit) { sstr << low_limit << " <= "; } - if (have_low_limit || have_high_limit) { sstr << "x"; } - if (have_high_limit) { sstr << " <= " << high_limit; } - - if (!valid_values_set.empty()) { - sstr << " {"; - bool first=true; - for (int v : valid_values_set) { - if (!first) sstr << ","; else first=false; - sstr << v; - } - sstr << "}"; - } - - return sstr.str(); -} - -bool option_int::processCmdLineArguments(char** argv, int* argc, int idx) -{ - if (argv==NULL) { return false; } - if (idx >= *argc) { return false; } - - int v = atoi(argvidx); - if (!is_valid(v)) { return false; } - - value = v; - value_set = true; - - remove_option(argc,argv,idx,1); - - return true; -} - -bool option_int::is_valid(int v) const -{ - if (have_low_limit && v<low_limit) { return false; } - if (have_high_limit && v>high_limit) { return false; } - - if (!valid_values_set.empty()) { - auto iter = std::find(valid_values_set.begin(), valid_values_set.end(), v); - if (iter==valid_values_set.end()) { return false; } - } - - return true; -} - -std::string option_int::get_default_string() const -{ - std::stringstream sstr; - sstr << default_value; - return sstr.str(); -} - - -std::string choice_option_base::getTypeDescr() const -{ - std::vector<std::string> choices = get_choice_names(); - - std::stringstream sstr; - sstr << "{"; - - bool first=true; - for (const auto& c : choices) { - if (first) { first=false; } - else { sstr << ","; } - - sstr << c; - } - - sstr << "}"; - return sstr.str(); -} - - -bool choice_option_base::processCmdLineArguments(char** argv, int* argc, int idx) -{ - if (argv==NULL) { return false; } - if (idx >= *argc) { return false; } - - std::string value = argvidx; - - std::cout << "set " << value << "\n"; - bool success = set_value(value); - std::cout << "success " << success << "\n"; - - remove_option(argc,argv,idx,1); - - return success; -} - - -static char* fill_strings_into_memory(const std::vector<std::string>& strings_list) -{ - // calculate memory requirement - - int totalStringLengths = 0; - for (const auto& str : strings_list) { - totalStringLengths += str.length() +1; // +1 for null termination - } - - int numStrings = strings_list.size(); - - int pointersSize = (numStrings+1) * sizeof(const char*); - - char* memory = new charpointersSize + totalStringLengths; - - - // copy strings to memory area - - char* stringPtr = memory + (numStrings+1) * sizeof(const char*); - const char** tablePtr = (const char**)memory; - - for (const auto& str : strings_list) { - *tablePtr++ = stringPtr; - - strcpy(stringPtr, str.c_str()); - stringPtr += str.length()+1; - } - - *tablePtr = NULL; - - return memory; -} - - -const char** choice_option_base::get_choices_string_table() const -{ - if (choice_string_table==NULL) { - choice_string_table = fill_strings_into_memory(get_choice_names()); - } - - return (const char**)choice_string_table; -} - - - -bool config_parameters::parse_command_line_params(int* argc, char** argv, int* first_idx_ptr, - bool ignore_unknown_options) -{ - int first_idx=1; - if (first_idx_ptr) { first_idx = *first_idx_ptr; } - - for (int i=first_idx;i < *argc;i++) { - - if (argvi0=='-') { - // option - - if (argvi1=='-') { - // long option - - bool option_found=false; - - for (size_t o=0;o<mOptions.size();o++) { - if (mOptionso->hasLongOption() && strcmp(mOptionso->getLongOption().c_str(), - argvi+2)==0) { - option_found=true; - - printf("FOUND %s\n",argvi); - - bool success = mOptionso->processCmdLineArguments(argv,argc, i+1); - if (!success) { - if (first_idx_ptr) { *first_idx_ptr = i; } - return false; - } - - remove_option(argc,argv,i); - i--; - - break; - } - } - - if (option_found == false && !ignore_unknown_options) { - return false; - } - } - else { - // short option - - bool is_single_option = (argvi1 != 0 && argvi2==0); - bool do_remove_option = true; - - for (int n=1; argvin; n++) { - char option = argvin; - - bool option_found=false; - - for (size_t o=0;o<mOptions.size();o++) { - if (mOptionso->getShortOption() == option) { - option_found=true; - - bool success; - if (is_single_option) { - success = mOptionso->processCmdLineArguments(argv,argc, i+1); - } - else { - success = mOptionso->processCmdLineArguments(NULL,NULL, 0); - } - - if (!success) { - if (first_idx_ptr) { *first_idx_ptr = i; } - return false; - } - - break; - } - } - - if (!option_found) { - if (!ignore_unknown_options) { - fprintf(stderr, "unknown option -%c\n",option); - return false; - } - else { - do_remove_option=false; - } - } - - } // all short options - - if (do_remove_option) { - remove_option(argc,argv,i); - i--; - } - } // is short option - } // is option - } // all command line arguments - - return true; -} - - -void config_parameters::print_params() const -{ - for (size_t i=0;i<mOptions.size();i++) { - const option_base* o = mOptionsi; - - std::stringstream sstr; - sstr << " "; - if (o->hasShortOption()) { - sstr << '-' << o->getShortOption(); - } else { - sstr << " "; - } - - if (o->hasShortOption() && o->hasLongOption()) { - sstr << ", "; - } else { - sstr << " "; - } - - if (o->hasLongOption()) { - sstr << "--" << std::setw(12) << std::left << o->getLongOption(); - } else { - sstr << " "; - } - - sstr << " "; - sstr << o->getTypeDescr(); - - if (o->has_default()) { - sstr << ", default=" << o->get_default_string(); - } - - if (o->has_description()) { - sstr << " : " << o->get_description(); - } - - sstr << "\n"; - - std::cerr << sstr.str(); - } -} - - -void config_parameters::add_option(option_base* o) -{ - mOptions.push_back(o); - delete param_string_table; // delete old table, since we got a new parameter - param_string_table = NULL; -} - - -std::vector<std::string> config_parameters::get_parameter_IDs() const -{ - std::vector<std::string> ids; - - for (auto option : mOptions) { - ids.push_back(option->get_name()); - } - - return ids; -} - - -enum en265_parameter_type config_parameters::get_parameter_type(const char* param) const -{ - option_base* option = find_option(param); - assert(option); - - if (dynamic_cast<option_int*> (option)) { return en265_parameter_int; } - if (dynamic_cast<option_bool*> (option)) { return en265_parameter_bool; } - if (dynamic_cast<option_string*>(option)) { return en265_parameter_string; } - if (dynamic_cast<choice_option_base*>(option)) { return en265_parameter_choice; } - - assert(false); - return en265_parameter_bool; -} - - -std::vector<std::string> config_parameters::get_parameter_choices(const char* param) const -{ - option_base* option = find_option(param); - assert(option); - - choice_option_base* o = dynamic_cast<choice_option_base*>(option); - assert(o); - - return o->get_choice_names(); -} - - -option_base* config_parameters::find_option(const char* param) const -{ - for (auto o : mOptions) { - if (strcmp(o->get_name().c_str(), param)==0) { return o; } - } - - return NULL; -} - - -bool config_parameters::set_bool(const char* param, bool value) -{ - option_base* option = find_option(param); - assert(option); - - option_bool* o = dynamic_cast<option_bool*>(option); - assert(o); - - return o->set(value); -} - -bool config_parameters::set_int(const char* param, int value) -{ - option_base* option = find_option(param); - assert(option); - - option_int* o = dynamic_cast<option_int*>(option); - assert(o); - - return o->set(value); -} - -bool config_parameters::set_string(const char* param, const char* value) -{ - option_base* option = find_option(param); - assert(option); - - option_string* o = dynamic_cast<option_string*>(option); - assert(o); - - return o->set(value); -} - -bool config_parameters::set_choice(const char* param, const char* value) -{ - option_base* option = find_option(param); - assert(option); - - choice_option_base* o = dynamic_cast<choice_option_base*>(option); - assert(o); - - return o->set(value); -} - - - -const char** config_parameters::get_parameter_choices_table(const char* param) const -{ - option_base* option = find_option(param); - assert(option); - - choice_option_base* o = dynamic_cast<choice_option_base*>(option); - assert(o); - - return o->get_choices_string_table(); -} - -const char** config_parameters::get_parameter_string_table() const -{ - if (param_string_table==NULL) { - param_string_table = fill_strings_into_memory(get_parameter_IDs()); - } - - return (const char**)param_string_table; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/configparam.h
Deleted
@@ -1,386 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef CONFIG_PARAM_H -#define CONFIG_PARAM_H - -#include "en265.h" -#include "util.h" - -#include <climits> -#include <vector> -#include <string> -#include <stddef.h> -#include <assert.h> - - -/* Notes: probably best to keep cmd-line-options here. So it will be: - - automatically consistent even when having different combinations of algorithms - - no other place to edit - - if needed, one can still override it at another place - */ - -// TODO: set a stack of default prefixes in config_parameters, such that all options added -// will receive this namespace prefix. - -// TODO: add the possibility to remove long options again, i.e., not use the default id name -class option_base -{ - public: - option_base() : mShortOption(0), mLongOption(NULL) { } - option_base(const char* name) : mIDName(name), mShortOption(0), mLongOption(NULL) { } - virtual ~option_base() { } - - - // --- option identifier --- - - void set_ID(const char* name) { mIDName=name; } - void add_namespace_prefix(std::string prefix) { mPrefix = prefix + ":" + mPrefix; } - - std::string get_name() const { return mPrefix + mIDName; } - - - // --- description --- - - void set_description(std::string descr) { mDescription = descr; } - std::string get_description() const { return mDescription; } - bool has_description() const { return !mDescription.empty(); } - - - // --- value --- - - virtual bool is_defined() const = 0; - bool is_undefined() const { return !is_defined(); } - - virtual bool has_default() const = 0; - - - // --- command line options ---- - - void set_cmd_line_options(const char* long_option, char short_option = 0) - { - mShortOption = short_option; - mLongOption = long_option; - } - - void set_short_option(char short_option) { mShortOption=short_option; } - - void unsetCmdLineOption() - { - mShortOption = 0; - mLongOption = NULL; - } - - bool hasShortOption() const { return mShortOption!=0; } - char getShortOption() const { return mShortOption; } - bool hasLongOption() const { return true; } //mLongOption!=NULL; } - std::string getLongOption() const { return mLongOption ? std::string(mLongOption) : get_name(); } - - virtual LIBDE265_API bool processCmdLineArguments(char** argv, int* argc, int idx) { return false; } - - - - virtual std::string getTypeDescr() const = 0; - - virtual std::string get_default_string() const { return "N/A"; } - - private: - std::string mPrefix; - std::string mIDName; - - std::string mDescription; - - char mShortOption; - const char* mLongOption; -}; - - - -class option_bool : public option_base -{ -public: - option_bool() : value_set(false), default_set(false) { } - - operator bool() const { - assert(value_set || default_set); - return value_set ? value : default_value; - } - - virtual bool is_defined() const { return value_set || default_set; } - virtual bool has_default() const { return default_set; } - - void set_default(bool v) { default_value=v; default_set=true; } - virtual std::string get_default_string() const { return default_value ? "true":"false"; } - - virtual std::string getTypeDescr() const { return "(boolean)"; } - virtual LIBDE265_API bool processCmdLineArguments(char** argv, int* argc, int idx) { set(true); return true; } - - bool set(bool v) { value_set=true; value=v; return true; } - - private: - bool value_set; - bool value; - - bool default_set; - bool default_value; -}; - - -class option_string : public option_base -{ -public: - option_string() : value_set(false), default_set(false) { } - - const option_string& operator=(std::string v) { value=v; value_set=true; return *this; } - - operator std::string() const { return get(); } - std::string get() const { - assert(value_set || default_set); - return value_set ? value : default_value; - } - - virtual bool is_defined() const { return value_set || default_set; } - virtual bool has_default() const { return default_set; } - - void set_default(std::string v) { default_value=v; default_set=true; } - virtual LIBDE265_API std::string get_default_string() const { return default_value; } - - virtual LIBDE265_API std::string getTypeDescr() const { return "(string)"; } - virtual LIBDE265_API bool processCmdLineArguments(char** argv, int* argc, int idx); - - bool set(std::string v) { value_set=true; value=v; return true; } - - private: - bool value_set; - std::string value; - - bool default_set; - std::string default_value; -}; - - -class option_int : public option_base -{ -public: - option_int() : value_set(false), default_set(false), - have_low_limit(false), have_high_limit(false) { } - - void set_minimum(int mini) { have_low_limit =true; low_limit =mini; } - void set_maximum(int maxi) { have_high_limit=true; high_limit=maxi; } - void set_range(int mini,int maxi); - void set_valid_values(const std::vector<int>& v) { valid_values_set = v; } - - const option_int& operator=(int v) { value=v; value_set=true; return *this; } - - int operator() () const { - assert(value_set || default_set); - return value_set ? value : default_value; - } - operator int() const { return operator()(); } - - virtual bool is_defined() const { return value_set || default_set; } - virtual bool has_default() const { return default_set; } - - void set_default(int v) { default_value=v; default_set=true; } - virtual LIBDE265_API std::string get_default_string() const; - - virtual LIBDE265_API std::string getTypeDescr() const; - virtual LIBDE265_API bool processCmdLineArguments(char** argv, int* argc, int idx); - - bool set(int v) { - if (is_valid(v)) { value_set=true; value=v; return true; } - else { return false; } - } - - private: - bool value_set; - int value; - - bool default_set; - int default_value; - - bool have_low_limit, have_high_limit; - int low_limit, high_limit; - - std::vector<int> valid_values_set; - - bool is_valid(int v) const; -}; - - - -class choice_option_base : public option_base -{ -public: - choice_option_base() : choice_string_table(NULL) { } - ~choice_option_base() { delete choice_string_table; } - - bool set(std::string v) { return set_value(v); } - virtual bool set_value(const std::string& val) = 0; - virtual std::vector<std::string> get_choice_names() const = 0; - - virtual std::string getTypeDescr() const; - virtual LIBDE265_API bool processCmdLineArguments(char** argv, int* argc, int idx); - - const char** get_choices_string_table() const; - - protected: - void invalidate_choices_string_table() { - delete choice_string_table; - choice_string_table = NULL; - } - - private: - mutable char* choice_string_table; -}; - - -template <class T> class choice_option : public choice_option_base -{ - public: - choice_option() : default_set(false), value_set(false) { } - - // --- initialization --- - - void add_choice(const std::string& s, T id, bool default_value=false) { - choices.push_back( std::make_pair(s,id) ); - if (default_value) { - defaultID = id; - defaultValue = s; - default_set = true; - } - - invalidate_choices_string_table(); - } - - void set_default(T val) { - for (const auto& c : choices) { - if (c.second == val) { - defaultID = val; - defaultValue = c.first; - default_set = true; - return; - } - } - - assert(false); // value does not exist - } - - - // --- usage --- - - bool set_value(const std::string& val) // returns false if it is not a valid option - { - value_set = true; - selectedValue=val; - - validValue = false; - - for (const auto& c : choices) { - if (val == c.first) { - selectedID = c.second; - validValue = true; - } - } - - return validValue; - } - - bool isValidValue() const { return validValue; } - - const std::string& getValue() const { - assert(value_set || default_set); - return value_set ? selectedValue : defaultValue; - } - void setID(T id) { selectedID=id; validValue=true; } - const T getID() const { return value_set ? selectedID : defaultID; } - - virtual bool is_defined() const { return value_set || default_set; } - virtual bool has_default() const { return default_set; } - - std::vector<std::string> get_choice_names() const - { - std::vector<std::string> names; - for (const auto& p : choices) { - names.push_back(p.first); - } - return names; - } - - std::string get_default_string() const { return defaultValue; } - - T operator() () const { return (T)getID(); } - - private: - std::vector< std::pair<std::string,T> > choices; - - bool default_set; - std::string defaultValue; - T defaultID; - - bool value_set; - std::string selectedValue; - T selectedID; - - bool validValue; -}; - - - - -class config_parameters -{ - public: - config_parameters() : param_string_table(NULL) { } - ~config_parameters() { delete param_string_table; } - - void LIBDE265_API add_option(option_base* o); - - void LIBDE265_API print_params() const; - bool LIBDE265_API parse_command_line_params(int* argc, char** argv, int* first_idx=NULL, - bool ignore_unknown_options=false); - - - // --- connection to C API --- - - std::vector<std::string> get_parameter_IDs() const; - enum en265_parameter_type get_parameter_type(const char* param) const; - - std::vector<std::string> get_parameter_choices(const char* param) const; - - bool set_bool(const char* param, bool value); - bool set_int(const char* param, int value); - bool set_string(const char* param, const char* value); - bool set_choice(const char* param, const char* value); - - const char** get_parameter_string_table() const; - const char** get_parameter_choices_table(const char* param) const; - - private: - std::vector<option_base*> mOptions; - - option_base* find_option(const char* param) const; - - mutable char* param_string_table; -}; - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-context.cc
Deleted
@@ -1,313 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "encoder/encoder-context.h" -#include "libde265/util.h" - -#include <math.h> - - -encoder_context::encoder_context() -{ - encoder_started=false; - - vps = std::make_shared<video_parameter_set>(); - sps = std::make_shared<seq_parameter_set>(); - pps = std::make_shared<pic_parameter_set>(); - - //img_source = NULL; - //reconstruction_sink = NULL; - //packet_sink = NULL; - - image_spec_is_defined = false; - parameters_have_been_set = false; - headers_have_been_sent = false; - - param_image_allocation_userdata = NULL; - //release_func = NULL; - - use_adaptive_context = true; //false; - - //enc_coeff_pool.set_blk_size(64*64*20); // TODO: this a guess - - //switch_CABAC_to_bitstream(); - - - params.registerParams(params_config); - algo.registerParams(params_config); -} - - -encoder_context::~encoder_context() -{ - while (!output_packets.empty()) { - en265_free_packet((en265_encoder_context*)this, output_packets.front()); - output_packets.pop_front(); - } -} - - -void encoder_context::start_encoder() -{ - if (encoder_started) { - return; - } - - - if (params.sop_structure() == SOP_Intra) { - sop = std::shared_ptr<sop_creator_intra_only>(new sop_creator_intra_only()); - } - else { - auto s = std::shared_ptr<sop_creator_trivial_low_delay>(new sop_creator_trivial_low_delay()); - s->setParams(params.mSOP_LowDelay); - sop = s; - } - - sop->set_encoder_context(this); - sop->set_encoder_picture_buffer(&picbuf); - - - encoder_started=true; -} - - -en265_packet* encoder_context::create_packet(en265_packet_content_type t) -{ - en265_packet* pck = new en265_packet; - - uint8_t* data = new uint8_tcabac_encoder.size(); - memcpy(data, cabac_encoder.data(), cabac_encoder.size()); - - pck->version = 1; - - pck->data = data; - pck->length = cabac_encoder.size(); - - pck->frame_number = -1; - pck->content_type = t; - pck->complete_picture = 0; - pck->final_slice = 0; - pck->dependent_slice = 0; - //pck->pts = 0; - //pck->user_data = NULL; - pck->nuh_layer_id = 0; - pck->nuh_temporal_id = 0; - - pck->encoder_context = (en265_encoder_context*)this; - - pck->input_image = NULL; - pck->reconstruction = NULL; - - cabac_encoder.reset(); - - return pck; -} - - -de265_error encoder_context::encode_headers() -{ - nal_header nal; - - // VPS - - vps->set_defaults(Profile_Main, 6,2); - - - // SPS - - sps->set_defaults(); - sps->set_CB_log2size_range( Log2(params.min_cb_size), Log2(params.max_cb_size)); - sps->set_TB_log2size_range( Log2(params.min_tb_size), Log2(params.max_tb_size)); - sps->max_transform_hierarchy_depth_intra = params.max_transform_hierarchy_depth_intra; - sps->max_transform_hierarchy_depth_inter = params.max_transform_hierarchy_depth_inter; - - if (imgdata->input->get_chroma_format() == de265_chroma_444) { - sps->chroma_format_idc = CHROMA_444; - } - - sps->set_resolution(image_width, image_height); - sop->set_SPS_header_values(); - de265_error err = sps->compute_derived_values(true); - if (err != DE265_OK) { - fprintf(stderr,"invalid SPS parameters\n"); - exit(10); - } - - - // PPS - - pps->set_defaults(); - pps->sps = sps; //sps.get(); - pps->pic_init_qp = algo.getPPS_QP(); - - // turn off deblocking filter - pps->deblocking_filter_control_present_flag = true; - pps->deblocking_filter_override_enabled_flag = false; - pps->pic_disable_deblocking_filter_flag = true; - pps->pps_loop_filter_across_slices_enabled_flag = false; - - pps->set_derived_values(sps.get()); - - - - // write headers - - en265_packet* pck; - - nal.set(NAL_UNIT_VPS_NUT); - nal.write(cabac_encoder); - vps->write(this, cabac_encoder); - cabac_encoder.add_trailing_bits(); - cabac_encoder.flush_VLC(); - pck = create_packet(EN265_PACKET_VPS); - pck->nal_unit_type = EN265_NUT_VPS; - output_packets.push_back(pck); - - nal.set(NAL_UNIT_SPS_NUT); - nal.write(cabac_encoder); - sps->write(this, cabac_encoder); - cabac_encoder.add_trailing_bits(); - cabac_encoder.flush_VLC(); - pck = create_packet(EN265_PACKET_SPS); - pck->nal_unit_type = EN265_NUT_SPS; - output_packets.push_back(pck); - - nal.set(NAL_UNIT_PPS_NUT); - nal.write(cabac_encoder); - pps->write(this, cabac_encoder, sps.get()); - cabac_encoder.add_trailing_bits(); - cabac_encoder.flush_VLC(); - pck = create_packet(EN265_PACKET_PPS); - pck->nal_unit_type = EN265_NUT_PPS; - output_packets.push_back(pck); - - - - headers_have_been_sent = true; - - return DE265_OK; -} - - -de265_error encoder_context::encode_picture_from_input_buffer() -{ - if (!picbuf.have_more_frames_to_encode()) { - return DE265_OK; - } - - - if (!image_spec_is_defined) { - const image_data* id = picbuf.peek_next_picture_to_encode(); - image_width = id->input->get_width(); - image_height = id->input->get_height(); - image_spec_is_defined = true; - - ctbs.alloc(image_width, image_height, Log2(params.max_cb_size)); - } - - - if (!parameters_have_been_set) { - algo.setParams(params); - - - // TODO: must be <30, because Y->C mapping (tab8_22) is not implemented yet - int qp = algo.getPPS_QP(); - - //lambda = ectx->params.lambda; - lambda = 0.0242 * pow(1.27245, qp); - - parameters_have_been_set = true; - } - - - - - - image_data* imgdata; - imgdata = picbuf.get_next_picture_to_encode(); - assert(imgdata); - picbuf.mark_encoding_started(imgdata->frame_number); - - this->imgdata = imgdata; - this->shdr = &imgdata->shdr; - loginfo(LogEncoder,"encoding frame %d\n",imgdata->frame_number); - - - // write headers if not written yet - - if (!headers_have_been_sent) { - encode_headers(); - } - - - // write slice header - - // slice - - imgdata->shdr.slice_deblocking_filter_disabled_flag = true; - imgdata->shdr.slice_loop_filter_across_slices_enabled_flag = false; - imgdata->shdr.compute_derived_values(pps.get()); - - imgdata->shdr.pps = pps; - - //shdr.slice_pic_order_cnt_lsb = poc & 0xFF; - - imgdata->nal.write(cabac_encoder); - imgdata->shdr.write(this, cabac_encoder, sps.get(), pps.get(), imgdata->nal.nal_unit_type); - cabac_encoder.add_trailing_bits(); - cabac_encoder.flush_VLC(); - - - // encode image - - cabac_encoder.init_CABAC(); - double psnr = encode_image(this,imgdata->input, algo); - loginfo(LogEncoder," PSNR-Y: %f\n", psnr); - cabac_encoder.flush_CABAC(); - cabac_encoder.add_trailing_bits(); - cabac_encoder.flush_VLC(); - - - // set reconstruction image - - picbuf.set_reconstruction_image(imgdata->frame_number, img); - //picbuf.set_prediction_image(imgdata->frame_number, prediction); - img=NULL; - this->imgdata = NULL; - this->shdr = NULL; - - // build output packet - - en265_packet* pck = create_packet(EN265_PACKET_SLICE); - pck->input_image = imgdata->input; - pck->reconstruction = imgdata->reconstruction; - pck->frame_number = imgdata->frame_number; - pck->nal_unit_type = (enum en265_nal_unit_type)imgdata->nal.nal_unit_type; - pck->nuh_layer_id = imgdata->nal.nuh_layer_id; - pck->nuh_temporal_id= imgdata->nal.nuh_temporal_id; - output_packets.push_back(pck); - - - picbuf.mark_encoding_finished(imgdata->frame_number); - - return DE265_OK; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-context.h
Deleted
@@ -1,173 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ENCODER_CONTEXT_H -#define ENCODER_CONTEXT_H - -#include "libde265/image.h" -#include "libde265/decctx.h" -#include "libde265/image-io.h" -#include "libde265/encoder/encoder-params.h" -#include "libde265/encoder/encpicbuf.h" -#include "libde265/encoder/sop.h" -#include "libde265/en265.h" -#include "libde265/util.h" - -#include <memory> - - -class encoder_context : public base_context -{ - public: - encoder_context(); - ~encoder_context(); - - virtual const de265_image* get_image(uint16_t frame_id) const { - return picbuf.get_picture(frame_id)->reconstruction; - } - - virtual bool has_image(uint16_t frame_id) const { - return picbuf.has_picture(frame_id); - } - - bool encoder_started; - - encoder_params params; - config_parameters params_config; - - EncoderCore_Custom algo; - - int image_width, image_height; - bool image_spec_is_defined; // whether we know the input image size - - void* param_image_allocation_userdata; - /* - void (*release_func)(en265_encoder_context*, - de265_image*, - void* userdata); - */ - - //error_queue errqueue; - //acceleration_functions accel; - - // quick links - de265_image* img; // reconstruction - //de265_image* prediction; - image_data* imgdata; // input image - slice_segment_header* shdr; - - CTBTreeMatrix ctbs; - - // temporary memory for motion compensated pixels (when CB-algo passes this down to TB-algo) - //uint8_t prediction364*64; // stride: 1<<(cb->log2Size) - //int prediction_x0,prediction_y0; - - - int active_qp; // currently active QP - /*int target_qp;*/ /* QP we want to code at. - (Not actually the real QP. Check image.get_QPY() for that.) */ - - const seq_parameter_set& get_sps() const { return *sps; } - const pic_parameter_set& get_pps() const { return *pps; } - - seq_parameter_set& get_sps() { return *sps; } - pic_parameter_set& get_pps() { return *pps; } - - std::shared_ptr<video_parameter_set>& get_shared_vps() { return vps; } - std::shared_ptr<seq_parameter_set>& get_shared_sps() { return sps; } - std::shared_ptr<pic_parameter_set>& get_shared_pps() { return pps; } - - private: - std::shared_ptr<video_parameter_set> vps; - std::shared_ptr<seq_parameter_set> sps; - std::shared_ptr<pic_parameter_set> pps; - //slice_segment_header shdr; - - public: - bool parameters_have_been_set; - bool headers_have_been_sent; - - encoder_picture_buffer picbuf; - std::shared_ptr<sop_creator> sop; - - std::deque<en265_packet*> output_packets; - - - // --- rate-control --- - - float lambda; - - - // --- CABAC output and rate estimation --- - - //CABAC_encoder* cabac; // currently active CABAC output (estim or bitstream) - //context_model_table2* ctx_model; // currently active ctx models (estim or bitstream) - - // CABAC bitstream writer - CABAC_encoder_bitstream cabac_encoder; - context_model_table cabac_ctx_models; - - //std::shared_ptr<CABAC_encoder> cabac_estim; - - bool use_adaptive_context; - - - /*** TODO: CABAC_encoder direkt an encode-Funktion übergeben, anstatt hier - aussenrum zwischenzuspeichern (mit undefinierter Lifetime). - Das Context-Model kann dann gleich mit in den Encoder rein cabac_encoder(ctxtable). - write_bits() wird dann mit dem context-index aufgerufen, nicht mit dem model direkt. - ***/ - - - /* - void switch_CABAC(context_model_table2* model) { - cabac = cabac_estim.get(); - ctx_model = model; - } - - void switch_CABAC_to_bitstream() { - cabac = &cabac_bitstream; - ctx_model = &ctx_model_bitstream; - } - */ - - en265_packet* create_packet(en265_packet_content_type t); - - - // --- encoding control --- - - void start_encoder(); - de265_error encode_headers(); - de265_error encode_picture_from_input_buffer(); - - - // Input images can be released after encoding and when the output packet is released. - // This is important to do as soon as possible, as the image might actually wrap - // scarce resources like camera picture buffers. - // This function does release (only) the raw input data. - void release_input_image(int frame_number) { picbuf.release_input_image(frame_number); } - - void mark_image_is_outputted(int frame_number) { picbuf.mark_image_is_outputted(frame_number); } -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-core.cc
Deleted
@@ -1,428 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/encoder/encoder-core.h" -#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encoder-syntax.h" -#include <assert.h> -#include <limits> -#include <math.h> -#include <iostream> -#include <fstream> - - -#define ENCODER_DEVELOPMENT 0 -#define COMPARE_ESTIMATED_RATE_TO_REAL_RATE 0 - - -static int IntraPredModeCnt735; -static int MPM_used735; - -static int IntraPredModeCnt_total35; -static int MPM_used_total35; - -/* -void statistics_IntraPredMode(const encoder_context* ectx, int x,int y, const enc_cb* cb) -{ - if (cb->split_cu_flag) { - for (int i=0;i<4;i++) - if (cb->childreni) { - statistics_IntraPredMode(ectx, childX(x,i,cb->log2Size), childY(y,i,cb->log2Size), cb->childreni); - } - } - else { - int cnt; - int size = cb->log2Size; - - if (cb->PartMode == PART_NxN) { cnt=4; size--; } else cnt=1; - - for (int i=0;i<cnt;i++) { - IntraPredModeCntsize cb->intra.pred_modei ++; - IntraPredModeCnt_total cb->intra.pred_modei ++; - - int xi = childX(x,i,cb->log2Size); - int yi = childY(y,i,cb->log2Size); - - enum IntraPredMode candModeList3; - fillIntraPredModeCandidates(candModeList,xi,yi, xi>0, yi>0, ectx->img); - - int predmode = cb->intra.pred_modei; - if (candModeList0==predmode || - candModeList1==predmode || - candModeList2==predmode) { - MPM_usedsizepredmode++; - MPM_used_totalpredmode++; - } - } - } -} -*/ - -void statistics_print() -{ - for (int i=0;i<35;i++) { - printf("%d",i); - printf(" %d %d",IntraPredModeCnt_totali, MPM_used_totali); - - for (int k=2;k<=6;k++) { - printf(" %d %d",IntraPredModeCntki, MPM_usedki); - } - - printf("\n"); - } -} - - -void print_tb_tree_rates(const enc_tb* tb, int level) -{ - for (int i=0;i<level;i++) - std::cout << " "; - - std::cout << "TB rate=" << tb->rate << " (" << tb->rate_withoutCbfChroma << ")\n"; - if (tb->split_transform_flag) { - for (int i=0;i<4;i++) - print_tb_tree_rates(tb->childreni, level+1); - } -} - - -void print_cb_tree_rates(const enc_cb* cb, int level) -{ - for (int i=0;i<level;i++) - std::cout << " "; - - std::cout << "CB rate=" << cb->rate << "\n"; - if (cb->split_cu_flag) { - for (int i=0;i<4;i++) - print_cb_tree_rates(cb->childreni, level+1); - } - else { - print_tb_tree_rates(cb->transform_tree, level+1); - } -} - - -// /*LIBDE265_API*/ ImageSink_YUV reconstruction_sink; - -double encode_image(encoder_context* ectx, - const de265_image* input, - EncoderCore& algo) -{ - int stride=input->get_image_stride(0); - - int w = ectx->get_sps().pic_width_in_luma_samples; - int h = ectx->get_sps().pic_height_in_luma_samples; - - // --- create reconstruction image --- - ectx->img = new de265_image; - ectx->img->set_headers(ectx->get_shared_vps(), ectx->get_shared_sps(), ectx->get_shared_pps()); - ectx->img->PicOrderCntVal = input->PicOrderCntVal; - - ectx->img->alloc_image(w,h, input->get_chroma_format(), ectx->get_shared_sps(), true, - NULL /* no decctx */, /*ectx,*/ 0,NULL,false); - //ectx->img->alloc_encoder_data(&ectx->sps); - ectx->img->clear_metadata(); - -#if 0 - if (1) { - ectx->prediction = new de265_image; - ectx->prediction->alloc_image(w,h, input->get_chroma_format(), &ectx->sps, false /* no metadata */, - NULL /* no decctx */, NULL /* no encctx */, 0,NULL,false); - ectx->prediction->vps = ectx->vps; - ectx->prediction->sps = ectx->sps; - ectx->prediction->pps = ectx->pps; - } -#endif - - ectx->active_qp = ectx->get_pps().pic_init_qp; // TODO take current qp from slice - - - ectx->cabac_ctx_models.init(ectx->shdr->initType, ectx->shdr->SliceQPY); - ectx->cabac_encoder.set_context_models(&ectx->cabac_ctx_models); - - - context_model_table modelEstim; - CABAC_encoder_estim cabacEstim; - - modelEstim.init(ectx->shdr->initType, ectx->shdr->SliceQPY); - cabacEstim.set_context_models(&modelEstim); - - - int Log2CtbSize = ectx->get_sps().Log2CtbSizeY; - - uint8_t* luma_plane = ectx->img->get_image_plane(0); - uint8_t* cb_plane = ectx->img->get_image_plane(1); - uint8_t* cr_plane = ectx->img->get_image_plane(2); - - double mse=0; - - - // encode CTB by CTB - - ectx->ctbs.clear(); - - for (int y=0;y<ectx->get_sps().PicHeightInCtbsY;y++) - for (int x=0;x<ectx->get_sps().PicWidthInCtbsY;x++) - { - ectx->img->set_SliceAddrRS(x, y, ectx->shdr->SliceAddrRS); - - int x0 = x<<Log2CtbSize; - int y0 = y<<Log2CtbSize; - - logtrace(LogSlice,"encode CTB at %d %d\n",x0,y0); - - // make a copy of the context model that we can modify for testing alternatives - - context_model_table ctxModel; - //copy_context_model_table(ctxModel, ectx->ctx_model_bitstream); - ctxModel = ectx->cabac_ctx_models.copy(); - ctxModel = modelEstim.copy(); // TODO TMP - - disable_logging(LogSymbols); - enable_logging(LogSymbols); // TODO TMP - - //printf("================================================== ANALYZE\n"); - -#if 1 - /* - enc_cb* cb = encode_cb_may_split(ectx, ctxModel, - input, x0,y0, Log2CtbSize, 0, qp); - */ - - enc_cb* cb = algo.getAlgoCTBQScale()->analyze(ectx,ctxModel, x0,y0); -#else - float minCost = std::numeric_limits<float>::max(); - int bestQ = 0; - int qp = ectx->params.constant_QP; - - enc_cb* cb; - for (int q=1;q<51;q++) { - copy_context_model_table(ctxModel, ectx->ctx_model_bitstream); - - enc_cb* cbq = encode_cb_may_split(ectx, ctxModel, - input, x0,y0, Log2CtbSize, 0, q); - - float cost = cbq->distortion + ectx->lambda * cbq->rate; - if (cost<minCost) { minCost=cost; bestQ=q; } - - if (q==qp) { cb=cbq; } - } - - printf("Q %d\n",bestQ); - fflush(stdout); -#endif - - //print_cb_tree_rates(cb,0); - - //statistics_IntraPredMode(ectx, x0,y0, cb); - - - // --- write bitstream --- - - //ectx->switch_CABAC_to_bitstream(); - - enable_logging(LogSymbols); - - logdebug(LogEncoder,"write CTB %d;%d\n",x,y); - - if (logdebug_enabled(LogEncoder)) { - cb->debug_dumpTree(enc_tb::DUMPTREE_ALL); - } - - /* - cb->debug_assertTreeConsistency(ectx->img); - - //cb->invalidateMetadataInSubTree(ectx->img); - cb->writeMetadata(ectx, ectx->img, - enc_node::METADATA_INTRA_MODES | - enc_node::METADATA_RECONSTRUCTION | - enc_node::METADATA_CT_DEPTH); - - cb->debug_assertTreeConsistency(ectx->img); - */ - - encode_ctb(ectx, &ectx->cabac_encoder, cb, x,y); - - //printf("================================================== WRITE\n"); - - - if (COMPARE_ESTIMATED_RATE_TO_REAL_RATE) { - float realPre = cabacEstim.getRDBits(); - encode_ctb(ectx, &cabacEstim, cb, x,y); - float realPost = cabacEstim.getRDBits(); - - printf("estim: %f real: %f diff: %f\n", - cb->rate, - realPost-realPre, - cb->rate - (realPost-realPre)); - } - - - int last = (y==ectx->get_sps().PicHeightInCtbsY-1 && - x==ectx->get_sps().PicWidthInCtbsY-1); - ectx->cabac_encoder.write_CABAC_term_bit(last); - - //delete cb; - - //ectx->free_all_pools(); - - mse += cb->distortion; - } - - mse /= ectx->img->get_width() * ectx->img->get_height(); - - - //reconstruction_sink.send_image(ectx->img); - - - //statistics_print(); - - - //delete ectx->prediction; - - - // frame PSNR - - ectx->ctbs.writeReconstructionToImage(ectx->img, &ectx->get_sps()); - -#if 0 - std::ofstream ostr("out.pgm"); - ostr << "P5\n" << ectx->img->get_width() << " " << ectx->img->get_height() << "\n255\n"; - for (int y=0;y<ectx->img->get_height();y++) { - ostr.write( (char*)ectx->img->get_image_plane_at_pos(0,0,y), ectx->img->get_width() ); - } -#endif - - double psnr = 10*log10(255.0*255.0 / mse); - -#if 0 - double psnr2 = PSNR(MSE(input->get_image_plane(0), input->get_image_stride(0), - luma_plane, ectx->img->get_image_stride(0), - input->get_width(), input->get_height())); - - printf("rate-estim PSNR: %f vs %f\n",psnr,psnr2); -#endif - - return psnr; -} - - - -void EncoderCore_Custom::setParams(encoder_params& params) -{ - // build algorithm tree - - mAlgo_CTB_QScale_Constant.setChildAlgo(&mAlgo_CB_Split_BruteForce); - mAlgo_CB_Split_BruteForce.setChildAlgo(&mAlgo_CB_Skip_BruteForce); - - mAlgo_CB_Skip_BruteForce.setSkipAlgo(&mAlgo_CB_MergeIndex_Fixed); - mAlgo_CB_Skip_BruteForce.setNonSkipAlgo(&mAlgo_CB_IntraInter_BruteForce); - //&mAlgo_CB_InterPartMode_Fixed); - - Algo_CB_IntraPartMode* algo_CB_IntraPartMode = NULL; - switch (params.mAlgo_CB_IntraPartMode()) { - case ALGO_CB_IntraPartMode_BruteForce: - algo_CB_IntraPartMode = &mAlgo_CB_IntraPartMode_BruteForce; - break; - case ALGO_CB_IntraPartMode_Fixed: - algo_CB_IntraPartMode = &mAlgo_CB_IntraPartMode_Fixed; - break; - } - - mAlgo_CB_IntraInter_BruteForce.setIntraChildAlgo(algo_CB_IntraPartMode); - mAlgo_CB_IntraInter_BruteForce.setInterChildAlgo(&mAlgo_CB_InterPartMode_Fixed); - - mAlgo_CB_MergeIndex_Fixed.setChildAlgo(&mAlgo_TB_Split_BruteForce); - - Algo_PB_MV* pbAlgo = NULL; - switch (params.mAlgo_MEMode()) { - case MEMode_Test: - pbAlgo = &mAlgo_PB_MV_Test; - break; - case MEMode_Search: - pbAlgo = &mAlgo_PB_MV_Search; - break; - } - - mAlgo_CB_InterPartMode_Fixed.setChildAlgo(pbAlgo); - pbAlgo->setChildAlgo(&mAlgo_TB_Split_BruteForce); - - - Algo_TB_IntraPredMode_ModeSubset* algo_TB_IntraPredMode = NULL; - switch (params.mAlgo_TB_IntraPredMode()) { - case ALGO_TB_IntraPredMode_BruteForce: - algo_TB_IntraPredMode = &mAlgo_TB_IntraPredMode_BruteForce; - break; - case ALGO_TB_IntraPredMode_FastBrute: - algo_TB_IntraPredMode = &mAlgo_TB_IntraPredMode_FastBrute; - break; - case ALGO_TB_IntraPredMode_MinResidual: - algo_TB_IntraPredMode = &mAlgo_TB_IntraPredMode_MinResidual; - break; - } - - algo_CB_IntraPartMode->setChildAlgo(algo_TB_IntraPredMode); - - mAlgo_TB_Split_BruteForce.setAlgo_TB_IntraPredMode(algo_TB_IntraPredMode); - mAlgo_TB_Split_BruteForce.setAlgo_TB_Residual(&mAlgo_TB_Transform); - - Algo_TB_RateEstimation* algo_TB_RateEstimation = NULL; - switch (params.mAlgo_TB_RateEstimation()) { - case ALGO_TB_RateEstimation_None: algo_TB_RateEstimation = &mAlgo_TB_RateEstimation_None; break; - case ALGO_TB_RateEstimation_Exact: algo_TB_RateEstimation = &mAlgo_TB_RateEstimation_Exact; break; - } - mAlgo_TB_Transform.setAlgo_TB_RateEstimation(algo_TB_RateEstimation); - //mAlgo_TB_Split_BruteForce.setParams(params.TB_Split_BruteForce); - - algo_TB_IntraPredMode->setChildAlgo(&mAlgo_TB_Split_BruteForce); - - - // ===== set algorithm parameters ====== - - //mAlgo_CB_IntraPartMode_Fixed.setParams(params.CB_IntraPartMode_Fixed); - - //mAlgo_TB_IntraPredMode_FastBrute.setParams(params.TB_IntraPredMode_FastBrute); - //mAlgo_TB_IntraPredMode_MinResidual.setParams(params.TB_IntraPredMode_MinResidual); - - - //mAlgo_CTB_QScale_Constant.setParams(params.CTB_QScale_Constant); - - - algo_TB_IntraPredMode->enableIntraPredModeSubset( params.mAlgo_TB_IntraPredMode_Subset() ); -} - - -void Logging::print_logging(const encoder_context* ectx, const char* id, const char* filename) -{ -#if 000 - if (strcmp(id,logging_tb_split.name())==0) { - logging_tb_split.print(ectx,filename); - } -#endif -} - - -void en265_print_logging(const encoder_context* ectx, const char* id, const char* filename) -{ - Logging::print_logging(ectx,id,filename); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-core.h
Deleted
@@ -1,151 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ANALYZE_H -#define ANALYZE_H - -#include "libde265/nal-parser.h" -#include "libde265/decctx.h" -#include "libde265/encoder/encoder-types.h" -#include "libde265/slice.h" -#include "libde265/scan.h" -#include "libde265/intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include "libde265/quality.h" -#include "libde265/fallback.h" -#include "libde265/encoder/configparam.h" - -#include "libde265/encoder/algo/tb-intrapredmode.h" -#include "libde265/encoder/algo/tb-transform.h" -#include "libde265/encoder/algo/tb-split.h" -#include "libde265/encoder/algo/cb-intrapartmode.h" -#include "libde265/encoder/algo/cb-interpartmode.h" -#include "libde265/encoder/algo/cb-split.h" -#include "libde265/encoder/algo/ctb-qscale.h" -#include "libde265/encoder/algo/cb-mergeindex.h" -//#include "libde265/encoder/algo/cb-skip-or-inter.h" -#include "libde265/encoder/algo/pb-mv.h" -#include "libde265/encoder/algo/cb-skip.h" -#include "libde265/encoder/algo/cb-intra-inter.h" - - -/* Encoder search tree, bottom up: - - - Algo_TB_Split - whether TB is split or not - - - Algo_TB_IntraPredMode - choose the intra prediction mode (or NOP, if at the wrong tree level) - - - Algo_CB_IntraPartMode - choose between NxN and 2Nx2N intra parts - - - Algo_CB_PredMode - intra / inter - - - Algo_CB_Split - whether CB is split or not - - - Algo_CTB_QScale - select QScale on CTB granularity - */ - - -// ========== an encoding algorithm combines a set of algorithm modules ========== - -class EncoderCore -{ - public: - virtual ~EncoderCore() { } - - virtual Algo_CTB_QScale* getAlgoCTBQScale() = 0; - - virtual int getPPS_QP() const = 0; - virtual int getSlice_QPDelta() const { return 0; } -}; - - -class EncoderCore_Custom : public EncoderCore -{ - public: - - void setParams(struct encoder_params& params); - - void registerParams(config_parameters& config) { - mAlgo_CTB_QScale_Constant.registerParams(config); - mAlgo_CB_IntraPartMode_Fixed.registerParams(config); - mAlgo_CB_InterPartMode_Fixed.registerParams(config); - mAlgo_PB_MV_Test.registerParams(config); - mAlgo_PB_MV_Search.registerParams(config); - mAlgo_TB_IntraPredMode_FastBrute.registerParams(config); - mAlgo_TB_IntraPredMode_MinResidual.registerParams(config); - mAlgo_TB_Split_BruteForce.registerParams(config); - } - - virtual Algo_CTB_QScale* getAlgoCTBQScale() { return &mAlgo_CTB_QScale_Constant; } - - virtual int getPPS_QP() const { return mAlgo_CTB_QScale_Constant.getQP(); } - - private: - Algo_CTB_QScale_Constant mAlgo_CTB_QScale_Constant; - - Algo_CB_Split_BruteForce mAlgo_CB_Split_BruteForce; - Algo_CB_Skip_BruteForce mAlgo_CB_Skip_BruteForce; - Algo_CB_IntraInter_BruteForce mAlgo_CB_IntraInter_BruteForce; - - Algo_CB_IntraPartMode_BruteForce mAlgo_CB_IntraPartMode_BruteForce; - Algo_CB_IntraPartMode_Fixed mAlgo_CB_IntraPartMode_Fixed; - - Algo_CB_InterPartMode_Fixed mAlgo_CB_InterPartMode_Fixed; - Algo_CB_MergeIndex_Fixed mAlgo_CB_MergeIndex_Fixed; - - Algo_PB_MV_Test mAlgo_PB_MV_Test; - Algo_PB_MV_Search mAlgo_PB_MV_Search; - - Algo_TB_Split_BruteForce mAlgo_TB_Split_BruteForce; - - Algo_TB_IntraPredMode_BruteForce mAlgo_TB_IntraPredMode_BruteForce; - Algo_TB_IntraPredMode_FastBrute mAlgo_TB_IntraPredMode_FastBrute; - Algo_TB_IntraPredMode_MinResidual mAlgo_TB_IntraPredMode_MinResidual; - - Algo_TB_Transform mAlgo_TB_Transform; - Algo_TB_RateEstimation_None mAlgo_TB_RateEstimation_None; - Algo_TB_RateEstimation_Exact mAlgo_TB_RateEstimation_Exact; -}; - - - -double encode_image(encoder_context*, const de265_image* input, EncoderCore&); - -void encode_sequence(encoder_context*); - - -class Logging -{ -public: - virtual ~Logging() { } - - static void print_logging(const encoder_context* ectx, const char* id, const char* filename); - - virtual const char* name() const = 0; - virtual void print(const encoder_context* ectx, const char* filename) = 0; -}; - - -LIBDE265_API void en265_print_logging(const encoder_context* ectx, const char* id, const char* filename); - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-intrapred.cc
Deleted
@@ -1,340 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "intrapred.h" -#include "encoder/encoder-intrapred.h" -#include "encoder/encoder-types.h" -#include "transform.h" -#include "util.h" -#include <assert.h> - - -#include <sys/types.h> -#include <string.h> - - -void fillIntraPredModeCandidates(enum IntraPredMode candModeList3, - int x,int y, - bool availableA, // left - bool availableB, // top - const CTBTreeMatrix& ctbs, - const seq_parameter_set* sps) -{ - - // block on left side - - enum IntraPredMode candIntraPredModeA, candIntraPredModeB; - - if (availableA==false) { - candIntraPredModeA=INTRA_DC; - } - else { - const enc_cb* cbL = ctbs.getCB(x-1,y); - assert(cbL != NULL); - - if (cbL->PredMode != MODE_INTRA || - cbL->pcm_flag) { - candIntraPredModeA=INTRA_DC; - } - else { - const enc_tb* tbL = cbL->getTB(x-1,y); - assert(tbL); - candIntraPredModeA = tbL->intra_mode; - } - } - - // block above - - if (availableB==false) { - candIntraPredModeB=INTRA_DC; - } - else { - const enc_cb* cbA = ctbs.getCB(x,y-1); - assert(cbA != NULL); - - if (cbA->PredMode != MODE_INTRA || - cbA->pcm_flag) { - candIntraPredModeB=INTRA_DC; - } - else if (y-1 < ((y >> sps->Log2CtbSizeY) << sps->Log2CtbSizeY)) { - candIntraPredModeB=INTRA_DC; - } - else { - const enc_tb* tbA = cbA->getTB(x,y-1); - assert(tbA); - - candIntraPredModeB = tbA->intra_mode; - } - } - - - logtrace(LogSlice,"%d;%d candA:%d / candB:%d\n", x,y, - availableA ? candIntraPredModeA : -999, - availableB ? candIntraPredModeB : -999); - - - fillIntraPredModeCandidates(candModeList, - candIntraPredModeA, - candIntraPredModeB); -} - - -template <class pixel_t> -class intra_border_computer_ctbtree : public intra_border_computer<pixel_t> -{ -public: - void fill_from_ctbtree(const enc_tb* tb, - const CTBTreeMatrix& ctbs); -}; - - -template <class pixel_t> -void intra_border_computer_ctbtree<pixel_t>::fill_from_ctbtree(const enc_tb* blkTb, - const CTBTreeMatrix& ctbs) -{ - int xBLuma = this->xB * this->SubWidth; - int yBLuma = this->yB * this->SubHeight; - - int currBlockAddr = this->pps->MinTbAddrZS (xBLuma >> this->sps->Log2MinTrafoSize) + - (yBLuma >> this->sps->Log2MinTrafoSize) * this->sps->PicWidthInTbsY ; - - - // copy pixels at left column - - for (int y=this->nBottom-1 ; y>=0 ; y-=4) - if (this->availableLeft) - { - int NBlockAddr = this->pps->MinTbAddrZS (((this->xB-1)*this->SubWidth )>>this->sps->Log2MinTrafoSize) + - (((this->yB+y)*this->SubHeight)>>this->sps->Log2MinTrafoSize) - * this->sps->PicWidthInTbsY ; - - bool availableN = NBlockAddr <= currBlockAddr; - - int xN = this->xB-1; - int yN = this->yB+y; - - const enc_cb* cb = ctbs.getCB(xN*this->SubWidth, yN*this->SubHeight); - - if (this->pps->constrained_intra_pred_flag) { - if (cb->PredMode != MODE_INTRA) - availableN = false; - } - - if (availableN) { - PixelAccessor pa = cb->transform_tree->getPixels(xN,yN, this->cIdx, *this->sps); - - if (!this->nAvail) this->firstValue = pathis->yB+ythis->xB-1; - - for (int i=0;i<4;i++) { - this->available-y+i-1 = availableN; - this->out_border-y+i-1 = pathis->yB+y-ithis->xB-1; - } - - this->nAvail+=4; - } - } - - // copy pixel at top-left position - - if (this->availableTopLeft) - { - int NBlockAddr = this->pps->MinTbAddrZS (((this->xB-1)*this->SubWidth )>>this->sps->Log2MinTrafoSize) + - (((this->yB-1)*this->SubHeight)>>this->sps->Log2MinTrafoSize) - * this->sps->PicWidthInTbsY ; - - bool availableN = NBlockAddr <= currBlockAddr; - - int xN = this->xB-1; - int yN = this->yB-1; - - const enc_cb* cb = ctbs.getCB(xN*this->SubWidth, yN*this->SubHeight); - - if (this->pps->constrained_intra_pred_flag) { - if (cb->PredMode!=MODE_INTRA) { - availableN = false; - } - } - - if (availableN) { - PixelAccessor pa = cb->transform_tree->getPixels(xN,yN, this->cIdx, *this->sps); - - this->out_border0 = pathis->yB-1this->xB-1; - this->available0 = availableN; - - if (!this->nAvail) this->firstValue = this->out_border0; - this->nAvail++; - } - } - - - // copy pixels at top row - - for (int x=0 ; x<this->nRight ; x+=4) { - bool borderAvailable; - if (x<this->nT) borderAvailable = this->availableTop; - else borderAvailable = this->availableTopRight; - - if (borderAvailable) - { - int NBlockAddr = this->pps->MinTbAddrZS (((this->xB+x)*this->SubWidth )>>this->sps->Log2MinTrafoSize) + - (((this->yB-1)*this->SubHeight)>>this->sps->Log2MinTrafoSize) - * this->sps->PicWidthInTbsY ; - - bool availableN = NBlockAddr <= currBlockAddr; - - int xN = this->xB+x; - int yN = this->yB-1; - - const enc_cb* cb = ctbs.getCB(xN*this->SubWidth, yN*this->SubHeight); - - if (this->pps->constrained_intra_pred_flag) { - if (cb->PredMode!=MODE_INTRA) { - availableN = false; - } - } - - - if (availableN) { - PixelAccessor pa = cb->transform_tree->getPixels(xN,yN, this->cIdx, *this->sps); - - if (!this->nAvail) this->firstValue = pathis->yB-1this->xB+x; - - for (int i=0;i<4;i++) { - this->out_borderx+i+1 = pathis->yB-1this->xB+x+i; - this->availablex+i+1 = availableN; - } - - this->nAvail+=4; - } - } - } -} - - -// (8.4.4.2.2) -template <class pixel_t> -void fill_border_samples_from_tree(const de265_image* img, - const enc_tb* tb, - const CTBTreeMatrix& ctbs, - int cIdx, - pixel_t* out_border) -{ - intra_border_computer_ctbtree<pixel_t> c; - - // xB,yB in component specific resolution - int xB,yB; - int nT = 1<<tb->log2Size; - - xB = tb->x; - yB = tb->y; - - if (img->get_sps().chroma_format_idc == CHROMA_444) { - } - else if (cIdx > 0) { - // TODO: proper chroma handling - xB >>= 1; - yB >>= 1; - nT >>= 1; - - if (tb->log2Size==2) { - xB = tb->parent->x >> 1; - yB = tb->parent->y >> 1; - nT = 4; - } - } - - c.init(out_border, img, nT, cIdx, xB, yB); - c.preproc(); - c.fill_from_ctbtree(tb, ctbs); - c.reference_sample_substitution(); -} - - - -template <class pixel_t> -void decode_intra_prediction_from_tree_internal(const de265_image* img, - const enc_tb* tb, - const CTBTreeMatrix& ctbs, - const seq_parameter_set& sps, - int cIdx) -{ - enum IntraPredMode intraPredMode; - if (cIdx==0) intraPredMode = tb->intra_mode; - else intraPredMode = tb->intra_mode_chroma; - - pixel_t* dst = tb->intra_predictioncIdx->get_buffer<pixel_t>(); - int dstStride = tb->intra_predictioncIdx->getStride(); - - pixel_t border_pixels_mem4*MAX_INTRA_PRED_BLOCK_SIZE+1; - pixel_t* border_pixels = &border_pixels_mem2*MAX_INTRA_PRED_BLOCK_SIZE; - - fill_border_samples_from_tree(img, tb, ctbs, cIdx, border_pixels); - - if (cIdx==0) { - // memcpy(tb->debug_intra_border, border_pixels_mem, 2*64+1); - } - - int nT = 1<<tb->log2Size; - if (cIdx>0 && tb->log2Size>2 && sps.chroma_format_idc == CHROMA_420) { - nT >>= 1; // TODO: 4:2:2 - } - - - if (sps.range_extension.intra_smoothing_disabled_flag == 0 && - (cIdx==0 || sps.ChromaArrayType==CHROMA_444)) - { - intra_prediction_sample_filtering(sps, border_pixels, nT, cIdx, intraPredMode); - } - - - switch (intraPredMode) { - case INTRA_PLANAR: - intra_prediction_planar(dst,dstStride, nT, cIdx, border_pixels); - break; - case INTRA_DC: - intra_prediction_DC(dst,dstStride, nT, cIdx, border_pixels); - break; - default: - { - //int bit_depth = img->get_bit_depth(cIdx); - int bit_depth = 8; // TODO - - bool disableIntraBoundaryFilter = - (sps.range_extension.implicit_rdpcm_enabled_flag && - tb->cb->cu_transquant_bypass_flag); - - intra_prediction_angular(dst,dstStride, bit_depth,disableIntraBoundaryFilter, - tb->x,tb->y,intraPredMode,nT,cIdx, border_pixels); - } - break; - } -} - - -void decode_intra_prediction_from_tree(const de265_image* img, - const enc_tb* tb, - const CTBTreeMatrix& ctbs, - const seq_parameter_set& sps, - int cIdx) -{ - // TODO: high bit depths - - decode_intra_prediction_from_tree_internal<uint8_t>(img ,tb, ctbs, sps, cIdx); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-intrapred.h
Deleted
@@ -1,40 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef DE265_ENCODER_INTRAPRED_H -#define DE265_ENCODER_INTRAPRED_H - -#include "libde265/decctx.h" - - -void fillIntraPredModeCandidates(enum IntraPredMode candModeList3, - int x,int y, - bool availableA, // left - bool availableB, // top - const class CTBTreeMatrix& ctbs, - const seq_parameter_set* sps); - -void decode_intra_prediction_from_tree(const de265_image* img, - const class enc_tb* tb, - const class CTBTreeMatrix& ctbs, - const class seq_parameter_set& sps, - int cIdx); - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-motion.cc
Deleted
@@ -1,80 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "encoder/encoder-motion.h" -#include "encoder/encoder-context.h" -#include "decctx.h" -#include "util.h" -#include "dpb.h" -#include "motion.h" - -#include <assert.h> - - -#include <sys/types.h> -#include <signal.h> -#include <string.h> - -#if defined(_MSC_VER) || defined(__MINGW32__) -# include <malloc.h> -#elif defined(HAVE_ALLOCA_H) -# include <alloca.h> -#endif - - -class MotionVectorAccess_encoder_context : public MotionVectorAccess -{ -public: - MotionVectorAccess_encoder_context(const encoder_context* e) : ectx(e) { } - - enum PartMode get_PartMode(int x,int y) const override { return ectx->ctbs.getCB(x,y)->PartMode; } - const PBMotion& get_mv_info(int x,int y) const override { return ectx->ctbs.getPB(x,y)->motion; } - -private: - const encoder_context* ectx; -}; - - - -void get_merge_candidate_list_from_tree(encoder_context* ectx, - const slice_segment_header* shdr, - int xC,int yC, int xP,int yP, - int nCS, int nPbW,int nPbH, int partIdx, - PBMotion* mergeCandList) -{ - int max_merge_idx = 5-shdr->five_minus_max_num_merge_cand -1; - - get_merge_candidate_list_without_step_9(ectx, shdr, - MotionVectorAccess_encoder_context(ectx), ectx->img, - xC,yC,xP,yP,nCS,nPbW,nPbH, partIdx, - max_merge_idx, mergeCandList); - - // 9. for encoder: modify all merge candidates - - for (int i=0;i<=max_merge_idx;i++) { - if (mergeCandListi.predFlag0 && - mergeCandListi.predFlag1 && - nPbW+nPbH==12) - { - mergeCandListi.refIdx1 = 0; - mergeCandListi.predFlag1 = 0; - } - } -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-motion.h
Deleted
@@ -1,32 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef DE265_ENCODER_MOTION_H -#define DE265_ENCODER_MOTION_H - -#include "libde265/motion.h" - -void get_merge_candidate_list_from_tree(class encoder_context* ectx, - const slice_segment_header* shdr, - int xC,int yC, int xP,int yP, - int nCS, int nPbW,int nPbH, int partIdx, - PBMotion* mergeCandList); - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-params.cc
Deleted
@@ -1,83 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "encoder-params.h" - - - -static std::vector<int> power2range(int low,int high) -{ - std::vector<int> vals; - for (int i=low; i<=high; i*=2) - vals.push_back(i); - return vals; -} - -encoder_params::encoder_params() -{ - //rateControlMethod = RateControlMethod_ConstantQP; - - min_cb_size.set_ID("min-cb-size"); min_cb_size.set_valid_values(power2range(8,64)); min_cb_size.set_default(8); - max_cb_size.set_ID("max-cb-size"); max_cb_size.set_valid_values(power2range(8,64)); max_cb_size.set_default(32); - min_tb_size.set_ID("min-tb-size"); min_tb_size.set_valid_values(power2range(4,32)); min_tb_size.set_default(4); - max_tb_size.set_ID("max-tb-size"); max_tb_size.set_valid_values(power2range(8,32)); max_tb_size.set_default(32); - - max_transform_hierarchy_depth_intra.set_ID("max-transform-hierarchy-depth-intra"); - max_transform_hierarchy_depth_intra.set_range(0,4); - max_transform_hierarchy_depth_intra.set_default(3); - - max_transform_hierarchy_depth_inter.set_ID("max-transform-hierarchy-depth-inter"); - max_transform_hierarchy_depth_inter.set_range(0,4); - max_transform_hierarchy_depth_inter.set_default(3); - - sop_structure.set_ID("sop-structure"); - - mAlgo_TB_IntraPredMode.set_ID("TB-IntraPredMode"); - mAlgo_TB_IntraPredMode_Subset.set_ID("TB-IntraPredMode-subset"); - mAlgo_CB_IntraPartMode.set_ID("CB-IntraPartMode"); - - mAlgo_TB_RateEstimation.set_ID("TB-RateEstimation"); - - mAlgo_MEMode.set_ID("MEMode"); -} - - -void encoder_params::registerParams(config_parameters& config) -{ - config.add_option(&min_cb_size); - config.add_option(&max_cb_size); - config.add_option(&min_tb_size); - config.add_option(&max_tb_size); - config.add_option(&max_transform_hierarchy_depth_intra); - config.add_option(&max_transform_hierarchy_depth_inter); - - config.add_option(&sop_structure); - - config.add_option(&mAlgo_TB_IntraPredMode); - config.add_option(&mAlgo_TB_IntraPredMode_Subset); - config.add_option(&mAlgo_CB_IntraPartMode); - - config.add_option(&mAlgo_MEMode); - config.add_option(&mAlgo_TB_RateEstimation); - - mSOP_LowDelay.registerParams(config); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-params.h
Deleted
@@ -1,143 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ENCODER_PARAMS_H -#define ENCODER_PARAMS_H - -#include "libde265/encoder/encoder-types.h" -#include "libde265/encoder/encoder-core.h" -#include "libde265/encoder/sop.h" - - -enum RateControlMethod - { - RateControlMethod_ConstantQP, - RateControlMethod_ConstantLambda - }; - -enum IntraPredSearch - { - IntraPredSearch_Complete - }; - - -enum SOP_Structure - { - SOP_Intra, - SOP_LowDelay - }; - -class option_SOP_Structure : public choice_option<enum SOP_Structure> -{ - public: - option_SOP_Structure() { - add_choice("intra", SOP_Intra); - add_choice("low-delay", SOP_LowDelay, true); - } -}; - - -enum MEMode - { - MEMode_Test, - MEMode_Search - }; - -class option_MEMode : public choice_option<enum MEMode> -{ - public: - option_MEMode() { - add_choice("test", MEMode_Test, true); - add_choice("search", MEMode_Search); - } -}; - - -struct encoder_params -{ - encoder_params(); - - void registerParams(config_parameters& config); - - - // CB quad-tree - - option_int min_cb_size; - option_int max_cb_size; - - option_int min_tb_size; - option_int max_tb_size; - - option_int max_transform_hierarchy_depth_intra; - option_int max_transform_hierarchy_depth_inter; - - - option_SOP_Structure sop_structure; - - sop_creator_trivial_low_delay::params mSOP_LowDelay; - - - // --- Algo_TB_IntraPredMode - - option_ALGO_TB_IntraPredMode mAlgo_TB_IntraPredMode; - option_ALGO_TB_IntraPredMode_Subset mAlgo_TB_IntraPredMode_Subset; - - //Algo_TB_IntraPredMode_FastBrute::params TB_IntraPredMode_FastBrute; - //Algo_TB_IntraPredMode_MinResidual::params TB_IntraPredMode_MinResidual; - - - // --- Algo_TB_Split_BruteForce - - //Algo_TB_Split_BruteForce::params TB_Split_BruteForce; - - - // --- Algo_CB_IntraPartMode - - option_ALGO_CB_IntraPartMode mAlgo_CB_IntraPartMode; - - //Algo_CB_IntraPartMode_Fixed::params CB_IntraPartMode_Fixed; - - // --- Algo_CB_Split - - // --- Algo_CTB_QScale - - //Algo_CTB_QScale_Constant::params CTB_QScale_Constant; - - option_MEMode mAlgo_MEMode; - - - // intra-prediction - - enum IntraPredSearch intraPredSearch; - - - // rate-control - - enum RateControlMethod rateControlMethod; - option_ALGO_TB_RateEstimation mAlgo_TB_RateEstimation; - - //int constant_QP; - //int lambda; -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-syntax.cc
Deleted
@@ -1,1730 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "encoder-syntax.h" -#include "encoder-context.h" -#include "encoder-intrapred.h" -#include "slice.h" -#include "scan.h" -#include "intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include <iostream> - - -#ifdef DE265_LOG_DEBUG -#define ESTIM_BITS_BEGIN \ - CABAC_encoder_estim* log_estim; \ - float log_bits_pre = 0; \ - if (logdebug_enabled(LogEncoder)) { \ - log_estim = dynamic_cast<CABAC_encoder_estim*>(cabac); \ - if (log_estim) { \ - log_bits_pre = log_estim->getRDBits(); \ - } \ - } - -#define ESTIM_BITS_END(name) \ - if (logdebug_enabled(LogEncoder)) { \ - if (log_estim) { \ - float bits_post = log_estim->getRDBits(); \ - printf("%s=%f\n",name,bits_post - log_bits_pre); \ - } \ - } -#else -#define ESTIM_BITS_BEGIN -#define ESTIM_BITS_END(name) -#endif - - - -static void internal_recursive_cbfChroma_rate(CABAC_encoder_estim* cabac, - enc_tb* tb, int log2TrafoSize, int trafoDepth) -{ - // --- CBF CB/CR --- - - // For 4x4 luma, there is no signaling of chroma CBF, because only the - // chroma CBF for 8x8 is relevant. - if (log2TrafoSize>2) { - if (trafoDepth==0 || tb->parent->cbf1) { - encode_cbf_chroma(cabac, trafoDepth, tb->cbf1); - } - if (trafoDepth==0 || tb->parent->cbf2) { - encode_cbf_chroma(cabac, trafoDepth, tb->cbf2); - } - } - - if (tb->split_transform_flag) { - for (int i=0;i<4;i++) { - internal_recursive_cbfChroma_rate(cabac, tb->childreni, log2TrafoSize-1, trafoDepth+1); - } - } -} - - -float recursive_cbfChroma_rate(CABAC_encoder_estim* cabac, - enc_tb* tb, int log2TrafoSize, int trafoDepth) -{ - float bits_pre = cabac->getRDBits(); - - internal_recursive_cbfChroma_rate(cabac, tb, log2TrafoSize, trafoDepth); - - float bits_post = cabac->getRDBits(); - - return bits_post - bits_pre; -} - - - -void encode_split_cu_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int x0, int y0, int ctDepth, int split_flag) -{ - logtrace(LogSymbols,"$1 split_cu_flag=%d\n",split_flag); - - // check if neighbors are available - - int availableL = check_CTB_available(ectx->img, x0,y0, x0-1,y0); - int availableA = check_CTB_available(ectx->img, x0,y0, x0,y0-1); - - int condL = 0; - int condA = 0; - - if (availableL && ectx->ctbs.getCB(x0-1,y0)->ctDepth > ctDepth) condL=1; - if (availableA && ectx->ctbs.getCB(x0,y0-1)->ctDepth > ctDepth) condA=1; - - int contextOffset = condL + condA; - int context = contextOffset; - - // decode bit - - logtrace(LogSlice,"> split_cu_flag = %d (context=%d)\n",split_flag,context); - - cabac->write_CABAC_bit(CONTEXT_MODEL_SPLIT_CU_FLAG + context, split_flag); -} - - -void encode_part_mode(encoder_context* ectx, - CABAC_encoder* cabac, - enum PredMode PredMode, enum PartMode PartMode, int cLog2CbSize) -{ - logtrace(LogSymbols,"$1 part_mode=%d\n",PartMode); - logtrace(LogSlice,"> part_mode = %d\n",PartMode); - - if (PredMode == MODE_INTRA) { - int bin = (PartMode==PART_2Nx2N); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+0, bin); - } - else { - if (PartMode==PART_2Nx2N) { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+0, 1); - return; - } - else { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+0, 0); - } - - if (cLog2CbSize > ectx->get_sps().Log2MinCbSizeY) { - if (ectx->get_sps().amp_enabled_flag) { - switch (PartMode) { - case PART_2NxN: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 1); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 1); - break; - case PART_Nx2N: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 0); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 1); - break; - case PART_2NxnU: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 1); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 0); - cabac->write_CABAC_bypass(0); - break; - case PART_2NxnD: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 1); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 0); - cabac->write_CABAC_bypass(1); - break; - case PART_nLx2N: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 0); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 0); - cabac->write_CABAC_bypass(0); - break; - case PART_nRx2N: - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 0); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 0); - cabac->write_CABAC_bypass(1); - break; - case PART_NxN: - case PART_2Nx2N: - assert(false); - break; - } - } - else { - if (PartMode==PART_2NxN) { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 1); - } - else { - assert(PartMode==PART_Nx2N); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 0); - } - } - } - else { - if (PartMode==PART_2NxN) { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 1); - } - else { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+1, 0); - - if (cLog2CbSize==3) { - assert(PartMode==PART_Nx2N); - } - else { - if (PartMode==PART_Nx2N) { - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 1); - } - else { - assert(PartMode==PART_NxN); - cabac->write_CABAC_bit(CONTEXT_MODEL_PART_MODE+3, 0); - } - } - } - } - } -} - - -static void encode_pred_mode_flag(encoder_context* ectx, - CABAC_encoder* cabac, - enum PredMode PredMode) -{ - logtrace(LogSlice,"> pred_mode = %d\n",PredMode); - - int flag = (PredMode == MODE_INTRA) ? 1 : 0; - - logtrace(LogSymbols,"$1 pred_mode=%d\n",flag); - - cabac->write_CABAC_bit(CONTEXT_MODEL_PRED_MODE_FLAG, flag); -} - - -static void encode_prev_intra_luma_pred_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int intraPred) -{ - logtrace(LogSymbols,"$1 prev_intra_luma_pred_flag=%d\n",intraPred>=0); - int bin = (intraPred>=0); - - logtrace(LogSlice,"> prev_intra_luma_pred_flag = %d\n",bin); - - cabac->write_CABAC_bit(CONTEXT_MODEL_PREV_INTRA_LUMA_PRED_FLAG, bin); -} - -static void encode_intra_mpm_or_rem(encoder_context* ectx, - CABAC_encoder* cabac, - int intraPred) -{ - if (intraPred>=0) { - logtrace(LogSymbols,"$1 mpm_idx=%d\n",intraPred); - logtrace(LogSlice,"> mpm_idx = %d\n",intraPred); - assert(intraPred<=2); - cabac->write_CABAC_TU_bypass(intraPred, 2); - } - else { - logtrace(LogSymbols,"$1 rem_intra_luma_pred_mode=%d\n",-intraPred-1); - logtrace(LogSlice,"> rem_intra_luma_pred_mode = %d\n",-intraPred-1); - cabac->write_CABAC_FL_bypass(-intraPred-1, 5); - } -} - - -static void encode_intra_chroma_pred_mode(encoder_context* ectx, - CABAC_encoder* cabac, - int mode) -{ - logtrace(LogSymbols,"$1 intra_chroma_pred_mode=%d\n",mode); - logtrace(LogSlice,"> intra_chroma_pred_mode = %d\n",mode); - - if (mode==4) { - cabac->write_CABAC_bit(CONTEXT_MODEL_INTRA_CHROMA_PRED_MODE,0); - } - else { - assert(mode<4); - - cabac->write_CABAC_bit(CONTEXT_MODEL_INTRA_CHROMA_PRED_MODE,1); - cabac->write_CABAC_FL_bypass(mode, 2); - } -} - - -/* Optimized variant that tests most likely branch first. - */ -enum IntraChromaPredMode find_chroma_pred_mode(enum IntraPredMode chroma_mode, - enum IntraPredMode luma_mode) -{ - // most likely mode: chroma mode = luma mode - - if (luma_mode==chroma_mode) { - return INTRA_CHROMA_LIKE_LUMA; - } - - - // check remaining candidates - - IntraPredMode mode = chroma_mode; - - // angular-34 is coded by setting the coded mode equal to the luma_mode - if (chroma_mode == INTRA_ANGULAR_34) { - mode = luma_mode; - } - - switch (mode) { - case INTRA_PLANAR: return INTRA_CHROMA_PLANAR_OR_34; - case INTRA_ANGULAR_26: return INTRA_CHROMA_ANGULAR_26_OR_34; - case INTRA_ANGULAR_10: return INTRA_CHROMA_ANGULAR_10_OR_34; - case INTRA_DC: return INTRA_CHROMA_DC_OR_34; - default: - assert(false); - return INTRA_CHROMA_DC_OR_34; - } -} - - - -void encode_split_transform_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int log2TrafoSize, int split_flag) -{ - logtrace(LogSymbols,"$1 split_transform_flag=%d\n",split_flag); - logtrace(LogSlice,"> split_transform_flag = %d\n",split_flag); - - int context = 5-log2TrafoSize; - assert(context >= 0 && context <= 2); - - cabac->write_CABAC_bit(CONTEXT_MODEL_SPLIT_TRANSFORM_FLAG + context, split_flag); -} - - -void encode_cbf_luma(CABAC_encoder* cabac, - bool zeroTrafoDepth, int cbf_luma) -{ - logtrace(LogSymbols,"$1 cbf_luma=%d\n",cbf_luma); - logtrace(LogSlice,"> cbf_luma = %d\n",cbf_luma); - - int context = (zeroTrafoDepth ? 1 : 0); - - cabac->write_CABAC_bit(CONTEXT_MODEL_CBF_LUMA + context, cbf_luma); -} - - -void encode_cbf_chroma(CABAC_encoder* cabac, - int trafoDepth, int cbf_chroma) -{ - logtrace(LogSymbols,"$1 cbf_chroma=%d\n",cbf_chroma); - logtrace(LogSlice,"> cbf_chroma = %d\n",cbf_chroma); - - int context = trafoDepth; - assert(context >= 0 && context <= 3); - - cabac->write_CABAC_bit(CONTEXT_MODEL_CBF_CHROMA + context, cbf_chroma); -} - -static inline void encode_coded_sub_block_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int cIdx, - uint8_t coded_sub_block_neighbors, - int flag) -{ - logtrace(LogSymbols,"$1 coded_sub_block_flag=%d\n",flag); - logtrace(LogSlice,"# coded_sub_block_flag = %d\n",flag); - - // tricky computation of csbfCtx - int csbfCtx = ((coded_sub_block_neighbors & 1) | // right neighbor set or - (coded_sub_block_neighbors >> 1)); // bottom neighbor set -> csbfCtx=1 - - int ctxIdxInc = csbfCtx; - if (cIdx!=0) { - ctxIdxInc += 2; - } - - cabac->write_CABAC_bit(CONTEXT_MODEL_CODED_SUB_BLOCK_FLAG + ctxIdxInc, flag); -} - -static inline void encode_significant_coeff_flag_lookup(encoder_context* ectx, - CABAC_encoder* cabac, - uint8_t ctxIdxInc, - int significantFlag) -{ - logtrace(LogSymbols,"$1 significant_coeff_flag=%d\n",significantFlag); - logtrace(LogSlice,"# significant_coeff_flag = significantFlag\n"); - logtrace(LogSlice,"context: %d\n",ctxIdxInc); - - cabac->write_CABAC_bit(CONTEXT_MODEL_SIGNIFICANT_COEFF_FLAG + ctxIdxInc, significantFlag); -} - -static inline void encode_coeff_abs_level_greater1(encoder_context* ectx, - CABAC_encoder* cabac, - int cIdx, int i, - bool firstCoeffInSubblock, - bool firstSubblock, - int lastSubblock_greater1Ctx, - int* lastInvocation_greater1Ctx, - int* lastInvocation_coeff_abs_level_greater1_flag, - int* lastInvocation_ctxSet, int c1, - int value) -{ - logtrace(LogSymbols,"$1 coeff_abs_level_greater1=%d\n",value); - logtrace(LogSlice,"# coeff_abs_level_greater1 = %d\n",value); - - logtrace(LogSlice," cIdx:%d i:%d firstCoeffInSB:%d firstSB:%d lastSB>1:%d last>1Ctx:%d lastLev>1:%d lastCtxSet:%d\n", cIdx,i,firstCoeffInSubblock,firstSubblock,lastSubblock_greater1Ctx, - *lastInvocation_greater1Ctx, - *lastInvocation_coeff_abs_level_greater1_flag, - *lastInvocation_ctxSet); - - int lastGreater1Ctx; - int greater1Ctx; - int ctxSet; - - logtrace(LogSlice,"c1: %d\n",c1); - - if (firstCoeffInSubblock) { - // block with real DC -> ctx 0 - if (i==0 || cIdx>0) { ctxSet=0; } - else { ctxSet=2; } - - if (firstSubblock) { lastGreater1Ctx=1; } - else { lastGreater1Ctx = lastSubblock_greater1Ctx; } - - if (lastGreater1Ctx==0) { ctxSet++; } - - logtrace(LogSlice,"ctxSet: %d\n",ctxSet); - - greater1Ctx=1; - } - else { // !firstCoeffInSubblock - ctxSet = *lastInvocation_ctxSet; - logtrace(LogSlice,"ctxSet (old): %d\n",ctxSet); - - greater1Ctx = *lastInvocation_greater1Ctx; - if (greater1Ctx>0) { - int lastGreater1Flag=*lastInvocation_coeff_abs_level_greater1_flag; - if (lastGreater1Flag==1) greater1Ctx=0; - else { /*if (greater1Ctx>0)*/ greater1Ctx++; } - } - } - - ctxSet = c1; // use HM algo - - int ctxIdxInc = (ctxSet*4) + (greater1Ctx>=3 ? 3 : greater1Ctx); - - if (cIdx>0) { ctxIdxInc+=16; } - - cabac->write_CABAC_bit(CONTEXT_MODEL_COEFF_ABS_LEVEL_GREATER1_FLAG + ctxIdxInc, value); - - *lastInvocation_greater1Ctx = greater1Ctx; - *lastInvocation_coeff_abs_level_greater1_flag = value; - *lastInvocation_ctxSet = ctxSet; -} - -static void encode_coeff_abs_level_greater2(encoder_context* ectx, - CABAC_encoder* cabac, - int cIdx, // int i,int n, - int ctxSet, - int value) -{ - logtrace(LogSymbols,"$1 coeff_abs_level_greater2=%d\n",value); - logtrace(LogSlice,"# coeff_abs_level_greater2 = %d\n",value); - - int ctxIdxInc = ctxSet; - - if (cIdx>0) ctxIdxInc+=4; - - cabac->write_CABAC_bit(CONTEXT_MODEL_COEFF_ABS_LEVEL_GREATER2_FLAG + ctxIdxInc, value); -} - - -bool TU(int val, int maxi) -{ - for (int i=0;i<val;i++) { - printf("1"); - } - if (val<maxi) { printf("0"); return false; } - else return true; -} - -void bin(int val, int bits) -{ - for (int i=0;i<bits;i++) { - int bit = (1<<(bits-1-i)); - if (val&bit) printf("1"); else printf("0"); - } -} - -void ExpG(int level, int riceParam) -{ - int prefix = level >> riceParam; - int suffix = level - (prefix<<riceParam); - - //printf("%d %d ",prefix,suffix); - - int base=0; - int range=1; - int nBits=0; - while (prefix >= base+range) { - printf("1"); - base+=range; - range*=2; - nBits++; - } - - printf("0."); - bin(prefix-base, nBits); - printf(":"); - bin(suffix,riceParam); -} - -int blamain() -{ - int riceParam=2; - int TRMax = 4<<riceParam; - - for (int level=0;level<128;level++) - { - printf("%d: ",level); - - int prefixPart = std::min(TRMax, level); - - // code TR prefix - - bool isMaxi = TU(prefixPart>>riceParam, TRMax>>riceParam); - printf(":"); - if (TRMax>prefixPart) { - int remain = prefixPart & ((1<<riceParam)-1); - bin(remain, riceParam); - } - printf("|"); - - if (isMaxi) { - ExpG(level-TRMax, riceParam+1); - } - - printf("\n"); - } - - return 0; -} - - -static void encode_coeff_abs_level_remaining(encoder_context* ectx, - CABAC_encoder* cabac, - int cRiceParam, - int level) -{ - logtrace(LogSymbols,"$1 coeff_abs_level_remaining=%d\n",level); - logtrace(LogSlice,"# encode_coeff_abs_level_remaining = %d\n",level); - - int cTRMax = 4<<cRiceParam; - int prefixPart = std::min(level, cTRMax); - - // --- code prefix with TR --- - - // TU part, length 4 (cTRMax>>riceParam) - - int nOnes = (prefixPart>>cRiceParam); - cabac->write_CABAC_TU_bypass(nOnes, 4); - - // TR suffix - - if (cTRMax > prefixPart) { - int remain = prefixPart & ((1<<cRiceParam)-1); - cabac->write_CABAC_FL_bypass(remain, cRiceParam); - } - - - // --- remainder suffix --- - - if (nOnes==4) { - int remain = level-cTRMax; - int ExpGRiceParam = cRiceParam+1; - - int prefix = remain >> ExpGRiceParam; - int suffix = remain - (prefix<<ExpGRiceParam); - - int base=0; - int range=1; - int nBits=0; - while (prefix >= base+range) { - cabac->write_CABAC_bypass(1); - base+=range; - range*=2; - nBits++; - } - - cabac->write_CABAC_bypass(0); - cabac->write_CABAC_FL_bypass(prefix-base, nBits); - cabac->write_CABAC_FL_bypass(suffix, ExpGRiceParam); - } -} - -// --------------------------------------------------------------------------- - -void findLastSignificantCoeff(const position* sbScan, const position* cScan, - const int16_t* coeff, int log2TrafoSize, - int* lastSignificantX, int* lastSignificantY, - int* lastSb, int* lastPos) -{ - int nSb = 1<<((log2TrafoSize-2)<<1); // number of sub-blocks - - // find last significant coefficient - - for (int i=nSb ; i-->0 ;) { - int x0 = sbScani.x << 2; - int y0 = sbScani.y << 2; - for (int c=16 ; c-->0 ;) { - int x = x0 + cScanc.x; - int y = y0 + cScanc.y; - - if (coeffx+(y<<log2TrafoSize)) { - *lastSignificantX = x; - *lastSignificantY = y; - *lastSb = i; - *lastPos= c; - - logtrace(LogSlice,"last significant coeff at: %d;%d, Sb:%d Pos:%d\n", x,y,i,c); - - return; - } - } - } - - // all coefficients == 0 ? cannot be since cbf should be false in this case - assert(false); -} - - -bool subblock_has_nonzero_coefficient(const int16_t* coeff, int coeffStride, - const position& sbPos) -{ - int x0 = sbPos.x << 2; - int y0 = sbPos.y << 2; - - coeff += x0 + y0*coeffStride; - - for (int y=0;y<4;y++) { - if (coeff0 || coeff1 || coeff2 || coeff3) { return true; } - coeff += coeffStride; - } - - return false; -} - -/* - Example 16x16: prefix in 0;7 - - prefix | last pos - =============|============= - 0 | 0 - 1 | 1 - 2 | 2 - 3 | 3 - -------------+------------- - lsb nBits | - 4 0 1 | 4, 5 - 5 1 1 | 6, 7 - 6 0 2 | 8, 9,10,11 - 7 1 2 | 12,13,14,15 -*/ -void encode_last_signficiant_coeff_prefix(encoder_context* ectx, - CABAC_encoder* cabac, - int log2TrafoSize, - int cIdx, int lastSignificant, - int context_model_index) -{ - logtrace(LogSlice,"> last_significant_coeff_prefix=%d log2TrafoSize:%d cIdx:%d\n", - lastSignificant,log2TrafoSize,cIdx); - - int cMax = (log2TrafoSize<<1)-1; - - int ctxOffset, ctxShift; - if (cIdx==0) { - ctxOffset = 3*(log2TrafoSize-2) + ((log2TrafoSize-1)>>2); - ctxShift = (log2TrafoSize+1)>>2; - } - else { - ctxOffset = 15; - ctxShift = log2TrafoSize-2; - } - - for (int binIdx=0;binIdx<lastSignificant;binIdx++) - { - int ctxIdxInc = (binIdx >> ctxShift); - cabac->write_CABAC_bit(context_model_index + ctxOffset + ctxIdxInc, 1); - } - - if (lastSignificant != cMax) { - int binIdx = lastSignificant; - int ctxIdxInc = (binIdx >> ctxShift); - cabac->write_CABAC_bit(context_model_index + ctxOffset + ctxIdxInc, 0); - } -} - - -void split_last_significant_position(int pos, int* prefix, int* suffix, int* nSuffixBits) -{ - logtrace(LogSlice,"split position %d : ",pos); - - // most frequent case - - if (pos<=3) { - *prefix=pos; - *suffix=-1; // just to have some defined value - *nSuffixBits=0; - logtrace(LogSlice,"prefix=%d suffix=%d (%d bits)\n",*prefix,*suffix,*nSuffixBits); - return; - } - - pos -= 4; - int nBits=1; - int range=4; - while (pos>=range) { - nBits++; - pos-=range; - range<<=1; - } - - *prefix = (1+nBits)<<1; - if (pos >= (range>>1)) { - *prefix |= 1; - pos -= (range>>1); - } - *suffix = pos; - *nSuffixBits = nBits; - - logtrace(LogSlice,"prefix=%d suffix=%d (%d bits)\n",*prefix,*suffix,*nSuffixBits); -} - - -extern uint8_t* ctxIdxLookup4 /* 4-log2-32 */2 /* !!cIdx */2 /* !!scanIdx */4 /* prevCsbf */; - -/* These values are read from the image metadata: - - intra prediction mode (x0;y0) - */ -void encode_residual(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0,int log2TrafoSize,int cIdx) -{ - logdebug(LogEncoder,"encode_residual %s\n",typeid(*cabac).name()); - - const de265_image* img = ectx->img; - const seq_parameter_set& sps = img->get_sps(); - const pic_parameter_set& pps = img->get_pps(); - - int16_t* coeff = tb->coeffcIdx; - - if (pps.transform_skip_enabled_flag && true /* TODO */) { - } - - -#if 1 - logdebug(LogEncoder,"write coefficients\n"); - for (int y=0;y<(1<<log2TrafoSize);y++) - { - for (int x=0;x<(1<<log2TrafoSize);x++) - { - logdebug(LogEncoder,"*%4d ",coeffx+y*(1<<log2TrafoSize)); - } - logdebug(LogEncoder,"*\n"); - } -#endif - - - // --- get scan orders --- - - enum PredMode PredMode = cb->PredMode; - int scanIdx; - - if (PredMode == MODE_INTRA) { - if (cIdx==0) { - scanIdx = get_intra_scan_idx(log2TrafoSize, tb->intra_mode, cIdx, &sps); - //printf("luma scan idx=%d <- intra mode=%d\n",scanIdx, tb->intra_mode); - } - else { - scanIdx = get_intra_scan_idx(log2TrafoSize, tb->intra_mode_chroma, cIdx, &sps); - //printf("chroma scan idx=%d <- intra mode=%d chroma:%d trsize:%d\n",scanIdx, - // tb->intra_mode_chroma, sps.chroma_format_idc, 1<<log2TrafoSize); - } - } - else { - scanIdx=0; - } - - - const position* ScanOrderSub = get_scan_order(log2TrafoSize-2, scanIdx); - const position* ScanOrderPos = get_scan_order(2, scanIdx); - - int lastSignificantX, lastSignificantY; - int lastScanPos; - int lastSubBlock; - findLastSignificantCoeff(ScanOrderSub, ScanOrderPos, - coeff, log2TrafoSize, - &lastSignificantX, &lastSignificantY, - &lastSubBlock, &lastScanPos); - - int codedSignificantX = lastSignificantX; - int codedSignificantY = lastSignificantY; - - if (scanIdx==2) { - std::swap(codedSignificantX, codedSignificantY); - } - - - - int prefixX, suffixX, suffixBitsX; - int prefixY, suffixY, suffixBitsY; - - split_last_significant_position(codedSignificantX, &prefixX,&suffixX,&suffixBitsX); - split_last_significant_position(codedSignificantY, &prefixY,&suffixY,&suffixBitsY); - - encode_last_signficiant_coeff_prefix(ectx, cabac, log2TrafoSize, cIdx, prefixX, - CONTEXT_MODEL_LAST_SIGNIFICANT_COEFFICIENT_X_PREFIX); - - encode_last_signficiant_coeff_prefix(ectx, cabac, log2TrafoSize, cIdx, prefixY, - CONTEXT_MODEL_LAST_SIGNIFICANT_COEFFICIENT_Y_PREFIX); - - - if (codedSignificantX > 3) { - cabac->write_CABAC_FL_bypass(suffixX, suffixBitsX); - } - if (codedSignificantY > 3) { - cabac->write_CABAC_FL_bypass(suffixY, suffixBitsY); - } - - - - int sbWidth = 1<<(log2TrafoSize-2); - int CoeffStride = 1<<log2TrafoSize; - - uint8_t coded_sub_block_neighbors32/4*32/4; // 64*2 flags - memset(coded_sub_block_neighbors,0,sbWidth*sbWidth); - - int c1 = 1; - bool firstSubblock = true; // for coeff_abs_level_greater1_flag context model - int lastSubblock_greater1Ctx=false; /* for coeff_abs_level_greater1_flag context model - (initialization not strictly needed) - */ - - int lastInvocation_greater1Ctx=0; - int lastInvocation_coeff_abs_level_greater1_flag=0; - int lastInvocation_ctxSet=0; - - - - // ----- encode coefficients ----- - - //tctx->nCoeffcIdx = 0; - - - // i - subblock index - // n - coefficient index in subblock - - for (int i=lastSubBlock;i>=0;i--) { - position S = ScanOrderSubi; - int inferSbDcSigCoeffFlag=0; - - logtrace(LogSlice,"sub block scan idx: %d\n",i); - - - // --- check whether this sub-block has to be coded --- - - int sub_block_is_coded = 0; - - if ((i<lastSubBlock) && (i>0)) { - sub_block_is_coded = subblock_has_nonzero_coefficient(coeff, CoeffStride, S); - encode_coded_sub_block_flag(ectx, cabac, cIdx, - coded_sub_block_neighborsS.x+S.y*sbWidth, - sub_block_is_coded); - inferSbDcSigCoeffFlag=1; - } - else if (i==0 || i==lastSubBlock) { - // first (DC) and last sub-block are always coded - // - the first will most probably contain coefficients - // - the last obviously contains the last coded coefficient - - sub_block_is_coded = 1; - } - - if (sub_block_is_coded) { - if (S.x > 0) coded_sub_block_neighborsS.x-1 + S.y *sbWidth |= 1; - if (S.y > 0) coded_sub_block_neighborsS.x + (S.y-1)*sbWidth |= 2; - } - - logtrace(LogSlice,"subblock is coded: %s\n", sub_block_is_coded ? "yes":"no"); - - - // --- write significant coefficient flags --- - - int16_t coeff_value16; - int16_t coeff_baseLevel16; - int8_t coeff_scan_pos16; - int8_t coeff_sign16; - int8_t coeff_has_max_base_level16; - int nCoefficients=0; - - - if (sub_block_is_coded) { - int x0 = S.x<<2; - int y0 = S.y<<2; - - int log2w = log2TrafoSize-2; - int prevCsbf = coded_sub_block_neighborsS.x+S.y*sbWidth; - uint8_t* ctxIdxMap = ctxIdxLookuplog2w!!cIdx!!scanIdxprevCsbf; - - logdebug(LogSlice,"log2w:%d cIdx:%d scanIdx:%d prevCsbf:%d\n", - log2w,cIdx,scanIdx,prevCsbf); - - - // set the last coded coefficient in the last subblock - - if (i==lastSubBlock) { - coeff_valuenCoefficients = coefflastSignificantX+(lastSignificantY<<log2TrafoSize); - coeff_has_max_base_levelnCoefficients = 1; // TODO - coeff_scan_posnCoefficients = lastScanPos; - nCoefficients++; - } - - - // --- encode all coefficients' significant_coeff flags except for the DC coefficient --- - - int last_coeff = (i==lastSubBlock) ? lastScanPos-1 : 15; - - for (int n= last_coeff ; n>0 ; n--) { - int subX = ScanOrderPosn.x; - int subY = ScanOrderPosn.y; - int xC = x0 + subX; - int yC = y0 + subY; - - - // for all AC coefficients in sub-block, a significant_coeff flag is coded - - int isSignificant = !!tb->coeffcIdxxC + (yC<<log2TrafoSize); - - logtrace(LogSlice,"coeff %d is significant: %d\n", n, isSignificant); - - logtrace(LogSlice,"trafoSize: %d\n",1<<log2TrafoSize); - logtrace(LogSlice,"context idx: %d;%d\n",xC,yC); - - encode_significant_coeff_flag_lookup(ectx, cabac, - ctxIdxMapxC+(yC<<log2TrafoSize), - isSignificant); - //ctxIdxMap(i<<4)+n); - - if (isSignificant) { - coeff_valuenCoefficients = coeffxC+(yC<<log2TrafoSize); - coeff_has_max_base_levelnCoefficients = 1; - coeff_scan_posnCoefficients = n; - nCoefficients++; - - // since we have a coefficient in the sub-block, - // we cannot infer the DC coefficient anymore - inferSbDcSigCoeffFlag = 0; - } - } - - - // --- decode DC coefficient significance --- - - if (last_coeff>=0) // last coded coefficient (always set to 1) is not the DC coefficient - { - if (inferSbDcSigCoeffFlag==0) { - // if we cannot infert the DC coefficient, it is coded - int isSignificant = !!tb->coeffcIdxx0 + (y0<<log2TrafoSize); - - logtrace(LogSlice,"DC coeff is significant: %d\n", isSignificant); - - encode_significant_coeff_flag_lookup(ectx, cabac, - ctxIdxMapx0+(y0<<log2TrafoSize), - isSignificant); - - if (isSignificant) { - coeff_valuenCoefficients = coeffx0+(y0<<log2TrafoSize); - coeff_has_max_base_levelnCoefficients = 1; - coeff_scan_posnCoefficients = 0; - nCoefficients++; - } - } - else { - // we can infer that the DC coefficient must be present - coeff_valuenCoefficients = coeffx0+(y0<<log2TrafoSize); - coeff_has_max_base_levelnCoefficients = 1; - coeff_scan_posnCoefficients = 0; - nCoefficients++; - } - } - } - - - - // --- encode coefficient values --- - - if (nCoefficients) { - - // separate absolute coefficient value and sign - - logtrace(LogSlice,"coefficients to code: "); - - for (int l=0;l<nCoefficients;l++) { - logtrace(LogSlice,"%d ",coeff_valuel); - - if (coeff_valuel<0) { - coeff_valuel = -coeff_valuel; - coeff_signl = 1; - } - else { - coeff_signl = 0; - } - - coeff_baseLevell = 1; - - logtrace(LogSlice,"(%d) ",coeff_scan_posl); - } - - logtrace(LogSlice,"\n"); - - - int ctxSet; - if (i==0 || cIdx>0) { ctxSet=0; } - else { ctxSet=2; } - - if (c1==0) { ctxSet++; } - c1=1; - - - // --- encode greater-1 flags --- - - int newLastGreater1ScanPos=-1; - - int lastGreater1Coefficient = libde265_min(8,nCoefficients); - for (int c=0;c<lastGreater1Coefficient;c++) { - int greater1_flag = (coeff_valuec>1); - - encode_coeff_abs_level_greater1(ectx, cabac, cIdx,i, - c==0, - firstSubblock, - lastSubblock_greater1Ctx, - &lastInvocation_greater1Ctx, - &lastInvocation_coeff_abs_level_greater1_flag, - &lastInvocation_ctxSet, ctxSet, - greater1_flag); - - if (greater1_flag) { - coeff_baseLevelc++; - - c1=0; - - if (newLastGreater1ScanPos == -1) { - newLastGreater1ScanPos=c; - } - } - else { - coeff_has_max_base_levelc = 0; - - if (c1<3 && c1>0) { - c1++; - } - } - } - - firstSubblock = false; - lastSubblock_greater1Ctx = lastInvocation_greater1Ctx; - - - // --- decode greater-2 flag --- - - if (newLastGreater1ScanPos != -1) { - int greater2_flag = (coeff_valuenewLastGreater1ScanPos>2); - encode_coeff_abs_level_greater2(ectx,cabac, cIdx, lastInvocation_ctxSet, greater2_flag); - coeff_baseLevelnewLastGreater1ScanPos += greater2_flag; - coeff_has_max_base_levelnewLastGreater1ScanPos = greater2_flag; - } - - - // --- encode coefficient signs --- - - int signHidden = (coeff_scan_pos0-coeff_scan_posnCoefficients-1 > 3 && - !cb->cu_transquant_bypass_flag); - - for (int n=0;n<nCoefficients-1;n++) { - cabac->write_CABAC_bypass(coeff_signn); - //logtrace(LogSlice,"a) sign%d = %d\n", n, coeff_signn); - } - - // n==nCoefficients-1 - if (!pps.sign_data_hiding_flag || !signHidden) { - cabac->write_CABAC_bypass(coeff_signnCoefficients-1); - //logtrace(LogSlice,"b) sign%d = %d\n", nCoefficients-1, coeff_signnCoefficients-1); - } - else { - assert(coeff_signnCoefficients-1 == 0); - } - - // --- decode coefficient value --- - - int sumAbsLevel=0; - int uiGoRiceParam=0; - - for (int n=0;n<nCoefficients;n++) { - int baseLevel = coeff_baseLeveln; - - int coeff_abs_level_remaining; - - if (coeff_has_max_base_leveln) { - logtrace(LogSlice,"value%d=%d, base level: %d\n",n,coeff_valuen,coeff_baseLeveln); - - coeff_abs_level_remaining = coeff_valuen - coeff_baseLeveln; - - encode_coeff_abs_level_remaining(ectx, cabac, uiGoRiceParam, - coeff_abs_level_remaining); - - // (9-462) - if (baseLevel + coeff_abs_level_remaining > 3*(1<<uiGoRiceParam)) { - uiGoRiceParam++; - if (uiGoRiceParam>4) uiGoRiceParam=4; - } - } - else { - coeff_abs_level_remaining = 0; - } - - - // --- DEBUG: check coefficient --- - -#if 0 - int16_t currCoeff = baseLevel + coeff_abs_level_remaining; - if (coeff_signn) { - currCoeff = -currCoeff; - } - - if (pps.sign_data_hiding_flag && signHidden) { - sumAbsLevel += baseLevel + coeff_abs_level_remaining; - - if (n==nCoefficients-1 && (sumAbsLevel & 1)) { - currCoeff = -currCoeff; - } - } - - assert(currCoeff == coeff_valuen); -#endif - } // iterate through coefficients in sub-block - } // if nonZero - - } -} - - -void encode_transform_unit(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx) -{ - ESTIM_BITS_BEGIN; - - if (tb->cbf0 || tb->cbf1 || tb->cbf2) { - if (ectx->img->get_pps().cu_qp_delta_enabled_flag && - true /*!ectx->IsCuQpDeltaCoded*/) { - assert(0); - } - - if (tb->cbf0) { - encode_residual(ectx,cabac, tb,cb,x0,y0,log2TrafoSize,0); - } - - if (ectx->get_sps().chroma_format_idc == CHROMA_444) { - if (tb->cbf1) { - encode_residual(ectx,cabac, tb,cb,x0,y0,log2TrafoSize,1); - } - if (tb->cbf2) { - encode_residual(ectx,cabac, tb,cb,x0,y0,log2TrafoSize,2); - } - } - else if (log2TrafoSize>2) { - // larger than 4x4 - - if (tb->cbf1) { - encode_residual(ectx,cabac,tb,cb,x0,y0,log2TrafoSize-1,1); - } - if (tb->cbf2) { - encode_residual(ectx,cabac,tb,cb,x0,y0,log2TrafoSize-1,2); - } - } - else if (blkIdx==3) { - // cannot check for tb->parent->cbf, because this may not yet be set - if (tb->cbf1) { - encode_residual(ectx,cabac,tb,cb,xBase,yBase,log2TrafoSize,1); - } - if (tb->cbf2) { - encode_residual(ectx,cabac,tb,cb,xBase,yBase,log2TrafoSize,2); - } - } - } - - ESTIM_BITS_END("encode_transform_unit"); -} - - -void encode_transform_tree(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx, - int MaxTrafoDepth, int IntraSplitFlag, bool recurse) -{ - ESTIM_BITS_BEGIN; - - //de265_image* img = ectx->img; - const seq_parameter_set& sps = ectx->img->get_sps(); - - if (log2TrafoSize <= sps.Log2MaxTrafoSize && - log2TrafoSize > sps.Log2MinTrafoSize && - trafoDepth < MaxTrafoDepth && - !(IntraSplitFlag && trafoDepth==0)) - { - int split_transform_flag = tb->split_transform_flag; - encode_split_transform_flag(ectx, cabac, log2TrafoSize, split_transform_flag); - } - else - { - int interSplitFlag=0; // TODO - - bool split_transform_flag = (log2TrafoSize > sps.Log2MaxTrafoSize || - (IntraSplitFlag==1 && trafoDepth==0) || - interSplitFlag==1) ? 1:0; - - /* - printf("split_transform_flag log2TrafoSize:%d Log2MaxTrafoSize:%d " - "IntraSplitFlag:%d trafoDepth:%d -> %d\n", - log2TrafoSize,sps->Log2MaxTrafoSize, - IntraSplitFlag, trafoDepth, - split_transform_flag); - */ - - assert(tb->split_transform_flag == split_transform_flag); - } - - // --- CBF CB/CR --- - - // For 4x4 luma, there is no signaling of chroma CBF, because only the - // chroma CBF for 8x8 is relevant. - if (log2TrafoSize>2 || sps.ChromaArrayType == CHROMA_444) { - if (trafoDepth==0 || tb->parent->cbf1) { - encode_cbf_chroma(cabac, trafoDepth, tb->cbf1); - } - if (trafoDepth==0 || tb->parent->cbf2) { - encode_cbf_chroma(cabac, trafoDepth, tb->cbf2); - } - } - - if (tb->split_transform_flag) { - if (recurse) { - int x1 = x0 + (1<<(log2TrafoSize-1)); - int y1 = y0 + (1<<(log2TrafoSize-1)); - - encode_transform_tree(ectx, cabac, tb->children0, cb, x0,y0,x0,y0,log2TrafoSize-1, - trafoDepth+1, 0, MaxTrafoDepth, IntraSplitFlag, true); - encode_transform_tree(ectx, cabac, tb->children1, cb, x1,y0,x0,y0,log2TrafoSize-1, - trafoDepth+1, 1, MaxTrafoDepth, IntraSplitFlag, true); - encode_transform_tree(ectx, cabac, tb->children2, cb, x0,y1,x0,y0,log2TrafoSize-1, - trafoDepth+1, 2, MaxTrafoDepth, IntraSplitFlag, true); - encode_transform_tree(ectx, cabac, tb->children3, cb, x1,y1,x0,y0,log2TrafoSize-1, - trafoDepth+1, 3, MaxTrafoDepth, IntraSplitFlag, true); - } - } - else { - if (cb->PredMode == MODE_INTRA || trafoDepth != 0 || - tb->cbf1 || tb->cbf2) { - encode_cbf_luma(cabac, trafoDepth==0, tb->cbf0); - } - else { - /* Note: usually, cbf0 should be TRUE, but while estimating the bitrate, this - function can also be called with all CBFs FALSE. Usually, this is handled by - the rqt_root_cbf flag, but during analysis, this is set after the bitrate is estimated. - */ - // assert(tb->cbf0==true); - } - - encode_transform_unit(ectx,cabac, tb,cb, x0,y0, xBase,yBase, log2TrafoSize, trafoDepth, blkIdx); - } - - ESTIM_BITS_END("encode_transform_tree"); -} - - -void encode_cu_skip_flag(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, - bool skip) -{ - logtrace(LogSymbols,"$1 cu_skip_flag=%d\n",skip); - - const de265_image* img = ectx->img; - - int x0 = cb->x; - int y0 = cb->y; - - // check if neighbors are available - - int availableL = check_CTB_available(img, x0,y0, x0-1,y0); - int availableA = check_CTB_available(img, x0,y0, x0,y0-1); - - int condL = 0; - int condA = 0; - - if (availableL && ectx->ctbs.getCB(x0-1,y0)->PredMode == MODE_SKIP) condL=1; - if (availableA && ectx->ctbs.getCB(x0,y0-1)->PredMode == MODE_SKIP) condA=1; - - int contextOffset = condL + condA; - int context = contextOffset; - - // decode bit - - int bit = skip; - - logtrace(LogSlice,"> cu_skip_flag ctx=%d, bit=%d\n", context,bit); - - cabac->write_CABAC_bit(CONTEXT_MODEL_CU_SKIP_FLAG + context, bit); -} - - -void encode_merge_idx(encoder_context* ectx, - CABAC_encoder* cabac, - int mergeIdx) -{ - logtrace(LogSymbols,"$1 merge_idx=%d\n",mergeIdx); - logtrace(LogSlice,"# merge_idx %d\n", mergeIdx); - - if (ectx->shdr->MaxNumMergeCand <= 1) { - return; // code nothing, we use only a single merge candidate - } - - // TU coding, first bin is CABAC, remaining are bypass. - // cMax = MaxNumMergeCand-1 - - cabac->write_CABAC_bit(CONTEXT_MODEL_MERGE_IDX, mergeIdx ? 1 : 0); - - if (mergeIdx>0) { - int idx=1; - - while (idx<ectx->shdr->MaxNumMergeCand-1) { - int increase = (idx < mergeIdx); - - cabac->write_CABAC_bypass(increase); - if (increase) { - idx++; - } - else { - break; - } - } - } -} - - -static inline void encode_rqt_root_cbf(encoder_context* ectx, - CABAC_encoder* cabac, - int rqt_root_cbf) -{ - logtrace(LogSymbols,"$1 rqt_root_cbf=%d\n",rqt_root_cbf); - cabac->write_CABAC_bit(CONTEXT_MODEL_RQT_ROOT_CBF, rqt_root_cbf); -} - - -void encode_mvd(encoder_context* ectx, - CABAC_encoder* cabac, - const int16_t mvd2) -{ - int mvd0abs = abs_value(mvd0); - int mvd1abs = abs_value(mvd1); - - int mvd0_greater_0 = !!(mvd0abs); - int mvd1_greater_0 = !!(mvd1abs); - - cabac->write_CABAC_bit(CONTEXT_MODEL_ABS_MVD_GREATER01_FLAG+0, mvd0_greater_0); - cabac->write_CABAC_bit(CONTEXT_MODEL_ABS_MVD_GREATER01_FLAG+0, mvd1_greater_0); - - if (mvd0_greater_0) { - cabac->write_CABAC_bit(CONTEXT_MODEL_ABS_MVD_GREATER01_FLAG+1, mvd0abs>1); - } - if (mvd1_greater_0) { - cabac->write_CABAC_bit(CONTEXT_MODEL_ABS_MVD_GREATER01_FLAG+1, mvd1abs>1); - } - - if (mvd0abs) { - if (mvd0abs>1) { - cabac->write_CABAC_EGk(mvd0abs-2,1); - } - cabac->write_CABAC_bypass(mvd0<0); - } - - if (mvd1abs) { - if (mvd1abs>1) { - cabac->write_CABAC_EGk(mvd1abs-2,1); - } - cabac->write_CABAC_bypass(mvd1<0); - } -} - - -void encode_prediction_unit(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, int pbIdx, - int x0,int y0, int w, int h) -{ - const enc_pb_inter& pb = cb->inter.pbpbIdx; - - logtrace(LogSymbols,"$1 merge_flag=%d\n",pb.spec.merge_flag); - cabac->write_CABAC_bit(CONTEXT_MODEL_MERGE_FLAG, pb.spec.merge_flag); - - if (pb.spec.merge_flag) { - assert(false); // TODO - } - else { - if (ectx->shdr->slice_type == SLICE_TYPE_B) { - assert(false); // TODO - } - - if (pb.spec.inter_pred_idc != PRED_L1) { - if (ectx->shdr->num_ref_idx_l0_active > 1) { - assert(false); // TODO - //cabac->write_CABAC_bit(CONTEXT_MODEL_REF_IDX_LX, pb.spec.mvp_l0_flag); - } - - encode_mvd(ectx,cabac, pb.spec.mvd0); - - logtrace(LogSymbols,"$1 mvp_lx_flag=%d\n",pb.spec.mvp_l0_flag); - cabac->write_CABAC_bit(CONTEXT_MODEL_MVP_LX_FLAG, pb.spec.mvp_l0_flag); - } - - if (pb.spec.inter_pred_idc != PRED_L0) { - assert(false); // TODO - } - - /* -enum InterPredIdc - InterPredIdc::PRED_L0=0, - InterPredIdc::PRED_L1=1, - InterPredIdc::PRED_BI=2 - */ - } -} - - -void encode_coding_unit(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, int x0,int y0, int log2CbSize, bool recurse) -{ - logtrace(LogSlice,"--- encode CU (%d;%d) ---\n",x0,y0); - - de265_image* img = ectx->img; - const slice_segment_header* shdr = &ectx->imgdata->shdr; - const seq_parameter_set& sps = ectx->img->get_sps(); - - - int nCbS = 1<<log2CbSize; - - - // write skip_flag - - if (shdr->slice_type != SLICE_TYPE_I) { - encode_cu_skip_flag(ectx,cabac, cb, cb->PredMode==MODE_SKIP); - } - - if (cb->PredMode==MODE_SKIP) { - assert(cb->inter.pb0.spec.merge_flag); - encode_merge_idx(ectx,cabac, cb->inter.pb0.spec.merge_idx); - } - else { - - enum PredMode PredMode = cb->PredMode; - enum PartMode PartMode = PART_2Nx2N; - int IntraSplitFlag=0; - - if (shdr->slice_type != SLICE_TYPE_I) { - encode_pred_mode_flag(ectx,cabac, PredMode); - } - - if (PredMode != MODE_INTRA || - log2CbSize == sps.Log2MinCbSizeY) { - PartMode = cb->PartMode; - encode_part_mode(ectx,cabac, PredMode, PartMode, log2CbSize); - } - - if (PredMode == MODE_INTRA) { - - assert(cb->split_cu_flag == 0); - - int availableA0 = check_CTB_available(img, x0,y0, x0-1,y0); - int availableB0 = check_CTB_available(img, x0,y0, x0,y0-1); - - if (PartMode==PART_2Nx2N) { - logtrace(LogSlice,"x0,y0: %d,%d\n",x0,y0); - int PUidx = (x0>>sps.Log2MinPUSize) + (y0>>sps.Log2MinPUSize)*sps.PicWidthInMinPUs; - - enum IntraPredMode candModeList3; - fillIntraPredModeCandidates(candModeList,x0,y0, - availableA0,availableB0, ectx->ctbs, &sps); - - for (int i=0;i<3;i++) - logtrace(LogSlice,"candModeList%d = %d\n", i, candModeListi); - - enum IntraPredMode mode = cb->transform_tree->intra_mode; - - int intraPred = find_intra_pred_mode(mode, candModeList); - encode_prev_intra_luma_pred_flag(ectx,cabac, intraPred); - encode_intra_mpm_or_rem(ectx,cabac, intraPred); - - logtrace(LogSlice,"IntraPredMode: %d (candidates: %d %d %d)\n", mode, - candModeList0, candModeList1, candModeList2); - logtrace(LogSlice," MPM/REM = %d\n",intraPred); - - - IntraChromaPredMode chromaPredMode; - chromaPredMode = find_chroma_pred_mode(cb->transform_tree->intra_mode_chroma, - cb->transform_tree->intra_mode); - encode_intra_chroma_pred_mode(ectx,cabac, chromaPredMode); - } - else { - IntraSplitFlag=1; - - int pbOffset = nCbS/2; - int PUidx; - - int intraPred4; - int childIdx=0; - - for (int j=0;j<nCbS;j+=pbOffset) - for (int i=0;i<nCbS;i+=pbOffset, childIdx++) - { - int x=x0+i, y=y0+j; - - int availableA = availableA0 || (i>0); // left candidate always available for right blk - int availableB = availableB0 || (j>0); // top candidate always available for bottom blk - - PUidx = (x>>sps.Log2MinPUSize) + (y>>sps.Log2MinPUSize)*sps.PicWidthInMinPUs; - - enum IntraPredMode candModeList3; - fillIntraPredModeCandidates(candModeList,x,y, - availableA,availableB, ectx->ctbs, &sps); - - for (int i=0;i<3;i++) - logtrace(LogSlice,"candModeList%d = %d\n", i, candModeListi); - - enum IntraPredMode mode = cb->transform_tree->childrenchildIdx->intra_mode; - - intraPredchildIdx = find_intra_pred_mode(mode, candModeList); - - logtrace(LogSlice,"IntraPredMode: %d (candidates: %d %d %d)\n", mode, - candModeList0, candModeList1, candModeList2); - logtrace(LogSlice," MPM/REM = %d\n",intraPredchildIdx); - } - - for (int i=0;i<4;i++) - encode_prev_intra_luma_pred_flag(ectx,cabac, intraPredi); - - for (int i=0;i<4;i++) { - encode_intra_mpm_or_rem(ectx,cabac, intraPredi); - } - - - // send chroma mode - - if (sps.ChromaArrayType == CHROMA_444) { - for (int i=0;i<4;i++) { - IntraChromaPredMode chromaPredMode; - chromaPredMode = find_chroma_pred_mode(cb->transform_tree->childreni->intra_mode_chroma, - cb->transform_tree->childreni->intra_mode); - encode_intra_chroma_pred_mode(ectx,cabac, chromaPredMode); - } - } - else { - IntraChromaPredMode chromaPredMode; - chromaPredMode = find_chroma_pred_mode(cb->transform_tree->children0->intra_mode_chroma, - cb->transform_tree->children0->intra_mode); - encode_intra_chroma_pred_mode(ectx,cabac, chromaPredMode); - } - } - - /* - printf("write intra modes. Luma=%d Chroma=%d\n", - cb->intra.pred_mode0, - cb->intra.chroma_mode); - */ - } - else { - switch (cb->PartMode) { - case PART_2Nx2N: - encode_prediction_unit(ectx,cabac,cb, 0, cb->x,cb->y,1<<cb->log2Size,1<<cb->log2Size); - break; - case PART_2NxN: - case PART_Nx2N: - case PART_NxN: - case PART_2NxnU: - case PART_2NxnD: - case PART_nLx2N: - case PART_nRx2N: - assert(false); // TODO - } - } - - - if (true) { // !pcm - - if (cb->PredMode != MODE_INTRA && - !(cb->PartMode == PART_2Nx2N && cb->inter.pb0.spec.merge_flag)) { - - //printf("%d %d %d\n",cb->PredMode,cb->PartMode,cb->inter.pb0.merge_flag); - - encode_rqt_root_cbf(ectx,cabac, cb->inter.rqt_root_cbf); - } - - //printf("%d;%d encode rqt_root_cbf=%d\n",x0,y0,cb->inter.rqt_root_cbf); - - if (cb->PredMode == MODE_INTRA || cb->inter.rqt_root_cbf) { - int MaxTrafoDepth; - if (PredMode == MODE_INTRA) - { MaxTrafoDepth = sps.max_transform_hierarchy_depth_intra + IntraSplitFlag; } - else - { MaxTrafoDepth = sps.max_transform_hierarchy_depth_inter; } - - - if (recurse) { - //printf("%d;%d store transform tree\n",x0,y0); - - encode_transform_tree(ectx,cabac, cb->transform_tree, cb, - x0,y0, x0,y0, log2CbSize, 0, 0, MaxTrafoDepth, IntraSplitFlag, true); - } - } - } - } -} - - -SplitType get_split_type(const seq_parameter_set* sps, - int x0,int y0, int log2CbSize) -{ - /* - CU split flag: - - | overlaps | minimum || - case | border | size || split - -----+----------+---------++---------- - A | 0 | 0 || optional - B | 0 | 1 || 0 - C | 1 | 0 || 1 - D | 1 | 1 || 0 - */ - if (x0+(1<<log2CbSize) <= sps->pic_width_in_luma_samples && - y0+(1<<log2CbSize) <= sps->pic_height_in_luma_samples && - log2CbSize > sps->Log2MinCbSizeY) { - - // case A - - return OptionalSplit; - } else { - // case B/C/D - - if (log2CbSize > sps->Log2MinCbSizeY) { return ForcedSplit; } - else { return ForcedNonSplit; } - } -} - - -void encode_quadtree(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, int x0,int y0, int log2CbSize, int ctDepth, - bool recurse) -{ - //de265_image* img = ectx->img; - const seq_parameter_set& sps = ectx->img->get_sps(); - - int split_flag = get_split_type(&sps,x0,y0,log2CbSize); - - // if it is an optional split, take the decision from the CU flag - if (split_flag == OptionalSplit) { - split_flag = cb->split_cu_flag; - - encode_split_cu_flag(ectx,cabac, x0,y0, ctDepth, split_flag); - } - - - if (split_flag) { - if (recurse) { - int x1 = x0 + (1<<(log2CbSize-1)); - int y1 = y0 + (1<<(log2CbSize-1)); - - encode_quadtree(ectx,cabac, cb->children0, x0,y0, log2CbSize-1, ctDepth+1, true); - - if (x1<sps.pic_width_in_luma_samples) - encode_quadtree(ectx,cabac, cb->children1, x1,y0, log2CbSize-1, ctDepth+1, true); - - if (y1<sps.pic_height_in_luma_samples) - encode_quadtree(ectx,cabac, cb->children2, x0,y1, log2CbSize-1, ctDepth+1, true); - - if (x1<sps.pic_width_in_luma_samples && - y1<sps.pic_height_in_luma_samples) - encode_quadtree(ectx,cabac, cb->children3, x1,y1, log2CbSize-1, ctDepth+1, true); - } - } - else { - encode_coding_unit(ectx,cabac, cb,x0,y0, log2CbSize, true); - } -} - - -void encode_ctb(encoder_context* ectx, - CABAC_encoder* cabac, - enc_cb* cb, int ctbX,int ctbY) -{ - logtrace(LogSlice,"----- encode CTB (%d;%d) -----\n",ctbX,ctbY); - -#if 0 - printf("MODEL:\n"); - for (int i=0;i<CONTEXT_MODEL_TABLE_LENGTH;i++) - { - printf("%d;%d ", - ectx->ctx_modeli.state, - ectx->ctx_modeli.MPSbit); - - if ((i%16)==15) printf("\n"); - } - printf("\n"); -#endif - - de265_image* img = ectx->img; - int log2ctbSize = img->get_sps().Log2CtbSizeY; - - encode_quadtree(ectx,cabac, cb, ctbX<<log2ctbSize, ctbY<<log2ctbSize, log2ctbSize, 0, true); -} - - -// ---------------------------------------------------------------------------
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-syntax.h
Deleted
@@ -1,102 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ENCODER_SYNTAX_H -#define ENCODER_SYNTAX_H - -#include "libde265/image.h" -#include "libde265/encoder/encoder-types.h" - - -void encode_split_cu_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int x0, int y0, int ctDepth, int split_flag); - -void encode_transform_tree(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx, - int MaxTrafoDepth, int IntraSplitFlag, bool recurse); - -void encode_coding_unit(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, int x0,int y0, int log2CbSize, bool recurse); - -/* returns - 1 - forced split - 0 - forced non-split - -1 - optional split -*/ -enum SplitType { - ForcedNonSplit = 0, - ForcedSplit = 1, - OptionalSplit = 2 -}; - -SplitType get_split_type(const seq_parameter_set* sps, - int x0,int y0, int log2CbSize); - - -/* Compute how much rate is required for sending the chroma CBF flags - in the whole TB tree. - */ -float recursive_cbfChroma_rate(CABAC_encoder_estim* cabac, - enc_tb* tb, int log2TrafoSize, int trafoDepth); - - -void encode_split_transform_flag(encoder_context* ectx, - CABAC_encoder* cabac, - int log2TrafoSize, int split_flag); - -void encode_merge_idx(encoder_context* ectx, - CABAC_encoder* cabac, - int mergeIdx); - -void encode_cu_skip_flag(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, - bool skip); - -void encode_cbf_luma(CABAC_encoder* cabac, - bool zeroTrafoDepth, int cbf_luma); - -void encode_cbf_chroma(CABAC_encoder* cabac, - int trafoDepth, int cbf_chroma); - -void encode_transform_unit(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_tb* tb, const enc_cb* cb, - int x0,int y0, int xBase,int yBase, - int log2TrafoSize, int trafoDepth, int blkIdx); - - -void encode_quadtree(encoder_context* ectx, - CABAC_encoder* cabac, - const enc_cb* cb, int x0,int y0, int log2CbSize, int ctDepth, - bool recurse); - -void encode_ctb(encoder_context* ectx, - CABAC_encoder* cabac, - enc_cb* cb, int ctbX,int ctbY); - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-types.cc
Deleted
@@ -1,766 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "encoder-types.h" -#include "encoder-context.h" -#include "slice.h" -#include "scan.h" -#include "intrapred.h" -#include "libde265/transform.h" -#include "libde265/fallback-dct.h" -#include <iostream> - - -int allocTB = 0; -int allocCB = 0; - -#define DEBUG_ALLOCS 0 - - -small_image_buffer::small_image_buffer(int log2Size,int bytes_per_pixel) -{ - mWidth = 1<<log2Size; - mHeight = 1<<log2Size; - mStride = 1<<log2Size; - mBytesPerRow = bytes_per_pixel * (1<<log2Size); - - int nBytes = mWidth*mHeight*bytes_per_pixel; - mBuf = new uint8_tnBytes; -} - - -small_image_buffer::~small_image_buffer() -{ - delete mBuf; -} - - -void enc_cb::set_rqt_root_bf_from_children_cbf() -{ - assert(transform_tree); - inter.rqt_root_cbf = (transform_tree->cbf0 | - transform_tree->cbf1 | - transform_tree->cbf2); -} - - - - -alloc_pool enc_tb::mMemPool(sizeof(enc_tb)); - -enc_tb::enc_tb(int x,int y,int log2TbSize, enc_cb* _cb) - : enc_node(x,y,log2TbSize) -{ - parent = NULL; - cb = _cb; - downPtr = NULL; - blkIdx = 0; - - split_transform_flag = false; - coeff0=coeff1=coeff2=NULL; - - TrafoDepth = 0; - cbf0 = cbf1 = cbf2 = 0; - - distortion = 0; - rate = 0; - rate_withoutCbfChroma = 0; - - intra_mode = INTRA_PLANAR; - intra_mode_chroma = INTRA_PLANAR; - - if (DEBUG_ALLOCS) { allocTB++; printf("TB : %d\n",allocTB); } -} - - -enc_tb::~enc_tb() -{ - if (split_transform_flag) { - for (int i=0;i<4;i++) { - delete childreni; - } - } - else { - for (int i=0;i<3;i++) { - delete coeffi; - } - } - - if (DEBUG_ALLOCS) { allocTB--; printf("TB ~: %d\n",allocTB); } -} - - -void enc_tb::alloc_coeff_memory(int cIdx, int tbSize) -{ - assert(coeffcIdx==NULL); - coeffcIdx = new int16_ttbSize*tbSize; -} - - -void enc_tb::reconstruct_tb(encoder_context* ectx, - de265_image* img, - int x0,int y0, // luma - int log2TbSize, // chroma adapted - int cIdx) const -{ - // chroma adapted position - int xC=x0; - int yC=y0; - - if (cIdx>0 && ectx->get_sps().chroma_format_idc == CHROMA_420) { - xC>>=1; - yC>>=1; - } - - - if (!reconstructioncIdx) { - - reconstructioncIdx = std::make_shared<small_image_buffer>(log2TbSize, sizeof(uint8_t)); - - if (cb->PredMode == MODE_SKIP) { - PixelAccessor dstPixels(*reconstructioncIdx, xC,yC); - dstPixels.copyFromImage(img, cIdx); - } - else { // not SKIP mode - if (cb->PredMode == MODE_INTRA) { - - //enum IntraPredMode intraPredMode = img->get_IntraPredMode(x0,y0); - enum IntraPredMode intraPredMode = intra_mode; - - if (cIdx>0) { - intraPredMode = intra_mode_chroma; - } - - //printf("reconstruct TB (%d;%d): intra mode (cIdx=%d) = %d\n",xC,yC,cIdx,intraPredMode); - - //decode_intra_prediction(img, xC,yC, intraPredMode, 1<< log2TbSize , cIdx); - - //printf("access intra-prediction of TB %p\n",this); - - intra_predictioncIdx->copy_to(*reconstructioncIdx); - /* - copy_subimage(img->get_image_plane_at_pos(cIdx,xC,yC), - img->get_image_stride(cIdx), - intra_predictioncIdx->get_buffer<uint8_t>(), 1<<log2TbSize, - 1<<log2TbSize, 1<<log2TbSize); - */ - } - else { - assert(0); // -> TODO: now only store in tb_enc - - int size = 1<<log2TbSize; - - uint8_t* dst_ptr = img->get_image_plane_at_pos(cIdx, xC, yC ); - int dst_stride = img->get_image_stride(cIdx); - - /* TODO TMP HACK: prediction does not exist anymore - uint8_t* src_ptr = ectx->prediction->get_image_plane_at_pos(cIdx, xC, yC ); - int src_stride = ectx->prediction->get_image_stride(cIdx); - - for (int y=0;y<size;y++) { - for (int x=0;x<size;x++) { - dst_ptry*dst_stride+x = src_ptry*src_stride+x; - } - } - */ - } - - ALIGNED_16(int16_t) dequant_coeff32*32; - - if (cbfcIdx) dequant_coefficients(dequant_coeff, coeffcIdx, log2TbSize, cb->qp); - - if (0 && cbfcIdx) { - printf("--- quantized coeffs ---\n"); - printBlk("qcoeffs",coeff0,1<<log2TbSize,1<<log2TbSize); - - printf("--- dequantized coeffs ---\n"); - printBlk("dequant",dequant_coeff,1<<log2TbSize,1<<log2TbSize); - } - -#if 0 - uint8_t* ptr = img->get_image_plane_at_pos(cIdx, xC, yC ); - int stride = img->get_image_stride(cIdx); -#endif - - int trType = (cIdx==0 && log2TbSize==2); // TODO: inter - - //printf("--- prediction %d %d / %d ---\n",x0,y0,cIdx); - //printBlk("prediction",ptr,1<<log2TbSize,stride); - - if (cbfcIdx) inv_transform(&ectx->acceleration, - reconstructioncIdx->get_buffer<uint8_t>(), 1<<log2TbSize, - dequant_coeff, log2TbSize, trType); - - //printBlk("RECO",reconstructioncIdx->get_buffer_u8(),1<<log2TbSize, - // reconstructioncIdx->getStride()); - } - } - - - // copy reconstruction into image - -#if 0 - copy_subimage(img->get_image_plane_at_pos(cIdx,xC,yC), - img->get_image_stride(cIdx), - reconstructioncIdx->get_buffer<uint8_t>(), 1<<log2TbSize, - 1<<log2TbSize, 1<<log2TbSize); -#endif - - //printf("--- RECO intra prediction %d %d ---\n",x0,y0); - //printBlk("RECO",ptr,1<<log2TbSize,stride); -} - - -void enc_tb::debug_writeBlack(encoder_context* ectx, de265_image* img) const -{ - if (split_transform_flag) { - for (int i=0;i<4;i++) { - childreni->debug_writeBlack(ectx,img); - } - } - else { - //reconstruct_tb(ectx, img, x,y, log2Size, 0); - - int size = 1<<(log2Size<<1); - std::vector<uint8_t> buf(size); - memset(&buf0,0x12,size); - - int cIdx=0; - int xC=x,yC=y; - - copy_subimage(img->get_image_plane_at_pos(cIdx,xC,yC), - img->get_image_stride(cIdx), - &buf0, 1<<log2Size, - 1<<log2Size, 1<<log2Size); - } -} - - -void enc_tb::reconstruct(encoder_context* ectx, de265_image* img) const -{ - if (split_transform_flag) { - for (int i=0;i<4;i++) { - childreni->reconstruct(ectx,img); - } - } - else { - reconstruct_tb(ectx, img, x,y, log2Size, 0); - - if (ectx->get_sps().chroma_format_idc == CHROMA_444) { - reconstruct_tb(ectx, img, x,y, log2Size, 1); - reconstruct_tb(ectx, img, x,y, log2Size, 2); - } - else if (log2Size>2) { - reconstruct_tb(ectx, img, x,y, log2Size-1, 1); - reconstruct_tb(ectx, img, x,y, log2Size-1, 2); - } - else if (blkIdx==3) { - int xBase = x - (1<<log2Size); - int yBase = y - (1<<log2Size); - - reconstruct_tb(ectx, img, xBase,yBase, log2Size, 1); - reconstruct_tb(ectx, img, xBase,yBase, log2Size, 2); - } - } -} - - -void enc_tb::set_cbf_flags_from_children() -{ - assert(split_transform_flag); - - cbf0 = 0; - cbf1 = 0; - cbf2 = 0; - - for (int i=0;i<4;i++) { - cbf0 |= childreni->cbf0; - cbf1 |= childreni->cbf1; - cbf2 |= childreni->cbf2; - } -} - - - -alloc_pool enc_cb::mMemPool(sizeof(enc_cb), 200); - - -enc_cb::enc_cb() - : split_cu_flag(false), - cu_transquant_bypass_flag(false), - pcm_flag(false), - transform_tree(NULL), - distortion(0), - rate(0) -{ - parent = NULL; - downPtr = NULL; - - if (DEBUG_ALLOCS) { allocCB++; printf("CB : %d\n",allocCB); } -} - -enc_cb::~enc_cb() -{ - if (split_cu_flag) { - for (int i=0;i<4;i++) { - delete childreni; - } - } - else { - delete transform_tree; - } - - if (DEBUG_ALLOCS) { allocCB--; printf("CB ~: %d\n",allocCB); } -} - - -/* -void enc_cb::write_to_image(de265_image* img) const -{ - //printf("write_to_image %d %d size:%d\n",x,y,1<<log2Size); - - - if (!split_cu_flag) { - img->set_log2CbSize(x,y,log2Size, true); - img->set_ctDepth(x,y,log2Size, ctDepth); - assert(pcm_flag==0); - img->set_pcm_flag(x,y,log2Size, pcm_flag); - img->set_cu_transquant_bypass(x,y,log2Size, cu_transquant_bypass_flag); - img->set_QPY(x,y,log2Size, qp); - img->set_pred_mode(x,y, log2Size, PredMode); - img->set_PartMode(x,y, PartMode); - - if (PredMode == MODE_INTRA) { - - if (PartMode == PART_NxN) { - int h = 1<<(log2Size-1); - img->set_IntraPredMode(x ,y ,log2Size-1, transform_tree->children0->intra_mode); - img->set_IntraPredMode(x+h,y ,log2Size-1, transform_tree->children1->intra_mode); - img->set_IntraPredMode(x ,y+h,log2Size-1, transform_tree->children2->intra_mode); - img->set_IntraPredMode(x+h,y+h,log2Size-1, transform_tree->children3->intra_mode); - } - else { - img->set_IntraPredMode(x,y,log2Size, transform_tree->intra_mode); - } - } - else { - int nC = 1<<log2Size; - int nC2 = nC>>1; - int nC4 = nC>>2; - int nC3 = nC-nC4; - switch (PartMode) { - case PART_2Nx2N: - img->set_mv_info(x,y,nC,nC, inter.pb0.motion); - break; - case PART_NxN: - img->set_mv_info(x ,y ,nC2,nC2, inter.pb0.motion); - img->set_mv_info(x+nC2,y ,nC2,nC2, inter.pb1.motion); - img->set_mv_info(x ,y+nC2,nC2,nC2, inter.pb2.motion); - img->set_mv_info(x+nC2,y+nC2,nC2,nC2, inter.pb3.motion); - break; - case PART_2NxN: - img->set_mv_info(x,y ,nC,nC2, inter.pb0.motion); - img->set_mv_info(x,y+nC2,nC,nC2, inter.pb1.motion); - break; - case PART_Nx2N: - img->set_mv_info(x ,y,nC2,nC, inter.pb0.motion); - img->set_mv_info(x+nC2,y,nC2,nC, inter.pb1.motion); - break; - case PART_2NxnU: - img->set_mv_info(x,y ,nC,nC4, inter.pb0.motion); - img->set_mv_info(x,y+nC4,nC,nC3, inter.pb1.motion); - break; - case PART_2NxnD: - img->set_mv_info(x,y ,nC,nC3, inter.pb0.motion); - img->set_mv_info(x,y+nC3,nC,nC4, inter.pb1.motion); - break; - case PART_nLx2N: - img->set_mv_info(x ,y,nC4,nC, inter.pb0.motion); - img->set_mv_info(x+nC4,y,nC3,nC, inter.pb1.motion); - break; - case PART_nRx2N: - img->set_mv_info(x ,y,nC3,nC, inter.pb0.motion); - img->set_mv_info(x+nC3,y,nC4,nC, inter.pb1.motion); - break; - } - } - } - else { - for (int i=0;i<4;i++) { - if (childreni) { - childreni->write_to_image(img); - } - } - } -} -*/ - -void enc_cb::reconstruct(encoder_context* ectx, de265_image* img) const -{ - assert(0); - if (split_cu_flag) { - for (int i=0;i<4;i++) { - childreni->reconstruct(ectx, img); - } - } - else { - //write_to_image(img); - transform_tree->reconstruct(ectx,img); - } -} - - -void enc_cb::debug_dumpTree(int flags, int indent) const -{ - std::string indentStr; - indentStr.insert(0,indent,' '); - - std::cout << indentStr << "CB " << x << ";" << y << " " - << (1<<log2Size) << "x" << (1<<log2Size) << " " << this << "\n"; - - std::cout << indentStr << "| split_cu_flag: " << int(split_cu_flag) << "\n"; - std::cout << indentStr << "| ctDepth: " << int(ctDepth) << "\n"; - - if (split_cu_flag) { - for (int i=0;i<4;i++) - if (childreni) { - std::cout << indentStr << "| child CB " << i << ":\n"; - childreni->debug_dumpTree(flags, indent+2); - } - } - else { - std::cout << indentStr << "| qp: " << int(qp) << "\n"; - std::cout << indentStr << "| PredMode: " << PredMode << "\n"; - std::cout << indentStr << "| PartMode: " << part_mode_name(PartMode) << "\n"; - std::cout << indentStr << "| transform_tree:\n"; - - transform_tree->debug_dumpTree(flags, indent+2); - } -} - - -void enc_tb::debug_dumpTree(int flags, int indent) const -{ - std::string indentStr; - indentStr.insert(0,indent,' '); - - std::cout << indentStr << "TB " << x << ";" << y << " " - << (1<<log2Size) << "x" << (1<<log2Size) << " " << this << "\n"; - - std::cout << indentStr << "| split_transform_flag: " << int(split_transform_flag) << "\n"; - std::cout << indentStr << "| TrafoDepth: " << int(TrafoDepth) << "\n"; - std::cout << indentStr << "| blkIdx: " << int(blkIdx) << "\n"; - - std::cout << indentStr << "| intra_mode: " << int(intra_mode) << "\n"; - std::cout << indentStr << "| intra_mode_chroma: " << int(intra_mode_chroma) << "\n"; - - std::cout << indentStr << "| cbf: " - << int(cbf0) << ":" - << int(cbf1) << ":" - << int(cbf2) << "\n"; - - - if (flags & DUMPTREE_RECONSTRUCTION) { - for (int i=0;i<3;i++) - if (reconstructioni) { - std::cout << indentStr << "| Reconstruction, channel " << i << ":\n"; - printBlk(NULL, - reconstructioni->get_buffer_u8(), - reconstructioni->getWidth(), - reconstructioni->getStride(), - indentStr + "| "); - } - } - - if (flags & DUMPTREE_INTRA_PREDICTION) { - for (int i=0;i<3;i++) - if (intra_predictioni) { - //if (i==0) print_border(debug_intra_border+64, NULL, 1<<log2Size); - - std::cout << indentStr << "| Intra prediction, channel " << i << ":\n"; - printBlk(NULL, - intra_predictioni->get_buffer_u8(), - intra_predictioni->getWidth(), - intra_predictioni->getStride(), - indentStr + "| "); - } - } - - if (split_transform_flag) { - for (int i=0;i<4;i++) - if (childreni) { - std::cout << indentStr << "| child TB " << i << ":\n"; - childreni->debug_dumpTree(flags, indent+2); - } - } -} - - - -const enc_tb* enc_cb::getTB(int x,int y) const -{ - assert(!split_cu_flag); - assert(transform_tree); - - return transform_tree->getTB(x,y); -} - - -const enc_tb* enc_tb::getTB(int px,int py) const -{ - if (split_transform_flag) { - int xHalf = x + (1<<(log2Size-1)); - int yHalf = y + (1<<(log2Size-1)); - - enc_tb* child; - - if (px<xHalf) { - if (py<yHalf) { - child = children0; - } - else { - child = children2; - } - } - else { - if (py<yHalf) { - child = children1; - } - else { - child = children3; - } - } - - if (!child) { return NULL; } - - return child->getTB(px,py); - } - - return this; -} - - - -void CTBTreeMatrix::alloc(int w,int h, int log2CtbSize) -{ - free(); - - int ctbSize = 1<<log2CtbSize; - - mWidthCtbs = (w+ctbSize-1) >> log2CtbSize; - mHeightCtbs = (h+ctbSize-1) >> log2CtbSize; - mLog2CtbSize = log2CtbSize; - - mCTBs.resize(mWidthCtbs * mHeightCtbs, NULL); -} - - -const enc_cb* CTBTreeMatrix::getCB(int px,int py) const -{ - int xCTB = px>>mLog2CtbSize; - int yCTB = py>>mLog2CtbSize; - - int idx = xCTB + yCTB*mWidthCtbs; - assert(idx < mCTBs.size()); - - enc_cb* cb = mCTBsidx; - if (!cb) { return NULL; } - - while (cb->split_cu_flag) { - int xHalf = cb->x + (1<<(cb->log2Size-1)); - int yHalf = cb->y + (1<<(cb->log2Size-1)); - - if (px<xHalf) { - if (py<yHalf) { - cb = cb->children0; - } - else { - cb = cb->children2; - } - } - else { - if (py<yHalf) { - cb = cb->children1; - } - else { - cb = cb->children3; - } - } - - if (!cb) { return NULL; } - } - - return cb; -} - - -const enc_tb* CTBTreeMatrix::getTB(int x,int y) const -{ - const enc_cb* cb = getCB(x,y); - if (!cb) { return NULL; } - - const enc_tb* tb = cb->transform_tree; - if (!tb) { return NULL; } - - return tb->getTB(x,y); -} - - -const enc_pb_inter* CTBTreeMatrix::getPB(int x,int y) const -{ - const enc_cb* cb = getCB(x,y); - - // TODO: get PB block based on partitioning - - return &cb->inter.pb0; -} - - -void CTBTreeMatrix::writeReconstructionToImage(de265_image* img, - const seq_parameter_set* sps) const -{ - for (size_t i=0;i<mCTBs.size();i++) { - const enc_cb* cb = mCTBsi; - cb->writeReconstructionToImage(img, sps); - } -} - -void enc_cb::writeReconstructionToImage(de265_image* img, - const seq_parameter_set* sps) const -{ - if (split_cu_flag) { - for (int i=0;i<4;i++) { - if (childreni) { - childreni->writeReconstructionToImage(img,sps); - } - } - } - else { - transform_tree->writeReconstructionToImage(img,sps); - } -} - -void enc_tb::writeReconstructionToImage(de265_image* img, - const seq_parameter_set* sps) const -{ - if (split_transform_flag) { - for (int i=0;i<4;i++) { - if (childreni) { - childreni->writeReconstructionToImage(img,sps); - } - } - } - else { - // luma pixels - - PixelAccessor lumaPixels(*reconstruction0, x,y); - lumaPixels.copyToImage(img, 0); - - // chroma pixels - - if (sps->chroma_format_idc == CHROMA_444) { - PixelAccessor chroma1Pixels(*reconstruction1, x,y); - chroma1Pixels.copyToImage(img, 1); - PixelAccessor chroma2Pixels(*reconstruction2, x,y); - chroma2Pixels.copyToImage(img, 2); - } - else if (log2Size>2) { - PixelAccessor chroma1Pixels(*reconstruction1, x>>1,y>>1); - chroma1Pixels.copyToImage(img, 1); - PixelAccessor chroma2Pixels(*reconstruction2, x>>1,y>>1); - chroma2Pixels.copyToImage(img, 2); - - //reconstruct_tb(ectx, img, x,y, log2Size-1, 1); - //reconstruct_tb(ectx, img, x,y, log2Size-1, 2); - } - else if (blkIdx==3) { - int xBase = x - (1<<log2Size); - int yBase = y - (1<<log2Size); - - PixelAccessor chroma1Pixels(*reconstruction1, xBase>>1,yBase>>1); - chroma1Pixels.copyToImage(img, 1); - PixelAccessor chroma2Pixels(*reconstruction2, xBase>>1,yBase>>1); - chroma2Pixels.copyToImage(img, 2); - - //reconstruct_tb(ectx, img, xBase,yBase, log2Size, 1); - //reconstruct_tb(ectx, img, xBase,yBase, log2Size, 2); - } - } -} - - -PixelAccessor enc_tb::getPixels(int x,int y, int cIdx, const seq_parameter_set& sps) -{ - int xL = x << sps.get_chroma_shift_W(cIdx); - int yL = y << sps.get_chroma_shift_H(cIdx); - - const enc_tb* tb = getTB(xL,yL); - - if (cIdx==0 || sps.chroma_format_idc == CHROMA_444) { - return PixelAccessor(*tb->reconstructioncIdx, tb->x, tb->y); - } - else if (sps.chroma_format_idc == CHROMA_420) { - if (tb->log2Size > 2) { - return PixelAccessor(*tb->reconstructioncIdx, - tb->x >> 1, - tb->y >> 1); - } - else { - enc_tb* parent = tb->parent; - tb = parent->children3; - - return PixelAccessor(*tb->reconstructioncIdx, - parent->x >> 1, - parent->y >> 1); - } - } - else { - assert(sps.chroma_format_idc == CHROMA_422); - - assert(false); // not supported yet - - return PixelAccessor::invalid(); - } -} - - -void PixelAccessor::copyToImage(de265_image* img, int cIdx) const -{ - uint8_t* p = img->get_image_plane_at_pos(cIdx, mXMin, mYMin); - int stride = img->get_image_stride(cIdx); - - for (int y=0;y<mHeight;y++) { - memcpy(p, mBase+mXMin+(y+mYMin)*mStride, mWidth); - p += stride; - } -} - -void PixelAccessor::copyFromImage(const de265_image* img, int cIdx) -{ - const uint8_t* p = img->get_image_plane_at_pos(cIdx, mXMin, mYMin); - int stride = img->get_image_stride(cIdx); - - for (int y=0;y<mHeight;y++) { - memcpy(mBase+mXMin+(y+mYMin)*mStride, p, mWidth); - p += stride; - } -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encoder-types.h
Deleted
@@ -1,409 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * Authors: Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef ENCODE_H -#define ENCODE_H - -#include "libde265/image.h" -#include "libde265/decctx.h" -#include "libde265/image-io.h" -#include "libde265/alloc_pool.h" - -#include <memory> - -class encoder_context; -class enc_cb; - - -class small_image_buffer -{ - public: - explicit small_image_buffer(int log2Size,int bytes_per_pixel=1); - ~small_image_buffer(); - - uint8_t* get_buffer_u8() const { return mBuf; } - int16_t* get_buffer_s16() const { return (int16_t*)mBuf; } - uint16_t* get_buffer_u16() const { return (uint16_t*)mBuf; } - template <class pixel_t> pixel_t* get_buffer() const { return (pixel_t*)mBuf; } - - void copy_to(small_image_buffer& b) const { - assert(b.mHeight==mHeight); - assert(b.mBytesPerRow==mBytesPerRow); - memcpy(b.mBuf, mBuf, mBytesPerRow*mHeight); - } - - int getWidth() const { return mWidth; } - int getHeight() const { return mHeight; } - - int getStride() const { return mStride; } // pixels per row - - private: - uint8_t* mBuf; - uint16_t mStride; - uint16_t mBytesPerRow; - - uint8_t mWidth, mHeight; - - // small_image_buffer cannot be copied - - small_image_buffer(const small_image_buffer&) { assert(false); } // = delete; - small_image_buffer& operator=(const small_image_buffer&) { assert(false); return *this; } // = delete; -}; - - -class enc_node -{ -public: - enc_node() { } - enc_node(int _x,int _y, int _log2Size) : x(_x), y(_y), log2Size(_log2Size) { } - virtual ~enc_node() { } - - uint16_t x,y; - uint8_t log2Size : 3; - - - static const int DUMPTREE_INTRA_PREDICTION = (1<<0); - static const int DUMPTREE_RESIDUAL = (1<<1); - static const int DUMPTREE_RECONSTRUCTION = (1<<2); - static const int DUMPTREE_ALL = 0xFFFF; - - virtual void debug_dumpTree(int flags, int indent=0) const = 0; -}; - - -class PixelAccessor -{ - public: - PixelAccessor(small_image_buffer& buf, int x0,int y0) { - mBase = buf.get_buffer_u8(); - mStride = buf.getStride(); - mXMin = x0; - mYMin = y0; - mWidth = buf.getWidth(); - mHeight= buf.getHeight(); - - mBase -= x0 + y0*mStride; - } - - const uint8_t* operator(int y) const { return mBase+y*mStride; } - - int getLeft() const { return mXMin; } - int getWidth() const { return mWidth; } - int getTop() const { return mYMin; } - int getHeight() const { return mHeight; } - - void copyToImage(de265_image* img, int cIdx) const; - void copyFromImage(const de265_image* img, int cIdx); - - static PixelAccessor invalid() { - return PixelAccessor(); - } - - private: - uint8_t* mBase; - short mStride; - short mXMin, mYMin; - uint8_t mWidth, mHeight; - - PixelAccessor() { - mBase = NULL; - mStride = mXMin = mYMin = mWidth = mHeight = 0; - } -}; - - -class enc_tb : public enc_node -{ - public: - enc_tb(int x,int y,int log2TbSize, enc_cb* _cb); - ~enc_tb(); - - enc_tb* parent; - enc_cb* cb; - enc_tb** downPtr; - - uint8_t split_transform_flag : 1; - uint8_t TrafoDepth : 2; // 2 bits enough ? (TODO) - uint8_t blkIdx : 2; - - enum IntraPredMode intra_mode; - - // Note: in NxN partition mode, the chroma mode is always derived from - // the top-left child's intra mode (for chroma 4:2:0). - enum IntraPredMode intra_mode_chroma; - - uint8_t cbf3; - - - /* intra_prediction and residual is filled in tb-split, because this is where we decide - on the final block-size the TB is coded with. - */ - //mutable uint8_t debug_intra_border2*64+1; - std::shared_ptr<small_image_buffer> intra_prediction3; - std::shared_ptr<small_image_buffer> residual3; - - /* Reconstruction is computed on-demand in writeMetadata(). - */ - mutable std::shared_ptr<small_image_buffer> reconstruction3; - - union { - // split - struct { - enc_tb* children4; - }; - - // leaf node - struct { - int16_t* coeff3; - - bool skip_transform32; - uint8_t explicit_rdpcm32; - }; - }; - - float distortion; // total distortion for this level of the TB tree (including all children) - float rate; // total rate for coding this TB level and all children - float rate_withoutCbfChroma; - - void set_cbf_flags_from_children(); - - void reconstruct(encoder_context* ectx, de265_image* img) const; - void debug_writeBlack(encoder_context* ectx, de265_image* img) const; - - bool isZeroBlock() const { return cbf0==false && cbf1==false && cbf2==false; } - - void alloc_coeff_memory(int cIdx, int tbSize); - - const enc_tb* getTB(int x,int y) const; - - PixelAccessor getPixels(int x,int y, int cIdx, const seq_parameter_set& sps); - - void writeReconstructionToImage(de265_image* img, - const seq_parameter_set* sps) const; - - /* - static void* operator new(const size_t size) { return mMemPool.new_obj(size); } - static void operator delete(void* obj) { mMemPool.delete_obj(obj); } - */ - - - virtual void debug_dumpTree(int flags, int indent=0) const; - -private: - static alloc_pool mMemPool; - - void reconstruct_tb(encoder_context* ectx, - de265_image* img, int x0,int y0, int log2TbSize, - int cIdx) const; -}; - - -struct enc_pb_inter -{ - /* absolute motion information (for MV-prediction candidates) - */ - PBMotion motion; - - /* specification how to code the motion vector in the bitstream - */ - PBMotionCoding spec; - - - // NOT TRUE: refIdx in 'spec' is not used. It is taken from 'motion' - // Currently, information is duplicated. Same as with inter_pred_idc/predFlag. - - /* SPEC: - int8_t refIdx2; // not used - int16_t mvd22; - - uint8_t inter_pred_idc : 2; // enum InterPredIdc - uint8_t mvp_l0_flag : 1; - uint8_t mvp_l1_flag : 1; - uint8_t merge_flag : 1; - uint8_t merge_idx : 3; - */ -}; - - -class enc_cb : public enc_node -{ -public: - enc_cb(); - ~enc_cb(); - - enc_cb* parent; - enc_cb** downPtr; - - uint8_t split_cu_flag : 1; - uint8_t ctDepth : 2; - - - union { - // split - struct { - enc_cb* children4; // undefined when split_cu_flag==false - }; - - // non-split - struct { - uint8_t qp : 6; - uint8_t cu_transquant_bypass_flag : 1; // currently unused - uint8_t pcm_flag : 1; - - enum PredMode PredMode; // : 6; - enum PartMode PartMode; // : 3; - - union { - struct { - //enum IntraPredMode pred_mode4; - //enum IntraPredMode chroma_mode; - } intra; - - struct { - enc_pb_inter pb4; - - uint8_t rqt_root_cbf : 1; - } inter; - }; - - enc_tb* transform_tree; - }; - }; - - - float distortion; - float rate; - - - void set_rqt_root_bf_from_children_cbf(); - - /* Save CB reconstruction in the node and restore it again to the image. - Pixel data and metadata. - */ - //virtual void save(const de265_image*); - //virtual void restore(de265_image*); - - - /* Decode this CB: pixel data and write metadata to image. - */ - void reconstruct(encoder_context* ectx,de265_image* img) const; - - // can only be called on the lowest-level CB (with TB-tree as its direct child) - const enc_tb* getTB(int x,int y) const; - - void writeReconstructionToImage(de265_image* img, - const seq_parameter_set* sps) const; - - - virtual void debug_dumpTree(int flags, int indent=0) const; - - - // memory management - - static void* operator new(const size_t size) { - void* p = mMemPool.new_obj(size); - //printf("ALLOC %p\n",p); - return p; - } - static void operator delete(void* obj) { - //printf("DELETE %p\n",obj); - mMemPool.delete_obj(obj); - } - - private: - //void write_to_image(de265_image*) const; - - static alloc_pool mMemPool; -}; - - - -class CTBTreeMatrix -{ - public: - CTBTreeMatrix() : mWidthCtbs(0), mHeightCtbs(0), mLog2CtbSize(0) { } - ~CTBTreeMatrix() { free(); } - - void alloc(int w,int h, int log2CtbSize); - void clear() { free(); } - - void setCTB(int xCTB, int yCTB, enc_cb* ctb) { - int idx = xCTB + yCTB*mWidthCtbs; - assert(idx < mCTBs.size()); - if (mCTBsidx) { delete mCTBsidx; } - mCTBsidx = ctb; - } - - const enc_cb* getCTB(int xCTB, int yCTB) const { - int idx = xCTB + yCTB*mWidthCtbs; - assert(idx < mCTBs.size()); - return mCTBsidx; - } - - enc_cb** getCTBRootPointer(int x, int y) { - x >>= mLog2CtbSize; - y >>= mLog2CtbSize; - - int idx = x + y*mWidthCtbs; - assert(idx < mCTBs.size()); - return &mCTBsidx; - } - - const enc_cb* getCB(int x,int y) const; - const enc_tb* getTB(int x,int y) const; - const enc_pb_inter* getPB(int x,int y) const; - - void writeReconstructionToImage(de265_image* img, - const seq_parameter_set*) const; - - private: - std::vector<enc_cb*> mCTBs; - int mWidthCtbs; - int mHeightCtbs; - int mLog2CtbSize; - - void free() { - for (int i=0 ; i<mWidthCtbs*mHeightCtbs ; i++) { - if (mCTBsi) { - delete mCTBsi; - mCTBsi=NULL; - } - } - } -}; - - - - -inline int childX(int x0, int idx, int log2CbSize) -{ - return x0 + ((idx&1) << (log2CbSize-1)); -} - -inline int childY(int y0, int idx, int log2CbSize) -{ - return y0 + ((idx>>1) << (log2CbSize-1)); -} - - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/encpicbuf.cc
Deleted
@@ -1,313 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "libde265/encoder/encpicbuf.h" -#include "libde265/util.h" - - -encoder_picture_buffer::encoder_picture_buffer() -{ -} - -encoder_picture_buffer::~encoder_picture_buffer() -{ - flush_images(); -} - - -image_data::image_data() -{ - //printf("new %p\n",this); - - frame_number = 0; - - input = NULL; - prediction = NULL; - reconstruction = NULL; - - // SOP metadata - - sps_index = -1; - skip_priority = 0; - is_intra = true; - - state = state_unprocessed; - - is_in_output_queue = true; -} - -image_data::~image_data() -{ - //printf("delete %p\n",this); - - delete input; - // TODO: this could still be referenced in the packet output queue, so the - // images should really be refcounted. release for now to prevent leaks - delete reconstruction; - delete prediction; -} - - -// --- input pushed by the input process --- - -void encoder_picture_buffer::reset() -{ - flush_images(); - - mEndOfStream = false; -} - - -void encoder_picture_buffer::flush_images() -{ - while (!mImages.empty()) { - delete mImages.front(); - mImages.pop_front(); - } -} - - -image_data* encoder_picture_buffer::insert_next_image_in_encoding_order(const de265_image* img, - int frame_number) -{ - image_data* data = new image_data(); - data->frame_number = frame_number; - data->input = img; - data->shdr.set_defaults(); - - mImages.push_back(data); - - return data; -} - -void encoder_picture_buffer::insert_end_of_stream() -{ - mEndOfStream = true; -} - - -// --- SOP structure --- - -void image_data::set_intra() -{ - is_intra = true; -} - -void image_data::set_NAL_type(uint8_t nalType) -{ - nal.nal_unit_type = nalType; -} - -void image_data::set_references(int sps_index, // -1 -> custom - const std::vector<int>& l0, - const std::vector<int>& l1, - const std::vector<int>& lt, - const std::vector<int>& keepMoreReferences) -{ - this->sps_index = sps_index; - ref0 = l0; - ref1 = l1; - longterm = lt; - keep = keepMoreReferences; - - - // TODO: pps.num_ref_idx_l0_default_active - - shdr.num_ref_idx_l0_active = l0.size(); - //shdr.num_ref_idx_l1_active = l1.size(); - - assert(l0.size() < MAX_NUM_REF_PICS); - for (size_t i=0;i<l0.size();i++) { - shdr.RefPicList0i = l0i; - } - - /* - assert(l1.size() < MAX_NUM_REF_PICS); - for (int i=0;i<l1.size();i++) { - shdr.RefPicList1i = l1i; - } - */ -} - -void image_data::set_NAL_temporal_id(int temporal_id) -{ - this->nal.nuh_temporal_id = temporal_id; -} - -void image_data::set_skip_priority(int skip_priority) -{ - this->skip_priority = skip_priority; -} - -void encoder_picture_buffer::sop_metadata_commit(int frame_number) -{ - image_data* data = mImages.back(); - assert(data->frame_number == frame_number); - - data->state = image_data::state_sop_metadata_available; -} - - - -// --- infos pushed by encoder --- - -void encoder_picture_buffer::mark_encoding_started(int frame_number) -{ - image_data* data = get_picture(frame_number); - - data->state = image_data::state_encoding; -} - -void encoder_picture_buffer::set_prediction_image(int frame_number, de265_image* pred) -{ - image_data* data = get_picture(frame_number); - - data->prediction = pred; -} - -void encoder_picture_buffer::set_reconstruction_image(int frame_number, de265_image* reco) -{ - image_data* data = get_picture(frame_number); - - data->reconstruction = reco; -} - -void encoder_picture_buffer::mark_encoding_finished(int frame_number) -{ - image_data* data = get_picture(frame_number); - - data->state = image_data::state_keep_for_reference; - - - // --- delete images that are not required anymore --- - - // first, mark all images unused - - for (auto imgdata : mImages) { - imgdata->mark_used = false; - } - - // mark all images that will be used later - - for (int f : data->ref0) { get_picture(f)->mark_used=true; } - for (int f : data->ref1) { get_picture(f)->mark_used=true; } - for (int f : data->longterm) { get_picture(f)->mark_used=true; } - for (int f : data->keep) { get_picture(f)->mark_used=true; } - data->mark_used=true; - - // copy over all images that we still keep - - std::deque<image_data*> newImageSet; - for (auto imgdata : mImages) { - if (imgdata->mark_used || imgdata->is_in_output_queue) { - imgdata->reconstruction->PicState = UsedForShortTermReference; // TODO: this is only a hack - - newImageSet.push_back(imgdata); - } - else { - // image is not needed anymore for reference, remove it from EncPicBuf - - delete imgdata; - } - } - - mImages = newImageSet; -} - - - -// --- data access --- - -bool encoder_picture_buffer::have_more_frames_to_encode() const -{ - for (size_t i=0;i<mImages.size();i++) { - if (mImagesi->state < image_data::state_encoding) { - return true; - } - } - - return false; -} - - -image_data* encoder_picture_buffer::get_next_picture_to_encode() -{ - for (size_t i=0;i<mImages.size();i++) { - if (mImagesi->state < image_data::state_encoding) { - return mImagesi; - } - } - - return NULL; -} - - -const image_data* encoder_picture_buffer::get_picture(int frame_number) const -{ - for (size_t i=0;i<mImages.size();i++) { - if (mImagesi->frame_number == frame_number) - return mImagesi; - } - - assert(false); - return NULL; -} - - -image_data* encoder_picture_buffer::get_picture(int frame_number) -{ - for (size_t i=0;i<mImages.size();i++) { - if (mImagesi->frame_number == frame_number) - return mImagesi; - } - - assert(false); - return NULL; -} - - -bool encoder_picture_buffer::has_picture(int frame_number) const -{ - for (size_t i=0;i<mImages.size();i++) { - if (mImagesi->frame_number == frame_number) - return true; - } - - return false; -} - - -void encoder_picture_buffer::mark_image_is_outputted(int frame_number) -{ - image_data* idata = get_picture(frame_number); - assert(idata); - - idata->is_in_output_queue = false; -} - - -void encoder_picture_buffer::release_input_image(int frame_number) -{ - image_data* idata = get_picture(frame_number); - assert(idata); - - delete idata->input; - idata->input = NULL; -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/encpicbuf.h
Deleted
@@ -1,144 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef DE265_ENCPICBUF_H -#define DE265_ENCPICBUF_H - -#include "libde265/image.h" -#include "libde265/sps.h" - -#include <deque> -#include <vector> - - -/* TODO: we need a way to quickly access pictures with a stable ID, like in the DPB. - */ - -struct image_data -{ - image_data(); - ~image_data(); - - int frame_number; - - const de265_image* input; // owner - de265_image* prediction; // owner - de265_image* reconstruction; // owner - - // SOP metadata - - nal_header nal; // TODO: image split into several NALs (always same NAL header?) - - slice_segment_header shdr; // TODO: multi-slice pictures - - std::vector<int> ref0; - std::vector<int> ref1; - std::vector<int> longterm; - std::vector<int> keep; - int sps_index; - int skip_priority; - bool is_intra; // TODO: remove, use shdr.slice_type instead - - /* unprocessed only input image has been inserted, no metadata - sop_metadata_available sop-creator has filled in references and skipping metadata - a) encoding encoding started for this frame, reconstruction image was created - . keep_for_reference encoding finished, picture is kept in the buffer for reference - b) skipped image was skipped, no encoding was done, no reconstruction image - */ - enum state { - state_unprocessed, - state_sop_metadata_available, - state_encoding, - state_keep_for_reference, - state_skipped - } state; - - bool is_in_output_queue; - - bool mark_used; - - - // --- SOP structure --- - - void set_intra(); - void set_NAL_type(uint8_t nalType); - void set_NAL_temporal_id(int temporal_id); - void set_references(int sps_index, // -1 -> custom - const std::vector<int>& l0, const std::vector<int>& l1, - const std::vector<int>& lt, - const std::vector<int>& keepMoreReferences); - void set_skip_priority(int skip_priority); -}; - - -class encoder_picture_buffer -{ - public: - encoder_picture_buffer(); - ~encoder_picture_buffer(); - - - // --- input pushed by the input process --- - - void reset(); - - image_data* insert_next_image_in_encoding_order(const de265_image*, int frame_number); - void insert_end_of_stream(); - - - // --- SOP structure --- - - void sop_metadata_commit(int frame_number); // note: frame_number is only for consistency checking - - - // --- infos pushed by encoder --- - - void mark_encoding_started(int frame_number); - void set_prediction_image(int frame_number, de265_image*); // store it just for debugging fun - void set_reconstruction_image(int frame_number, de265_image*); - void mark_encoding_finished(int frame_number); - - - - // --- data access --- - - bool have_more_frames_to_encode() const; - image_data* get_next_picture_to_encode(); // or return NULL if no picture is available - const image_data* get_picture(int frame_number) const; - bool has_picture(int frame_number) const; - - const image_data* peek_next_picture_to_encode() const { - assert(!mImages.empty()); - return mImages.front(); - } - - void mark_image_is_outputted(int frame_number); - void release_input_image(int frame_number); - - private: - bool mEndOfStream; - std::deque<image_data*> mImages; - - void flush_images(); - image_data* get_picture(int frame_number); -}; - - -#endif
View file
libde265-1.0.17.tar.gz/libde265/encoder/sop.cc
Deleted
@@ -1,106 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include "libde265/encoder/sop.h" -#include "libde265/encoder/encoder-context.h" - - -sop_creator_intra_only::sop_creator_intra_only() -{ -} - - -void sop_creator_intra_only::set_SPS_header_values() -{ - mEncCtx->get_sps().log2_max_pic_order_cnt_lsb = get_num_poc_lsb_bits(); -} - - -void sop_creator_intra_only::insert_new_input_image(de265_image* img) -{ - img->PicOrderCntVal = get_pic_order_count(); - - reset_poc(); - int poc = get_pic_order_count(); - - assert(mEncPicBuf); - image_data* imgdata = mEncPicBuf->insert_next_image_in_encoding_order(img, get_frame_number()); - - imgdata->set_intra(); - imgdata->set_NAL_type(NAL_UNIT_IDR_N_LP); - imgdata->shdr.slice_type = SLICE_TYPE_I; - imgdata->shdr.slice_pic_order_cnt_lsb = get_pic_order_count_lsb(); - - mEncPicBuf->sop_metadata_commit(get_frame_number()); - - advance_frame(); -} - - -// --------------------------------------------------------------------------- - - -sop_creator_trivial_low_delay::sop_creator_trivial_low_delay() -{ -} - - -void sop_creator_trivial_low_delay::set_SPS_header_values() -{ - ref_pic_set rps; - rps.DeltaPocS00 = -1; - rps.UsedByCurrPicS00 = true; - rps.NumNegativePics = 1; - rps.NumPositivePics = 0; - rps.compute_derived_values(); - mEncCtx->get_sps().ref_pic_sets.push_back(rps); - mEncCtx->get_sps().log2_max_pic_order_cnt_lsb = get_num_poc_lsb_bits(); -} - - -void sop_creator_trivial_low_delay::insert_new_input_image(de265_image* img) -{ - img->PicOrderCntVal = get_pic_order_count(); - - int frame = get_frame_number(); - - std::vector<int> l0, l1, empty; - if (!isIntra(frame)) { - l0.push_back(frame-1); - } - - assert(mEncPicBuf); - image_data* imgdata = mEncPicBuf->insert_next_image_in_encoding_order(img, get_frame_number()); - - if (isIntra(frame)) { - reset_poc(); - imgdata->set_intra(); - imgdata->set_NAL_type(NAL_UNIT_IDR_N_LP); - imgdata->shdr.slice_type = SLICE_TYPE_I; - } else { - imgdata->set_references(0, l0,l1, empty,empty); - imgdata->set_NAL_type(NAL_UNIT_TRAIL_R); - imgdata->shdr.slice_type = SLICE_TYPE_P; - } - imgdata->shdr.slice_pic_order_cnt_lsb = get_pic_order_count_lsb(); - mEncPicBuf->sop_metadata_commit(get_frame_number()); - - advance_frame(); -}
View file
libde265-1.0.17.tar.gz/libde265/encoder/sop.h
Deleted
@@ -1,147 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as - * published by the Free Software Foundation, either version 3 of - * the License, or (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifndef DE265_SOP_H -#define DE265_SOP_H - -#include "libde265/image.h" -#include "libde265/sps.h" -#include "libde265/encoder/configparam.h" -//#include "libde265/encoder/encoder-context.h" -#include "libde265/encoder/encpicbuf.h" - -#include <deque> -#include <vector> - -/* -struct refpic_set -{ - std::vector<int> l0; - std::vector<int> l1; -}; -*/ - -class encoder_context; - - -class pic_order_counter -{ - public: - pic_order_counter() { mFrameNumber=0; mPOC=0; mNumLsbBits=6; } - - void reset_poc() { mPOC=0; } - - int get_frame_number() const { return mFrameNumber; } - - int get_pic_order_count() const { return mPOC; } - int get_pic_order_count_lsb() const { - return mPOC & ((1<<mNumLsbBits)-1); - } - - void advance_frame(int n=1) { mFrameNumber+=n; mPOC+=n; } - - void set_num_poc_lsb_bits(int n) { mNumLsbBits=n; } - int get_num_poc_lsb_bits() const { return mNumLsbBits; } - - private: - int mFrameNumber; - int mPOC; - int mNumLsbBits; -}; - - -class sop_creator : public pic_order_counter -{ - public: - sop_creator() { mEncCtx=NULL; mEncPicBuf=NULL; } - virtual ~sop_creator() { } - - void set_encoder_context(encoder_context* encctx) { mEncCtx=encctx; } - void set_encoder_picture_buffer(encoder_picture_buffer* encbuf) { mEncPicBuf=encbuf; } - - /* Fills in the following fields: - - SPS.ref_pic_sets - - SPS.log2_max_pic_order_cnt_lsb - */ - virtual void set_SPS_header_values() = 0; - - /* Fills in the following fields: - - NAL.nal_type - - SHDR.slice_type - - SHDR.slice_pic_order_cnt_lsb - - IMGDATA.references - */ - virtual void insert_new_input_image(de265_image*) = 0; - virtual void insert_end_of_stream() { mEncPicBuf->insert_end_of_stream(); } - - virtual int get_number_of_temporal_layers() const { return 1; } - - //virtual std::vector<refpic_set> get_sps_refpic_sets() const = 0; - - protected: - encoder_context* mEncCtx; - encoder_picture_buffer* mEncPicBuf; -}; - - - -class sop_creator_intra_only : public sop_creator -{ - public: - sop_creator_intra_only(); - - virtual void set_SPS_header_values(); - virtual void insert_new_input_image(de265_image* img); -}; - - - -class sop_creator_trivial_low_delay : public sop_creator -{ - public: - struct params { - params() { - intraPeriod.set_ID("sop-lowDelay-intraPeriod"); - intraPeriod.set_minimum(1); - intraPeriod.set_default(250); - } - - void registerParams(config_parameters& config) { - config.add_option(&intraPeriod); - } - - option_int intraPeriod; - }; - - sop_creator_trivial_low_delay(); - - void setParams(const params& p) { mParams=p; } - - virtual void set_SPS_header_values(); - virtual void insert_new_input_image(de265_image* img); - - private: - params mParams; - - bool isIntra(int frame) const { return (frame % mParams.intraPeriod)==0; } -}; - - -#endif
View file
libde265-1.0.17.tar.gz/tools
Deleted
-(directory)
View file
libde265-1.0.17.tar.gz/tools/CMakeLists.txt
Deleted
@@ -1,22 +0,0 @@ -set(tool_targets - gen-enc-table - yuv-distortion - rd-curves - block-rate-estim - tests - bjoentegaard -) - -set(gen-enc-table_SOURCES gen-entropy-table.cc) -set(yuv-distortion_SOURCES yuv-distortion.cc) -set(rd-curves_SOURCES rd-curves.cc) -set(block-rate-estim_SOURCES block-rate-estim.cc) -set(tests_SOURCES tests.cc) -set(bjoentegaard_SOURCES bjoentegaard.cc) - -foreach(tool ${tool_targets}) - add_executable(${tool} ${${tool}_SOURCES}) - target_link_libraries(${tool} PRIVATE de265) -endforeach() - -install(TARGETS ${tool_targets} DESTINATION ${CMAKE_INSTALL_BINDIR})
View file
libde265-1.0.17.tar.gz/tools/bjoentegaard.cc
Deleted
@@ -1,344 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include <stdio.h> -#include <stdlib.h> -#include <fstream> -#include <sstream> -#include <string> -#include <vector> -#include <math.h> -#include <unistd.h> - - -const bool D = false; - - -/* There are numerical stability problems in the matrix inverse. - Switching to long double seems to provide enough accuracy. - TODO: in the long term, use a better regression algorithm. - */ -typedef long double FP; - - -struct datapoint -{ - double rate; - double distortion; -}; - -struct BjoentegaardParams -{ - // a*log^3 R + b*log^2 R + c*log R + d - double a,b,c,d; - - double minRate, maxRate; -}; - -std::vector<datapoint> curveA,curveB; -BjoentegaardParams paramsA,paramsB; - -#define RATE_NORMALIZATION_FACTOR 1 //(1/1000.0) - - - -FP invf(int i,int j,const FP* m) -{ - int o = 2+(j-i); - - i += 4+o; - j += 4-o; - -#define e(a,b) m ((j+b)%4)*4 + ((i+a)%4) - - FP inv = - + e(+1,-1)*e(+0,+0)*e(-1,+1) - + e(+1,+1)*e(+0,-1)*e(-1,+0) - + e(-1,-1)*e(+1,+0)*e(+0,+1) - - e(-1,-1)*e(+0,+0)*e(+1,+1) - - e(-1,+1)*e(+0,-1)*e(+1,+0) - - e(+1,-1)*e(-1,+0)*e(+0,+1); - - return (o%2)?inv : -inv; - - #undef e - -} - -bool inverseMatrix4x4(const FP *m, FP *out) -{ - FP inv16; - - for(int i=0;i<4;i++) - for(int j=0;j<4;j++) - invj*4+i = invf(i,j,m); - - FP D = 0; - - for(int k=0;k<4;k++) D += mk * invk*4; - - if (D == 0) return false; - - D = 1.0 / D; - - for (int i = 0; i < 16; i++) - outi = invi * D; - - return true; - -} - - - -BjoentegaardParams fitParams(const std::vector<datapoint>& curve) -{ - // build regression matrix - - int n = curve.size(); - - FP X4*n; // regression matrix - FP XTn*4; // transpose of X - - for (int i=0;i<n;i++) { - FP x = log(curvei.rate) * RATE_NORMALIZATION_FACTOR; - - X4*i + 0 = 1; - X4*i + 1 = x; - X4*i + 2 = x*x; - X4*i + 3 = x*x*x; - - if (D) printf("%f %f %f %f ;\n",1.0,(double)x,(double)(x*x),(double)(x*x*x)); - - XTi+0*n = 1; - XTi+1*n = x; - XTi+2*n = x*x; - XTi+3*n = x*x*x; - } - - if (D) { - printf("rate: "); - for (int i=0;i<n;i++) { - printf("%f ; ",curvei.rate); - } - printf("\n"); - - printf("distortion: "); - for (int i=0;i<n;i++) { - printf("%f ; ",curvei.distortion); - } - printf("\n"); - } - - // calc X^T * X - - FP XTX4*4; - for (int y=0;y<4;y++) - for (int x=0;x<4;x++) { - FP sum=0; - - for (int i=0;i<n;i++) - { - sum += XTy*n + i * Xx + i*4; - } - - XTXy*4+x = sum; - } - - FP XTXinv4*4; - - inverseMatrix4x4(XTX, XTXinv); - - if (D) { - for (int y=0;y<4;y++) { - for (int x=0;x<4;x++) { - printf("%f ",(double)XTXinvy*4+x); - } - printf("\n"); - } - } - - // calculate pseudo-inverse XP = (X^T * X)^-1 * X^T - - FP XPn*4; - - for (int y=0;y<4;y++) { - for (int x=0;x<n;x++) { - FP sum=0; - - for (int i=0;i<4;i++) - { - sum += XTXinvy*4 + i * XTx + i*n; - } - - XPy*n+x = sum; - } - } - - // calculate regression parameters - - FP p4; - - for (int k=0;k<4;k++) - { - FP sum=0; - - for (int i=0;i<n;i++) { - sum += XPk*n + i * curvei.distortion; - } - - pk=sum; - } - - - BjoentegaardParams param; - param.d = p0; - param.c = p1; - param.b = p2; - param.a = p3; - - - // find min and max rate - - param.minRate = curve0.rate; - param.maxRate = curve0.rate; - - for (int i=1;i<n;i++) { - param.minRate = std::min(param.minRate, curvei.rate); - param.maxRate = std::max(param.maxRate, curvei.rate); - } - - return param; -} - -FP evalIntegralAt(const BjoentegaardParams& p, double x) -{ - FP sum = 0; - - // integral of: d - - sum += p.d * x; - - // integral of: c*log(x) - - sum += p.c * x* (log(x)-1); - - // integral of: b*log(x)^2 - - sum += p.b * x * ((log(x)-2)*log(x)+2); - - // integral of: a*log(x)^3 - - sum += p.a * x * (log(x)*((log(x)-3)*log(x)+6)-6); - - return sum; -} - - -double calcBjoentegaard(const BjoentegaardParams& paramsA, - const BjoentegaardParams& paramsB, - double min_rate, double max_rate) -{ - double mini = std::max(paramsA.minRate, paramsB.minRate); - double maxi = std::min(paramsA.maxRate, paramsB.maxRate); - - if (min_rate >= 0) mini = std::max(mini, min_rate); - if (max_rate >= 0) maxi = std::min(maxi, max_rate); - - if (D) printf("range: %f %f\n",mini,maxi); - - FP intA = evalIntegralAt(paramsA, maxi) - evalIntegralAt(paramsA, mini); - FP intB = evalIntegralAt(paramsB, maxi) - evalIntegralAt(paramsB, mini); - - if (D) printf("int1:%f int2:%f\n",(double)intA,(double)intB); - - return (intA-intB)/(maxi-mini); -} - - -std::vector<datapoint> readRDFile(const char* filename, float min_rate, float max_rate) -{ - std::vector<datapoint> curve; - std::ifstream istr(filename); - - for (;;) - { - std::string line; - getline(istr, line); - if (istr.eof()) - break; - - if (line0=='#') continue; - - std::stringstream sstr(line); - datapoint p; - sstr >> p.rate >> p.distortion; - - if (min_rate>=0 && p.rate < min_rate) continue; - if (max_rate>=0 && p.rate > max_rate) continue; - - curve.push_back(p); - } - - return curve; -} - - -int main(int argc, char** argv) -{ - float min_rate = -1; - float max_rate = -1; - - int c; - while ((c=getopt(argc,argv, "l:h:")) != -1) { - switch (c) { - case 'l': min_rate = atof(optarg); break; - case 'h': max_rate = atof(optarg); break; - } - } - - curveA = readRDFile(argvoptind, min_rate, max_rate); - paramsA = fitParams(curveA); - - printf("params A: %f %f %f %f\n",paramsA.a,paramsA.b,paramsA.c,paramsA.d); - - printf("gnuplot: %f*log(x)**3+%f*log(x)**2+%f*log(x)+%f\n",paramsA.a,paramsA.b,paramsA.c,paramsA.d); - - if (optind+1<argc) { - curveB = readRDFile(argvoptind+1, min_rate, max_rate); - paramsB = fitParams(curveB); - - printf("params B: %f %f %f %f\n",paramsB.a,paramsB.b,paramsB.c,paramsB.d); - - printf("gnuplot: %f*log(x)**3+%f*log(x)**2+%f*log(x)+%f\n",paramsB.a,paramsB.b,paramsB.c,paramsB.d); - - double delta = calcBjoentegaard(paramsA,paramsB, min_rate,max_rate); - - printf("Bjoentegaard delta: %f dB (A-B -> >0 -> first (A) is better)\n",delta); - - if (delta>=0) { - printf("-> first is better by %f dB\n",delta); - } - else { - printf("-> second is better by %f dB\n",-delta); - } - } - - return 0; -}
View file
libde265-1.0.17.tar.gz/tools/block-rate-estim.cc
Deleted
@@ -1,127 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include <vector> -#include <string> -#include <fstream> -#include <math.h> - - -struct datapoint { - int log2blksize; - float rate; - float estim; -}; - -std::vector<datapoint> pts; - -#define NBINS 100 - -/* - #define ESTIMDIV 100 - #define MAXESTIM 80000 - std::vector<float> bitestim2MAXESTIM/ESTIMDIV; -*/ - - - -void print_bitestim_results(int log2blksize) -{ - float max_estim=0; - - for (int i=0;i<pts.size();i++) { - if (log2blksize==0 || ptsi.log2blksize==log2blksize) { - max_estim = std::max(max_estim, ptsi.estim); - } - } - - - - float epsilon = 0.0001; - float interval = (max_estim+epsilon) / NBINS; - - for (int b=0;b<NBINS;b++) { - - int cnt=0; - double sum=0; - float mini=999999; - float maxi=0; - - for (int i=0;i<pts.size();i++) - if (log2blksize==0 || ptsi.log2blksize==log2blksize) { - int bin = ptsi.estim/interval; - if (bin==b) { - sum += ptsi.rate; - - mini = std::min(mini,ptsi.rate); - maxi = std::max(maxi,ptsi.rate); - cnt++; - } - } - - if (cnt>0) { - double mean = sum/cnt; - - double var = 0; - - for (int i=0;i<pts.size();i++) - if (log2blksize==0 || ptsi.log2blksize==log2blksize) { - int bin = ptsi.estim/interval; - if (bin==b) { - var += (ptsi.rate-mean)*(ptsi.rate-mean); - } - } - - var /= cnt; - double stddev = sqrt(var); - - printf("%f %f %f %f %f %f %f %d\n", - (b+0.5)*interval,mean,var, - mean-stddev,mean+stddev, mini,maxi, - cnt); - } - } -} - - -int main(int argc,char** argv) -{ - std::string tag = argv1; - - std::ifstream istr(argv2); - for (;;) - { - std::string t; - int log2blksize; - datapoint pt; - - istr >> t >> pt.log2blksize >> pt.rate >> pt.estim; - - if (istr.eof()) break; - - if (t == tag) { - pts.push_back(pt); - } - } - - print_bitestim_results(0); - - return 0; -}
View file
libde265-1.0.17.tar.gz/tools/gen-entropy-table.cc
Deleted
@@ -1,477 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - - -#include "libde265/cabac.h" -#include <assert.h> -#include <time.h> -#include <stdlib.h> -#include <stdio.h> - - -void simple_getline(char** lineptr,size_t* linelen,FILE* fh) -{ - const int LINESIZE=1000; - - if (*lineptr==NULL) { - *linelen = LINESIZE; - *lineptr = (char*)malloc(LINESIZE); - } - - char* p = *lineptr; - - for (;;) { - assert(p - *lineptr < LINESIZE); - - int c = fgetc(fh); - if (c == EOF || c == '\n') { - *p = 0; - break; - } - - *p++ = c; - } -} - - -void generate_entropy_table() -{ -#if 000 - const int nSymbols=1000*1000*10; - const int oversample = 10; - - double tab642; - - for (int i=0;i<64;i++) - for (int k=0;k<2;k++) - tabik=0; - - srand(time(0)); - //srand(123); - - int cnt=1; - for (;;cnt++) { - printf("-------------------- %d --------------------\n",cnt); - - for (int s=0;s<63;s++) { - CABAC_encoder_bitstream cabac_mix0; - CABAC_encoder_bitstream cabac_mix1; - CABAC_encoder_bitstream cabac_ref; - - for (int i=0;i<nSymbols*oversample;i++) { - int r = rand(); - int n = (r>>2) % 63; - int m = (r>>1) & 1; - int b = r & 1; - - context_model model; - model.MPSbit = 1; - model.state = n; - cabac_ref.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - cabac_mix0.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - cabac_mix1.write_CABAC_bit(&model, b); - - if (i%oversample == oversample/2) { - model.MPSbit = 1; - model.state = s; - cabac_mix0.write_CABAC_bit(&model, 0); - - model.MPSbit = 1; - model.state = s; - cabac_mix1.write_CABAC_bit(&model, 1); - - //b = rand() & 1; - //cabac_mix.write_CABAC_bypass(1); - } - - } - - cabac_ref.flush_CABAC(); - cabac_mix0.flush_CABAC(); - cabac_mix1.flush_CABAC(); - - int bits_ref = cabac_ref.size()*8; - int bits_mix0 = cabac_mix0.size()*8; - int bits_mix1 = cabac_mix1.size()*8; - - //printf("bits: %d %d\n",bits_ref,bits_mix); - int bits_diff0 = bits_mix0-bits_ref; - int bits_diff1 = bits_mix1-bits_ref; - //printf("bits diff: %d\n",bits_diff); - - double bits_per_symbol0 = bits_diff0 / double(nSymbols); - double bits_per_symbol1 = bits_diff1 / double(nSymbols); - - tabs0 += bits_per_symbol0; - tabs1 += bits_per_symbol1; - - double bps0 = tabs0/cnt; - double bps1 = tabs1/cnt; - - printf("/* state=%2d */ 0x%05x /* %f */, 0x%05x /* %f */,\n", s, - (int)(bps1*0x8000), bps1, - (int)(bps0*0x8000), bps0); - } - - printf(" 0x0010c, 0x3bfbb /* dummy, should never be used */\n"); - } -#endif -} - -int probTab128+2 = { - 1537234,1602970, - 1608644,1815493, - 1822246,2245961, - 916773,1329391, - 1337504,1930659, - 1063692,1707588, - 868294,1532108, - 842934,1555538, - 689043,1396941, - 860184,1789964, - 534165,1258482, - 672508,1598821, - 578782,1476240, - 602247,1613140, - 409393,1206638, - 459294,1356779, - 430124,1359893, - 308326,1050647, - 313100,1099956, - 293887,1088978, - 220901,869582, - 214967,881695, - 197226,856990, - 166131,767761, - 152514,737406, - 128332,663998, - 117638,632653, - 106178,595539, - 90898,539506, - 83437,509231, - 76511,492801, - 64915,443096, - 57847,409809, - 52730,385395, - 45707,354059, - 42018,333028, - 37086,308073, - 33256,284497, - 36130,299172, - 28831,270716, - 25365,244840, - 22850,221896, - 19732,201462, - 17268,183729, - 15252,168106, - 13787,153979, - 12187,141455, - 10821,130337, - 9896,120165, - 8626,112273, - 8162,103886, - 7201,96441, - 6413,89805, - 5886,83733, - 5447,78084, - 4568,73356, - 4388,68831, - 3959,64688, - 3750,60804, - 3407,57271, - 3109,54024, - 2820,51099, - 48569,1451987, - 0, 0, - 0*22686225, 0 -}; - - -void generate_entropy_table_probTableWeighted() -{ -#if 000 - int64_t probTabSum=0; - for (int i=0;i<130;i++) - probTabSum += probTabi; - - - const int nSymbols=1000*1000*10; - const int oversample = 10; - - double tab642; - - for (int i=0;i<64;i++) - for (int k=0;k<2;k++) - tabik=0; - - srand(time(0)); - //srand(123); - - int cnt=1; - for (;;cnt++) { - printf("-------------------- %d --------------------\n",cnt); - - for (int s=0;s<63;s++) { - CABAC_encoder_bitstream cabac_mix0; - CABAC_encoder_bitstream cabac_mix1; - CABAC_encoder_bitstream cabac_ref; - - for (int i=0;i<nSymbols*oversample;i++) { - int r = rand(); - - r %= probTabSum; - int idx=0; - while (r>probTabidx) { - r-=probTabidx; - idx++; - } - - assert(idx<=128); - - int n = idx/2; - int b = idx&1; - bool bypass = (idx==128); - - printf("%d %d %d\n",n,b,bypass); - - context_model model; - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_ref.write_CABAC_bypass(1); - else cabac_ref.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_mix0.write_CABAC_bypass(1); - else cabac_mix0.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_mix1.write_CABAC_bypass(1); - else cabac_mix1.write_CABAC_bit(&model, b); - - if (i%oversample == oversample/2) { - model.MPSbit = 1; - model.state = s; - cabac_mix0.write_CABAC_bit(&model, 0); - - model.MPSbit = 1; - model.state = s; - cabac_mix1.write_CABAC_bit(&model, 1); - - //b = rand() & 1; - //cabac_mix.write_CABAC_bypass(1); - } - - } - - cabac_ref.flush_CABAC(); - cabac_mix0.flush_CABAC(); - cabac_mix1.flush_CABAC(); - - int bits_ref = cabac_ref.size()*8; - int bits_mix0 = cabac_mix0.size()*8; - int bits_mix1 = cabac_mix1.size()*8; - - //printf("bits: %d %d\n",bits_ref,bits_mix); - int bits_diff0 = bits_mix0-bits_ref; - int bits_diff1 = bits_mix1-bits_ref; - //printf("bits diff: %d\n",bits_diff); - - double bits_per_symbol0 = bits_diff0 / double(nSymbols); - double bits_per_symbol1 = bits_diff1 / double(nSymbols); - - tabs0 += bits_per_symbol0; - tabs1 += bits_per_symbol1; - - double bps0 = tabs0/cnt; - double bps1 = tabs1/cnt; - - printf("/* state=%2d */ 0x%05x /* %f */, 0x%05x /* %f */,\n", s, - (int)(bps1*0x8000), bps1, - (int)(bps0*0x8000), bps0); - } - - printf(" 0x0010c, 0x3bfbb /* dummy, should never be used */\n"); - } -#endif -} - - -void generate_entropy_table_replay() -{ -#if 000 - const int oversample = 10; - - char* lineptr = NULL; - size_t linelen = 0; - - for (int s=0;s<63;s++) { - CABAC_encoder_bitstream cabac_mix0; - CABAC_encoder_bitstream cabac_mix1; - CABAC_encoder_bitstream cabac_ref; - - int nSymbols = 0; - - FILE* fh = fopen("streamdump-paris-intra","r"); - - for (int i=0;i<80000000;i++) { - simple_getline(&lineptr,&linelen,fh); - if (feof(fh)) - break; - - int n,b; - sscanf(lineptr,"%d %d",&n,&b); - - bool bypass = (n==64); - - if ((i%10000)==0) - { printf("%d %d %d \r",i,n,b); - } - - //printf("%d %d %d\n",n,b,bypass); - - context_model model; - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_ref.write_CABAC_bypass(1); - else cabac_ref.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_mix0.write_CABAC_bypass(1); - else cabac_mix0.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_mix1.write_CABAC_bypass(1); - else cabac_mix1.write_CABAC_bit(&model, b); - - if (i%oversample == oversample/2) { - model.MPSbit = 1; - model.state = s; - cabac_mix0.write_CABAC_bit(&model, 0); - - model.MPSbit = 1; - model.state = s; - cabac_mix1.write_CABAC_bit(&model, 1); - - nSymbols++; - - //b = rand() & 1; - //cabac_mix.write_CABAC_bypass(1); - } - } - - fclose(fh); - - cabac_ref.flush_CABAC(); - cabac_mix0.flush_CABAC(); - cabac_mix1.flush_CABAC(); - - int bits_ref = cabac_ref.size()*8; - int bits_mix0 = cabac_mix0.size()*8; - int bits_mix1 = cabac_mix1.size()*8; - - //printf("bits: %d %d\n",bits_ref,bits_mix); - int bits_diff0 = bits_mix0-bits_ref; - int bits_diff1 = bits_mix1-bits_ref; - //printf("bits diff: %d\n",bits_diff); - - double bits_per_symbol0 = bits_diff0 / double(nSymbols); - double bits_per_symbol1 = bits_diff1 / double(nSymbols); - - double bps0 = bits_per_symbol0; - double bps1 = bits_per_symbol1; - - printf("/* state=%2d */ 0x%05x /* %f */, 0x%05x /* %f */,\n", s, - (int)(bps1*0x8000), bps1, - (int)(bps0*0x8000), bps0); - } - - printf(" 0x0010c, 0x3bfbb /* dummy, should never be used */\n"); -#endif -} - - -void test_entropy_table_replay() -{ -#if 000 - char* lineptr = NULL; - size_t linelen = 0; - - - CABAC_encoder_bitstream cabac_bs; - CABAC_encoder_estim cabac_estim; - - //FILE* fh = fopen("y","r"); - //FILE* fh = fopen("own-dump","r"); - //FILE* fh = fopen("rawstream-dump","r"); - //FILE* fh = fopen("johnny-stream-dump","r"); - FILE* fh = fopen("streamdump-paris-intra","r"); - - for (int i=0;i<80000000;i++) { - simple_getline(&lineptr,&linelen,fh); - if (feof(fh)) - break; - - int n,b; - sscanf(lineptr,"%d %d",&n,&b); - b=!b; - bool bypass = (n==64); - - if ((i%10000)==0) - { printf("%d %d %d \n",i,n,b); - } - - context_model model; - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_bs.write_CABAC_bypass(1); - else cabac_bs.write_CABAC_bit(&model, b); - - model.MPSbit = 1; - model.state = n; - if (bypass) cabac_estim.write_CABAC_bypass(1); - else cabac_estim.write_CABAC_bit(&model, b); - } - - fclose(fh); - - printf("bs:%d estim:%d\n",cabac_bs.size(),cabac_estim.size()); -#endif -} - - -int main(int argc, char** argv) -{ - //generate_entropy_table(); - //generate_entropy_table_replay(); - - test_entropy_table_replay(); - - return 0; -}
View file
libde265-1.0.17.tar.gz/tools/rd-curves.cc
Deleted
@@ -1,1075 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <stdio.h> -#include <stdlib.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <unistd.h> -#include <assert.h> -#include <cstdint> -#include <string.h> -#include <getopt.h> - -#include <sys/time.h> - -#if !defined(_WIN32) && !defined(WIN32) -#include <sys/times.h> -#endif - -#include <string> -#include <sstream> -#include <iostream> -#include <iomanip> -#include <fstream> -#include <vector> - - - -static struct { - const char* name; - const char* value; -} variables = { - { "$HOME" , "/home/domain/farindk" }, - { "$ROOT" , "/home/domain/farindk/prog/h265" }, - { "$ENC265" , "$ROOT/libde265/enc265/enc265" }, - { "$DEC265" , "$ROOT/libde265/dec265/dec265" }, - { "$YUVDIST" , "$ROOT/libde265/tools/yuv-distortion" }, - { "$YUVTMP" , "/mnt/temp/dirk/yuv/ftp.tnt.uni-hannover.de/testsequences" }, - { "$YUV" , "/storage/users/farindk/yuv" }, - { "$HMENC" , "HM13enc" }, - { "$HM13CFG" , "$ROOT/HM/HM-13.0-dev/cfg" }, - { "$HMSCCENC", "HM-SCC-enc" }, - { "$HMSCCCFG", "$ROOT/HM/HM-SCC-extensions/cfg" }, - { "$X265ENC" , "$ROOT/x265/build/linux/x265" }, - { "$X264" , "x264" }, - { "$FFMPEG" , "ffmpeg" }, - { "$F265" , "$ROOT/f265/build/f265cli" }, - { 0,0 } -}; - - -bool keepStreams = false; -int maxFrames = 0; -std::string encoderParameters; - - -std::string replace_variables(std::string str) -{ - bool replaced = false; - for (int i=0;variablesi.name;i++) { - size_t pos = str.find(variablesi.name); - if (pos != std::string::npos) { - replaced = true; - str = str.replace(pos, strlen(variablesi.name), variablesi.value); - break; - } - } - - if (!replaced) return str; - else return replace_variables(str); -} - - -// --------------------------------------------------------------------------- - -struct Preset -{ - const int ID; - const char* name; - const char* descr; - - const char* options_de265; - const char* options_hm; - const char* options_hm_scc; - const char* options_x265; - const char* options_f265; - const char* options_x264; - const char* options_x264_ffmpeg; - const char* options_ffmpeg_mpeg2; - - //int nFrames; -}; - - -Preset preset = { - { 1, "pre01-intra-noLF", "intra, no LF, no SBH, CTB-size 32, min CB=8", - /* de265 */ "--sop-structure intra", - /* HM */ "-c $HM13CFG/encoder_intra_main.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* HM SCC */ "-c $HMSCCCFG/encoder_intra_main_scc.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* x265 */ "--no-lft -I 1 --no-signhide", - /* f265 */ "key-frame-spacing=1", - /* x264 */ "-I 1", - /* ffmpeg */ "-g 1", - /* mpeg-2 */ "-g 1" - // 0 // all frames - }, - - { 2, "pre02-fastIntra", "intra, no LF, no SBH, CTB-size 32, min CB=8", - /* de265 */ "--sop-structure intra --TB-IntraPredMode minSSD", - /* HM */ "-c $HM13CFG/encoder_intra_main.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* HM SCC */ "-c $HMSCCCFG/encoder_intra_main_scc.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* x265 */ "--no-lft -I 1 --no-signhide", - /* f265 */ "key-frame-spacing=1", - /* x264 */ "-I 1", - /* ffmpeg */ "-g 1", - /* mpeg-2 */ "-g 1" - // 0 // all frames - }, - - { 3, "pre03-fastIntra", "pre02, but fast-brute", - /* de265 */ "--sop-structure intra --TB-IntraPredMode fast-brute", - /* HM */ "-c $HM13CFG/encoder_intra_main.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* HM SCC */ "-c $HMSCCCFG/encoder_intra_main_scc.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* x265 */ "--no-lft -I 1 --no-signhide", - /* f265 */ "key-frame-spacing=1", - /* x264 */ "-I 1", - /* ffmpeg */ "-g 1", - /* mpeg-2 */ "-g 1" - // 0 // all frames - }, - - { 50, "cb-auto16", "(development test)", - /* de265 */ "--max-cb-size 16 --min-cb-size 8", - /* HM */ "-c $HM13CFG/encoder_intra_main.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* HM SCC */ "-c $HMSCCCFG/encoder_intra_main_scc.cfg -SBH 0 --SAO=0 --LoopFilterDisable --DeblockingFilterControlPresent --MaxCUSize=32 --MaxPartitionDepth=2", - /* x265 */ "--no-lft -I 1 --no-signhide", - /* f265 */ "key-frame-spacing=1", - /* x264 */ "-I 1", - /* ffmpeg */ "-g 1", - /* mpeg-2 */ "-g 1" - // 0 // all frames - }, - - { 80, "lowdelay", "default (low-default) encoder parameters", - "--MEMode search --max-cb-size 32 --min-cb-size 8 --min-tb-size 4 --CB-IntraPartMode-Fixed-partMode 2Nx2N --CB-IntraPartMode fixed --TB-IntraPredMode min-residual --PB-MV-TestMode zero", - /* de265 */ //"--sop-structure low-delay --MEMode search --max-cb-size 32 --min-cb-size 8 --min-tb-size 4 --CB-IntraPartMode fixed --TB-IntraPredMode min-residual", - /* HM */ "-c $HM13CFG/encoder_lowdelay_main.cfg -ip 248", - /* HM SCC */ "-c $HMSCCCFG/encoder_lowdelay_main_scc.cfg -ip 248", - /* x265 */ "-I 248 --no-wpp --bframes 0", // GOP size: 248 - /* f265 */ 0, //"key-frame-spacing=248", - /* x264 */ "", - /* ffmpeg */ "-g 248 -bf 0", - /* mpeg-2 */ "" // GOP size 248 does not make sense here - // 0 // all frames - }, - - { 98, "best", "default (random-access) encoder parameters", - /* de265 */ "--max-cb-size 16 --min-cb-size 8", - /* HM */ "-c $HM13CFG/encoder_randomaccess_main.cfg", - /* HM SCC */ "-c $HMSCCCFG/encoder_randomaccess_main_scc.cfg", - /* x265 */ "", - /* f265 */ "", - /* x264 */ "", - /* ffmpeg */ "", - /* mpeg-2 */ "" - // 0 // all frames - }, - - { 99, "besteq", "default (random-access) encoder parameters, I-frame distance = 248", - /* de265 */ "", - /* HM */ "-c $HM13CFG/encoder_randomaccess_main.cfg -ip 248", - /* HM SCC */ "-c $HMSCCCFG/encoder_randomaccess_main_scc.cfg -ip 248", - /* x265 */ "-I 248 --no-wpp", // GOP size: 248 - /* f265 */ "key-frame-spacing=248", - /* x264 */ "", - /* ffmpeg */ "-g 248", - /* mpeg-2 */ "" // GOP size 248 does not make sense here - // 0 // all frames - }, - - { 0, NULL } -}; - -// --------------------------------------------------------------------------- - -class Input -{ -public: - Input() { - width=height=0; - maxFrames=0; - } - - void setInput(const char* yuvfilename,int w,int h, float fps) { - mInputFilename = yuvfilename; - width = w; - height = h; - mFPS = fps; - } - - void setMaxFrames(int n) { maxFrames=n; } - - std::string options_de265() const { - std::stringstream sstr; - sstr << " -i " << mInputFilename << " --width " << width << " --height " << height; - if (maxFrames) sstr << " --frames " << maxFrames; - - return sstr.str(); - } - - std::string options_HM() const { - std::stringstream sstr; - sstr << "-i " << mInputFilename << " -wdt " << width << " -hgt " << height - << " -fr " << mFPS; - if (maxFrames) sstr << " -f " << maxFrames; - - return sstr.str(); - } - - std::string options_x265() const { - std::stringstream sstr; - sstr << mInputFilename << " --input-res " << width << "x" << height - << " --fps " << mFPS; - if (maxFrames) sstr << " -f " << maxFrames; - - return sstr.str(); - } - - std::string options_x264() const { - std::stringstream sstr; - sstr << mInputFilename << " --input-res " << width << "x" << height; - sstr << " --fps 25"; // TODO: check why crf/qp rate-control freaks out when fps is != 25 - if (maxFrames) sstr << " --frames " << maxFrames; - - return sstr.str(); - } - - std::string options_ffmpeg() const { - std::stringstream sstr; - sstr << "-f rawvideo -vcodec rawvideo -s " << width << "x" << height; // << " -r " << mFPS - sstr << " -pix_fmt yuv420p -i " << mInputFilename; - if (maxFrames) sstr << " -vframes " << maxFrames; - - return sstr.str(); - } - - std::string options_f265() const { - std::stringstream sstr; - sstr << mInputFilename << " -w " << width << ":" << height; - if (maxFrames) sstr << " -c " << maxFrames; - - return sstr.str(); - } - - std::string getFilename() const { return mInputFilename; } - float getFPS() const { return mFPS; } - int getNFrames() const { return maxFrames; } - int getWidth() const { return width; } - int getHeight() const { return height; } - -private: - std::string mInputFilename; - int width, height; - int maxFrames; - float mFPS; -}; - -Input input; - -struct InputSpec -{ - const char* name; - const char* filename; - int width,height, nFrames; - float fps; -} inputSpec = { - { "paris", "$YUV/paris_cif.yuv",352,288,1065, 30.0 }, - { "paris10", "$YUV/paris_cif.yuv",352,288, 10, 30.0 }, - { "paris100", "$YUV/paris_cif.yuv",352,288, 100, 30.0 }, - { "johnny", "$YUV/Johnny_1280x720_60.yuv",1280,720,600,60.0 }, - { "johnny10", "$YUV/Johnny_1280x720_60.yuv",1280,720, 10,60.0 }, - { "johnny100", "$YUV/Johnny_1280x720_60.yuv",1280,720,100,60.0 }, - { "cactus", "$YUV/Cactus_1920x1080_50.yuv",1920,1080,500,50.0 }, - { "cactus10", "$YUV/Cactus_1920x1080_50.yuv",1920,1080, 10,50.0 }, - { "4people", "$YUVTMP/FourPeople_1280x720_60.yuv",1280,720,600,60.0 }, - { "4people100", "$YUVTMP/FourPeople_1280x720_60.yuv",1280,720,100,60.0 }, - { "slideedit", "$YUVTMP/SlideEditing_1280x720_30.yuv",1280,720,300,30.0 }, - { "slideedit100","$YUVTMP/SlideEditing_1280x720_30.yuv",1280,720,100,30.0 }, - { "slideshow", "$YUVTMP/SlideShow_1280x720_20.yuv",1280,720,500,20.0 }, - { "slideshow100","$YUVTMP/SlideShow_1280x720_20.yuv",1280,720,100,20.0 }, - { "screensharing","$HOME/test-screensharing-encoding/Screensharing.yuv",1360,768,4715,60.0 }, - { NULL } -}; - - -void setInput(const char* input_preset) -{ - bool presetFound=false; - - for (int i=0;inputSpeci.name;i++) { - if (strcmp(input_preset, inputSpeci.name)==0) { - input.setInput(inputSpeci.filename, - inputSpeci.width, - inputSpeci.height, - inputSpeci.fps); - input.setMaxFrames(inputSpeci.nFrames); - presetFound=true; - break; - } - } - - if (!presetFound) { - fprintf(stderr,"no input preset '%s'\n",input_preset); - exit(5); - } -} - - -float bitrate(const char* filename) -{ - struct stat s; - stat(filename,&s); - - long size = s.st_size; - - int frames = input.getNFrames(); - assert(frames!=0); - - float bitrate = size*8/(frames/input.getFPS()); - return bitrate; -} - - -// --------------------------------------------------------------------------- - -class Quality -{ -public: - virtual ~Quality() { } - - virtual void measure(const char* h265filename); - virtual void measure_yuv(const char* yuvfilename); - - float psnr, ssim; -}; - - -void Quality::measure(const char* h265filename) -{ - std::stringstream sstr; - sstr << "$DEC265 " << h265filename << " -q -t6 -m " << input.getFilename() << " | grep total " - //"| awk '{print $2}' " - ">/tmp/xtmp"; - - //std::cout << sstr.str() << "\n"; - int retval = system(replace_variables(sstr.str()).c_str()); - - std::ifstream istr; - istr.open("/tmp/xtmp"); - std::string dummy; - istr >> dummy >> psnr >> dummy >> dummy >> ssim; - - unlink("/tmp/xtmp"); -} - - -void Quality::measure_yuv(const char* yuvfilename) -{ - std::stringstream sstr; - - sstr << "$YUVDIST " << input.getFilename() << " " << yuvfilename - << " " << input.getWidth() << " " << input.getHeight() - << "|grep total " - //"|awk '{print $2}' " - ">/tmp/ytmp"; - - //std::cout << sstr.str() << "\n"; - int retval = system(replace_variables(sstr.str()).c_str()); - - std::ifstream istr; - istr.open("/tmp/ytmp"); - std::string dummy; - istr >> dummy >> psnr >> ssim; - - unlink("/tmp/ytmp"); -} - -Quality quality; - -// --------------------------------------------------------------------------- - -long ticks_per_second; - -void init_clock() -{ -#if !defined(_WIN32) && !defined(WIN32) - ticks_per_second = sysconf(_SC_CLK_TCK); -#endif -} - -double get_cpu_time() -{ -#if !defined(_WIN32) && !defined(WIN32) - struct tms t; - times(&t); - return double(t.tms_cutime)/ticks_per_second; -#else - return 0; // not supported on windows (TODO) -#endif -} - -double get_wall_time() -{ - struct timeval tv; - gettimeofday(&tv, NULL); - double t = tv.tv_sec; - double ut = tv.tv_usec/1000000.0f; - t += ut; - return t; -} - - -struct RDPoint -{ - float rate; - float psnr; - float ssim; - double cpu_time; // computation time in seconds - double wall_time; - - - RDPoint() { } - - void compute_from_h265(std::string stream_name) { - rate = bitrate(stream_name.c_str()); - quality.measure(stream_name.c_str()); - psnr = quality.psnr; - ssim = quality.ssim; - } - - void compute_from_yuv(std::string stream_name, std::string yuv_name) { - rate = bitrate(stream_name.c_str()); - quality.measure_yuv(yuv_name.c_str()); - psnr = quality.psnr; - ssim = quality.ssim; - } - - void start_timer() { - cpu_time = get_cpu_time(); - wall_time= get_wall_time(); - } - - void end_timer() { - cpu_time = get_cpu_time() - cpu_time; - wall_time= get_wall_time()- wall_time; - } -}; - - -FILE* output_fh; - -void write_rd_line(RDPoint p) -{ - fprintf(output_fh,"%9.2f %6.4f %5.3f %5.4f %5.4f\n", - p.rate/1024, p.psnr, p.ssim, - p.cpu_time/60, p.wall_time/60); - fflush(output_fh); -} - - - - -class Encoder -{ -public: - virtual ~Encoder() { } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const = 0; - -private: -}; - - -class Encoder_de265 : public Encoder -{ -public: - Encoder_de265(); - void setQPRange(int low,int high,int step) { mQPLow=low; mQPHigh=high; mQPStep=step; } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int qp) const; - - int mQPLow,mQPHigh,mQPStep; -}; - - -Encoder_de265::Encoder_de265() -{ - mQPLow = 14; - mQPHigh= 40; - mQPStep= 2; -} - - -std::vector<RDPoint> Encoder_de265::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - for (int qp=mQPHigh ; qp>=mQPLow ; qp-=mQPStep) { - curve.push_back(encode(preset, qp)); - } - - return curve; -} - - -RDPoint Encoder_de265::encode(const Preset& preset,int qp) const -{ - std::stringstream streamname; - streamname << "de265-" << preset.name << "-" << qp << ".265"; - - std::stringstream cmd1; - cmd1 << "$ENC265 " << input.options_de265() - << " " << preset.options_de265 - << " -q " << qp << " -o " << streamname.str() - << " " << encoderParameters; - - std::string cmd2 = replace_variables(cmd1.str()); - - printf("cmdline: %s\n",cmd2.c_str()); - - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - rd.compute_from_h265(streamname.str()); - - if (!keepStreams) { unlink(streamname.str().c_str()); } - - write_rd_line(rd); - - return rd; -} - - - - -class Encoder_HM : public Encoder -{ -public: - Encoder_HM(); - - void enableSCC(bool flag=true) { useSCC = flag; } - void setQPRange(int low,int high,int step) { mQPLow=low; mQPHigh=high; mQPStep=step; } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int qp) const; - - bool useSCC; - int mQPLow,mQPHigh,mQPStep; -}; - - -Encoder_HM::Encoder_HM() -{ - mQPLow = 14; - mQPHigh= 40; - mQPStep= 2; - - useSCC = false; -} - - -std::vector<RDPoint> Encoder_HM::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - for (int qp=mQPHigh ; qp>=mQPLow ; qp-=mQPStep) { - curve.push_back(encode(preset, qp)); - } - - return curve; -} - - -RDPoint Encoder_HM::encode(const Preset& preset,int qp) const -{ - std::stringstream streamname; - streamname << (useSCC ? "hmscc-" : "hm-") << preset.name << "-" << qp << ".265"; - - char recoyuv_prefix = "/tmp/reco-XXXXXX"; - char *tempfile = mktemp(recoyuv_prefix); - assert(tempfile != NULL && tempfile0 != 0); - std::string recoyuv = std::string(recoyuv_prefix) + ".yuv"; - - std::stringstream cmd1; - cmd1 << (useSCC ? "$HMSCCENC " : "$HMENC ") - << input.options_HM() - << " " << (useSCC ? preset.options_hm_scc : preset.options_hm) - << " -q " << qp << " -o " << recoyuv << " -b " << streamname.str() - << " " << encoderParameters << " >&2"; - - std::string cmd2 = replace_variables(cmd1.str()); - - std::cout << "CMD: '" << cmd2 << "'\n"; - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - rd.compute_from_yuv(streamname.str(), recoyuv); - if (!keepStreams) { unlink(streamname.str().c_str()); } - unlink(recoyuv.c_str()); - - write_rd_line(rd); - - return rd; -} - - - -class Encoder_x265 : public Encoder -{ -public: - Encoder_x265(); - void setQPRange(int low,int high,int step) { mQPLow=low; mQPHigh=high; mQPStep=step; } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int qp) const; - - int mQPLow,mQPHigh,mQPStep; -}; - - -Encoder_x265::Encoder_x265() -{ - /* CRF - mQPLow = 4; - mQPHigh= 34; - mQPStep= 2; - */ - - mQPLow = 14; - mQPHigh= 40; - mQPStep= 2; -} - - -std::vector<RDPoint> Encoder_x265::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - for (int qp=mQPHigh ; qp>=mQPLow ; qp-=mQPStep) { - curve.push_back(encode(preset, qp)); - } - - return curve; -} - - -RDPoint Encoder_x265::encode(const Preset& preset,int qp) const -{ - std::stringstream streamname; - streamname << "x265-" << preset.name << "-" << qp << ".265"; - - std::stringstream cmd1; - cmd1 << "$X265ENC " << input.options_x265() - << " " << preset.options_x265 - << " --qp " << qp << " " << streamname.str() - << " " << encoderParameters - << " >&2"; - - std::string cmd2 = replace_variables(cmd1.str()); - - //std::cout << "CMD: '" << cmd2 << "'\n"; - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - rd.compute_from_h265(streamname.str()); - if (!keepStreams) { unlink(streamname.str().c_str()); } - - write_rd_line(rd); - - return rd; -} - - - - -class Encoder_f265 : public Encoder -{ -public: - Encoder_f265(); - void setQPRange(int low,int high,int step) { mQPLow=low; mQPHigh=high; mQPStep=step; } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int qp) const; - - int mQPLow,mQPHigh,mQPStep; -}; - - -Encoder_f265::Encoder_f265() -{ - mQPLow = 14; - mQPHigh= 40; - mQPStep= 2; -} - - -std::vector<RDPoint> Encoder_f265::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - for (int qp=mQPHigh ; qp>=mQPLow ; qp-=mQPStep) { - curve.push_back(encode(preset, qp)); - } - - return curve; -} - - -RDPoint Encoder_f265::encode(const Preset& preset,int qp) const -{ - std::stringstream cmd1; - cmd1 << "$F265 " << input.options_f265() - << " f265.out -v -p\"" << preset.options_f265 << " qp=" << qp - << " " << encoderParameters - << "\" >&2"; - - std::string cmd2 = replace_variables(cmd1.str()); - - std::cout << "CMD: '" << cmd2 << "'\n"; - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - rd.compute_from_h265("f265.out"); - if (!keepStreams) { unlink("f265.out"); } - - write_rd_line(rd); - - return rd; -} - - - -class Encoder_x264 : public Encoder -{ -public: - Encoder_x264(); - //void setCRFRange(int low,int high,int step) { mCRFLow=low; mCRFHigh=high; mCRFStep=step; } - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int crf) const; - - int mCRFLow,mCRFMid,mCRFHigh; - int mCRFStepHigh, mCRFStepLow; -}; - - -Encoder_x264::Encoder_x264() -{ - // in the upper bit-rate range mid;high, use larger CRF step-size 'StepHigh' - // in the lower bit-rate range low;mid, use smaller CRF step-size 'StepLow' - - mCRFLow = 10; - mCRFMid = 20; - mCRFHigh= 36; - mCRFStepHigh= 2; - mCRFStepLow = 1; -} - - -std::vector<RDPoint> Encoder_x264::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - for (int crf=mCRFLow ; crf<mCRFMid ; crf+=mCRFStepHigh) { - curve.push_back(encode(preset, crf)); - } - - for (int crf=mCRFMid ; crf<=mCRFHigh ; crf+=mCRFStepLow) { - curve.push_back(encode(preset, crf)); - } - - return curve; -} - - -RDPoint Encoder_x264::encode(const Preset& preset,int qp_crf) const -{ - std::stringstream streamname; - streamname << "x264-" << preset.name << "-" << qp_crf << ".264"; - - std::stringstream cmd1; -#if 0 - cmd1 << "$X264 " << input.options_x264() - << " " << preset.options_x264 - << " --crf " << qp_crf - << " -o " << streamname.str(); -#else - cmd1 << "$FFMPEG " << input.options_ffmpeg() - << " " << preset.options_x264_ffmpeg - << " -crf " << qp_crf - << " -threads 6" - << " -f h264 " << streamname.str() - << " " << encoderParameters; -#endif - - std::string cmd2 = replace_variables(cmd1.str()); - - std::cerr << "-----------------------------\n"; - - std::cerr << "CMD: '" << cmd2 << "'\n"; - - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - char tmpyuv_prefix = "/tmp/rdout-XXXXXX"; - char *tempfile = mktemp(tmpyuv_prefix); - assert(tempfile != NULL && tempfile0 != 0); - std::string tmpyuv = std::string(tmpyuv_prefix) + ".yuv"; - - std::string cmd3 = "ffmpeg -i " + streamname.str() + " -threads 6 " + tmpyuv; - - retval = system(cmd3.c_str()); - - rd.compute_from_yuv(streamname.str(), tmpyuv); - - unlink(tmpyuv.c_str()); - if (!keepStreams) { unlink(streamname.str().c_str()); } - - write_rd_line(rd); - - return rd; -} - - -class Encoder_mpeg2 : public Encoder -{ -public: - Encoder_mpeg2(); - - virtual std::vector<RDPoint> encode_curve(const Preset& preset) const; - -private: - RDPoint encode(const Preset& preset,int bitrate) const; -}; - - -Encoder_mpeg2::Encoder_mpeg2() -{ -} - - -std::vector<RDPoint> Encoder_mpeg2::encode_curve(const Preset& preset) const -{ - std::vector<RDPoint> curve; - - int bitrates = { 250,500,750,1000,1250,1500,1750,2000,2500,3000,3500,4000,4500,5000, - 6000,7000,8000,9000,10000,12000,14000,16000,18000,20000,25000,30000, - -1 }; - - for (int i=0; bitratesi>0; i++) { - curve.push_back(encode(preset, bitratesi)); - } - - return curve; -} - - -RDPoint Encoder_mpeg2::encode(const Preset& preset,int br) const -{ - std::stringstream streamname; - streamname << "mpeg2-" << preset.name << "-" - << std::setfill('0') << std::setw(5) << br << ".mp2"; - - std::stringstream cmd1; - cmd1 << "$FFMPEG " << input.options_ffmpeg() - << " " << preset.options_x264_ffmpeg - << " -b " << br << "k " - << " -threads 6" - << " -f mpeg2video " << streamname.str() - << " " << encoderParameters; - - std::string cmd2 = replace_variables(cmd1.str()); - - std::cerr << "-----------------------------\n"; - - std::cerr << "CMD: '" << cmd2 << "'\n"; - - RDPoint rd; - rd.start_timer(); - int retval = system(cmd2.c_str()); - rd.end_timer(); - - char tmpyuv_prefix = "/tmp/rdout-XXXXXX"; - char *tempfile = mktemp(tmpyuv_prefix); - assert(tempfile != NULL && tempfile0 != 0); - std::string tmpyuv = std::string(tmpyuv_prefix) + ".yuv"; - - std::string cmd3 = "ffmpeg -i " + streamname.str() + " -threads 6 " + tmpyuv; - - retval = system(cmd3.c_str()); - - rd.compute_from_yuv(streamname.str(), tmpyuv); - - unlink(tmpyuv.c_str()); - if (!keepStreams) { unlink(streamname.str().c_str()); } - - write_rd_line(rd); - - return rd; -} - - -Encoder_de265 enc_de265; -Encoder_HM enc_hm; -Encoder_x265 enc_x265; -Encoder_f265 enc_f265; -Encoder_x264 enc_x264; -Encoder_mpeg2 enc_mpeg2; - -// --------------------------------------------------------------------------- - -static struct option long_options = { - {"keep-streams", no_argument, 0, 'k' }, - //{"write-bytestream", required_argument,0, 'B' }, - {0, 0, 0, 0 } -}; - - -void show_usage() -{ - fprintf(stderr, - "usage: rd-curves 'preset_id' 'input_preset' 'encoder'\n" - "supported encoders: de265 / hm / hmscc / x265 / f265 / x264 / mpeg2\n"); - fprintf(stderr, - "presets:\n"); - - for (int i=0;preseti.name!=NULL;i++) { - fprintf(stderr, - " %2d %-20s %s\n",preseti.ID,preseti.name,preseti.descr); - } - - fprintf(stderr, - "\ninput presets:\n"); - for (int i=0;inputSpeci.name;i++) { - fprintf(stderr, - " %-12s %-30s %4dx%4d, %4d frames, %5.2f fps\n", - inputSpeci.name, - inputSpeci.filename, - inputSpeci.width, - inputSpeci.height, - inputSpeci.nFrames, - inputSpeci.fps); - } -} - -int main(int argc, char** argv) -{ - init_clock(); - - while (1) { - int option_index = 0; - - int c = getopt_long(argc, argv, "kf:p:", - long_options, &option_index); - if (c == -1) - break; - - switch (c) { - case 'k': keepStreams=true; break; - case 'f': maxFrames=atoi(optarg); break; - case 'p': encoderParameters=optarg; break; - } - } - - if (optind != argc-3) { - show_usage(); - exit(5); - } - - int presetID = atoi( argvoptind ); - const char* inputName = argvoptind+1; - const char* encoderName = argvoptind+2; - - int presetIdx = -1; - - for (int i=0;preseti.name != NULL;i++) { - if (preseti.ID == presetID) { - presetIdx = i; - break; - } - } - - if (presetIdx == -1) { - fprintf(stderr,"preset ID %d does not exist\n",presetID); - exit(5); - } - - setInput(inputName); - if (maxFrames) input.setMaxFrames(maxFrames); - - - Encoder* enc = NULL; - /**/ if (strcmp(encoderName,"de265")==0) { enc = &enc_de265; } - else if (strcmp(encoderName,"hm" )==0) { enc = &enc_hm; } - else if (strcmp(encoderName,"hmscc")==0) { enc = &enc_hm; enc_hm.enableSCC(); } - else if (strcmp(encoderName,"x265" )==0) { enc = &enc_x265; } - else if (strcmp(encoderName,"f265" )==0) { enc = &enc_f265; } - else if (strcmp(encoderName,"x264" )==0) { enc = &enc_x264; } - else if (strcmp(encoderName,"mpeg2")==0) { enc = &enc_mpeg2; } - - if (enc==NULL) { - fprintf(stderr, "unknown encoder"); - exit(5); - } - - - std::stringstream data_filename; - data_filename << encoderName << "-" << inputName << "-" << presetpresetIdx.name << ".rd"; - output_fh = fopen(data_filename.str().c_str(), "wb"); - - fprintf(output_fh,"# %s\n", presetpresetIdx.descr); - fprintf(output_fh,"# 1:rate 2:psnr 3:ssim 4:cputime(min) 5:walltime(min)\n"); - - std::vector<RDPoint> curve = enc->encode_curve(presetpresetIdx); - - for (int i=0;i<curve.size();i++) { - //fprintf(out_fh,"%7.2f %6.4f\n", curvei.rate/1024, curvei.psnr); - } - - fclose(output_fh); - - return 0; -}
View file
libde265-1.0.17.tar.gz/tools/tests.cc
Deleted
@@ -1,100 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#include <stdio.h> -#include <iostream> -#include <string.h> - - -class Test -{ -public: - Test() { next=s_firstTest; s_firstTest=this; } - virtual ~Test() { } - - virtual const char* getName() const { return "noname"; } - virtual const char* getDescription() const { return "no description"; } - virtual bool work(bool quiet=false) = 0; - - static void runTest(const char* name) { - Test* t = s_firstTest; - while (t) { - if (strcmp(t->getName(), name)==0) { - t->work(); - break; - } - t=t->next; - } - } - - static void runAllTests() { - Test* t = s_firstTest; - while (t) { - printf("%s ... ",t->getName()); - fflush(stdout); - if (t->work(true) == false) { - printf("*** FAILED ***\n"); - } - else { - printf("passed\n"); - } - - t=t->next; - } - } - -public: - Test* next; - static Test* s_firstTest; -}; - -Test* Test::s_firstTest = NULL; - - -class ListTests : public Test -{ -public: - const char* getName() const { return "list"; } - const char* getDescription() const { return "list all available tests"; } - bool work(bool quiet) { - if (!quiet) { - Test* t = s_firstTest; - while (t) { - printf("- %s: %s\n",t->getName(), t->getDescription()); - t=t->next; - } - } - return true; - } -} listtest; - - - -int main(int argc,char** argv) -{ - if (argc>=2) { - Test::runTest(argv1); - } - else { - Test::runAllTests(); - } - - return 0; -}
View file
libde265-1.0.17.tar.gz/tools/yuv-distortion.cc
Deleted
@@ -1,113 +0,0 @@ -/* - * H.265 video codec. - * Copyright (c) 2013-2014 struktur AG, Dirk Farin <farin@struktur.de> - * - * This file is part of libde265. - * - * libde265 is free software: you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation, either version 3 of the License, or - * (at your option) any later version. - * - * libde265 is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with libde265. If not, see <http://www.gnu.org/licenses/>. - */ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <stdio.h> -#include <stdlib.h> - -#include <libde265/quality.h> - -#if HAVE_VIDEOGFX -#include <libvideogfx.hh> -using namespace videogfx; -#endif - - -float ssim(const uint8_t* img1, - const uint8_t* img2, - int width, int height) -{ -#if HAVE_VIDEOGFX - Bitmap<Pixel> ref, coded; - ref .Create(width, height); // reference image - coded.Create(width, height); // coded image - - for (int y=0;y<height;y++) { - memcpy(codedy, img1 + y*width, width); - memcpy(refy, img2 + y*width, width); - } - - SSIM ssimAlgo; - return ssimAlgo.calcMSSIM(ref,coded); -#else - return 0; -#endif -} - - -int main(int argc, char** argv) -{ - if (argc != 5) { - fprintf(stderr,"need two YUV files and image size as input: FILE1 FILE2 WIDTH HEIGHT\n"); - exit(5); - } - - - FILE* fh_ref = fopen(argv1,"rb"); - FILE* fh_cmp = fopen(argv2,"rb"); - - int width = atoi(argv3); - int height = atoi(argv4); - - uint8_t* yp_ref = (uint8_t*)malloc(width*height); - uint8_t* yp_cmp = (uint8_t*)malloc(width*height); - - double mse_y=0.0, ssim_y=0.0; - int nFrames=0; - - for (;;) - { - if (fread(yp_ref,1,width*height,fh_ref) != width*height) { - break; - } - if (fread(yp_cmp,1,width*height,fh_cmp) != width*height) { - break; - } - - if (feof(fh_ref)) break; - if (feof(fh_cmp)) break; - - fprintf(stderr,"yuv-distortion processing frame %d\r",nFrames+1); - - fseek(fh_ref,width*height/2,SEEK_CUR); - fseek(fh_cmp,width*height/2,SEEK_CUR); - - double curr_mse_y = MSE(yp_ref, width, yp_cmp, width, width, height); - mse_y += curr_mse_y; - - double curr_ssim_y = ssim(yp_ref, yp_cmp, width, height); - ssim_y += curr_ssim_y; - - printf("%4d %f %f\n",nFrames,PSNR(curr_mse_y),curr_ssim_y); - - nFrames++; - } - - printf("total: %f %f\n",PSNR(mse_y/nFrames),ssim_y/nFrames); - fprintf(stderr,"\n"); - - fclose(fh_ref); - fclose(fh_cmp); - - return 0; -}
View file
libde265-1.0.17.tar.gz/CMakeLists.txt -> libde265-1.1.1.tar.gz/CMakeLists.txt
Changed
@@ -2,7 +2,7 @@ project (libde265 LANGUAGES C CXX - VERSION 1.0.17 + VERSION 1.1.1 ) # Auto-compute BCD-encoded numeric version from project version. @@ -19,10 +19,10 @@ # This controls the shared library filename: # libde265.so -> libde265.so.0 -> libde265.so.0.1.9 # ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^ -# SOVERSION VERSION +# DE265_SOVERSION DE265_LIBRARY_VERSION # -# VERSION is a three-part number: MAJOR.MINOR.PATCH -# - MAJOR = the ABI major version (same as SOVERSION). +# DE265_LIBRARY_VERSION is a three-part number: MAJOR.MINOR.PATCH +# - MAJOR = the ABI major version (same as DE265_SOVERSION). # Bump when the ABI breaks (exported functions removed or signatures changed). # Reset MINOR and PATCH to 0. # - MINOR = backward-compatible ABI additions. @@ -31,20 +31,21 @@ # - PATCH = implementation-only changes (bug fixes, performance improvements). # Bump when the ABI is unchanged. # -# SOVERSION must always equal the MAJOR part of VERSION. +# DE265_SOVERSION must always equal the MAJOR part of DE265_LIBRARY_VERSION. # Programs linked against libde265.so.0 will work with any libde265.so.0.x.y. # set(DE265_SOVERSION 0) -set(DE265_LIBRARY_VERSION "0.1.10") +set(DE265_LIBRARY_VERSION "0.2.1") set(CMAKE_CXX_STANDARD 17) set(CMAKE_CXX_STANDARD_REQUIRED ON) set(CMAKE_CXX_EXTENSIONS OFF) set(CMAKE_POSITION_INDEPENDENT_CODE ON) -if (CMAKE_BUILD_TYPE STREQUAL "Debug") - add_compile_options(-Wall -Wextra -Wpedantic -Wno-unused-parameter) -endif() +add_compile_options( + "$<$<AND:$<CONFIG:Debug>,$<NOT:$<CXX_COMPILER_ID:MSVC>>>:-Wall;-Wextra;-Wpedantic;-Wno-unused-parameter>" + "$<$<AND:$<CONFIG:Debug>,$<CXX_COMPILER_ID:MSVC>>:/W4>" +) option(USE_IWYU "Run include-what-you-use analysis during build" OFF) if (USE_IWYU) @@ -72,42 +73,66 @@ CHECK_INCLUDE_FILE(malloc.h HAVE_MALLOC_H) CHECK_FUNCTION_EXISTS(posix_memalign HAVE_POSIX_MEMALIGN) +option(ENABLE_SIMD "Enable SIMD acceleration (SSE on x86, NEON on ARM)" ON) +option(ENABLE_AVX2 "Enable AVX2 kernels on x86 (in addition to SSE)" ON) +option(ENABLE_AVX512 "Enable AVX-512 kernels on x86 (in addition to AVX2)" ON) + include(CheckCSourceCompiles) -check_c_source_compiles( - "#if !defined(__x86_64) && !defined(__i386__) \ - && !defined(_M_IX86) && !defined(_M_AMD64) \ - || defined(_M_ARM64EC) || defined(_ARM64EC_) - #error not x86 - #endif - int main(){return 0;}" - HAVE_X86) - -if(HAVE_X86) - if (MSVC) - set(SUPPORTS_SSE2 1) - set(SUPPORTS_SSSE3 1) - set(SUPPORTS_SSE4_1 1) - else() - check_c_compiler_flag(-msse2 SUPPORTS_SSE2) - check_c_compiler_flag(-mssse3 SUPPORTS_SSSE3) - check_c_compiler_flag(-msse4.1 SUPPORTS_SSE4_1) - endif() - if(SUPPORTS_SSE4_1) - set(HAVE_SSE4_1 TRUE) +if(ENABLE_SIMD) + check_c_source_compiles( + "#if !defined(__x86_64) && !defined(__i386__) \ + && !defined(_M_IX86) && !defined(_M_AMD64) \ + || defined(_M_ARM64EC) || defined(_ARM64EC_) + #error not x86 + #endif + int main(){return 0;}" + HAVE_X86) + + if(HAVE_X86) + if (MSVC) + set(SUPPORTS_SSE2 1) + set(SUPPORTS_SSSE3 1) + set(SUPPORTS_SSE4_1 1) + else() + check_c_compiler_flag(-msse2 SUPPORTS_SSE2) + check_c_compiler_flag(-mssse3 SUPPORTS_SSSE3) + check_c_compiler_flag(-msse4.1 SUPPORTS_SSE4_1) + check_c_compiler_flag(-mavx2 SUPPORTS_AVX2) + check_c_compiler_flag("-mavx512f -mavx512bw" SUPPORTS_AVX512) + endif() + if(SUPPORTS_SSE4_1) + set(HAVE_SSE4_1 TRUE) + endif() + # AVX2 kernels back the SSE4.1 ones; only built on 64-bit targets. + if(ENABLE_AVX2 AND SUPPORTS_AVX2 AND HAVE_SSE4_1 AND CMAKE_SIZEOF_VOID_P EQUAL 8) + set(HAVE_AVX2 TRUE) + endif() + # AVX-512 (F+BW) kernels back the AVX2 ones. + if(ENABLE_AVX512 AND SUPPORTS_AVX512 AND HAVE_AVX2) + set(HAVE_AVX512 TRUE) + endif() endif() -endif() -check_c_source_compiles( - "#if !defined(__arm__) && !defined(__aarch64__) \ - && !defined(_M_ARM) && !defined(_M_ARM64) - #error not ARM - #endif - int main(){return 0;}" - HAVE_ARM) + check_c_source_compiles( + "#if !defined(__arm__) && !defined(_M_ARM) + #error not 32-bit ARM + #endif + int main(){return 0;}" + HAVE_ARM32) + + check_c_source_compiles( + "#if !defined(__aarch64__) && !defined(_M_ARM64) + #error not 64-bit ARM + #endif + int main(){return 0;}" + HAVE_ARM64) -if(HAVE_ARM) - enable_language(ASM) - check_c_compiler_flag(-mfpu=neon HAVE_NEON) + if(HAVE_ARM32) + enable_language(ASM) + check_c_compiler_flag(-mfpu=neon HAVE_NEON) + endif() +else() + message(STATUS "SIMD acceleration disabled (ENABLE_SIMD=OFF) — using scalar fallback only") endif() configure_file (libde265/de265-version.h.in libde265/de265-version.h) @@ -118,16 +143,31 @@ add_definitions(-Wall -Werror=return-type -Werror=unused-result -Werror=reorder) endif() +set(THREADS_PREFER_PTHREAD_FLAG TRUE) +find_package(Threads REQUIRED) + include(CheckCXXSymbolExists) check_cxx_symbol_exists(_LIBCPP_VERSION cstdlib HAVE_LIBCPP) -if(HAVE_LIBCPP) - set(LIBS_PRIVATE "-lc++") -else() - set(LIBS_PRIVATE "-lstdc++") + +# Build Libs.private contents for libde265.pc. +set(_libs_private "") +if(NOT MSVC) + if(HAVE_LIBCPP) + list(APPEND _libs_private "-lc++") + else() + list(APPEND _libs_private "-lstdc++") + endif() endif() +# Emit '-lpthread' rather than '-pthread' so consumers using +# `pkg-config --libs --static` get an unambiguous linker flag (see #453). +if(CMAKE_USE_PTHREADS_INIT) + list(APPEND _libs_private "-lpthread") +endif() +if(UNIX) + list(APPEND _libs_private "-lm") +endif() +string(JOIN " " LIBS_PRIVATE ${_libs_private}) -configure_file (libde265.pc.in libde265.pc @ONLY) -install(FILES ${CMAKE_CURRENT_BINARY_DIR}/libde265.pc DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig) option(BUILD_SHARED_LIBS "Build shared library" ON) if(NOT BUILD_SHARED_LIBS) @@ -145,12 +185,10 @@ include_directories ("${PROJECT_SOURCE_DIR}/extra") endif() -find_package(Threads) - option(ENABLE_DECODER "Enable Decoder" ON) option(ENABLE_ENCODER "Enable Encoder" OFF) option(ENABLE_SHERLOCK265 "Build sherlock265 visual inspection tool" OFF) -option(ENABLE_TOOLS "Build tools" OFF) +option(ENABLE_INTERNAL_DEVELOPMENT_TOOLS "Build internal development tools (not for end users)" OFF) option(WITH_FUZZERS "Build the fuzzers" OFF) set(FUZZING_SANITIZER_OPTIONS "-fsanitize=address,shift,integer" "-fno-sanitize=unsigned-shift-base" "-fno-sanitize-recover=shift,integer" CACHE STRING "Sanitizer flags for fuzzing builds") @@ -159,6 +197,47 @@ add_link_options(${FUZZING_SANITIZER_OPTIONS}) endif() +# --- Symbol visibility ------------------------------------------------------- +# +# Hiding the library's non-API symbols turns cross-translation-unit calls inside +# libde265 from indirect PLT/GOT dispatch into direct calls — a measurable win +# on the hot CABAC decode path — and shrinks the dynamic symbol table. Only the +# symbols tagged LIBDE265_API (the public de265_*/en265_* C API) stay exported. +# +# By default we reduce visibility for optimized (non-Debug) builds and keep +# every symbol exported for Debug builds, so a debug build stays fully available +# for introspection. The encoder CLI, sherlock265 and the dev tools all link +# against *internal* (non-API) symbols of the library, so hiding would break +# their link; visibility is therefore never reduced when any of those is built. +# +# Set FORCE_FULL_VISIBILITY=ON to override the default and export all symbols +# even from an optimized build. +option(FORCE_FULL_VISIBILITY + "Export all symbols, even from optimized builds (default: hide non-API symbols in optimized builds without the encoder/tools)" + OFF) + +set(_de265_internal_consumers OFF) +if(ENABLE_ENCODER OR ENABLE_SHERLOCK265 OR ENABLE_INTERNAL_DEVELOPMENT_TOOLS) + set(_de265_internal_consumers ON) +endif() + +set(_de265_optimized_build OFF) +if(CMAKE_BUILD_TYPE) + string(TOUPPER "${CMAKE_BUILD_TYPE}" _de265_build_type_upper) + if(NOT _de265_build_type_upper STREQUAL "DEBUG") + set(_de265_optimized_build ON) + endif() +endif() + +set(DE265_REDUCED_VISIBILITY OFF) +if(_de265_optimized_build AND NOT _de265_internal_consumers AND NOT FORCE_FULL_VISIBILITY) + set(DE265_REDUCED_VISIBILITY ON) + set(CMAKE_C_VISIBILITY_PRESET hidden) + set(CMAKE_CXX_VISIBILITY_PRESET hidden) + set(CMAKE_VISIBILITY_INLINES_HIDDEN ON) + message(STATUS "Reduced symbol visibility enabled (only LIBDE265_API symbols exported)") +endif() + add_subdirectory (libde265) if (ENABLE_DECODER) add_subdirectory (dec265) @@ -169,8 +248,8 @@ if (ENABLE_SHERLOCK265) add_subdirectory (sherlock265) endif() -if (ENABLE_TOOLS) - add_subdirectory (tools) +if (ENABLE_INTERNAL_DEVELOPMENT_TOOLS) + add_subdirectory (dev-tools) endif() if (WITH_FUZZERS) add_subdirectory (fuzzing)
View file
libde265-1.0.17.tar.gz/README.md -> libde265-1.1.1.tar.gz/README.md
Changed
@@ -54,7 +54,7 @@ !Build Status(https://github.com/strukturag/libde265/workflows/build/badge.svg)(https://github.com/strukturag/libde265/actions) !Build Status(https://ci.appveyor.com/api/projects/status/github/strukturag/libde265?svg=true)(https://ci.appveyor.com/project/strukturag/libde265) -libde265 uses the CMake build system. Please do not use to deprecated autotools scripts. +libde265 uses the CMake build system. To compile libde265, run ```` mkdir build @@ -81,20 +81,6 @@ http://github.com/farindk/libvideogfx -You can disable building of the example programs by running `./configure` with -<pre> - --disable-dec265 Do not build the dec265 decoder program. - --disable-sherlock265 Do not build the sherlock265 visual inspection program. -</pre> - -Additional logging information can be turned on and off using these `./configure` flags: -<pre> - --enable-log-error turn on logging at error level (default=yes) - --enable-log-info turn on logging at info level (default=no) - --enable-log-trace turn on logging at trace level (default=no) -</pre> - - Build using cmake ================= @@ -108,8 +94,16 @@ make ``` -See the cmake documentation(http://www.cmake.org) for further information on -using cmake on other platforms. +You can disable building of the example programs by running `cmake` with +<pre> + -DENABLE_DECODER=off Do not build the dec265 decoder program. + -DENABLE_SHERLOCK265=off Do not build the sherlock265 visual inspection program. +</pre> + +Additional logging information can be turned on and off using these `./configure` flags: +<pre> + -DDE265_LOG_LEVEL={error;info;debug;trace} +</pre> Building using vcpkg
View file
libde265-1.1.1.tar.gz/SECURITY.md
Added
@@ -0,0 +1,58 @@ +# Security Policy + +## Reporting a Vulnerability + +If you believe you have found a security vulnerability in libde265, **please do +not open a public GitHub issue**. Instead, report it privately via one of the +following channels: + +- **Preferred:** Use GitHub's private vulnerability reporting by clicking + *Report a vulnerability* on the + Security tab(https://github.com/strukturag/libde265/security/advisories/new) + of this repository. This creates a private draft advisory that only the + maintainers can see. +- **Alternative:** If you cannot use GitHub, send an email to + <dirk.farin@gmail.com> describing the issue. + +When reporting, please include as much of the following as you can: + +- A description of the vulnerability and its potential impact. +- Steps to reproduce, ideally with a minimal H.265 bitstream that triggers the + issue. +- The libde265 version (release tag or commit hash) and build configuration + (compiler, sanitizers, etc.) where the issue was observed. +- Any suggested fix or mitigation, if known. + +We will acknowledge your report, work with you on a fix, and credit you in the +release notes and advisory unless you prefer to remain anonymous. + +## Supported Versions + +Security fixes are applied to the latest release on the `master` branch. Older +releases are not maintained; please update to the current release before +reporting. + +## Scope + +In scope: + +- Memory safety issues in the decoder (out-of-bounds reads/writes, use-after- + free, double-free, uninitialized memory use). +- Crashes, hangs, or excessive resource consumption triggered by crafted H.265 + bitstreams. +- Issues in the public C API (`libde265/de265.h`) that lead to the above when + used as documented. + +Out of scope: + +- Bugs in the experimental encoder (`libde265/encoder/`, built with + `-DENABLE_ENCODER=ON`). +- Bugs in the sample applications (`dec265/`, `enc265/`, `sherlock265/`) that + do not also affect the library. +- Issues that require a non-default, unsafe build configuration or modified + source. + +## Disclosure + +We aim to coordinate disclosure with the reporter. Once a fix is available, we +will publish a GitHub Security Advisory and request a CVE where appropriate.
View file
libde265-1.0.17.tar.gz/cmake/config.h.in -> libde265-1.1.1.tar.gz/cmake/config.h.in
Changed
@@ -3,5 +3,8 @@ #cmakedefine HAVE_MALLOC_H 1 #cmakedefine HAVE_POSIX_MEMALIGN 1 #cmakedefine HAVE_SSE4_1 1 -#cmakedefine HAVE_ARM 1 +#cmakedefine HAVE_AVX2 1 +#cmakedefine HAVE_AVX512 1 +#cmakedefine HAVE_ARM32 1 +#cmakedefine HAVE_ARM64 1 #cmakedefine HAVE_NEON 1
View file
libde265-1.1.1.tar.gz/cmake/toolchains/arm-linux-gnueabihf-clang.cmake
Added
@@ -0,0 +1,16 @@ +set(CMAKE_SYSTEM_NAME Linux) +set(CMAKE_SYSTEM_PROCESSOR arm) + +set(triple arm-linux-gnueabihf) + +set(CMAKE_C_COMPILER clang) +set(CMAKE_C_COMPILER_TARGET ${triple}) +set(CMAKE_CXX_COMPILER clang++) +set(CMAKE_CXX_COMPILER_TARGET ${triple}) +set(CMAKE_ASM_COMPILER clang) +set(CMAKE_ASM_COMPILER_TARGET ${triple}) + +set(CMAKE_FIND_ROOT_PATH /usr/arm-linux-gnueabihf) +set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER) +set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY) +set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
View file
libde265-1.0.17.tar.gz/dec265/dec265.cc -> libde265-1.1.1.tar.gz/dec265/dec265.cc
Changed
@@ -271,18 +271,20 @@ de265_chroma chroma = de265_get_chroma_format(img); + enum SDL_YUV_Display::SDL_Chroma sdlChroma; + switch (chroma) { + case de265_chroma_420: sdlChroma = SDL_YUV_Display::SDL_CHROMA_420; break; + case de265_chroma_422: sdlChroma = SDL_YUV_Display::SDL_CHROMA_422; break; + case de265_chroma_444: sdlChroma = SDL_YUV_Display::SDL_CHROMA_444; break; + case de265_chroma_mono: sdlChroma = SDL_YUV_Display::SDL_CHROMA_MONO; break; + default: assert(false); sdlChroma = SDL_YUV_Display::SDL_CHROMA_MONO; + } + if (!sdl_active) { sdl_active=true; - enum SDL_YUV_Display::SDL_Chroma sdlChroma; - switch (chroma) { - case de265_chroma_420: sdlChroma = SDL_YUV_Display::SDL_CHROMA_420; break; - case de265_chroma_422: sdlChroma = SDL_YUV_Display::SDL_CHROMA_422; break; - case de265_chroma_444: sdlChroma = SDL_YUV_Display::SDL_CHROMA_444; break; - case de265_chroma_mono: sdlChroma = SDL_YUV_Display::SDL_CHROMA_MONO; break; - default: assert(false); sdlChroma = SDL_YUV_Display::SDL_CHROMA_MONO; - } - - sdlWin.init(width,height, sdlChroma); + if (!sdlWin.init(width,height, sdlChroma)) return true; + } else { + if (!sdlWin.resize(width,height, sdlChroma)) return true; } int stride,chroma_stride;
View file
libde265-1.0.17.tar.gz/dec265/sdl-display.cc -> libde265-1.1.1.tar.gz/dec265/sdl-display.cc
Changed
@@ -25,6 +25,7 @@ */ #include "sdl-display.h" +#include <cstdio> #include <cstring> @@ -94,6 +95,43 @@ return true; } +bool SDL_YUV_Display::resize(int frame_width, int frame_height, enum SDL_Chroma chroma) +{ + if (!mWindowOpen) { + return init(frame_width, frame_height, chroma); + } + + // SDL_PIXELFORMAT_YV12 requires even dimensions; init() rounds down to a + // multiple of 8, so we do the same here for consistency. + frame_width &= ~7; + frame_height &= ~7; + + if (frame_width == rect.w && frame_height == rect.h && mChroma == chroma) { + return true; + } + + // All chroma formats currently map to SDL_PIXELFORMAT_YV12 (we down-convert + // 4:2:2 and 4:4:4 ourselves), so the texture pixel format never changes. + // Only the texture dimensions and the window size need updating. + SDL_DestroyTexture(mTexture); + mTexture = SDL_CreateTexture(mRenderer, SDL_PIXELFORMAT_YV12, + SDL_TEXTUREACCESS_STREAMING, + frame_width, frame_height); + if (!mTexture) { + printf("SDL: Couldn't recreate SDL texture: %s\n", SDL_GetError()); + mWindowOpen = false; + return false; + } + + SDL_SetWindowSize(mWindow, frame_width, frame_height); + + mChroma = chroma; + rect.w = frame_width; + rect.h = frame_height; + + return true; +} + void SDL_YUV_Display::display(const unsigned char *Y, const unsigned char *U, const unsigned char *V, @@ -247,7 +285,7 @@ } uint8_t *startV = mPixels + (rect.h*mStride); - uint8_t *startU = startV + (rect.h*mStride/2); + uint8_t *startU = startV + (rect.h*mStride/4); for (int y=0;y<rect.h;y+=2) { unsigned char* u = startU + y/2*mStride/2;
View file
libde265-1.0.17.tar.gz/dec265/sdl-display.h -> libde265-1.1.1.tar.gz/dec265/sdl-display.h
Changed
@@ -39,6 +39,7 @@ }; bool init(int frame_width, int frame_height, enum SDL_Chroma chroma = SDL_CHROMA_420); + bool resize(int frame_width, int frame_height, enum SDL_Chroma chroma); void display(const unsigned char *Y, const unsigned char *U, const unsigned char *V, int stride, int chroma_stride); void close();
View file
libde265-1.0.17.tar.gz/libde265.pc.in -> libde265-1.1.1.tar.gz/libde265.pc.in
Changed
@@ -1,7 +1,7 @@ -prefix=@CMAKE_INSTALL_PREFIX@ -exec_prefix=${prefix} -libdir=${prefix}/@CMAKE_INSTALL_LIBDIR@ -includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@ +prefix=@prefix@ +exec_prefix=@exec_prefix@ +libdir=@libdir@ +includedir=@includedir@ Name: libde265 Description: H.265/HEVC video decoder. @@ -11,3 +11,4 @@ Libs: -L${libdir} -lde265 Libs.private: @LIBS_PRIVATE@ Cflags: -I${includedir} +Cflags.private: -DLIBDE265_STATIC_BUILD
View file
libde265-1.0.17.tar.gz/libde265/CMakeLists.txt -> libde265-1.1.1.tar.gz/libde265/CMakeLists.txt
Changed
@@ -10,7 +10,9 @@ decctx.cc dpb.cc fallback-dct.cc - fallback-motion.cc + fallback-deblk.cc + fallback-intrapred.cc + fallback-motion.cc fallback.cc image-io.cc image.cc @@ -47,6 +49,7 @@ decctx.h dpb.h fallback-dct.h + fallback-intrapred.h fallback-motion.h fallback.h image-io.h @@ -87,6 +90,13 @@ add_definitions(-DLIBDE265_EXPORTS) +# With hidden default visibility, LIBDE265_API must expand to +# visibility("default") so the public C API stays exported. de265.h gates that +# on HAVE_VISIBILITY (see the symbol-visibility block in the top CMakeLists). +if(DE265_REDUCED_VISIBILITY) + add_definitions(-DHAVE_VISIBILITY) +endif() + if (ENABLE_ENCODER) add_subdirectory (encoder) list(APPEND libde265_sources en265.cc) @@ -100,14 +110,18 @@ endif() endif() -if(HAVE_ARM) - add_subdirectory (arm) +if(HAVE_ARM32) + add_subdirectory (arm32) endif() -add_library(de265 ${libde265_sources} ${libde265_public_headers} ${ENCODER_OBJECTS} ${X86_OBJECTS} ${ARM_OBJECTS}) +add_library(de265 ${libde265_sources} ${libde265_public_headers} ${ENCODER_OBJECTS} ${X86_OBJECTS} ${ARM32_OBJECTS}) target_include_directories(de265 PRIVATE ${CMAKE_BINARY_DIR} ${CMAKE_CURRENT_BINARY_DIR}) +target_link_libraries(de265 PRIVATE Threads::Threads) +if(UNIX) + target_link_libraries(de265 PRIVATE m) +endif() -write_basic_package_version_file(libde265ConfigVersion.cmake COMPATIBILITY ExactVersion) +write_basic_package_version_file(libde265-config-version.cmake COMPATIBILITY ExactVersion) # --- debug output @@ -155,7 +169,7 @@ MACOSX_RPATH TRUE) endif() -install(TARGETS de265 EXPORT libde265Config +install(TARGETS de265 EXPORT libde265-config RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR} LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR} ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} @@ -165,9 +179,9 @@ ) install(FILES ${libde265_public_headers} DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}/libde265) -install(EXPORT libde265Config DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/libde265") +install(EXPORT libde265-config DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/libde265") -install(FILES ${CMAKE_CURRENT_BINARY_DIR}/libde265ConfigVersion.cmake DESTINATION +install(FILES ${CMAKE_CURRENT_BINARY_DIR}/libde265-config-version.cmake DESTINATION "${CMAKE_INSTALL_LIBDIR}/cmake/libde265") @@ -186,6 +200,5 @@ set(includedir "\${prefix}/${CMAKE_INSTALL_INCLUDEDIR}") endif() -set(VERSION ${PROJECT_VERSION}) # so that the replacement in libde265.pc will work with both autotools and CMake configure_file(../libde265.pc.in ${CMAKE_CURRENT_BINARY_DIR}/libde265.pc @ONLY) install(FILES ${CMAKE_CURRENT_BINARY_DIR}/libde265.pc DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig)
View file
libde265-1.0.17.tar.gz/libde265/acceleration.h -> libde265-1.1.1.tar.gz/libde265/acceleration.h
Changed
@@ -172,6 +172,20 @@ template <class pixel_t> void add_residual(pixel_t *dst, ptrdiff_t stride, const int32_t* r, int nT, int bit_depth) const; + // Inverse quantization (no scaling list): for each of the nCoeff entries, + // coeffBufcoeffPosi = Clip16( (coeffListi*fact + offset) >> bdShift ). + // Contract: fact small enough that coeffListi*fact+offset fits in int32 + // (caller checks this; the rare int64 case stays scalar). bdShift >= 1. + void (*dequant_coeff_block)(int16_t* coeffBuf, const int16_t* coeffList, + const int16_t* coeffPos, int nCoeff, + int32_t fact, int32_t offset, int32_t bdShift); + + // --- deblocking (8 bit; one 4-line edge segment) --- + void (*deblock_luma_8)(uint8_t* ptr, ptrdiff_t stride, int vertical, + int dE, int dEp, int dEq, int tc, int filterP, int filterQ); + void (*deblock_chroma_8)(uint8_t* ptr, ptrdiff_t stride, int vertical, + int tc, int filterP, int filterQ); + void (*rdpcm_v)(int32_t* residual, const int16_t* coeffs, int nT,int tsShift,int bdShift); void (*rdpcm_h)(int32_t* residual, const int16_t* coeffs, int nT,int tsShift,int bdShift); @@ -186,6 +200,22 @@ template <class pixel_t> void transform_add(int sizeIdx, pixel_t *dst, const int16_t *coeffs, ptrdiff_t stride, int bit_depth) const; + // --- intra prediction --- + + void (*intra_pred_dc_8 )(uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border); + void (*intra_pred_dc_16)(uint16_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint16_t* border); + void (*intra_pred_planar_8 )(uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border); + void (*intra_pred_planar_16)(uint16_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint16_t* border); + void (*intra_pred_angular_8 )(uint8_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const uint8_t* border); + void (*intra_pred_angular_16)(uint16_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const uint16_t* border); + + template <class pixel_t> void intra_pred_dc(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border) const; + template <class pixel_t> void intra_pred_planar(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border) const; + template <class pixel_t> void intra_pred_angular(pixel_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const pixel_t* border) const; + // --- forward transforms --- @@ -356,4 +386,13 @@ template <> inline void acceleration_functions::add_residual(uint8_t *dst, ptrdiff_t stride, const int32_t* r, int nT, int bit_depth) const { add_residual_8(dst,stride,r,nT,bit_depth); } template <> inline void acceleration_functions::add_residual(uint16_t *dst, ptrdiff_t stride, const int32_t* r, int nT, int bit_depth) const { add_residual_16(dst,stride,r,nT,bit_depth); } +template <> inline void acceleration_functions::intra_pred_dc<uint8_t> (uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border) const { intra_pred_dc_8 (dst,stride,nT,cIdx,border); } +template <> inline void acceleration_functions::intra_pred_dc<uint16_t>(uint16_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint16_t* border) const { intra_pred_dc_16(dst,stride,nT,cIdx,border); } + +template <> inline void acceleration_functions::intra_pred_planar<uint8_t> (uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border) const { intra_pred_planar_8 (dst,stride,nT,cIdx,border); } +template <> inline void acceleration_functions::intra_pred_planar<uint16_t>(uint16_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint16_t* border) const { intra_pred_planar_16(dst,stride,nT,cIdx,border); } + +template <> inline void acceleration_functions::intra_pred_angular<uint8_t> (uint8_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, int xB0, int yB0, int mode, int nT, int cIdx, const uint8_t* border) const { intra_pred_angular_8 (dst,stride,bit_depth,disableBoundaryFilter,xB0,yB0,mode,nT,cIdx,border); } +template <> inline void acceleration_functions::intra_pred_angular<uint16_t>(uint16_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, int xB0, int yB0, int mode, int nT, int cIdx, const uint16_t* border) const { intra_pred_angular_16(dst,stride,bit_depth,disableBoundaryFilter,xB0,yB0,mode,nT,cIdx,border); } + #endif
View file
libde265-1.1.1.tar.gz/libde265/arm32
Added
+(directory)
View file
libde265-1.1.1.tar.gz/libde265/arm32/CMakeLists.txt
Added
@@ -0,0 +1,26 @@ +add_library(arm32 OBJECT arm.cc arm.h) + +if(HAVE_NEON) + add_library(arm32_neon OBJECT + cpudetect.S + hevcdsp_qpel_neon.S + ) + + # Clang's assembler does not support .func/.endfunc directives. + # See issue #510. + if(NOT CMAKE_ASM_COMPILER_ID MATCHES Clang) + set(AS_FUNC_FLAG -DHAVE_AS_FUNC) + endif() + + target_compile_options(arm32_neon PRIVATE + -mfpu=neon + -DHAVE_NEON + -DEXTERN_ASM= + ${AS_FUNC_FLAG} + -DHAVE_SECTION_DATA_REL_RO + ) + + set(ARM32_OBJECTS $<TARGET_OBJECTS:arm32> $<TARGET_OBJECTS:arm32_neon> PARENT_SCOPE) +else() + set(ARM32_OBJECTS $<TARGET_OBJECTS:arm32> PARENT_SCOPE) +endif()
View file
libde265-1.1.1.tar.gz/libde265/arm32/arm.cc
Changed
(renamed from libde265/arm/arm.cc)
View file
libde265-1.1.1.tar.gz/libde265/arm32/arm.h
Changed
(renamed from libde265/arm/arm.h)
View file
libde265-1.1.1.tar.gz/libde265/arm32/asm.S
Changed
(renamed from libde265/arm/asm.S)
View file
libde265-1.1.1.tar.gz/libde265/arm32/cpudetect.S
Changed
(renamed from libde265/arm/cpudetect.S)
View file
libde265-1.1.1.tar.gz/libde265/arm32/hevcdsp_qpel_neon.S
Changed
(renamed from libde265/arm/hevcdsp_qpel_neon.S)
View file
libde265-1.1.1.tar.gz/libde265/arm32/neon.S
Changed
(renamed from libde265/arm/neon.S)
View file
libde265-1.0.17.tar.gz/libde265/cabac.cc -> libde265-1.1.1.tar.gz/libde265/cabac.cc
Changed
@@ -158,8 +158,101 @@ } +#if defined(__x86_64__) && defined(__GNUC__) && !defined(DE265_LOG_TRACE) +#define DE265_CABAC_ASM_X86_64 1 +// Combined state-transition table (folds the MPS-flip at state 0), indexed by +// the packed context byte (state*2+MPSbit) for MPS and (packed ^ 0xFF) for LPS. +static const uint8_t cabac_transition256 = + { + 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, + 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, + 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, + 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, + 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, + 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, + 98, 99,100,101,102,103,104,105,106,107,108,109,110,111,112,113, + 114,115,116,117,118,119,120,121,122,123,124,125,124,125,126,127, + 127,126, 77, 76, 77, 76, 75, 74, 75, 74, 75, 74, 73, 72, 73, 72, + 73, 72, 71, 70, 71, 70, 71, 70, 69, 68, 69, 68, 67, 66, 67, 66, + 67, 66, 65, 64, 65, 64, 63, 62, 61, 60, 61, 60, 61, 60, 59, 58, + 59, 58, 57, 56, 55, 54, 55, 54, 53, 52, 53, 52, 51, 50, 49, 48, + 49, 48, 47, 46, 45, 44, 45, 44, 43, 42, 43, 42, 39, 38, 39, 38, + 37, 36, 37, 36, 33, 32, 33, 32, 31, 30, 31, 30, 27, 26, 27, 26, + 25, 24, 23, 22, 23, 22, 19, 18, 19, 18, 17, 16, 15, 14, 13, 12, + 11, 10, 9, 8, 9, 8, 5, 4, 5, 4, 3, 2, 1, 0, 0, 1, + }; +#endif + int CABAC_decoder::decode_bit(context_model* model) { +#ifdef DE265_CABAC_ASM_X86_64 + // x86-64 branchless arithmetic decoder. Bit-identical to the C fallback below. + uint32_t range_l = range, value_l = value; + int bn_l = bits_needed; + uint8_t* curr_l = bitstream_curr; + uint8_t* sp = reinterpret_cast<uint8_t*>(model); + int bit_l; + + __asm__ ( + "movzbl (%sp), %%eax \n\t" // eax = b (state*2 + MPS) + "mov %range, %%ecx \n\t" + "shr $6, %%ecx \n\t" + "sub $4, %%ecx \n\t" // ecx = (range>>6)-4 + "mov %%eax, %%edx \n\t" + "shr $1, %%edx \n\t" // edx = state + "lea (%%rcx,%%rdx,4), %%rcx \n\t" // rcx = state*4 + ridx + "lea %lps, %%rdx \n\t" // rdx = &LPS_table00 + "movzbl (%%rdx,%%rcx,1), %%edx \n\t" // edx = RangeLPS + "sub %%edx, %range \n\t" // range = range - RangeLPS (MPS range) + "mov %range, %%r8d \n\t" + "shl $7, %%r8d \n\t" // r8d = scaled_range + "mov %value, %%r9d \n\t" + "sub %%r8d, %%r9d \n\t" + "sar $31, %%r9d \n\t" // r9d = mps_mask (-1 if MPS) + "not %%r9d \n\t" // r9d = lps_mask (-1 if LPS) + "mov %%r8d, %%ecx \n\t" + "and %%r9d, %%ecx \n\t" + "sub %%ecx, %value \n\t" // value -= scaled & lps_mask + "mov %%edx, %%ecx \n\t" // RangeLPS + "sub %range, %%ecx \n\t" + "and %%r9d, %%ecx \n\t" + "add %%ecx, %range \n\t" // range = RangeLPS if LPS, else unchanged + "and $0xFF, %%r9d \n\t" + "xor %%r9d, %%eax \n\t" // eax = idx + "mov %%eax, %bit \n\t" + "and $1, %bit \n\t" // bit = idx & 1 + "lea %trans, %%rcx \n\t" // rcx = &cabac_transition0 + "movzbl (%%rcx,%%rax,1), %%edx \n\t" + "mov %%dl, (%sp) \n\t" // *sp = cabac_transitionidx + "bsr %range, %%ecx \n\t" // ecx = MSB index + "mov $8, %%edx \n\t" + "sub %%ecx, %%edx \n\t" // edx = renorm shift = 8 - bsr + "mov %%edx, %%ecx \n\t" + "shl %%cl, %value \n\t" + "shl %%cl, %range \n\t" + "add %%edx, %bn \n\t" // bits_needed += shift + "test %bn, %bn \n\t" + "js 1f \n\t" // bits_needed < 0: no refill + "cmp %end, %curr \n\t" + "jae 2f \n\t" // curr >= end: no read + "movzbl (%curr), %%eax \n\t" + "mov %bn, %%ecx \n\t" + "shl %%cl, %%eax \n\t" + "or %%eax, %value \n\t" // value |= (*curr) << bits_needed + "inc %curr \n\t" + "2: \n\t" + "sub $8, %bn \n\t" + "1: \n\t" + : range"+r"(range_l), value"+r"(value_l), bn"+r"(bn_l), + curr"+r"(curr_l), bit"=&r"(bit_l) + : sp"r"(sp), end"r"(bitstream_end), + lps"m"(LPS_table00), trans"m"(cabac_transition0) + : "rax","rcx","rdx","r8","r9","cc","memory" + ); + + range = range_l; value = value_l; bits_needed = (int16_t)bn_l; bitstream_curr = curr_l; + return bit_l; +#else logtrace(LogCABAC,"%3d decodeBin r:%x v:%x state:%d\n",logcnt,range, value, model->state); int decoded_bit; @@ -237,6 +330,7 @@ #endif return decoded_bit; +#endif } int CABAC_decoder::decode_term_bit() @@ -279,6 +373,43 @@ // we will eventually only return zeros. int CABAC_decoder::decode_bypass() { +#ifdef DE265_CABAC_ASM_X86_64 + uint32_t value_l = value; + int bn_l = bits_needed; + uint8_t* curr_l = bitstream_curr; + int bit_l; + + __asm__ ( + "add %value, %value \n\t" // value <<= 1 + "addl $1, %bn \n\t" // bits_needed++ + "js 1f \n\t" // bits_needed < 0: no refill + "cmp %end, %curr \n\t" + "jae 2f \n\t" // curr >= end: no read + "movzbl (%curr), %%eax \n\t" + "or %%eax, %value \n\t" // value |= *curr (bits_needed==0) + "inc %curr \n\t" + "2: \n\t" + "movl $-8, %bn \n\t" // bits_needed = -8 + "1: \n\t" + "mov %range, %%ecx \n\t" + "shl $7, %%ecx \n\t" // ecx = scaled_range + "mov %value, %%edx \n\t" + "sub %%ecx, %%edx \n\t" // edx = value - scaled_range + "mov %%edx, %%eax \n\t" + "sar $31, %%eax \n\t" // eax = mask (-1 if value<scaled -> bit 0) + "mov %%eax, %bit \n\t" + "add $1, %bit \n\t" // bit = mask + 1 + "not %%eax \n\t" // eax = ~mask + "and %%eax, %%ecx \n\t" // ecx = scaled & ~mask + "sub %%ecx, %value \n\t" // value -= scaled when bit==1 + : value"+r"(value_l), bn"+r"(bn_l), curr"+r"(curr_l), bit"=&r"(bit_l) + : range"r"(range), end"r"(bitstream_end) + : "rax","rcx","rdx","cc","memory" + ); + + value = value_l; bits_needed = (int16_t)bn_l; bitstream_curr = curr_l; + return bit_l; +#else logtrace(LogCABAC,"%3d bypass r:%x v:%x\n",logcnt,range, value); value <<= 1; @@ -314,6 +445,7 @@ #endif return bit; +#endif }
View file
libde265-1.0.17.tar.gz/libde265/de265.cc -> libde265-1.1.1.tar.gz/libde265/de265.cc
Changed
@@ -82,6 +82,8 @@ case DE265_ERROR_OUT_OF_MEMORY: return "out of memory"; case DE265_ERROR_CODED_PARAMETER_OUT_OF_RANGE: return "coded parameter out of range"; case DE265_ERROR_IMAGE_BUFFER_FULL: return "DPB/output queue full"; + case DE265_ERROR_IMAGE_SIZE_EXCEEDS_SECURITY_LIMIT: return "image size exceeds security limit"; + case DE265_ERROR_NAL_SIZE_EXCEEDS_SECURITY_LIMIT: return "NAL unit size exceeds security limit"; case DE265_ERROR_CANNOT_START_THREADPOOL: return "cannot start decoding threads"; case DE265_ERROR_LIBRARY_INITIALIZATION_FAILED: return "global library initialization failed"; case DE265_ERROR_LIBRARY_NOT_INITIALIZED: return "cannot free library data (not initialized"; @@ -178,6 +180,10 @@ return "Access with invalid slice header index"; case DE265_WARNING_INVALID_TU_BLOCK_SPLIT: return "Transform block split below minimum transform size"; + case DE265_WARNING_RICE_PARAMETER_OUT_OF_RANGE: + return "Rice parameter or StatCoeff out of range, clamped"; + case DE265_WARNING_MAX_NUMBER_OF_SEI_MESSAGES_EXCEEDED: + return "number of SEI messages exceeds security limit, dropped"; default: return "unknown error"; } @@ -215,8 +221,10 @@ // do initializations init_scan_orders(); + pps_scan_cache_init(); if (!alloc_and_init_significant_coeff_ctxIdx_lookupTable()) { + pps_scan_cache_free(); de265_init_count--; return DE265_ERROR_LIBRARY_INITIALIZATION_FAILED; } @@ -236,6 +244,7 @@ if (de265_init_count==0) { free_significant_coeff_ctxIdx_lookupTable(); + pps_scan_cache_free(); } return DE265_OK; @@ -610,6 +619,53 @@ } +static const de265_security_limits disabled_security_limits = { + 1, // version + 0, // max_image_size_pixels + 0, // max_NAL_size_bytes + 0 // max_SEI_messages +}; + + +LIBDE265_API de265_security_limits* de265_get_security_limits(de265_decoder_context* de265ctx) +{ + decoder_context* ctx = reinterpret_cast<decoder_context*>(de265ctx); + return &ctx->param_security_limits; +} + + +// Copy security limits field by field, but only those fields that exist in both +// structs, i.e. up to min(dst->version, src->version). This keeps copying safe +// when 'src' was compiled against a different (older or newer) header version +// than 'dst'. Fields not covered by the common version keep their value in 'dst'. +static void copy_security_limits(de265_security_limits* dst, const de265_security_limits* src) +{ + uint8_t version = (dst->version < src->version) ? dst->version : src->version; + + if (version >= 1) { + dst->max_image_size_pixels = src->max_image_size_pixels; + dst->max_NAL_size_bytes = src->max_NAL_size_bytes; + dst->max_SEI_messages = src->max_SEI_messages; + } +} + + +LIBDE265_API void de265_set_security_limits(de265_decoder_context* de265ctx, + const de265_security_limits* limits) +{ + decoder_context* ctx = reinterpret_cast<decoder_context*>(de265ctx); + if (limits) { + copy_security_limits(&ctx->param_security_limits, limits); + } +} + + +LIBDE265_API const de265_security_limits* de265_get_disabled_security_limits() +{ + return &disabled_security_limits; +} + + LIBDE265_API int de265_get_number_of_input_bytes_pending(de265_decoder_context* de265ctx) { decoder_context* ctx = reinterpret_cast<decoder_context*>(de265ctx); @@ -676,7 +732,7 @@ uint8_t* data = img->pixels_confwinchannel; - if (stride) *stride = img->get_image_stride(channel) * ((de265_get_bits_per_pixel(img, channel)+7) / 8); + if (stride) *stride = static_cast<int>(img->get_image_stride(channel) * ((de265_get_bits_per_pixel(img, channel)+7) / 8)); return data; }
View file
libde265-1.0.17.tar.gz/libde265/de265.h -> libde265-1.1.1.tar.gz/libde265/de265.h
Changed
@@ -28,8 +28,6 @@ #include <libde265/de265-version.h> -//#define inline static __inline - #ifndef __STDC_LIMIT_MACROS #define __STDC_LIMIT_MACROS 1 @@ -60,12 +58,6 @@ #define LIBDE265_DEPRECATED #endif -#if defined(_MSC_VER) -#define LIBDE265_INLINE __inline -#else -#define LIBDE265_INLINE inline -#endif - /* === version numbers === */ // version of linked libde265 library @@ -103,6 +95,8 @@ DE265_ERROR_NO_INITIAL_SLICE_HEADER=16, DE265_ERROR_PREMATURE_END_OF_SLICE=17, DE265_ERROR_UNSPECIFIED_DECODING_ERROR=18, + DE265_ERROR_IMAGE_SIZE_EXCEEDS_SECURITY_LIMIT=19, + DE265_ERROR_NAL_SIZE_EXCEEDS_SECURITY_LIMIT=20, // --- errors that should become obsolete in later libde265 versions --- @@ -147,7 +141,9 @@ DE265_WARNING_BIT_DEPTH_OF_CURRENT_IMAGE_DOES_NOT_MATCH_SPS=1031, DE265_WARNING_REFERENCE_IMAGE_CHROMA_FORMAT_DOES_NOT_MATCH=1032, DE265_WARNING_INVALID_SLICE_HEADER_INDEX_ACCESS=1033, - DE265_WARNING_INVALID_TU_BLOCK_SPLIT=1034 + DE265_WARNING_INVALID_TU_BLOCK_SPLIT=1034, + DE265_WARNING_RICE_PARAMETER_OUT_OF_RANGE=1035, + DE265_WARNING_MAX_NUMBER_OF_SEI_MESSAGES_EXCEEDED=1036 } de265_error; LIBDE265_API const char* de265_get_error_text(de265_error err); @@ -209,6 +205,17 @@ typedef void de265_decoder_context; // private structure +/* Thread-safety: + A de265_decoder_context must not be accessed concurrently from multiple + threads. All API calls that take a de265_decoder_context (push data, decode, + query state, retrieve images, free) must be serialized by the caller. + To decode multiple streams in parallel, create one context per thread. + + This is independent from de265_start_worker_threads(), which only enables + internal worker threads inside a single context to parallelize WPP/tile + decoding. Those internal threads are managed by libde265 and do not relax + the single-owner-thread requirement above. +*/ /* Get a new decoder context. Must be freed with de265_free_decoder(). */ LIBDE265_API de265_decoder_context* de265_new_decoder(void); @@ -428,6 +435,25 @@ LIBDE265_API int de265_get_parameter_bool(de265_decoder_context*, de265_param param); +/* --- security limits --- */ + +typedef struct de265_security_limits { + uint8_t version; + + // --- version 1 --- + + uint32_t max_image_size_pixels; + uint32_t max_NAL_size_bytes; + uint32_t max_SEI_messages; // max number of SEI messages per access unit (0 = unlimited) + +} de265_security_limits; + +LIBDE265_API de265_security_limits* de265_get_security_limits(de265_decoder_context*); + +LIBDE265_API void de265_set_security_limits(de265_decoder_context*, const de265_security_limits* limits); + +LIBDE265_API const de265_security_limits* de265_get_disabled_security_limits(); + /* --- optional library initialization --- */
View file
libde265-1.0.17.tar.gz/libde265/deblock.cc -> libde265-1.1.1.tar.gz/libde265/deblock.cc
Changed
@@ -22,6 +22,8 @@ #include "util.h" #include "transform.h" #include "de265.h" +#include "decctx.h" +#include "fallback-deblk.h" #include <assert.h> @@ -187,8 +189,8 @@ filterLeftCbEdge = 0; } else if (pps.loop_filter_across_tiles_enabled_flag == 0 && - pps.TileIdRS x0ctb +y0ctb*picWidthInCtbs != - pps.TileIdRS((x0-1)>>ctbshift)+y0ctb*picWidthInCtbs) { + pps.scan->TileIdRS x0ctb +y0ctb*picWidthInCtbs != + pps.scan->TileIdRS((x0-1)>>ctbshift)+y0ctb*picWidthInCtbs) { filterLeftCbEdge = 0; } } @@ -201,8 +203,8 @@ filterTopCbEdge = 0; } else if (pps.loop_filter_across_tiles_enabled_flag == 0 && - pps.TileIdRSx0ctb+ y0ctb *picWidthInCtbs != - pps.TileIdRSx0ctb+((y0-1)>>ctbshift)*picWidthInCtbs) { + pps.scan->TileIdRSx0ctb+ y0ctb *picWidthInCtbs != + pps.scan->TileIdRSx0ctb+((y0-1)>>ctbshift)*picWidthInCtbs) { filterTopCbEdge = 0; } } @@ -250,8 +252,8 @@ (DEBLOCK_FLAG_HORIZ | DEBLOCK_PB_EDGE_HORIZ); int transformEdgeMask = vertical ? DEBLOCK_FLAG_VERTI : DEBLOCK_FLAG_HORIZ; - xEnd = libde265_min(xEnd,img->get_deblk_width()); - yEnd = libde265_min(yEnd,img->get_deblk_height()); + xEnd = std::min(xEnd,img->get_deblk_width()); + yEnd = std::min(yEnd,img->get_deblk_height()); //int TUShift = img->get_sps().Log2MinTrafoSize; //int TUStride= img->get_sps().PicWidthInTbsY; @@ -326,18 +328,18 @@ if (refPicP0 != refPicP1) { if (refPicP0 == refPicQ0) { - if (abs_value(mvP0.x-mvQ0.x) >= 4 || - abs_value(mvP0.y-mvQ0.y) >= 4 || - abs_value(mvP1.x-mvQ1.x) >= 4 || - abs_value(mvP1.y-mvQ1.y) >= 4) { + if (std::abs(mvP0.x-mvQ0.x) >= 4 || + std::abs(mvP0.y-mvQ0.y) >= 4 || + std::abs(mvP1.x-mvQ1.x) >= 4 || + std::abs(mvP1.y-mvQ1.y) >= 4) { bS = 1; } } else { - if (abs_value(mvP0.x-mvQ1.x) >= 4 || - abs_value(mvP0.y-mvQ1.y) >= 4 || - abs_value(mvP1.x-mvQ0.x) >= 4 || - abs_value(mvP1.y-mvQ0.y) >= 4) { + if (std::abs(mvP0.x-mvQ1.x) >= 4 || + std::abs(mvP0.y-mvQ1.y) >= 4 || + std::abs(mvP1.x-mvQ0.x) >= 4 || + std::abs(mvP1.y-mvQ0.y) >= 4) { bS = 1; } } @@ -345,15 +347,15 @@ else { assert(refPicQ0==refPicQ1); - if ((abs_value(mvP0.x-mvQ0.x) >= 4 || - abs_value(mvP0.y-mvQ0.y) >= 4 || - abs_value(mvP1.x-mvQ1.x) >= 4 || - abs_value(mvP1.y-mvQ1.y) >= 4) + if ((std::abs(mvP0.x-mvQ0.x) >= 4 || + std::abs(mvP0.y-mvQ0.y) >= 4 || + std::abs(mvP1.x-mvQ1.x) >= 4 || + std::abs(mvP1.y-mvQ1.y) >= 4) && - (abs_value(mvP0.x-mvQ1.x) >= 4 || - abs_value(mvP0.y-mvQ1.y) >= 4 || - abs_value(mvP1.x-mvQ0.x) >= 4 || - abs_value(mvP1.y-mvQ0.y) >= 4)) { + (std::abs(mvP0.x-mvQ1.x) >= 4 || + std::abs(mvP0.y-mvQ1.y) >= 4 || + std::abs(mvP1.x-mvQ0.x) >= 4 || + std::abs(mvP1.y-mvQ0.y) >= 4)) { bS = 1; } } @@ -422,8 +424,8 @@ int bitDepth_Y = sps.BitDepth_Y; - xEnd = libde265_min(xEnd,img->get_deblk_width()); - yEnd = libde265_min(yEnd,img->get_deblk_height()); + xEnd = std::min(xEnd,img->get_deblk_width()); + yEnd = std::min(yEnd,img->get_deblk_height()); for (int y=yStart;y<yEnd;y+=yIncr) for (int x=xStart;x<xEnd;x+=xIncr) { @@ -532,10 +534,10 @@ int dE=0, dEp=0, dEq=0; - int dp0 = abs_value(p02 - 2*p01 + p00); - int dp3 = abs_value(p32 - 2*p31 + p30); - int dq0 = abs_value(q02 - 2*q01 + q00); - int dq3 = abs_value(q32 - 2*q31 + q30); + int dp0 = std::abs(p02 - 2*p01 + p00); + int dp3 = std::abs(p32 - 2*p31 + p30); + int dq0 = std::abs(q02 - 2*q01 + q00); + int dq3 = std::abs(q32 - 2*q31 + q30); int dpq0 = dp0 + dq0; int dpq3 = dp3 + dq3; @@ -547,12 +549,12 @@ if (d < beta) { //int dpq = 2*dpq0; bool dSam0 = (2 * dpq0 < (beta >> 2) && - abs_value(p03-p00) + abs_value(q00-q03) < (beta >> 3) && - abs_value(p00-q00) < ((5 * tc + 1) >> 1)); + std::abs(p03-p00) + std::abs(q00-q03) < (beta >> 3) && + std::abs(p00-q00) < ((5 * tc + 1) >> 1)); bool dSam3 = (2 * dpq3 < (beta >> 2) && - abs_value(p33-p30) + abs_value(q30-q33) < (beta >> 3) && - abs_value(p30-q30) < ((5 * tc + 1) >> 1)); + std::abs(p33-p30) + std::abs(q30-q33) < (beta >> 3) && + std::abs(p30-q30) < ((5 * tc + 1) >> 1)); if (dSam0 && dSam3) { dE = 2; @@ -589,109 +591,13 @@ if (img->get_cu_transquant_bypass(xDi,yDi)) filterQ=false; } - for (int k=0;k<4;k++) { - //int nDp,nDq; - - logtrace(LogDeblock,"line:%d\n",k); - - const pixel_t p0 = pk0; - const pixel_t p1 = pk1; - const pixel_t p2 = pk2; - const pixel_t p3 = pk3; - const pixel_t q0 = qk0; - const pixel_t q1 = qk1; - const pixel_t q2 = qk2; - const pixel_t q3 = qk3; - - if (dE==2) { - // strong filtering - - //nDp=nDq=3; - - pixel_t pnew3,qnew3; - pnew0 = Clip3(p0-2*tc,p0+2*tc, (p2 + 2*p1 + 2*p0 + 2*q0 + q1 +4)>>3); - pnew1 = Clip3(p1-2*tc,p1+2*tc, (p2 + p1 + p0 + q0+2)>>2); - pnew2 = Clip3(p2-2*tc,p2+2*tc, (2*p3 + 3*p2 + p1 + p0 + q0 + 4)>>3); - qnew0 = Clip3(q0-2*tc,q0+2*tc, (p1+2*p0+2*q0+2*q1+q2+4)>>3); - qnew1 = Clip3(q1-2*tc,q1+2*tc, (p0+q0+q1+q2+2)>>2); - qnew2 = Clip3(q2-2*tc,q2+2*tc, (p0+q0+q1+3*q2+2*q3+4)>>3); - - logtrace(LogDeblock,"strong filtering\n"); - - if (vertical) { - for (int i=0;i<3;i++) { - if (filterP) { ptr-i-1+k*stride = pnewi; } - if (filterQ) { ptr i + k*stride = qnewi; } - } - - // ptr-1+k*stride = ptr 0+k*stride = 200; - } - else { - for (int i=0;i<3;i++) { - if (filterP) { ptr k -(i+1)*stride = pnewi; } - if (filterQ) { ptr k + i *stride = qnewi; } - } - } - } - else { - // weak filtering - - //nDp=nDq=0; - - int delta = (9*(q0-p0) - 3*(q1-p1) + 8)>>4; - logtrace(LogDeblock,"delta=%d, tc=%d\n",delta,tc); - - if (abs_value(delta) < tc*10) { - - delta = Clip3(-tc,tc,delta); - logtrace(LogDeblock," deblk + %d;%d %02x->%02x - %d;%d %02x->%02x delta:%d\n", - vertical ? xDi-1 : xDi+k, - vertical ? yDi+k : yDi-1, p0,Clip_BitDepth(p0+delta, bitDepth_Y), - vertical ? xDi : xDi+k, - vertical ? yDi+k : yDi, q0,Clip_BitDepth(q0-delta, bitDepth_Y), - delta); - - if (vertical) { - if (filterP) { ptr-0-1+k*stride = Clip_BitDepth(p0+delta, bitDepth_Y); } - if (filterQ) { ptr 0 +k*stride = Clip_BitDepth(q0-delta, bitDepth_Y); } - } - else { - if (filterP) { ptr k -1*stride = Clip_BitDepth(p0+delta, bitDepth_Y); } - if (filterQ) { ptr k +0*stride = Clip_BitDepth(q0-delta, bitDepth_Y); } - } - - //ptr 0+k*stride = 200; - - if (dEp==1 && filterP) { - int delta_p = Clip3(-(tc>>1), tc>>1, (((p2+p0+1)>>1)-p1+delta)>>1); - - logtrace(LogDeblock," deblk dEp %d;%d delta:%d\n", - vertical ? xDi-2 : xDi+k, - vertical ? yDi+k : yDi-2, - delta_p); - - if (vertical) { ptr-1-1+k*stride = Clip_BitDepth(p1+delta_p, bitDepth_Y); } - else { ptr k -2*stride = Clip_BitDepth(p1+delta_p, bitDepth_Y); } - } - - if (dEq==1 && filterQ) { - int delta_q = Clip3(-(tc>>1), tc>>1, (((q2+q0+1)>>1)-q1-delta)>>1); - - logtrace(LogDeblock," delkb dEq %d;%d delta:%d\n", - vertical ? xDi+1 : xDi+k, - vertical ? yDi+k : yDi+1, - delta_q); - - if (vertical) { ptr 1 +k*stride = Clip_BitDepth(q1+delta_q, bitDepth_Y); } - else { ptr k +1*stride = Clip_BitDepth(q1+delta_q, bitDepth_Y); } - } - - //nDp = dEp+1; - //nDq = dEq+1; - - //logtrace(LogDeblock,"weak filtering (%d:%d)\n",nDp,nDq); - } - } + if constexpr (sizeof(pixel_t)==1) { + img->decctx->acceleration.deblock_luma_8((uint8_t*)ptr, stride, vertical, + dE,dEp,dEq,tc, filterP,filterQ); + } + else { + deblock_luma_kernel<pixel_t>(ptr, stride, vertical, + dE,dEp,dEq,tc, filterP,filterQ, bitDepth_Y); } } } @@ -746,8 +652,8 @@ const int stride = img->get_image_stride(1); - xEnd = libde265_min(xEnd,img->get_deblk_width()); - yEnd = libde265_min(yEnd,img->get_deblk_height()); + xEnd = std::min(xEnd,img->get_deblk_width()); + yEnd = std::min(yEnd,img->get_deblk_height()); int bitDepth_C = sps.BitDepth_C; @@ -770,23 +676,6 @@ pixel_t* ptr = img->get_image_plane_at_pos_NEW<pixel_t>(cplane+1, xDi,yDi); - pixel_t p24; - pixel_t q24; - - logtrace(LogDeblock,"-%s- %d %d\n",cplane==0 ? "Cb" : "Cr",xDi,yDi); - - for (int i=0;i<2;i++) - for (int k=0;k<4;k++) - { - if (vertical) { - qik = ptr i +k*stride; - pik = ptr-i-1+k*stride; - } - else { - qik = ptrk + i *stride; - pik = ptrk -(i+1)*stride; - } - } #if 0 for (int k=0;k<4;k++) @@ -815,7 +704,7 @@ if (sps.ChromaArrayType == CHROMA_420) { QP_C = table8_22(qP_i); } else { - QP_C = libde265_min(qP_i, 51); + QP_C = std::min(qP_i, 51); } @@ -843,11 +732,11 @@ if (img->get_cu_transquant_bypass(SubWidthC*xDi,SubHeightC*yDi)) filterQ=false; - for (int k=0;k<4;k++) { - int delta = Clip3(-tc,tc, ((((q0k-p0k)*4)+p1k-q1k+4)>>3)); // standard says <<2 in eq. (8-356), but the value can also be negative - logtrace(LogDeblock,"delta=%d\n",delta); - if (filterP) { ptr-1+k*stride = Clip_BitDepth(p0k+delta, bitDepth_C); } - if (filterQ) { ptr 0+k*stride = Clip_BitDepth(q0k-delta, bitDepth_C); } + if constexpr (sizeof(pixel_t)==1) { + img->decctx->acceleration.deblock_chroma_8((uint8_t*)ptr, stride, 1, tc, filterP,filterQ); + } + else { + deblock_chroma_kernel<pixel_t>(ptr, stride, true, tc, filterP,filterQ, bitDepth_C); } } else { @@ -859,10 +748,11 @@ if (sps.pcm_loop_filter_disable_flag && img->get_pcm_flag(SubWidthC*xDi,SubHeightC*yDi)) filterQ=false; if (img->get_cu_transquant_bypass(SubWidthC*xDi,SubHeightC*yDi)) filterQ=false; - for (int k=0;k<4;k++) { - int delta = Clip3(-tc,tc, ((((q0k-p0k)*4)+p1k-q1k+4)>>3)); // standard says <<2, but the value can also be negative - if (filterP) { ptr k-1*stride = Clip_BitDepth(p0k+delta, bitDepth_C); } - if (filterQ) { ptr k+0*stride = Clip_BitDepth(q0k-delta, bitDepth_C); } + if constexpr (sizeof(pixel_t)==1) { + img->decctx->acceleration.deblock_chroma_8((uint8_t*)ptr, stride, 0, tc, filterP,filterQ); + } + else { + deblock_chroma_kernel<pixel_t>(ptr, stride, false, tc, filterP,filterQ, bitDepth_C); } } }
View file
libde265-1.0.17.tar.gz/libde265/decctx.cc -> libde265-1.1.1.tar.gz/libde265/decctx.cc
Changed
@@ -41,8 +41,8 @@ #include "x86/sse.h" #endif -#ifdef HAVE_ARM -#include "arm/arm.h" +#ifdef HAVE_ARM32 +#include "arm32/arm.h" #endif #define SAVE_INTERMEDIATE_IMAGES 0 @@ -133,6 +133,7 @@ decoder_context::decoder_context() { param_image_allocation_functions = de265_image::default_image_allocation; + nal_parser.set_security_limits(¶m_security_limits); compute_framedrop_table(); } @@ -249,7 +250,19 @@ init_acceleration_functions_sse(&acceleration); } #endif -#ifdef HAVE_ARM +#if HAVE_AVX2 + // layered on top of SSE: overrides a few transform kernels (runtime-checked) + if (l>=de265_acceleration_AVX2) { + init_acceleration_functions_avx2(&acceleration); + } +#endif +#if HAVE_AVX512 + // layered on top of AVX2: overrides the 32x32 transform (runtime-checked) + if (l>=de265_acceleration_AVX2) { + init_acceleration_functions_avx512(&acceleration); + } +#endif +#ifdef HAVE_ARM32 if (l>=de265_acceleration_ARM) { init_acceleration_functions_arm(&acceleration); } @@ -276,7 +289,7 @@ if (tctx->shdr->slice_segment_address > 0) { - int prevCtb = pps.CtbAddrTStoRS pps.CtbAddrRStoTStctx->shdr->slice_segment_address -1 ; + int prevCtb = pps.scan->CtbAddrTStoRS pps.scan->CtbAddrRStoTStctx->shdr->slice_segment_address -1 ; int ctbX = prevCtb % sps.PicWidthInCtbsY; int ctbY = prevCtb / sps.PicWidthInCtbsY; @@ -301,7 +314,7 @@ void decoder_context::add_task_decode_CTB_row(thread_context* tctx, bool firstSliceSubstream, - int ctbRow) + uint16_t ctbRow) { thread_task_ctb_row* task = new thread_task_ctb_row; task->firstSliceSubstream = firstSliceSubstream; @@ -316,7 +329,7 @@ void decoder_context::add_task_decode_slice_segment(thread_context* tctx, bool firstSliceSubstream, - int ctbx,int ctby) + uint16_t ctbx, uint16_t ctby) { thread_task_slice_segment* task = new thread_task_slice_segment; task->firstSliceSubstream = firstSliceSubstream; @@ -413,6 +426,14 @@ dump_sei(&sei, current_sps.get()); if (image_units.empty()==false && suffix) { + uint32_t max_SEI_messages = param_security_limits.max_SEI_messages; + if (max_SEI_messages != 0 && + image_units.back()->suffix_SEIs.size() >= max_SEI_messages) { + // too many SEI messages for this access unit -> drop to bound memory usage + add_warning(DE265_WARNING_MAX_NUMBER_OF_SEI_MESSAGES_EXCEEDED, false); + return DE265_WARNING_MAX_NUMBER_OF_SEI_MESSAGES_EXCEEDED; + } + image_units.back()->suffix_SEIs.push_back(sei); } } @@ -466,7 +487,7 @@ // modify entry_point_offsets uint32_t headerLength = reader.data - nal->data(); - for (int i=0;i<shdr->num_entry_point_offsets;i++) { + for (uint32_t i=0;i<shdr->num_entry_point_offsets;i++) { uint32_t skipped = nal->num_skipped_bytes_before(shdr->entry_point_offseti, headerLength); if (skipped > shdr->entry_point_offseti) { @@ -478,16 +499,18 @@ shdr->entry_point_offseti -= skipped; } - this->img->add_slice_segment_header(shdr); - - - // --- start a new image if this is the first slice --- if (shdr->first_slice_segment_in_pic_flag) { image_unit* imgunit = new image_unit; imgunit->img = this->img; image_units.push_back(imgunit); + + // A new picture starts here. Drop the reference to the previous picture's + // slice header, whose storage may be released independently of this decoder + // state. Dependent slices only ever reference a preceding slice header + // within the same picture, which is set below as slices are retained. + previous_slice_header = nullptr; } @@ -495,6 +518,18 @@ if ( ! image_units.empty() ) { + // Hand the slice header to the picture (which takes ownership and frees it + // on release). Only do this when there is an active image unit to decode + // the slice; otherwise the header would be retained on img->slices forever, + // which a crafted stream of non-first slice NALs can exploit to grow memory + // without bound. + this->img->add_slice_segment_header(shdr); + + // The header is now owned by the image and stays alive at least until the + // image is released, so it is safe for a following dependent slice to copy + // from it. Only retained headers may become 'previous_slice_header'. + previous_slice_header = shdr; + slice_unit* sliceunit = new slice_unit(this); sliceunit->nal = nal; sliceunit->shdr = shdr; @@ -507,6 +542,7 @@ } else { nal_parser.free_NAL_unit(nal); + delete shdr; } bool did_work; @@ -631,7 +667,7 @@ remove_images_from_dpb(sliceunit->shdr->RemoveReferencesList); - if (sliceunit->shdr->slice_segment_address >= imgunit->img->get_pps().CtbAddrRStoTS.size()) { + if (sliceunit->shdr->slice_segment_address >= imgunit->img->get_pps().scan->CtbAddrRStoTS.size()) { return DE265_ERROR_CTB_OUTSIDE_IMAGE_AREA; } @@ -643,7 +679,7 @@ tctx.decctx = this; tctx.imgunit = imgunit; tctx.sliceunit= sliceunit; - tctx.CtbAddrInTS = imgunit->img->get_pps().CtbAddrRStoTStctx.shdr->slice_segment_address; + tctx.CtbAddrInTS = imgunit->img->get_pps().scan->CtbAddrRStoTStctx.shdr->slice_segment_address; tctx.task = nullptr; init_thread_context(&tctx); @@ -660,6 +696,7 @@ if (imgunit->img->get_pps().entropy_coding_sync_enabled_flag && sliceunit->shdr->first_slice_segment_in_pic_flag) { imgunit->ctx_models.resize( (img->get_sps().PicHeightInCtbsY-1) ); //* CONTEXT_MODEL_TABLE_LENGTH ); + imgunit->StatCoeff_models.assign( (img->get_sps().PicHeightInCtbsY-1), {{0,0,0,0}} ); } sliceunit->nThreads=1; @@ -809,8 +846,8 @@ slice_segment_header* shdr = sliceunit->shdr; const pic_parameter_set& pps = img->get_pps(); - int nRows = shdr->num_entry_point_offsets +1; - int ctbsWidth = img->get_sps().PicWidthInCtbsY; + uint16_t nRows = shdr->num_entry_point_offsets +1; + uint16_t ctbsWidth = img->get_sps().PicWidthInCtbsY; assert(img->num_threads_active() == 0); @@ -821,6 +858,7 @@ if (shdr->first_slice_segment_in_pic_flag) { // reserve space for nRows-1 because we don't need to save the CABAC model in the last CTB row imgunit->ctx_models.resize( (img->get_sps().PicHeightInCtbsY-1) ); //* CONTEXT_MODEL_TABLE_LENGTH ); + imgunit->StatCoeff_models.assign( (img->get_sps().PicHeightInCtbsY-1), {{0,0,0,0}} ); } @@ -828,10 +866,14 @@ // first CTB in this slice - int ctbAddrRS = shdr->slice_segment_address; - int ctbRow = ctbAddrRS / ctbsWidth; + uint32_t ctbAddrRS = shdr->slice_segment_address; + uint16_t ctbRow = ctbAddrRS / ctbsWidth; - for (int entryPt=0;entryPt<nRows;entryPt++) { + if (ctbRow + nRows > img->get_sps().PicHeightInCtbsY) { + return DE265_WARNING_SLICEHEADER_INVALID; + } + + for (uint16_t entryPt=0;entryPt<nRows;entryPt++) { // entry points other than the first start at CTB rows if (entryPt>0) { ctbRow++; @@ -857,7 +899,12 @@ tctx->img = img; tctx->imgunit = imgunit; tctx->sliceunit= sliceunit; - tctx->CtbAddrInTS = pps.CtbAddrRStoTSctbAddrRS; + + if (ctbAddrRS >= pps.scan->CtbAddrRStoTS.size()) { + err = DE265_WARNING_SLICEHEADER_INVALID; + break; + } + tctx->CtbAddrInTS = pps.scan->CtbAddrRStoTSctbAddrRS; init_thread_context(tctx); @@ -922,8 +969,8 @@ slice_segment_header* shdr = sliceunit->shdr; const pic_parameter_set& pps = img->get_pps(); - int nTiles = shdr->num_entry_point_offsets +1; - int ctbsWidth = img->get_sps().PicWidthInCtbsY; + uint16_t nTiles = shdr->num_entry_point_offsets +1; + uint16_t ctbsWidth = img->get_sps().PicWidthInCtbsY; assert(img->num_threads_active() == 0); @@ -932,10 +979,16 @@ // first CTB in this slice - int ctbAddrRS = shdr->slice_segment_address; - int tileID = pps.TileIdRSctbAddrRS; + uint32_t ctbAddrRS = shdr->slice_segment_address; - for (int entryPt=0;entryPt<nTiles;entryPt++) { + // pps.scan->TileIdRS and pps.scan->CtbAddrRStoTS are both sized to PicSizeInCtbsY in + // set_derived_values(), so one bound covers both accesses below. + if (ctbAddrRS >= pps.scan->CtbAddrRStoTS.size()) { + return DE265_WARNING_SLICEHEADER_INVALID; + } + int tileID = pps.scan->TileIdRSctbAddrRS; + + for (uint16_t entryPt=0;entryPt<nTiles;entryPt++) { // entry points other than the first start at tile beginnings if (entryPt>0) { tileID++; @@ -945,9 +998,14 @@ break; } - int ctbX = pps.colBdtileID % pps.num_tile_columns; - int ctbY = pps.rowBdtileID / pps.num_tile_columns; + uint16_t ctbX = pps.colBdtileID % pps.num_tile_columns; + uint16_t ctbY = pps.rowBdtileID / pps.num_tile_columns; ctbAddrRS = ctbY * ctbsWidth + ctbX; + + if (ctbAddrRS >= pps.scan->CtbAddrRStoTS.size()) { + err = DE265_WARNING_SLICEHEADER_INVALID; + break; + } } // set thread context @@ -959,7 +1017,7 @@ tctx->img = img; tctx->imgunit = imgunit; tctx->sliceunit= sliceunit; - tctx->CtbAddrInTS = pps.CtbAddrRStoTSctbAddrRS; + tctx->CtbAddrInTS = pps.scan->CtbAddrRStoTSctbAddrRS; init_thread_context(tctx); @@ -989,8 +1047,8 @@ img->thread_start(1); sliceunit->nThreads++; add_task_decode_slice_segment(tctx, entryPt==0, - ctbAddrRS % ctbsWidth, - ctbAddrRS / ctbsWidth); + static_cast<uint16_t>(ctbAddrRS % ctbsWidth), + static_cast<uint16_t>(ctbAddrRS / ctbsWidth)); } img->wait_for_completion(); @@ -1571,7 +1629,7 @@ bool decoder_context::construct_reference_picture_lists(slice_segment_header* hdr) { int NumPocTotalCurr = hdr->NumPocTotalCurr; - int NumRpsCurrTempList0 = libde265_max(hdr->num_ref_idx_l0_active, NumPocTotalCurr); + int NumRpsCurrTempList0 = std::max((int)hdr->num_ref_idx_l0_active, NumPocTotalCurr); // TODO: fold code for both lists together @@ -1643,7 +1701,7 @@ */ if (hdr->slice_type == SLICE_TYPE_B) { - int NumRpsCurrTempList1 = libde265_max(hdr->num_ref_idx_l1_active, NumPocTotalCurr); + int NumRpsCurrTempList1 = std::max((int)hdr->num_ref_idx_l1_active, NumPocTotalCurr); int rIdx=0; while (rIdx < NumRpsCurrTempList1) { @@ -1966,7 +2024,9 @@ hdr->SliceAddrRS = previous_slice_header->SliceAddrRS; } - previous_slice_header = hdr; + // Note: previous_slice_header is updated by the caller (read_slice_NAL) only + // once the slice header is actually retained by the image. Setting it here + // would leave a dangling pointer when the caller discards/deletes 'hdr'. loginfo(LogHeaders,"SliceAddrRS = %d\n",hdr->SliceAddrRS);
View file
libde265-1.0.17.tar.gz/libde265/decctx.h -> libde265-1.1.1.tar.gz/libde265/decctx.h
Changed
@@ -35,6 +35,7 @@ #include "libde265/acceleration.h" #include "libde265/nal-parser.h" +#include <array> #include <memory> constexpr int DE265_MAX_VPS_SETS = 16; // this is the maximum as defined in the standard @@ -262,6 +263,11 @@ There is one saved model for the initialization of each CTB row. The array is unused for non-WPP streams. */ std::vector<context_model_table> ctx_models; // TODO: move this into image ? + + /* Saved StatCoeff (persistent_rice_adaptation state) parallel to ctx_models. + Per HEVC RExt, this state must be carried across WPP CTB rows together + with the CABAC context. */ + std::vector<std::array<uint8_t, 4>> StatCoeff_models; }; @@ -350,6 +356,13 @@ //bool param_disable_mc_residual_idct; // not implemented yet //bool param_disable_intra_residual_idct; // not implemented yet + de265_security_limits param_security_limits = { + 1, // version + 8192 * 8192, // max_image_size_pixels + 16 * 1024 * 1024, // max_NAL_size_bytes + 256 // max_SEI_messages + }; + void set_image_allocation_functions(de265_image_allocation* allocfunc, void* userdata); de265_image_allocation param_image_allocation_functions; // initialized in constructor @@ -499,9 +512,9 @@ private: void init_thread_context(thread_context* tctx); - void add_task_decode_CTB_row(thread_context* tctx, bool firstSliceSubstream, int ctbRow); + void add_task_decode_CTB_row(thread_context* tctx, bool firstSliceSubstream, uint16_t ctbRow); void add_task_decode_slice_segment(thread_context* tctx, bool firstSliceSubstream, - int ctbX,int ctbY); + uint16_t ctbX, uint16_t ctbY); void mark_whole_slice_as_processed(image_unit* imgunit, slice_unit* sliceunit,
View file
libde265-1.0.17.tar.gz/libde265/dpb.cc -> libde265-1.1.1.tar.gz/libde265/dpb.cc
Changed
@@ -251,6 +251,14 @@ int w = sps->pic_width_in_luma_samples; int h = sps->pic_height_in_luma_samples; + // --- enforce maximum image size before allocating the image buffer --- + + uint32_t max_image_size_pixels = decctx->param_security_limits.max_image_size_pixels; + if (max_image_size_pixels != 0 && + (uint64_t)w * h > max_image_size_pixels) { + return -DE265_ERROR_IMAGE_SIZE_EXCEEDS_SECURITY_LIMIT; + } + enum de265_chroma chroma; switch (sps->chroma_format_idc) { case 0: chroma = de265_chroma_mono; break;
View file
libde265-1.0.17.tar.gz/libde265/fallback-dct.cc -> libde265-1.1.1.tar.gz/libde265/fallback-dct.cc
Changed
@@ -1207,3 +1207,14 @@ { hadamard_transform_8(coeffs,32, input,stride); } + + +void dequant_coeff_block_fallback(int16_t* coeffBuf, const int16_t* coeffList, + const int16_t* coeffPos, int nCoeff, + int32_t fact, int32_t offset, int32_t bdShift) +{ + for (int i=0;i<nCoeff;i++) { + int32_t v = Clip3(-32768, 32767, (coeffListi*fact + offset) >> bdShift); + coeffBuf coeffPosi = (int16_t)v; + } +}
View file
libde265-1.0.17.tar.gz/libde265/fallback-dct.h -> libde265-1.1.1.tar.gz/libde265/fallback-dct.h
Changed
@@ -72,6 +72,11 @@ } } +// Inverse quantization without scaling list (int32 fast path; see acceleration.h). +void dequant_coeff_block_fallback(int16_t* coeffBuf, const int16_t* coeffList, + const int16_t* coeffPos, int nCoeff, + int32_t fact, int32_t offset, int32_t bdShift); + void rdpcm_v_fallback(int32_t* residual, const int16_t* coeffs, int nT, int tsShift,int bdShift); void rdpcm_h_fallback(int32_t* residual, const int16_t* coeffs, int nT, int tsShift,int bdShift);
View file
libde265-1.1.1.tar.gz/libde265/fallback-deblk.cc
Added
@@ -0,0 +1,34 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "fallback-deblk.h" + +void deblock_luma_8_fallback(uint8_t* ptr, ptrdiff_t stride, int vertical, + int dE, int dEp, int dEq, int tc, int filterP, int filterQ) +{ + deblock_luma_kernel<uint8_t>(ptr, stride, vertical!=0, dE, dEp, dEq, tc, + filterP!=0, filterQ!=0, 8); +} + +void deblock_chroma_8_fallback(uint8_t* ptr, ptrdiff_t stride, int vertical, + int tc, int filterP, int filterQ) +{ + deblock_chroma_kernel<uint8_t>(ptr, stride, vertical!=0, tc, filterP!=0, filterQ!=0, 8); +}
View file
libde265-1.1.1.tar.gz/libde265/fallback-deblk.h
Added
@@ -0,0 +1,133 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef DE265_FALLBACK_DEBLK_H +#define DE265_FALLBACK_DEBLK_H + +#include <stddef.h> +#include <stdint.h> +#include "util.h" + +// One luma edge-filter segment (4 lines along the edge), spec 8.7.2.4.4. +// 'ptr' points at the q0 sample of line 0. dE in {1,2} (weak/strong); the +// caller guarantees dE != 0. The filterP/filterQ flags disable the p- resp. +// q-side (PCM / transquant-bypass). +template <class pixel_t> +void deblock_luma_kernel(pixel_t* ptr, ptrdiff_t stride, bool vertical, + int dE, int dEp, int dEq, int tc, + bool filterP, bool filterQ, int bitDepth) +{ + for (int k=0;k<4;k++) { + pixel_t p0,p1,p2,p3,q0,q1,q2,q3; + if (vertical) { + p0=ptr-1+k*stride; p1=ptr-2+k*stride; p2=ptr-3+k*stride; p3=ptr-4+k*stride; + q0=ptr 0+k*stride; q1=ptr 1+k*stride; q2=ptr 2+k*stride; q3=ptr 3+k*stride; + } else { + p0=ptrk-1*stride; p1=ptrk-2*stride; p2=ptrk-3*stride; p3=ptrk-4*stride; + q0=ptrk+0*stride; q1=ptrk+1*stride; q2=ptrk+2*stride; q3=ptrk+3*stride; + } + + if (dE==2) { + // strong filtering + pixel_t pnew3,qnew3; + pnew0 = Clip3(p0-2*tc,p0+2*tc, (p2 + 2*p1 + 2*p0 + 2*q0 + q1 +4)>>3); + pnew1 = Clip3(p1-2*tc,p1+2*tc, (p2 + p1 + p0 + q0+2)>>2); + pnew2 = Clip3(p2-2*tc,p2+2*tc, (2*p3 + 3*p2 + p1 + p0 + q0 + 4)>>3); + qnew0 = Clip3(q0-2*tc,q0+2*tc, (p1+2*p0+2*q0+2*q1+q2+4)>>3); + qnew1 = Clip3(q1-2*tc,q1+2*tc, (p0+q0+q1+q2+2)>>2); + qnew2 = Clip3(q2-2*tc,q2+2*tc, (p0+q0+q1+3*q2+2*q3+4)>>3); + + if (vertical) { + for (int i=0;i<3;i++) { + if (filterP) { ptr-i-1+k*stride = pnewi; } + if (filterQ) { ptr i + k*stride = qnewi; } + } + } else { + for (int i=0;i<3;i++) { + if (filterP) { ptr k -(i+1)*stride = pnewi; } + if (filterQ) { ptr k + i *stride = qnewi; } + } + } + } + else { + // weak filtering + int delta = (9*(q0-p0) - 3*(q1-p1) + 8)>>4; + + if (std::abs(delta) < tc*10) { + delta = Clip3(-tc,tc,delta); + + if (vertical) { + if (filterP) { ptr-0-1+k*stride = Clip_BitDepth(p0+delta, bitDepth); } + if (filterQ) { ptr 0 +k*stride = Clip_BitDepth(q0-delta, bitDepth); } + } else { + if (filterP) { ptr k -1*stride = Clip_BitDepth(p0+delta, bitDepth); } + if (filterQ) { ptr k +0*stride = Clip_BitDepth(q0-delta, bitDepth); } + } + + if (dEp==1 && filterP) { + int delta_p = Clip3(-(tc>>1), tc>>1, (((p2+p0+1)>>1)-p1+delta)>>1); + if (vertical) { ptr-1-1+k*stride = Clip_BitDepth(p1+delta_p, bitDepth); } + else { ptr k -2*stride = Clip_BitDepth(p1+delta_p, bitDepth); } + } + + if (dEq==1 && filterQ) { + int delta_q = Clip3(-(tc>>1), tc>>1, (((q2+q0+1)>>1)-q1-delta)>>1); + if (vertical) { ptr 1 +k*stride = Clip_BitDepth(q1+delta_q, bitDepth); } + else { ptr k +1*stride = Clip_BitDepth(q1+delta_q, bitDepth); } + } + } + } + } +} + + +// One chroma edge-filter segment (4 lines), spec 8.7.2.4.5. +template <class pixel_t> +void deblock_chroma_kernel(pixel_t* ptr, ptrdiff_t stride, bool vertical, + int tc, bool filterP, bool filterQ, int bitDepth) +{ + for (int k=0;k<4;k++) { + pixel_t p0,p1,q0,q1; + if (vertical) { + q0=ptr 0+k*stride; q1=ptr 1+k*stride; p0=ptr-1+k*stride; p1=ptr-2+k*stride; + } else { + q0=ptrk+0*stride; q1=ptrk+1*stride; p0=ptrk-1*stride; p1=ptrk-2*stride; + } + + int delta = Clip3(-tc,tc, ((((q0-p0)*4)+p1-q1+4)>>3)); + + if (vertical) { + if (filterP) { ptr-1+k*stride = Clip_BitDepth(p0+delta, bitDepth); } + if (filterQ) { ptr 0+k*stride = Clip_BitDepth(q0-delta, bitDepth); } + } else { + if (filterP) { ptr k-1*stride = Clip_BitDepth(p0+delta, bitDepth); } + if (filterQ) { ptr k+0*stride = Clip_BitDepth(q0-delta, bitDepth); } + } + } +} + + +// 8-bit fallback wrappers stored in the acceleration table. +void deblock_luma_8_fallback(uint8_t* ptr, ptrdiff_t stride, int vertical, + int dE, int dEp, int dEq, int tc, int filterP, int filterQ); +void deblock_chroma_8_fallback(uint8_t* ptr, ptrdiff_t stride, int vertical, + int tc, int filterP, int filterQ); + +#endif
View file
libde265-1.1.1.tar.gz/libde265/fallback-intrapred.cc
Added
@@ -0,0 +1,54 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "fallback-intrapred.h" +#include "intrapred.h" + + +template <class pixel_t> +void intra_pred_dc_fallback(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border) +{ + intra_prediction_DC<pixel_t>(dst, (int)stride, nT, cIdx, const_cast<pixel_t*>(border)); +} + +template <class pixel_t> +void intra_pred_planar_fallback(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border) +{ + intra_prediction_planar<pixel_t>(dst, (int)stride, nT, cIdx, const_cast<pixel_t*>(border)); +} + +template <class pixel_t> +void intra_pred_angular_fallback(pixel_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const pixel_t* border) +{ + intra_prediction_angular<pixel_t>(dst, (int)stride, bit_depth, (bool)disableBoundaryFilter, + xB0, yB0, (enum IntraPredMode)mode, nT, cIdx, + const_cast<pixel_t*>(border)); +} + + +// explicit instantiations so the symbols can be installed into the acceleration table + +template void intra_pred_dc_fallback<uint8_t> (uint8_t*, ptrdiff_t, int, int, const uint8_t*); +template void intra_pred_dc_fallback<uint16_t>(uint16_t*, ptrdiff_t, int, int, const uint16_t*); +template void intra_pred_planar_fallback<uint8_t> (uint8_t*, ptrdiff_t, int, int, const uint8_t*); +template void intra_pred_planar_fallback<uint16_t>(uint16_t*, ptrdiff_t, int, int, const uint16_t*); +template void intra_pred_angular_fallback<uint8_t> (uint8_t*, ptrdiff_t, int, int, int, int, int, int, int, const uint8_t*); +template void intra_pred_angular_fallback<uint16_t>(uint16_t*, ptrdiff_t, int, int, int, int, int, int, int, const uint16_t*);
View file
libde265-1.1.1.tar.gz/libde265/fallback-intrapred.h
Added
@@ -0,0 +1,41 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef DE265_FALLBACK_INTRAPRED_H +#define DE265_FALLBACK_INTRAPRED_H + +#include <stddef.h> +#include <stdint.h> + +// Scalar fallback wrappers around the intra-prediction kernels in intrapred.h. +// They have plain (non-templated mode/flag) signatures so they can be stored in +// the acceleration_functions function-pointer table. + +template <class pixel_t> +void intra_pred_dc_fallback(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border); + +template <class pixel_t> +void intra_pred_planar_fallback(pixel_t* dst, ptrdiff_t stride, int nT, int cIdx, const pixel_t* border); + +template <class pixel_t> +void intra_pred_angular_fallback(pixel_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const pixel_t* border); + +#endif
View file
libde265-1.0.17.tar.gz/libde265/fallback-motion.cc -> libde265-1.1.1.tar.gz/libde265/fallback-motion.cc
Changed
@@ -165,9 +165,11 @@ const int16_t *src, ptrdiff_t srcstride, int width, int height, int bit_depth) { - int shift1 = 14-bit_depth; - int offset1 = 0; - if (shift1>0) { offset1 = 1<<(shift1-1); } + // shift1 per HEVC v2 (10/2014) spec 8.5.3.3.4.2: Max(2, 14 - BitDepth). + // The Max() was added with the Range Extensions in v2 to handle BitDepth up to 16; + // the v1 (04/2013) formula was just (14 - BitDepth), valid only for BitDepth <= 14. + int shift1 = std::max(2, 14-bit_depth); + int offset1 = 1<<(shift1-1); assert((width&1)==0); @@ -232,7 +234,10 @@ ptrdiff_t srcstride, int width, int height, int bit_depth) { - int shift2 = 15-bit_depth; + // shift2 per HEVC v2 (10/2014) spec 8.5.3.3.4.2: Max(3, 15 - BitDepth). + // The Max() was added with the Range Extensions in v2 to handle BitDepth up to 16; + // the v1 (04/2013) formula was just (15 - BitDepth), valid only for BitDepth <= 14. + int shift2 = std::max(3, 15-bit_depth); int offset2 = 1<<(shift2-1); assert((width&1)==0); @@ -279,7 +284,10 @@ int width, int height, int mx, int my, int16_t* mcbuffer, int bit_depth) { - int shift3 = 14 - bit_depth; + // shift3 per HEVC v2 (10/2014) spec 8.5.3.3.3.3 (chroma): Max(2, 14 - BitDepth). + // The Max() was added with the Range Extensions in v2 to handle BitDepth up to 16; + // the v1 (04/2013) formula was just (14 - BitDepth), valid only for BitDepth <= 14. + int shift3 = std::max(2, 14 - bit_depth); for (int y=0;y<height;y++) { int16_t* o = &outy*out_stride; @@ -459,7 +467,10 @@ { //const int shift1 = bit_depth-8; //const int shift2 = 6; - const int shift3 = 14-bit_depth; + // shift3 per HEVC v2 (10/2014) spec 8.5.3.3.3.2 (luma): Max(2, 14 - BitDepth). + // The Max() was added with the Range Extensions in v2 to handle BitDepth up to 16; + // the v1 (04/2013) formula was just (14 - BitDepth), valid only for BitDepth <= 14. + const int shift3 = std::max(2, 14-bit_depth); // straight copy
View file
libde265-1.0.17.tar.gz/libde265/fallback.cc -> libde265-1.1.1.tar.gz/libde265/fallback.cc
Changed
@@ -21,6 +21,8 @@ #include "fallback.h" #include "fallback-motion.h" #include "fallback-dct.h" +#include "fallback-intrapred.h" +#include "fallback-deblk.h" void init_acceleration_functions_fallback(struct acceleration_functions* accel) @@ -104,6 +106,7 @@ accel->rotate_coefficients = rotate_coefficients_fallback; accel->add_residual_8 = add_residual_fallback<uint8_t>; accel->add_residual_16 = add_residual_fallback<uint16_t>; + accel->dequant_coeff_block = dequant_coeff_block_fallback; accel->rdpcm_h = rdpcm_h_fallback; accel->rdpcm_v = rdpcm_v_fallback; accel->transform_skip_residual = transform_skip_residual_fallback; @@ -124,4 +127,14 @@ accel->hadamard_transform_81 = hadamard_8x8_8_fallback; accel->hadamard_transform_82 = hadamard_16x16_8_fallback; accel->hadamard_transform_83 = hadamard_32x32_8_fallback; + + accel->intra_pred_dc_8 = intra_pred_dc_fallback<uint8_t>; + accel->intra_pred_dc_16 = intra_pred_dc_fallback<uint16_t>; + accel->intra_pred_planar_8 = intra_pred_planar_fallback<uint8_t>; + accel->intra_pred_planar_16 = intra_pred_planar_fallback<uint16_t>; + accel->intra_pred_angular_8 = intra_pred_angular_fallback<uint8_t>; + accel->intra_pred_angular_16 = intra_pred_angular_fallback<uint16_t>; + + accel->deblock_luma_8 = deblock_luma_8_fallback; + accel->deblock_chroma_8 = deblock_chroma_8_fallback; }
View file
libde265-1.0.17.tar.gz/libde265/image-io.cc -> libde265-1.1.1.tar.gz/libde265/image-io.cc
Changed
@@ -76,7 +76,7 @@ // --- load image --- uint8_t* p; - int stride; + ptrdiff_t stride; p = img->get_image_plane(0); stride = img->get_image_stride(0); for (uint32_t y=0;y<height;y++) { @@ -174,7 +174,7 @@ // --- write image --- const uint8_t* p; - int stride; + ptrdiff_t stride; int width = img->get_width(); int height= img->get_height();
View file
libde265-1.0.17.tar.gz/libde265/image.cc -> libde265-1.1.1.tar.gz/libde265/image.cc
Changed
@@ -20,7 +20,6 @@ #include "image.h" #include "decctx.h" -#include "en265.h" #include <atomic> @@ -45,10 +44,10 @@ #define STANDARD_ALIGNMENT 16 -#ifdef HAVE___MINGW_ALIGNED_MALLOC +#if defined(__MINGW32__) #define ALLOC_ALIGNED(alignment, size) __mingw_aligned_malloc((size), (alignment)) #define FREE_ALIGNED(mem) __mingw_aligned_free((mem)) -#elif _WIN32 +#elif defined(_MSC_VER) #define ALLOC_ALIGNED(alignment, size) _aligned_malloc((size), (alignment)) #define FREE_ALIGNED(mem) _aligned_free((mem)) #elif defined(HAVE_POSIX_MEMALIGN) @@ -71,10 +70,11 @@ void* inputdata, int inputstride, void *userdata) { int alignment = STANDARD_ALIGNMENT; - int stride = (img->get_width(cIdx) + alignment-1) / alignment * alignment; - int height = img->get_height(cIdx); + uint32_t stride = (img->get_width(cIdx) + alignment-1) / alignment * alignment; + uint32_t height = img->get_height(cIdx); - uint8_t* p = static_cast<uint8_t*>(ALLOC_ALIGNED_16(stride * height + MEMORY_PADDING)); + // size computed in size_t: stride*height can exceed UINT32_MAX for large planes + uint8_t* p = static_cast<uint8_t*>(ALLOC_ALIGNED_16(static_cast<size_t>(stride) * height + MEMORY_PADDING)); if (p==nullptr) { return nullptr; } @@ -83,12 +83,14 @@ // copy input data if provided if (inputdata != nullptr) { - if (inputstride == stride) { - memcpy(p, inputdata, stride*height); + if (inputstride == static_cast<int>(stride)) { + memcpy(p, inputdata, static_cast<size_t>(stride) * height); } else { - for (int y=0;y<height;y++) { - memcpy(p+y*stride, static_cast<char*>(inputdata) + inputstride*y, inputstride); + for (uint32_t y=0;y<height;y++) { + memcpy(p + static_cast<size_t>(y) * stride, + static_cast<char*>(inputdata) + static_cast<size_t>(inputstride) * y, + inputstride); } } } @@ -108,30 +110,35 @@ static int de265_image_get_buffer(de265_decoder_context* ctx, de265_image_spec* spec, de265_image* img, void* userdata) { - const int rawChromaWidth = spec->width / img->SubWidthC; - const int rawChromaHeight = spec->height / img->SubHeightC; + const uint32_t rawChromaWidth = spec->width / img->SubWidthC; + const uint32_t rawChromaHeight = spec->height / img->SubHeightC; - int luma_stride = (spec->width + spec->alignment-1) / spec->alignment * spec->alignment; - int chroma_stride = (rawChromaWidth + spec->alignment-1) / spec->alignment * spec->alignment; + uint32_t luma_stride = (spec->width + spec->alignment-1) / spec->alignment * spec->alignment; + uint32_t chroma_stride = (rawChromaWidth + spec->alignment-1) / spec->alignment * spec->alignment; assert(img->BitDepth_Y >= 8 && img->BitDepth_Y <= 16); assert(img->BitDepth_C >= 8 && img->BitDepth_C <= 16); - int luma_bpl = luma_stride * ((img->BitDepth_Y+7)/8); - int chroma_bpl = chroma_stride * ((img->BitDepth_C+7)/8); + uint32_t luma_bpl = luma_stride * ((img->BitDepth_Y+7)/8); + uint32_t chroma_bpl = chroma_stride * ((img->BitDepth_C+7)/8); - int luma_height = spec->height; - int chroma_height = rawChromaHeight; + uint32_t luma_height = spec->height; + uint32_t chroma_height = rawChromaHeight; bool alloc_failed = false; - uint8_t* p3 = { 0,0,0 }; - p0 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(luma_height * luma_bpl + MEMORY_PADDING)); + // Compute the plane sizes in size_t. Each operand fits in uint32_t, but the + // height * bytes-per-line product can exceed UINT32_MAX for large frames, so + // the multiplication must be done in 64 bits. Computing it in 32 bits wraps + // the allocation size to a small value while fill_image() later writes the + // real (size_t) size -> heap buffer overflow (GHSA-vv8h-932h-7r86). + uint8_t* p3 = { nullptr,nullptr,nullptr }; + p0 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(static_cast<size_t>(luma_height) * luma_bpl + MEMORY_PADDING)); if (p0==nullptr) { alloc_failed=true; } if (img->get_chroma_format() != de265_chroma_mono) { - p1 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(chroma_height * chroma_bpl + MEMORY_PADDING)); - p2 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(chroma_height * chroma_bpl + MEMORY_PADDING)); + p1 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(static_cast<size_t>(chroma_height) * chroma_bpl + MEMORY_PADDING)); + p2 = static_cast<uint8_t*>(ALLOC_ALIGNED_16(static_cast<size_t>(chroma_height) * chroma_bpl + MEMORY_PADDING)); if (p1==nullptr || p2==nullptr) { alloc_failed=true; } } @@ -177,7 +184,7 @@ }; -void de265_image::set_image_plane(int cIdx, uint8_t* mem, int stride, void *userdata) +void de265_image::set_image_plane(int cIdx, uint8_t* mem, ptrdiff_t stride, void *userdata) { pixelscIdx = mem; plane_user_datacIdx = userdata; @@ -490,24 +497,21 @@ int bytes_per_pixel = get_bytes_per_pixel(channel); assert(value >= 0); // needed for the shift operation in the check below + // Each plane is allocated with MEMORY_PADDING trailing bytes for safe SSE overread; the + // memsets below cover that padding too so it never contains uninitialized heap data. + const size_t plane_bytes = + (channel == 0 ? static_cast<size_t>(stride) * height + : static_cast<size_t>(chroma_stride) * chroma_height) + * bytes_per_pixel; + if (bytes_per_pixel == 1) { - if (channel==0) { - memset(pixelschannel, value, stride * height); - } - else { - memset(pixelschannel, value, chroma_stride * chroma_height); - } + memset(pixelschannel, value, plane_bytes + MEMORY_PADDING); } else if ((value >> 8) == (value & 0xFF)) { assert(bytes_per_pixel == 2); // if we fill the same byte value to all bytes, we can still use memset() - if (channel==0) { - memset(pixelschannel, 0, stride * height * bytes_per_pixel); - } - else { - memset(pixelschannel, 0, chroma_stride * chroma_height * bytes_per_pixel); - } + memset(pixelschannel, 0, plane_bytes + MEMORY_PADDING); } else { assert(bytes_per_pixel == 2); @@ -535,6 +539,10 @@ memcpy(pixelschannel + y * chroma_stride * 2, pixelschannel, chroma_width * 2); } } + +#if MEMORY_PADDING > 0 + memset(pixelschannel + plane_bytes, 0, MEMORY_PADDING); +#endif } } @@ -782,9 +790,9 @@ if (xN>=sps->pic_width_in_luma_samples || yN>=sps->pic_height_in_luma_samples) return false; - int minBlockAddrN = pps->MinTbAddrZS (xN>>sps->Log2MinTrafoSize) + + int minBlockAddrN = pps->scan->MinTbAddrZS (xN>>sps->Log2MinTrafoSize) + (yN>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ; - int minBlockAddrCurr = pps->MinTbAddrZS (xCurr>>sps->Log2MinTrafoSize) + + int minBlockAddrCurr = pps->scan->MinTbAddrZS (xCurr>>sps->Log2MinTrafoSize) + (yCurr>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ; if (minBlockAddrN > minBlockAddrCurr) return false; @@ -799,8 +807,8 @@ return false; } - if (pps->TileIdRSxCurrCtb + yCurrCtb*sps->PicWidthInCtbsY != - pps->TileIdRSxNCtb + yNCtb *sps->PicWidthInCtbsY) { + if (pps->scan->TileIdRSxCurrCtb + yCurrCtb*sps->PicWidthInCtbsY != + pps->scan->TileIdRSxNCtb + yNCtb *sps->PicWidthInCtbsY) { return false; }
View file
libde265-1.0.17.tar.gz/libde265/image.h -> libde265-1.1.1.tar.gz/libde265/image.h
Changed
@@ -26,6 +26,7 @@ #endif #include <assert.h> +#include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> @@ -235,11 +236,11 @@ /* */ uint8_t* get_image_plane(int cIdx) { return pixelscIdx; } const uint8_t* get_image_plane(int cIdx) const { return pixelscIdx; } - void set_image_plane(int cIdx, uint8_t* mem, int stride, void *userdata); + void set_image_plane(int cIdx, uint8_t* mem, ptrdiff_t stride, void *userdata); uint8_t* get_image_plane_at_pos(int cIdx, int xpos,int ypos) { - int stride = get_image_stride(cIdx); + ptrdiff_t stride = get_image_stride(cIdx); return pixelscIdx + xpos + ypos*stride; } @@ -248,38 +249,38 @@ template <class pixel_t> pixel_t* get_image_plane_at_pos_NEW(int cIdx, int xpos,int ypos) { - int stride = get_image_stride(cIdx); + ptrdiff_t stride = get_image_stride(cIdx); return (pixel_t*)(pixelscIdx + (xpos + ypos*stride)*sizeof(pixel_t)); } const uint8_t* get_image_plane_at_pos(int cIdx, int xpos,int ypos) const { - int stride = get_image_stride(cIdx); + ptrdiff_t stride = get_image_stride(cIdx); return pixelscIdx + xpos + ypos*stride; } void* get_image_plane_at_pos_any_depth(int cIdx, int xpos,int ypos) { - int stride = get_image_stride(cIdx); + ptrdiff_t stride = get_image_stride(cIdx); return pixelscIdx + ((xpos + ypos*stride) << bpp_shiftcIdx); } const void* get_image_plane_at_pos_any_depth(int cIdx, int xpos,int ypos) const { - int stride = get_image_stride(cIdx); + ptrdiff_t stride = get_image_stride(cIdx); return pixelscIdx + ((xpos + ypos*stride) << bpp_shiftcIdx); } /* Number of pixels in one row (not number of bytes). */ - int get_image_stride(int cIdx) const + ptrdiff_t get_image_stride(int cIdx) const { if (cIdx==0) return stride; else return chroma_stride; } - int get_luma_stride() const { return stride; } - int get_chroma_stride() const { return chroma_stride; } + ptrdiff_t get_luma_stride() const { return stride; } + ptrdiff_t get_chroma_stride() const { return chroma_stride; } int get_width (int cIdx=0) const { return cIdx==0 ? width : chroma_width; } int get_height(int cIdx=0) const { return cIdx==0 ? height : chroma_height; } @@ -333,7 +334,7 @@ int width = 0, height = 0; // size in luma pixels int chroma_width = 0, chroma_height = 0; - int stride = 0, chroma_stride = 0; + ptrdiff_t stride = 0, chroma_stride = 0; public: uint8_t BitDepth_Y = 0, BitDepth_C = 0;
View file
libde265-1.0.17.tar.gz/libde265/intrapred.cc -> libde265-1.1.1.tar.gz/libde265/intrapred.cc
Changed
@@ -21,6 +21,7 @@ #include "intrapred.h" #include "transform.h" #include "util.h" +#include "decctx.h" #include <assert.h> @@ -292,12 +293,14 @@ } + const acceleration_functions& acceleration = img->decctx->acceleration; + switch (intraPredMode) { case INTRA_PLANAR: - intra_prediction_planar(dst,dstStride, nT,cIdx, border_pixels); + acceleration.intra_pred_planar<pixel_t>(dst,dstStride, nT,cIdx, border_pixels); break; case INTRA_DC: - intra_prediction_DC(dst,dstStride, nT,cIdx, border_pixels); + acceleration.intra_pred_dc<pixel_t>(dst,dstStride, nT,cIdx, border_pixels); break; default: { @@ -306,8 +309,8 @@ (img->get_sps().range_extension.implicit_rdpcm_enabled_flag && img->get_cu_transquant_bypass(xB0,yB0)); - intra_prediction_angular(dst,dstStride, bit_depth,disableIntraBoundaryFilter, - xB0,yB0,intraPredMode,nT,cIdx, border_pixels); + acceleration.intra_pred_angular<pixel_t>(dst,dstStride, bit_depth,disableIntraBoundaryFilter, + xB0,yB0,intraPredMode,nT,cIdx, border_pixels); } break; }
View file
libde265-1.0.17.tar.gz/libde265/intrapred.h -> libde265-1.1.1.tar.gz/libde265/intrapred.h
Changed
@@ -195,9 +195,8 @@ if (intraPredMode==INTRA_DC || nT==4) { filterFlag = 0; } else { - // int-cast below prevents a typing problem that leads to wrong results when abs_value is a macro - int minDistVerHor = libde265_min( abs_value((int)intraPredMode-26), - abs_value((int)intraPredMode-10) ); + int minDistVerHor = std::min( std::abs((int)intraPredMode-26), + std::abs((int)intraPredMode-10) ); //printf("mindist: %d\n",minDistVerHor); @@ -217,8 +216,8 @@ int biIntFlag = (sps.strong_intra_smoothing_enable_flag && cIdx==0 && nT==32 && - abs_value(p0+p 64-2*p 32) < (1<<(sps.bit_depth_luma-5)) && - abs_value(p0+p-64-2*p-32) < (1<<(sps.bit_depth_luma-5))) + std::abs(p0+p 64-2*p 32) < (1<<(sps.bit_depth_luma-5)) && + std::abs(p0+p-64-2*p-32) < (1<<(sps.bit_depth_luma-5))) ? 1 : 0; pixel_t pF_mem4*32+1; @@ -491,17 +490,17 @@ int topleftCTBSlice = availableTopLeft ? img->get_SliceAddrRS(xLeftCtb, yTopCtb) : -1; /* - printf("size: %d\n",pps->TileIdRS.size()); + printf("size: %d\n",pps->scan->TileIdRS.size()); printf("curr: %d left: %d top: %d\n", xCurrCtb+yCurrCtb*picWidthInCtbs, availableLeft ? xLeftCtb+yCurrCtb*picWidthInCtbs : 9999, availableTop ? xCurrCtb+yTopCtb*picWidthInCtbs : 9999); */ - uint32_t currCTBTileID = pps->TileIdRSxCurrCtb+yCurrCtb*picWidthInCtbs; - uint32_t leftCTBTileID = availableLeft ? pps->TileIdRSxLeftCtb+yCurrCtb*picWidthInCtbs : UINT32_MAX; - uint32_t topCTBTileID = availableTop ? pps->TileIdRSxCurrCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; - uint32_t topleftCTBTileID = availableTopLeft ? pps->TileIdRSxLeftCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; - uint32_t toprightCTBTileID= availableTopRight? pps->TileIdRSxRightCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; + uint32_t currCTBTileID = pps->scan->TileIdRSxCurrCtb+yCurrCtb*picWidthInCtbs; + uint32_t leftCTBTileID = availableLeft ? pps->scan->TileIdRSxLeftCtb+yCurrCtb*picWidthInCtbs : UINT32_MAX; + uint32_t topCTBTileID = availableTop ? pps->scan->TileIdRSxCurrCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; + uint32_t topleftCTBTileID = availableTopLeft ? pps->scan->TileIdRSxLeftCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; + uint32_t toprightCTBTileID= availableTopRight? pps->scan->TileIdRSxRightCtb+yTopCtb*picWidthInCtbs : UINT32_MAX; if (leftCTBSlice != currCTBSlice || leftCTBTileID != currCTBTileID ) availableLeft = false; if (topCTBSlice != currCTBSlice || topCTBTileID != currCTBTileID ) availableTop = false; @@ -533,14 +532,14 @@ assert(nT<=32); pixel_t* image; - int stride; + ptrdiff_t stride; image = (pixel_t*)img->get_image_plane(cIdx); stride = img->get_image_stride(cIdx); int xBLuma = xB * SubWidth; int yBLuma = yB * SubHeight; - int currBlockAddr = pps->MinTbAddrZS (xBLuma>>sps->Log2MinTrafoSize) + + int currBlockAddr = pps->scan->MinTbAddrZS (xBLuma>>sps->Log2MinTrafoSize) + (yBLuma>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ; @@ -549,7 +548,7 @@ for (int y=nBottom-1 ; y>=0 ; y-=4) if (availableLeft) { - int NBlockAddr = pps->MinTbAddrZS (((xB-1)*SubWidth )>>sps->Log2MinTrafoSize) + + int NBlockAddr = pps->scan->MinTbAddrZS (((xB-1)*SubWidth )>>sps->Log2MinTrafoSize) + (((yB+y)*SubHeight)>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ; @@ -576,7 +575,7 @@ if (availableTopLeft) { - int NBlockAddr = pps->MinTbAddrZS (((xB-1)*SubWidth )>>sps->Log2MinTrafoSize) + + int NBlockAddr = pps->scan->MinTbAddrZS (((xB-1)*SubWidth )>>sps->Log2MinTrafoSize) + (((yB-1)*SubHeight)>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ; @@ -606,7 +605,7 @@ if (borderAvailable) { - int NBlockAddr = pps->MinTbAddrZS (((xB+x)*SubWidth )>>sps->Log2MinTrafoSize) + + int NBlockAddr = pps->scan->MinTbAddrZS (((xB+x)*SubWidth )>>sps->Log2MinTrafoSize) + (((yB-1)*SubHeight)>>sps->Log2MinTrafoSize) * sps->PicWidthInTbsY ;
View file
libde265-1.0.17.tar.gz/libde265/motion.cc -> libde265-1.1.1.tar.gz/libde265/motion.cc
Changed
@@ -50,7 +50,7 @@ const seq_parameter_set* sps, int mv_x, int mv_y, int xP,int yP, int16_t* out, int out_stride, - const pixel_t* ref, int ref_stride, + const pixel_t* ref, ptrdiff_t ref_stride, int nPbW, int nPbH, int bitDepth_L) { int xFracL = mv_x & 3; @@ -63,7 +63,7 @@ //const int shift1 = sps->BitDepth_Y-8; //const int shift2 = 6; - const int shift3 = 14 - sps->BitDepth_Y; + const int shift3 = std::max(2, 14 - sps->BitDepth_Y); int w = sps->pic_width_in_luma_samples; int h = sps->pic_height_in_luma_samples; @@ -129,7 +129,7 @@ pixel_t padbuf(MAX_CU_SIZE+16)*(MAX_CU_SIZE+7); const pixel_t* src_ptr; - int src_stride; + ptrdiff_t src_stride; if (-extra_left + xIntOffsL >= 0 && -extra_top + yIntOffsL >= 0 && @@ -181,14 +181,14 @@ int mv_x, int mv_y, int xP,int yP, int16_t* out, int out_stride, - const pixel_t* ref, int ref_stride, + const pixel_t* ref, ptrdiff_t ref_stride, int nPbWC, int nPbHC, int bit_depth_C) { // chroma sample interpolation process (8.5.3.2.2.2) //const int shift1 = sps->BitDepth_C-8; //const int shift2 = 6; - const int shift3 = 14 - sps->BitDepth_C; + const int shift3 = std::max(2, 14 - sps->BitDepth_C); int wC = sps->pic_width_in_luma_samples /sps->SubWidthC; int hC = sps->pic_height_in_luma_samples/sps->SubHeightC; @@ -227,7 +227,7 @@ pixel_t padbuf(MAX_CU_SIZE+16)*(MAX_CU_SIZE+3); const pixel_t* src_ptr; - int src_stride; + ptrdiff_t src_stride; int extra_top = 1; int extra_left = 1; @@ -453,9 +453,9 @@ // weighted sample prediction (8.5.3.2.3) - const int shift1_L = libde265_max(2,14-sps->BitDepth_Y); + const int shift1_L = std::max(2,14-sps->BitDepth_Y); const int offset_shift1_L = img->get_sps().WpOffsetBdShiftY; - const int shift1_C = libde265_max(2,14-sps->BitDepth_C); + const int shift1_C = std::max(2,14-sps->BitDepth_C); const int offset_shift1_C = img->get_sps().WpOffsetBdShiftC; /* @@ -1074,7 +1074,7 @@ numRefIdx = shdr->num_ref_idx_l0_active; } else { - numRefIdx = libde265_min(shdr->num_ref_idx_l0_active, + numRefIdx = std::min(shdr->num_ref_idx_l0_active, shdr->num_ref_idx_l1_active); } @@ -1128,12 +1128,12 @@ return false; } else { - int tx = (16384 + (abs_value(td)>>1)) / td; + int tx = (16384 + (std::abs(td)>>1)) / td; int distScaleFactor = Clip3(-4096,4095, (tb*tx+32)>>6); out_mv->x = Clip3(-32768,32767, - Sign(distScaleFactor*mv.x)*((abs_value(distScaleFactor*mv.x)+127)>>8)); + Sign(distScaleFactor*mv.x)*((std::abs(distScaleFactor*mv.x)+127)>>8)); out_mv->y = Clip3(-32768,32767, - Sign(distScaleFactor*mv.y)*((abs_value(distScaleFactor*mv.y)+127)>>8)); + Sign(distScaleFactor*mv.y)*((std::abs(distScaleFactor*mv.y)+127)>>8)); return true; } }
View file
libde265-1.0.17.tar.gz/libde265/nal-parser.cc -> libde265-1.1.1.tar.gz/libde265/nal-parser.cc
Changed
@@ -24,6 +24,8 @@ #include <assert.h> #include <stdlib.h> #include <stdio.h> +#include <stdint.h> +#include <limits.h> #ifdef HAVE_CONFIG_H #include "config.h" @@ -55,7 +57,21 @@ LIBDE265_CHECK_RESULT bool NAL_unit::resize(int new_size) { if (capacity < new_size) { - unsigned char* newbuffer = static_cast<unsigned char*>(malloc(new_size)); + // Grow the buffer geometrically (1.5x) rather than to the exact requested + // size. NAL_Parser::push_data() appends to the pending NAL one input chunk + // at a time, increasing the request by a roughly constant amount each call. + // With exact-size allocation every chunk would reallocate and copy the + // whole accumulated buffer (O(n^2) for a single oversized NAL); spare + // capacity amortizes the total copying to O(n). Here new_size > capacity >= 0, + // so the 1.5x term is computed in 64 bits and only used when it both exceeds + // the request and still fits in 'int'. + int alloc_size = new_size; + int64_t grow = static_cast<int64_t>(capacity) + capacity / 2; + if (grow > new_size && grow <= INT_MAX) { + alloc_size = static_cast<int>(grow); + } + + unsigned char* newbuffer = static_cast<unsigned char*>(malloc(alloc_size)); if (newbuffer == nullptr) { return false; } @@ -66,7 +82,7 @@ } nal_data = newbuffer; - capacity = new_size; + capacity = alloc_size; } return true; } @@ -113,38 +129,36 @@ void NAL_unit::remove_stuffing_bytes() { - uint8_t* p = data(); - - for (int i=0;i<size()-2;i++) - { -#if 0 - for (int k=i;k<i+64;k++) - if (i*0+k<size()) { - printf("%c%02x", (k==i) ? '':' ', data()k); - } - printf("\n"); -#endif - - if (p2!=3 && p2!=0) { - // fast forward 3 bytes (2+1) - p+=2; - i+=2; - } - else { - if (p0==0 && p1==0 && p2==3) { - //printf("SKIP NAL @ %d\n",i+2+num_skipped_bytes); - insert_skipped_byte(i+2 + num_skipped_bytes()); - - memmove(p+2, p+3, size()-i-3); - set_size(size()-1); + // Remove emulation-prevention bytes: every 0x03 that immediately follows two + // 0x00 bytes is dropped (and the zero-run reset, so 00 00 03 03 keeps the + // trailing 03). This is done in a single in-place forward-compaction pass in + // O(n) time. A previous implementation called memmove() on the remaining tail + // for each removed byte, which is O(n^2) and can be abused by a payload that + // is densely packed with 00 00 03 triplets. + + uint8_t* d = data(); + const int n = size(); + + int w = 0; // write position == length of the compacted output so far + int zeros = 0; // number of consecutive 0x00 bytes already written to output + + for (int r=0; r<n; r++) { + uint8_t b = dr; + + if (zeros >= 2 && b == 3) { + // 'r' is the position of this byte in the original (uncompacted) NAL, + // which equals (compacted position) + num_skipped_bytes() — the value the + // previous memmove-based code recorded here. + insert_skipped_byte(r); + zeros = 0; + continue; + } - p++; - i++; - } - } + dw++ = b; + zeros = (b == 0) ? zeros + 1 : 0; + } - p++; - } + set_size(w); } @@ -321,6 +335,14 @@ } #endif + // enforce the maximum NAL size: drop an oversized NAL and resync + if (!nal_size_within_limit(out - nal->data())) { + free_NAL_unit(pending_input_NAL); + pending_input_NAL = nullptr; + input_push_state = 0; + return DE265_ERROR_NAL_SIZE_EXCEEDS_SECURITY_LIMIT; + } + nal->set_size(out - nal->data());; // push this NAL decoder queue @@ -355,6 +377,18 @@ } nal->set_size(out - nal->data()); + + // Enforce the maximum NAL size on the still-incomplete pending NAL. This bounds + // memory when a single NAL grows across many push_data() calls without ever + // reaching a start code. The oversized pending NAL is dropped and the parser + // resyncs at the next start code. + if (!nal_size_within_limit(nal->size())) { + free_NAL_unit(pending_input_NAL); + pending_input_NAL = nullptr; + input_push_state = 0; + return DE265_ERROR_NAL_SIZE_EXCEEDS_SECURITY_LIMIT; + } + return DE265_OK; } @@ -368,6 +402,11 @@ end_of_frame = false; + // enforce the maximum NAL size to bound memory usage + if (!nal_size_within_limit(len)) { + return DE265_ERROR_NAL_SIZE_EXCEEDS_SECURITY_LIMIT; + } + NAL_unit* nal = alloc_NAL_unit(len); if (nal == nullptr || !nal->set_data(data, len)) { free_NAL_unit(nal);
View file
libde265-1.0.17.tar.gz/libde265/nal-parser.h -> libde265-1.1.1.tar.gz/libde265/nal-parser.h
Changed
@@ -25,6 +25,7 @@ #include "libde265/pps.h" #include "libde265/nal.h" #include "libde265/util.h" +#include "libde265/de265.h" #include <vector> #include <queue> @@ -90,6 +91,10 @@ NAL_Parser(); ~NAL_Parser(); + // Point the parser at the live security limits struct so that runtime + // changes (via de265_get_security_limits()) take effect immediately. + void set_security_limits(const de265_security_limits* limits) { m_security_limits = limits; } + de265_error push_data(const unsigned char* data, int len, de265_PTS pts, void* user_data = nullptr); @@ -134,6 +139,8 @@ NAL_unit* pending_input_NAL = nullptr; + const de265_security_limits* m_security_limits = nullptr; + // NAL level @@ -142,6 +149,14 @@ void push_to_NAL_queue(NAL_unit*); + // Returns true if a NAL unit of the given size is within the configured + // security limit (or if no limit is set). + bool nal_size_within_limit(int64_t nal_size) const { + return m_security_limits == nullptr || + m_security_limits->max_NAL_size_bytes == 0 || + nal_size <= m_security_limits->max_NAL_size_bytes; + } + // pool of unused NAL memory
View file
libde265-1.0.17.tar.gz/libde265/pps.cc -> libde265-1.1.1.tar.gz/libde265/pps.cc
Changed
@@ -25,6 +25,9 @@ #include <assert.h> #include <stdlib.h> #include <string.h> +#include <mutex> +#include <memory> +#include <atomic> #if defined(_MSC_VER) || defined(__MINGW32__) # include <malloc.h> #elif defined(HAVE_ALLOCA_H) @@ -62,15 +65,19 @@ } cross_component_prediction_enabled_flag = br->get_bits(1); + // shall be 0 when ChromaArrayType is not 3 (Sec. 7.4.3.3.2) if (sps->ChromaArrayType != CHROMA_444 && cross_component_prediction_enabled_flag) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); + return false; } chroma_qp_offset_list_enabled_flag = br->get_bits(1); + // shall be 0 when ChromaArrayType is 0 (mono) (Sec. 7.4.3.3.2) if (sps->ChromaArrayType == CHROMA_MONO && chroma_qp_offset_list_enabled_flag) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); + return false; } if (chroma_qp_offset_list_enabled_flag) { @@ -118,7 +125,7 @@ uvlc = br->get_uvlc(); if (uvlc == UVLC_ERROR || - uvlc > static_cast<uint32_t>(libde265_max(0, sps->BitDepth_Y-10))) { + uvlc > static_cast<uint32_t>(std::max(0, sps->BitDepth_Y-10))) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); return false; } @@ -127,7 +134,7 @@ uvlc = br->get_uvlc(); if (uvlc == UVLC_ERROR || - uvlc > static_cast<uint32_t>(libde265_max(0, sps->BitDepth_C-10))) { + uvlc > static_cast<uint32_t>(std::max(0, sps->BitDepth_C-10))) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); return false; } @@ -231,11 +238,7 @@ for (int i=0;i<=DE265_MAX_TILE_COLUMNS;i++) { colBdi=0; } for (int i=0;i<=DE265_MAX_TILE_ROWS;i++) { rowBdi=0; } - CtbAddrRStoTS.clear(); - CtbAddrTStoRS.clear(); - TileId.clear(); - TileIdRS.clear(); - MinTbAddrZS.clear(); + scan.reset(); Log2MinCuQpDeltaSize = 0; @@ -260,6 +263,8 @@ pps_range_extension_flag = 0; pps_multilayer_extension_flag = 0; pps_extension_6bits = 0; + + range_extension.reset(); } @@ -312,7 +317,9 @@ { int32_t svlc; - if ((svlc = br->get_svlc()) == SVLC_ERROR) { + // init_qp_minus26 shall be in -(26 + QpBdOffset_Y), +25 (Sec. 7.4.3.3.1) + if ((svlc = br->get_svlc()) == SVLC_ERROR || + svlc < -(26 + sps->QpBdOffset_Y) || svlc > 25) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); return false; } @@ -365,18 +372,20 @@ if (tiles_enabled_flag) { if ((uvlc = br->get_uvlc()) == UVLC_ERROR || - uvlc+1 > DE265_MAX_TILE_COLUMNS) { + uvlc + 1 > DE265_MAX_TILE_COLUMNS || + uvlc + 1 > sps->PicWidthInCtbsY) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); return false; } - num_tile_columns = uvlc+1; + num_tile_columns = uvlc + 1; if ((uvlc = br->get_uvlc()) == UVLC_ERROR || - uvlc+1 > DE265_MAX_TILE_ROWS) { + uvlc + 1 > DE265_MAX_TILE_ROWS || + uvlc + 1 > sps->PicHeightInCtbsY) { ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); return false; } - num_tile_rows = uvlc+1; + num_tile_rows = uvlc + 1; uniform_spacing_flag = br->get_bits(1); @@ -384,30 +393,29 @@ uint16_t lastColumnWidth = sps->PicWidthInCtbsY; uint16_t lastRowHeight = sps->PicHeightInCtbsY; - for (int i=0; i<num_tile_columns-1; i++) - { - if ((uvlc = br->get_uvlc()) == UVLC_ERROR || - uvlc >= lastColumnWidth) { - ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); - return false; - } - colWidthi = uvlc+1; - - lastColumnWidth -= colWidthi; + for (int i = 0; i < num_tile_columns - 1; i++) { + if ((uvlc = br->get_uvlc()) == UVLC_ERROR || + uvlc + 1 >= lastColumnWidth) { + ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); + return false; } - colWidthnum_tile_columns-1 = lastColumnWidth; + colWidthi = uvlc + 1; - for (int i=0; i<num_tile_rows-1; i++) - { - if ((uvlc = br->get_uvlc()) == UVLC_ERROR || - uvlc >= lastRowHeight) { - ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); - return false; - } - rowHeighti = uvlc+1; - lastRowHeight -= rowHeighti; + lastColumnWidth -= colWidthi; + } + + colWidthnum_tile_columns - 1 = lastColumnWidth; + + for (int i = 0; i < num_tile_rows - 1; i++) { + if ((uvlc = br->get_uvlc()) == UVLC_ERROR || + uvlc + 1 >= lastRowHeight) { + ctx->add_warning(DE265_WARNING_PPS_HEADER_INVALID, false); + return false; } + rowHeighti = uvlc + 1; + lastRowHeight -= rowHeighti; + } rowHeightnum_tile_rows-1 = lastRowHeight; @@ -481,7 +489,7 @@ } } else { - memcpy(&scaling_list, &sps->scaling_list, sizeof(scaling_list_data)); + scaling_list = sps->scaling_list; } @@ -514,16 +522,12 @@ } } - //assert(false); - /* - while( more_rbsp_data() ) - - pps_extension_data_flag - u(1) - rbsp_trailing_bits() - - } - */ + // Multilayer extension and the 6 reserved extension bits would carry + // additional payload that we do not parse. Reject the stream. + if (pps_multilayer_extension_flag || pps_extension_6bits) { + ctx->add_warning(DE265_ERROR_NOT_IMPLEMENTED_YET, false); + return false; + } } @@ -535,6 +539,153 @@ } +//---------------------------------------------------------------------------- +// Library-scope cache for the geometry-derived scan tables (HEVC Sec. 6.5). +// +// The tables depend only on the picture/tile geometry. Many independent decoder +// contexts (e.g. libheif tile grids) decode images of the same geometry, so we +// compute the tables once and share them read-only via shared_ptr. A small LRU +// cache (a few distinct geometries) protected by a mutex serves concurrent +// decoders. The compute is done while holding the lock on purpose: a burst of +// contexts with the same new geometry then computes the tables exactly once +// (the others block briefly and pick up the cached result). +//---------------------------------------------------------------------------- + +namespace { + +struct pps_scan_key { + uint8_t log2CtbSize; + uint8_t log2MinTrafo; + uint16_t picWidthInCtbs, picHeightInCtbs; + uint16_t picWidthInTbs, picHeightInTbs; + uint32_t picSizeInCtbs, picSizeInTbs; + uint16_t numTileCols, numTileRows; + uint16_t colBdDE265_MAX_TILE_COLUMNS+1; + uint16_t rowBdDE265_MAX_TILE_ROWS+1; + + bool operator==(const pps_scan_key& o) const { + if (log2CtbSize != o.log2CtbSize || log2MinTrafo != o.log2MinTrafo || + picWidthInCtbs != o.picWidthInCtbs || picHeightInCtbs!= o.picHeightInCtbs|| + picWidthInTbs != o.picWidthInTbs || picHeightInTbs != o.picHeightInTbs || + picSizeInCtbs != o.picSizeInCtbs || picSizeInTbs != o.picSizeInTbs || + numTileCols != o.numTileCols || numTileRows != o.numTileRows) return false; + for (int i=0;i<=numTileCols;i++) if (colBdi!=o.colBdi) return false; + for (int i=0;i<=numTileRows;i++) if (rowBdi!=o.rowBdi) return false; + return true; + } +}; + +// Build the five scan tables from the geometry key (HEVC 6.5.1 + 6.5.2). +std::shared_ptr<const pps_scan_tables> compute_scan_tables(const pps_scan_key& k) +{ + std::shared_ptr<pps_scan_tables> t = std::make_shared<pps_scan_tables>(); + t->CtbAddrRStoTS.resize(k.picSizeInCtbs); + t->CtbAddrTStoRS.resize(k.picSizeInCtbs); + t->TileId .resize(k.picSizeInCtbs); + t->TileIdRS .resize(k.picSizeInCtbs); + t->MinTbAddrZS .resize(k.picSizeInTbs); + + // 6.5.1 raster (RS) <-> tile scan (TS) conversion + tile-ID assignment. + uint32_t ctbAddrTS = 0; + uint32_t tIdx = 0; + for (int tileY=0; tileY<k.numTileRows; tileY++) { + for (int tileX=0; tileX<k.numTileCols; tileX++) { + for (int y=k.rowBdtileY; y<k.rowBdtileY+1; y++) { + for (int x=k.colBdtileX; x<k.colBdtileX+1; x++) { + uint32_t ctbAddrRS = y * k.picWidthInCtbs + x; + t->CtbAddrRStoTSctbAddrRS = ctbAddrTS; + t->CtbAddrTStoRSctbAddrTS = ctbAddrRS; + t->TileId ctbAddrTS = tIdx; + t->TileIdRSctbAddrRS = tIdx; + ctbAddrTS++; + } + } + tIdx++; + } + } + assert(ctbAddrTS == k.picSizeInCtbs); + + // 6.5.2 Z-scan order array initialization process. + const int shift = k.log2CtbSize - k.log2MinTrafo; + for (int y=0; y<k.picHeightInTbs; y++) + for (int x=0; x<k.picWidthInTbs; x++) { + int tbX = (x<<k.log2MinTrafo)>>k.log2CtbSize; + int tbY = (y<<k.log2MinTrafo)>>k.log2CtbSize; + int ctbAddrRS = k.picWidthInCtbs*tbY + tbX; + + uint32_t v = t->CtbAddrRStoTSctbAddrRS << (shift*2); + int p=0; + for (int i=0;i<shift;i++) { + int m=1<<i; + p += (m & x ? m*m : 0) + (m & y ? 2*m*m : 0); + } + t->MinTbAddrZSx + y*k.picWidthInTbs = v + p; + } + + return t; +} + +class pps_scan_cache { +public: + std::shared_ptr<const pps_scan_tables> get(const pps_scan_key& key) { + std::lock_guard<std::mutex> lock(mMutex); + + for (size_t i=0; i<mEntries.size(); i++) { + if (mEntriesi.key == key) { + std::shared_ptr<const pps_scan_tables> tables = mEntriesi.tables; + if (i != 0) { // move-to-front (LRU) + Entry e = mEntriesi; + mEntries.erase(mEntries.begin()+i); + mEntries.insert(mEntries.begin(), e); + } + return tables; + } + } + + // Miss: compute while holding the lock so that a burst of concurrent decoders + // with the same new geometry computes the tables exactly once. + std::shared_ptr<const pps_scan_tables> tables = compute_scan_tables(key); + mEntries.insert(mEntries.begin(), Entry{key, tables}); + if (mEntries.size() > kMaxEntries) mEntries.pop_back(); // evict LRU + return tables; + } + +private: + static const size_t kMaxEntries = 3; + struct Entry { pps_scan_key key; std::shared_ptr<const pps_scan_tables> tables; }; + std::mutex mMutex; + std::vector<Entry> mEntries; +}; + +// Owned by the de265_init()/de265_free() lifecycle (see de265.cc). It is created +// and destroyed (under de265's init mutex) while no decoder is running, so it is +// read locklessly during decoding; the cache's own mutex guards concurrent get() +// calls. Atomic so the publish/read of the pointer is well-defined. +std::atomic<pps_scan_cache*> g_pps_scan_cache{nullptr}; + +std::shared_ptr<const pps_scan_tables> get_pps_scan_tables(const pps_scan_key& key) +{ + pps_scan_cache* cache = g_pps_scan_cache.load(std::memory_order_acquire); + if (cache) return cache->get(key); + return compute_scan_tables(key); // library not initialized: compute without caching +} + +} // namespace + + +void pps_scan_cache_init() +{ + if (!g_pps_scan_cache.load(std::memory_order_relaxed)) { + g_pps_scan_cache.store(new pps_scan_cache(), std::memory_order_release); + } +} + +void pps_scan_cache_free() +{ + delete g_pps_scan_cache.exchange(nullptr, std::memory_order_acq_rel); +} + + void pic_parameter_set::set_derived_values(const seq_parameter_set* sps) { Log2MinCuQpDeltaSize = sps->Log2CtbSizeY - diff_cu_qp_delta_depth; @@ -582,143 +733,28 @@ - // alloc raster scan arrays - - CtbAddrRStoTS.resize(sps->PicSizeInCtbsY); - CtbAddrTStoRS.resize(sps->PicSizeInCtbsY); - TileId .resize(sps->PicSizeInCtbsY); - TileIdRS .resize(sps->PicSizeInCtbsY); - MinTbAddrZS .resize(sps->PicSizeInTbsY ); - - - // raster scan (RS) <-> tile scan (TS) conversion - - for (uint32_t ctbAddrRS=0 ; ctbAddrRS < sps->PicSizeInCtbsY ; ctbAddrRS++) - { - int tbX = ctbAddrRS % sps->PicWidthInCtbsY; - int tbY = ctbAddrRS / sps->PicWidthInCtbsY; - int tileX=-1,tileY=-1; - - for (int i=0;i<num_tile_columns;i++) - if (tbX >= colBdi) - tileX=i; - - for (int j=0;j<num_tile_rows;j++) - if (tbY >= rowBdj) - tileY=j; - - CtbAddrRStoTSctbAddrRS = 0; - for (int i=0;i<tileX;i++) - CtbAddrRStoTSctbAddrRS += rowHeighttileY*colWidthi; - - for (int j=0;j<tileY;j++) - { - //pps->CtbAddrRStoTSctbAddrRS += (tbY - pps->rowBdtileY)*pps->colWidthtileX; - //pps->CtbAddrRStoTSctbAddrRS += tbX - pps->colBdtileX; - - CtbAddrRStoTSctbAddrRS += sps->PicWidthInCtbsY * rowHeightj; - } - - assert(tileX>=0 && tileY>=0); - - CtbAddrRStoTSctbAddrRS += (tbY-rowBdtileY)*colWidthtileX; - CtbAddrRStoTSctbAddrRS += tbX - colBdtileX; - - - // inverse mapping - - CtbAddrTStoRS CtbAddrRStoTSctbAddrRS = ctbAddrRS; - } - - -#if 0 - logtrace(LogHeaders,"6.5.1 CtbAddrRSToTS\n"); - for (int y=0;y<sps->PicHeightInCtbsY;y++) - { - for (int x=0;x<sps->PicWidthInCtbsY;x++) - { - logtrace(LogHeaders,"%3d ", CtbAddrRStoTSx + y*sps->PicWidthInCtbsY); - } - - logtrace(LogHeaders,"\n"); - } -#endif - - // tile id - - for (int j=0, tIdx=0 ; j<num_tile_rows ; j++) - for (int i=0 ; i<num_tile_columns;i++) - { - for (int y=rowBdj ; y<rowBdj+1 ; y++) - for (int x=colBdi ; x<colBdi+1 ; x++) { - TileId CtbAddrRStoTSy*sps->PicWidthInCtbsY + x = tIdx; - TileIdRS y*sps->PicWidthInCtbsY + x = tIdx; - - //logtrace(LogHeaders,"tileID%d,%d = %d\n",x,y,pps->TileIdRS y*sps->PicWidthInCtbsY + x ); - } - - tIdx++; - } - -#if 0 - logtrace(LogHeaders,"Tile IDs RS:\n"); - for (int y=0;y<sps->PicHeightInCtbsY;y++) { - for (int x=0;x<sps->PicWidthInCtbsY;x++) { - logtrace(LogHeaders,"%2d ",TileIdRSy*sps->PicWidthInCtbsY+x); - } - logtrace(LogHeaders,"\n"); - } -#endif - - // 6.5.2 Z-scan order array initialization process - - for (int y=0;y<sps->PicHeightInTbsY;y++) - for (int x=0;x<sps->PicWidthInTbsY;x++) - { - int tbX = (x<<sps->Log2MinTrafoSize)>>sps->Log2CtbSizeY; - int tbY = (y<<sps->Log2MinTrafoSize)>>sps->Log2CtbSizeY; - int ctbAddrRS = sps->PicWidthInCtbsY*tbY + tbX; - - MinTbAddrZSx + y*sps->PicWidthInTbsY = CtbAddrRStoTSctbAddrRS - << ((sps->Log2CtbSizeY-sps->Log2MinTrafoSize)*2); - - int p=0; - for (int i=0 ; i<(sps->Log2CtbSizeY - sps->Log2MinTrafoSize) ; i++) { - int m=1<<i; - p += (m & x ? m*m : 0) + (m & y ? 2*m*m : 0); - } - - MinTbAddrZSx + y*sps->PicWidthInTbsY += p; - } - - - // --- debug logging --- - - /* - logtrace(LogHeaders,"6.5.2 Z-scan order array\n"); - for (int y=0;y<sps->PicHeightInTbsY;y++) - { - for (int x=0;x<sps->PicWidthInTbsY;x++) - { - logtrace(LogHeaders,"%4d ", pps->MinTbAddrZSx + y*sps->PicWidthInTbsY); - } - - logtrace(LogHeaders,"\n"); - } - - for (int i=0;i<sps->PicSizeInTbsY;i++) - { - for (int y=0;y<sps->PicHeightInTbsY;y++) - { - for (int x=0;x<sps->PicWidthInTbsY;x++) - { - if (pps->MinTbAddrZSx + y*sps->PicWidthInTbsY == i) { - logtrace(LogHeaders,"%d %d\n",x,y); - } - } - } - } - */ + // The derived scan tables (Sec. 6.5.1 + 6.5.2) depend only on the picture/tile + // geometry computed above. Build the geometry key and fetch the shared tables + // from the library-scope cache (computing+caching them on a miss). This avoids + // recomputing the (potentially large) MinTbAddrZS table for every decoder + // context when many contexts decode images of the same geometry. + + pps_scan_key key; + memset(&key, 0, sizeof(key)); // zero padding/unused tile entries for clean compares + key.log2CtbSize = sps->Log2CtbSizeY; + key.log2MinTrafo = sps->Log2MinTrafoSize; + key.picWidthInCtbs = sps->PicWidthInCtbsY; + key.picHeightInCtbs = sps->PicHeightInCtbsY; + key.picWidthInTbs = sps->PicWidthInTbsY; + key.picHeightInTbs = sps->PicHeightInTbsY; + key.picSizeInCtbs = sps->PicSizeInCtbsY; + key.picSizeInTbs = sps->PicSizeInTbsY; + key.numTileCols = num_tile_columns; + key.numTileRows = num_tile_rows; + for (int i=0;i<=num_tile_columns;i++) key.colBdi = colBdi; + for (int i=0;i<=num_tile_rows; i++) key.rowBdi = rowBdi; + + scan = get_pps_scan_tables(key); } @@ -731,7 +767,7 @@ } out.write_uvlc(pic_parameter_set_id); - if (seq_parameter_set_id >= DE265_MAX_PPS_SETS) { + if (seq_parameter_set_id >= DE265_MAX_SPS_SETS) { errqueue->add_warning(DE265_WARNING_NONEXISTING_SPS_REFERENCED, false); return false; }
View file
libde265-1.0.17.tar.gz/libde265/pps.h -> libde265-1.1.1.tar.gz/libde265/pps.h
Changed
@@ -56,6 +56,26 @@ }; +// Picture-geometry-derived scan-order tables (HEVC Sec. 6.5). They depend only +// on the picture/tile geometry, are immutable after construction, and are shared +// read-only across pic_parameter_set instances (and across independent decoder +// contexts) through a small library-scope cache -- see pps.cc. Sharing avoids +// recomputing the (potentially large) MinTbAddrZS table once per decoder when +// many contexts decode images of the same geometry (e.g. libheif tile grids). +struct pps_scan_tables { + std::vector<uint32_t> CtbAddrRStoTS; // #CTBs + std::vector<uint32_t> CtbAddrTStoRS; // #CTBs + std::vector<uint32_t> TileId; // #CTBs // index in tile-scan order + std::vector<uint32_t> TileIdRS; // #CTBs // index in raster-scan order + std::vector<uint32_t> MinTbAddrZS; // #TBs x + y*PicWidthInTbsY +}; + +// Library-scope scan-table cache lifecycle, tied to de265_init() / de265_free() +// so the cache is released when the library is de-initialized (no leak at exit). +void pps_scan_cache_init(); +void pps_scan_cache_free(); + + class pic_parameter_set { public: pic_parameter_set(); @@ -126,7 +146,7 @@ int8_t tc_offset; // -12;12 bool pic_scaling_list_data_present_flag; - struct scaling_list_data scaling_list; // contains valid data if sps->scaling_list_enabled_flag set + scaling_list_data scaling_list; // contains valid data if sps->scaling_list_enabled_flag set bool lists_modification_present_flag; uint8_t log2_parallel_merge_level; // 2 ; log2(max CB size) @@ -146,16 +166,14 @@ int Log2MinCuChromaQpOffsetSize; int Log2MaxTransformSkipSize; - int colWidth DE265_MAX_TILE_COLUMNS ; - int rowHeight DE265_MAX_TILE_ROWS ; - int colBd DE265_MAX_TILE_COLUMNS+1 ; - int rowBd DE265_MAX_TILE_ROWS+1 ; + uint16_t colWidth DE265_MAX_TILE_COLUMNS ; + uint16_t rowHeight DE265_MAX_TILE_ROWS ; + uint16_t colBd DE265_MAX_TILE_COLUMNS+1 ; + uint16_t rowBd DE265_MAX_TILE_ROWS+1 ; - std::vector<uint32_t> CtbAddrRStoTS; // #CTBs - std::vector<uint32_t> CtbAddrTStoRS; // #CTBs - std::vector<uint32_t> TileId; // #CTBs // index in tile-scan order - std::vector<uint32_t> TileIdRS; // #CTBs // index in raster-scan order - std::vector<uint32_t> MinTbAddrZS; // #TBs x + y*PicWidthInTbsY + // Derived scan-order tables (Sec. 6.5), shared read-only via the library-scope + // cache. Access as scan->CtbAddrRStoTS..., scan->MinTbAddrZS..., etc. + std::shared_ptr<const pps_scan_tables> scan; void set_derived_values(const seq_parameter_set* sps); };
View file
libde265-1.0.17.tar.gz/libde265/quality.cc -> libde265-1.1.1.tar.gz/libde265/quality.cc
Changed
@@ -57,7 +57,7 @@ for (int y=0;y<height;y++) { for (int x=0;x<width;x++) { int diff = iPtrx - rPtrx; - sum += abs_value(diff); + sum += std::abs(diff); } iPtr += imgStride;
View file
libde265-1.0.17.tar.gz/libde265/refpic.cc -> libde265-1.1.1.tar.gz/libde265/refpic.cc
Changed
@@ -86,7 +86,7 @@ const seq_parameter_set* sps, bitreader* br, ref_pic_set* out_set, // where to store the read set - int idxRps, // index of the set to be read + uint32_t idxRps, // index of the set to be read const std::vector<ref_pic_set>& sets, // previously read sets bool sliceRefPicSet) // is this in the slice header? { @@ -109,16 +109,13 @@ /* Only for the last ref_pic_set (that's the one coded in the slice header), we can specify relative to which reference set we code the set. */ - int delta_idx; + uint32_t delta_idx; if (sliceRefPicSet) { // idxRps == num_short_term_ref_pic_sets) { - delta_idx = vlc = br->get_uvlc(); - if (vlc==UVLC_ERROR) { - return false; - } - - if (delta_idx>=idxRps) { + vlc = br->get_uvlc(); + if (vlc==UVLC_ERROR || vlc >= idxRps) { return false; } + delta_idx = vlc; delta_idx++; } else { @@ -325,6 +322,22 @@ out_set->compute_derived_values(); + // The unused short-term references are all collected into a single PocStFoll array + // of MAX_NUM_REF_PICS entries (see decoder_context::process_reference_picture_set). + // While each individual list is bounded above, the predicted-RPS construction can + // append the current-picture delta to an already-full source set, pushing the + // combined count past MAX_NUM_REF_PICS. Reject such sets to avoid an out-of-bounds + // write when filling PocStFoll. + if (out_set->NumDeltaPocs > MAX_NUM_REF_PICS) { + out_set->NumNegativePics = 0; + out_set->NumPositivePics = 0; + out_set->NumDeltaPocs = 0; + out_set->NumPocTotalCurr_shortterm_only = 0; + + errqueue->add_warning(DE265_WARNING_MAX_NUM_REF_PICS_EXCEEDED, false); + return false; + } + return true; } @@ -333,7 +346,7 @@ const seq_parameter_set* sps, CABAC_encoder& out, const ref_pic_set* in_set, // which set to write - int idxRps, // index of the set to be written + uint32_t idxRps, // index of the set to be written const std::vector<ref_pic_set>& sets, // previously read sets bool sliceRefPicSet) // is this in the slice header? { @@ -384,7 +397,7 @@ const seq_parameter_set* sps, CABAC_encoder& out, const ref_pic_set* in_set, // which set to write - int idxRps, // index of the set to be read + uint32_t idxRps, // index of the set to be read const std::vector<ref_pic_set>& sets, // previously read sets bool sliceRefPicSet) // is this in the slice header? {
View file
libde265-1.0.17.tar.gz/libde265/sao.cc -> libde265-1.1.1.tar.gz/libde265/sao.cc
Changed
@@ -28,8 +28,8 @@ template <class pixel_t> void apply_sao_internal(de265_image* img, int xCtb,int yCtb, const slice_segment_header* shdr, int cIdx, int nSW,int nSH, - const pixel_t* in_img, int in_stride, - /* */ pixel_t* out_img, int out_stride) + const pixel_t* in_img, ptrdiff_t in_stride, + /* */ pixel_t* out_img, ptrdiff_t out_stride) { const sao_info* saoinfo = img->get_sao_info(xCtb,yCtb); @@ -77,7 +77,7 @@ if (SaoTypeIdx==2) { int hPos2, vPos2; - int vPosStride2; // vPos multiplied by image stride + ptrdiff_t vPosStride2; // vPos multiplied by image stride int SaoEoClass = (saoinfo->SaoEoClass >> (2*cIdx)) & 0x3; switch (SaoEoClass) { @@ -156,8 +156,8 @@ if (pps->loop_filter_across_tiles_enabled_flag==0 && - pps->TileIdRS(xS>>ctbshiftW) + (yS>>ctbshiftH)*picWidthInCtbs != - pps->TileIdRS(xC>>ctbshiftW) + (yC>>ctbshiftH)*picWidthInCtbs) { + pps->scan->TileIdRS(xS>>ctbshiftW) + (yS>>ctbshiftH)*picWidthInCtbs != + pps->scan->TileIdRS(xC>>ctbshiftW) + (yC>>ctbshiftH)*picWidthInCtbs) { edgeIdx=0; break; } @@ -266,8 +266,8 @@ template <class pixel_t> void apply_sao(de265_image* img, int xCtb,int yCtb, const slice_segment_header* shdr, int cIdx, int nSW,int nSH, - const pixel_t* in_img, int in_stride, - /* */ pixel_t* out_img, int out_stride) + const pixel_t* in_img, ptrdiff_t in_stride, + /* */ pixel_t* out_img, ptrdiff_t out_stride) { if (img->high_bit_depth(cIdx)) { apply_sao_internal<uint16_t>(img,xCtb,yCtb, shdr,cIdx,nSW,nSH, @@ -332,10 +332,10 @@ return; } - int lumaImageSize = img->get_image_stride(0) * img->get_height(0) * img->get_bytes_per_pixel(0); - int chromaImageSize = img->get_image_stride(1) * img->get_height(1) * img->get_bytes_per_pixel(1); + size_t lumaImageSize = static_cast<size_t>(img->get_image_stride(0)) * img->get_height(0) * img->get_bytes_per_pixel(0); + size_t chromaImageSize = static_cast<size_t>(img->get_image_stride(1)) * img->get_height(1) * img->get_bytes_per_pixel(1); - uint8_t* inputCopy = new uint8_t libde265_max(lumaImageSize, chromaImageSize) ; + uint8_t* inputCopy = new uint8_t std::max(lumaImageSize, chromaImageSize) ; if (inputCopy == nullptr) { img->decctx->add_warning(DE265_WARNING_CANNOT_APPLY_SAO_OUT_OF_MEMORY,false); return; @@ -347,10 +347,10 @@ for (int cIdx=0;cIdx<nChannels;cIdx++) { - int stride = img->get_image_stride(cIdx); + ptrdiff_t stride = img->get_image_stride(cIdx); int height = img->get_height(cIdx); - memcpy(inputCopy, img->get_image_plane(cIdx), stride * height * img->get_bytes_per_pixel(cIdx)); + memcpy(inputCopy, img->get_image_plane(cIdx), static_cast<size_t>(stride) * height * img->get_bytes_per_pixel(cIdx)); for (int yCtb=0; yCtb<sps.PicHeightInCtbsY; yCtb++) for (int xCtb=0; xCtb<sps.PicWidthInCtbsY; xCtb++)
View file
libde265-1.0.17.tar.gz/libde265/scan.cc -> libde265-1.1.1.tar.gz/libde265/scan.cc
Changed
@@ -105,30 +105,25 @@ return scanposscanIdxlog2BlkSize y*(1<<log2BlkSize) + x ; } -static void fill_scan_pos(scan_position* pos, int x,int y,int scanIdx, int log2TrafoSize) +static void fill_scan_pos_table(scan_position* pos, int scanIdx, int log2TrafoSize) { - int lastScanPos = 16; - int lastSubBlock = (1<<(log2TrafoSize-2)) * (1<<(log2TrafoSize-2)) -1; + int numSubBlocks = (1<<(log2TrafoSize-2)) * (1<<(log2TrafoSize-2)); + int blkSize = 1 << log2TrafoSize; const position* ScanOrderSub = get_scan_order(log2TrafoSize-2, scanIdx); const position* ScanOrderPos = get_scan_order(2, scanIdx); - int xC,yC; - do { - if (lastScanPos==0) { - lastScanPos=16; - lastSubBlock--; + for (int sb = 0; sb < numSubBlocks; sb++) + { + position S = ScanOrderSubsb; + for (int sp = 0; sp < 16; sp++) + { + int xC = (S.x<<2) + ScanOrderPossp.x; + int yC = (S.y<<2) + ScanOrderPossp.y; + posyC * blkSize + xC.subBlock = sb; + posyC * blkSize + xC.scanPos = sp; + } } - lastScanPos--; - - position S = ScanOrderSublastSubBlock; - xC = (S.x<<2) + ScanOrderPoslastScanPos.x; - yC = (S.y<<2) + ScanOrderPoslastScanPos.y; - - } while ( (xC != x) || (yC != y)); - - pos->subBlock = lastSubBlock; - pos->scanPos = lastScanPos; } @@ -141,12 +136,7 @@ init_scan_d(scan_dlog2size, 1<<log2size); } - for (int log2size=2;log2size<=5;log2size++) for (int scanIdx=0;scanIdx<3;scanIdx++) - for (int y=0;y<(1<<log2size);y++) - for (int x=0;x<(1<<log2size);x++) - { - fill_scan_pos(&scanposscanIdxlog2size y*(1<<log2size) + x ,x,y,scanIdx,log2size); - } + fill_scan_pos_table(scanposscanIdxlog2size, scanIdx, log2size); }
View file
libde265-1.0.17.tar.gz/libde265/sei.cc -> libde265-1.1.1.tar.gz/libde265/sei.cc
Changed
@@ -99,7 +99,7 @@ class raw_hash_data { public: - raw_hash_data(int w, int stride); + raw_hash_data(int w, ptrdiff_t stride); ~raw_hash_data(); struct data_chunk { @@ -111,13 +111,14 @@ data_chunk prepare_16bit(const uint8_t* data,int y); private: - int mWidth, mStride; + int mWidth; + ptrdiff_t mStride; uint8_t* mMem; }; -raw_hash_data::raw_hash_data(int w, int stride) +raw_hash_data::raw_hash_data(int w, ptrdiff_t stride) { mWidth=w; mStride=stride; @@ -157,7 +158,7 @@ } -static uint32_t compute_checksum(uint8_t* data,int w,int h,int stride, int bit_depth) +static uint32_t compute_checksum(uint8_t* data,int w,int h,ptrdiff_t stride, int bit_depth) { uint32_t sum = 0; @@ -170,7 +171,7 @@ } else { auto* data16 = reinterpret_cast<uint16_t*>(data); - int stride16 = stride / 2; + ptrdiff_t stride16 = stride / 2; for (int y=0; y<h; y++) for(int x=0; x<w; x++) { uint8_t xorMask = ( x & 0xFF ) ^ ( y & 0xFF ) ^ ( x >> 8 ) ^ ( y >> 8 ); @@ -224,7 +225,7 @@ (t << 12)) & 0xFFFF; } -static uint32_t compute_CRC_8bit_fast(const uint8_t* data,int w,int h,int stride, int bit_depth) +static uint32_t compute_CRC_8bit_fast(const uint8_t* data,int w,int h,ptrdiff_t stride, int bit_depth) { raw_hash_data raw_data(w,stride); @@ -250,7 +251,7 @@ } -static void compute_MD5(uint8_t* data,int w,int h,int stride, uint8_t* result, int bit_depth) +static void compute_MD5(uint8_t* data,int w,int h,ptrdiff_t stride, uint8_t* result, int bit_depth) { MD5_CTX md5; MD5_Init(&md5); @@ -289,7 +290,8 @@ int nHashes = img->get_sps().chroma_format_idc==0 ? 1 : 3; for (int i=0;i<nHashes;i++) { uint8_t* data; - int w,h,stride; + int w,h; + ptrdiff_t stride; w = img->get_width(i); h = img->get_height(i);
View file
libde265-1.0.17.tar.gz/libde265/slice.cc -> libde265-1.1.1.tar.gz/libde265/slice.cc
Changed
@@ -42,7 +42,7 @@ const seq_parameter_set* sps, bitreader* br, ref_pic_set* out_set, - int idxRps, // index of the set to be read + uint32_t idxRps, // index of the set to be read const std::vector<ref_pic_set>& sets, bool sliceRefPicSet); @@ -360,6 +360,9 @@ RemoveReferencesList.clear(); + for (int i = 0; i < 4; i++) { + ctx_model_storage_StatCoeffi = 0; + } ctx_model_storage_defined = false; } @@ -851,7 +854,7 @@ } offset_len = uvlc + 1; - for (int i = 0; i < num_entry_point_offsets; i++) { + for (uint32_t i = 0; i < num_entry_point_offsets; i++) { { uint32_t offset_minus1 = br->get_bits(offset_len); if (offset_minus1 == UINT32_MAX) { @@ -1235,7 +1238,7 @@ if (num_entry_point_offsets > 0) { out.write_uvlc(offset_len - 1); - for (int i = 0; i < num_entry_point_offsets; i++) { + for (uint32_t i = 0; i < num_entry_point_offsets; i++) { { uint32_t prev = 0; if (i > 0) prev = entry_point_offseti - 1; @@ -1505,7 +1508,7 @@ if (num_entry_point_offsets > 0) { LOG1("offset_len : %d\n", offset_len); - for (int i = 0; i < num_entry_point_offsets; i++) { + for (uint32_t i = 0; i < num_entry_point_offsets; i++) { LOG2("entry point %i : %d\n", i, entry_point_offseti); } } @@ -1599,7 +1602,7 @@ static uint8_t decode_sao_offset_abs(thread_context* tctx, int bitDepth) { logtrace(LogSlice, "# sao_offset_abs\n"); - int cMax = (1 << (libde265_min(bitDepth, 10) - 5)) - 1; + int cMax = (1 << (std::min(bitDepth, 10) - 5)) - 1; assert(cMax >= 7 && cMax<=31); uint8_t value = static_cast<uint8_t>(tctx->cabac_decoder.decode_TU_bypass( cMax)); logtrace(LogSymbols, "$1 sao_offset_abs=%d\n", value); @@ -2474,12 +2477,21 @@ #define MAX_PREFIX (15+3) +// Defensive bounds against non-conforming bitstreams. The spec (eq. 9-25 / 9-23) does +// not impose an explicit upper bound on cRiceParam or StatCoeff in the persistent-rice +// path, but a malformed stream can push them arbitrarily high. We clamp so that the +// signed-int shift expressions in residual_coding stay well-defined: +// - 3 * (1 << uiGoRiceParam) requires uiGoRiceParam <= 29 (else int32 overflow) +// - 3 << (StatCoeff/4) requires StatCoeff/4 <= 29 (same) +#define MAX_RICE_PARAM 29 +#define MAX_STAT_COEFF (4 * MAX_RICE_PARAM + 3) // 119: largest value with /4 <= 29 + static int32_t decode_coeff_abs_level_remaining(thread_context* tctx, int cRiceParam) { logtrace(LogSlice, "# decode_coeff_abs_level_remaining\n"); - uint16_t prefix = 0; + uint32_t prefix = 0; while (tctx->cabac_decoder.decode_bypass()) { prefix++; if (prefix > MAX_PREFIX) { @@ -2502,7 +2514,7 @@ // included in the 'prefix' counter above. int codeword = tctx->cabac_decoder.decode_FL_bypass( prefix - 3 + cRiceParam); - value = (((UINT16_C(1) << (prefix - 3)) + 3 - 1) << cRiceParam) + codeword; + value = (((UINT32_C(1) << (prefix - 3)) + 3 - 1) << cRiceParam) + codeword; } logtrace(LogSymbols, "$1 coeff_abs_level_remaining=%d\n", value); @@ -2696,7 +2708,7 @@ const seq_parameter_set& sps = tctx->img->get_sps(); if (tctx->CtbAddrInTS < sps.PicSizeInCtbsY) { - tctx->CtbAddrInRS = tctx->img->get_pps().CtbAddrTStoRStctx->CtbAddrInTS; + tctx->CtbAddrInRS = tctx->img->get_pps().scan->CtbAddrTStoRStctx->CtbAddrInTS; tctx->CtbX = tctx->CtbAddrInRS % sps.PicWidthInCtbsY; tctx->CtbY = tctx->CtbAddrInRS / sps.PicWidthInCtbsY; @@ -2741,8 +2753,8 @@ if (xCtb > 0) { //char leftCtbInSliceSeg = (CtbAddrInSliceSeg>0); char leftCtbInSliceSeg = (tctx->CtbAddrInRS > shdr->SliceAddrRS); - char leftCtbInTile = (pps.TileIdRSxCtb + yCtb * sps.PicWidthInCtbsY == - pps.TileIdRSxCtb - 1 + yCtb * sps.PicWidthInCtbsY); + char leftCtbInTile = (pps.scan->TileIdRSxCtb + yCtb * sps.PicWidthInCtbsY == + pps.scan->TileIdRSxCtb - 1 + yCtb * sps.PicWidthInCtbsY); if (leftCtbInSliceSeg && leftCtbInTile) { sao_merge_left_flag = decode_sao_merge_flag(tctx); @@ -2756,8 +2768,8 @@ sps.PicWidthInCtbsY, shdr->slice_segment_address); bool upCtbInSliceSeg = (tctx->CtbAddrInRS - sps.PicWidthInCtbsY) >= shdr->SliceAddrRS; - bool upCtbInTile = (pps.TileIdRSxCtb + yCtb * sps.PicWidthInCtbsY == - pps.TileIdRSxCtb + (yCtb - 1) * sps.PicWidthInCtbsY); + bool upCtbInTile = (pps.scan->TileIdRSxCtb + yCtb * sps.PicWidthInCtbsY == + pps.scan->TileIdRSxCtb + (yCtb - 1) * sps.PicWidthInCtbsY); if (upCtbInSliceSeg && upCtbInTile) { sao_merge_up_flag = decode_sao_merge_flag(tctx); @@ -2888,7 +2900,7 @@ } -LIBDE265_INLINE static int luma_pos_to_ctbAddrRS(const seq_parameter_set* sps, int x, int y) +inline static int luma_pos_to_ctbAddrRS(const seq_parameter_set* sps, int x, int y) { int ctbX = x >> sps->Log2CtbSizeY; int ctbY = y >> sps->Log2CtbSizeY; @@ -2919,8 +2931,8 @@ // check if both CTBs are in the same tile. - if (img->get_pps().TileIdRScurrent_ctbAddrRS != - img->get_pps().TileIdRSneighbor_ctbAddrRS) { + if (img->get_pps().scan->TileIdRScurrent_ctbAddrRS != + img->get_pps().scan->TileIdRSneighbor_ctbAddrRS) { return 0; } @@ -3258,7 +3270,7 @@ int newLastGreater1ScanPos = -1; - int lastGreater1Coefficient = libde265_min(8, nCoefficients); + int lastGreater1Coefficient = std::min(8, nCoefficients); for (int c = 0; c < lastGreater1Coefficient; c++) { int greater1_flag = decode_coeff_abs_level_greater1(tctx, cIdx, i, @@ -3372,15 +3384,24 @@ } } else { - if (baseLevel + coeff_abs_level_remaining > 3 * (1 << uiGoRiceParam)) + if (baseLevel + coeff_abs_level_remaining > 3 * (1 << uiGoRiceParam)) { uiGoRiceParam++; + if (uiGoRiceParam > MAX_RICE_PARAM) { + uiGoRiceParam = MAX_RICE_PARAM; + tctx->decctx->add_warning(DE265_WARNING_RICE_PARAMETER_OUT_OF_RANGE, true); + } + } } // persistent_rice_adaptation_enabled_flag if (sps.range_extension.persistent_rice_adaptation_enabled_flag && firstCoeffWithAbsLevelRemaining) { if (coeff_abs_level_remaining >= (3 << (tctx->StatCoeffsbType / 4))) { - tctx->StatCoeffsbType++; + if (tctx->StatCoeffsbType < MAX_STAT_COEFF) { + tctx->StatCoeffsbType++; + } else { + tctx->decctx->add_warning(DE265_WARNING_RICE_PARAMETER_OUT_OF_RANGE, true); + } } else if (2 * coeff_abs_level_remaining < (1 << (tctx->StatCoeffsbType / 4)) && tctx->StatCoeffsbType > 0) { @@ -3581,7 +3602,7 @@ const int ChromaArrayType = sps.ChromaArrayType; int log2TrafoSizeC = (ChromaArrayType == CHROMA_444 ? log2TrafoSize : log2TrafoSize - 1); - log2TrafoSizeC = libde265_max(2, log2TrafoSizeC); + log2TrafoSizeC = std::max(2, log2TrafoSizeC); const int cbfLuma = cbf_luma; const int cbfChroma = cbf_cb | cbf_cr; @@ -4739,6 +4760,11 @@ // copy CABAC model from previous CTB row tctx->ctx_model = tctx->imgunit->ctx_models(tctx->CtbY - 1); tctx->imgunit->ctx_models(tctx->CtbY - 1).release(); // not used anymore + + // also restore the StatCoeff state for persistent_rice_adaptation + for (int i = 0; i < 4; i++) { + tctx->StatCoeffi = tctx->imgunit->StatCoeff_models(tctx->CtbY - 1)i; + } } else { tctx->img->wait_for_progress(tctx->task, 0, tctx->CtbY - 1,CTB_PROGRESS_PREFILTER); @@ -4751,7 +4777,7 @@ const uint32_t ctbx = tctx->CtbX; const uint32_t ctby = tctx->CtbY; - if (ctbx + ctby * ctbW >= pps.CtbAddrRStoTS.size()) { + if (ctbx + ctby * ctbW >= pps.scan->CtbAddrRStoTS.size()) { return Decode_Error; } @@ -4792,6 +4818,11 @@ tctx->imgunit->ctx_modelsctby = tctx->ctx_model; tctx->imgunit->ctx_modelsctby.decouple(); // store an independent copy + + // also save the StatCoeff state for persistent_rice_adaptation + for (int i = 0; i < 4; i++) { + tctx->imgunit->StatCoeff_modelsctbyi = tctx->StatCoeffi; + } } @@ -4808,6 +4839,11 @@ tctx->shdr->ctx_model_storage = tctx->ctx_model; tctx->shdr->ctx_model_storage.decouple(); // store an independent copy + // also save the StatCoeff state for persistent_rice_adaptation + for (int i = 0; i < 4; i++) { + tctx->shdr->ctx_model_storage_StatCoeffi = tctx->StatCoeffi; + } + tctx->shdr->ctx_model_storage_defined = true; } } @@ -4851,7 +4887,7 @@ if (!end_of_slice_segment_flag) { bool end_of_sub_stream = false; end_of_sub_stream |= (pps.tiles_enabled_flag && - pps.TileIdtctx->CtbAddrInTS != pps.TileIdtctx->CtbAddrInTS - 1); + pps.scan->TileIdtctx->CtbAddrInTS != pps.scan->TileIdtctx->CtbAddrInTS - 1); end_of_sub_stream |= (pps.entropy_coding_sync_enabled_flag && lastCtbY != tctx->CtbY); @@ -4879,7 +4915,7 @@ slice_segment_header* shdr = tctx->shdr; if (shdr->dependent_slice_segment_flag) { - int prevCtb = pps.CtbAddrTStoRSpps.CtbAddrRStoTSshdr->slice_segment_address - 1; + int prevCtb = pps.scan->CtbAddrTStoRSpps.scan->CtbAddrRStoTSshdr->slice_segment_address - 1; uint16_t sliceIdx = img->get_SliceHeaderIndex_atIndex(prevCtb); if (sliceIdx >= img->slices.size()) { @@ -4920,6 +4956,11 @@ tctx->ctx_model = prevCtbHdr->ctx_model_storage; prevCtbHdr->ctx_model_storage.release(); + + // also restore the StatCoeff state for persistent_rice_adaptation + for (int i = 0; i < 4; i++) { + tctx->StatCoeffi = prevCtbHdr->ctx_model_storage_StatCoeffi; + } } } else {
View file
libde265-1.0.17.tar.gz/libde265/slice.h -> libde265-1.1.1.tar.gz/libde265/slice.h
Changed
@@ -218,8 +218,8 @@ bool slice_loop_filter_across_slices_enabled_flag; - int num_entry_point_offsets; - int offset_len; + uint32_t num_entry_point_offsets; + int offset_len; std::vector<uint32_t> entry_point_offset; int slice_segment_header_extension_length; @@ -255,6 +255,8 @@ // context storage for dependent slices (stores CABAC model at end of slice segment) context_model_table ctx_model_storage; + // StatCoeff (persistent_rice_adaptation state) saved alongside ctx_model_storage + uint8_t ctx_model_storage_StatCoeff4; bool ctx_model_storage_defined; // whether there is valid data in ctx_model_storage std::vector<int> RemoveReferencesList; // images that can be removed from the DPB before decoding this slice @@ -286,7 +288,7 @@ { public: bool firstSliceSubstream; - int debug_startCtbRow; + uint16_t debug_startCtbRow; thread_context* tctx; void work() override; @@ -297,7 +299,7 @@ { public: bool firstSliceSubstream; - int debug_startCtbX, debug_startCtbY; + uint16_t debug_startCtbX, debug_startCtbY; thread_context* tctx; void work() override;
View file
libde265-1.0.17.tar.gz/libde265/sps.cc -> libde265-1.1.1.tar.gz/libde265/sps.cc
Changed
@@ -50,7 +50,7 @@ const seq_parameter_set* sps, bitreader* br, ref_pic_set* out_set, - int idxRps, // index of the set to be read + uint32_t idxRps, // index of the set to be read const std::vector<ref_pic_set>& sets, bool sliceRefPicSet); @@ -58,7 +58,7 @@ const seq_parameter_set* sps, CABAC_encoder& out, const ref_pic_set* in_set, // which set to write - int idxRps, // index of the set to be read + uint32_t idxRps, // index of the set to be read const std::vector<ref_pic_set>& sets, // previously read sets bool sliceRefPicSet); // is this in the slice header? @@ -570,7 +570,7 @@ PicHeightInCtbsY = ceil_div(pic_height_in_luma_samples,CtbSizeY); PicSizeInMinCbsY = PicWidthInMinCbsY * PicHeightInMinCbsY; PicSizeInCtbsY = PicWidthInCtbsY * PicHeightInCtbsY; - PicSizeInSamplesY = pic_width_in_luma_samples * pic_height_in_luma_samples; + PicSizeInSamplesY = static_cast<uint32_t>(pic_width_in_luma_samples) * pic_height_in_luma_samples; if (chroma_format_idc==0 || separate_colour_plane_flag) { CtbWidthC = 0; @@ -583,6 +583,7 @@ Log2MinTrafoSize = log2_min_transform_block_size; Log2MaxTrafoSize = log2_min_transform_block_size + log2_diff_max_min_transform_block_size; + assert(Log2MaxTrafoSize >= 2); // log2_min_transform_block_size >= 2 by spec; relied on by pps.cc if (max_transform_hierarchy_depth_inter > Log2CtbSizeY - Log2MinTrafoSize) { if (sanitize_values) { @@ -656,7 +657,7 @@ return DE265_ERROR_CODED_PARAMETER_OUT_OF_RANGE; } - if (Log2MaxTrafoSize > libde265_min(Log2CtbSizeY,5)) { + if (Log2MaxTrafoSize > std::min((int)Log2CtbSizeY,5)) { if (D) fprintf(stderr,"SPS error: TB_max > 32 or CTB\n"); return DE265_ERROR_CODED_PARAMETER_OUT_OF_RANGE; } @@ -1324,7 +1325,7 @@ PicHeightInCtbsY = ceil_div(pic_height_in_luma_samples,CtbSizeY); PicSizeInMinCbsY = PicWidthInMinCbsY * PicHeightInMinCbsY; PicSizeInCtbsY = PicWidthInCtbsY * PicHeightInCtbsY; - PicSizeInSamplesY = pic_width_in_luma_samples * pic_height_in_luma_samples; + PicSizeInSamplesY = static_cast<uint32_t>(pic_width_in_luma_samples) * pic_height_in_luma_samples; if (chroma_format_idc==0 || separate_colour_plane_flag) { CtbWidthC = 0; CtbHeightC = 0; @@ -1335,6 +1336,7 @@ } Log2MinTrafoSize = log2_min_transform_block_size; Log2MaxTrafoSize = log2_min_transform_block_size + log2_diff_max_min_transform_block_size; + assert(Log2MaxTrafoSize >= 2); // log2_min_transform_block_size >= 2 by spec; relied on by pps.cc Log2MinPUSize = Log2MinCbSizeY-1; PicWidthInMinPUs = PicWidthInCtbsY << (Log2CtbSizeY - Log2MinPUSize); PicHeightInMinPUs = PicHeightInCtbsY << (Log2CtbSizeY - Log2MinPUSize);
View file
libde265-1.0.17.tar.gz/libde265/sps.h -> libde265-1.1.1.tar.gz/libde265/sps.h
Changed
@@ -39,6 +39,13 @@ constexpr int MAX_PICTURE_WIDTH = 65535; constexpr int MAX_PICTURE_HEIGHT = 65535; +// pic_width/height_in_luma_samples are stored as uint16_t and PicSizeInSamplesY as uint32_t, +// so these limits must keep width/height in 16 bits and their product in 32 bits. +static_assert(MAX_PICTURE_WIDTH <= 0xFFFF, "picture width must fit in uint16_t"); +static_assert(MAX_PICTURE_HEIGHT <= 0xFFFF, "picture height must fit in uint16_t"); +static_assert((uint64_t)MAX_PICTURE_WIDTH * MAX_PICTURE_HEIGHT <= 0xFFFFFFFFu, + "total luma sample count must fit in uint32_t"); + enum { CHROMA_MONO = 0, CHROMA_420 = 1, @@ -111,8 +118,8 @@ uint8_t chroma_format_idc; // 0;3 bool separate_colour_plane_flag; - int pic_width_in_luma_samples; - int pic_height_in_luma_samples; + uint16_t pic_width_in_luma_samples; // <= MAX_PICTURE_WIDTH (validated on parse) + uint16_t pic_height_in_luma_samples; // <= MAX_PICTURE_HEIGHT (validated on parse) bool conformance_window_flag; int conf_win_left_offset;
View file
libde265-1.0.17.tar.gz/libde265/transform.cc -> libde265-1.1.1.tar.gz/libde265/transform.cc
Changed
@@ -110,7 +110,7 @@ if (tctx->img->available_zscan(xQG,yQG, xQG-1,yQG)) { int xTmp = (xQG-1) >> sps.Log2MinTrafoSize; int yTmp = (yQG ) >> sps.Log2MinTrafoSize; - int minTbAddrA = pps.MinTbAddrZSxTmp + yTmp*sps.PicWidthInTbsY; + int minTbAddrA = pps.scan->MinTbAddrZSxTmp + yTmp*sps.PicWidthInTbsY; uint32_t ctbAddrA = minTbAddrA >> (2 * (sps.Log2CtbSizeY-sps.Log2MinTrafoSize)); if (ctbAddrA == tctx->CtbAddrInTS) { qPYA = tctx->img->get_QPY(xQG-1,yQG); @@ -126,7 +126,7 @@ if (tctx->img->available_zscan(xQG,yQG, xQG,yQG-1)) { int xTmp = (xQG ) >> sps.Log2MinTrafoSize; int yTmp = (yQG-1) >> sps.Log2MinTrafoSize; - uint32_t minTbAddrB = pps.MinTbAddrZSxTmp + yTmp*sps.PicWidthInTbsY; + uint32_t minTbAddrB = pps.scan->MinTbAddrZSxTmp + yTmp*sps.PicWidthInTbsY; uint32_t ctbAddrB = minTbAddrB >> (2 * (sps.Log2CtbSizeY-sps.Log2MinTrafoSize)); if (ctbAddrB == tctx->CtbAddrInTS) { qPYB = tctx->img->get_QPY(xQG,yQG-1); @@ -467,19 +467,23 @@ const int offset = (1<<(bdShift-1)); const int fact = m_x_y * levelScaleqP%6 << (qP/6); - for (int i=0;i<tctx->nCoeffcIdx;i++) { - - int64_t currCoeff = tctx->coeffListcIdxi; - - //logtrace(LogTransform,"coefficient%d = %d\n",tctx->coeffPoscIdxi, - //tctx->coeffListcIdxi); - - currCoeff = Clip3(-32768,32767, - ( (currCoeff * fact + offset ) >> bdShift)); - - //logtrace(LogTransform," -> %d\n",currCoeff); - - tctx->coeffBuf tctx->coeffPoscIdxi = currCoeff; + // Fast path: when coeffListi*fact (|coeffList| <= 32767) fits in int32, + // the dequant can run in int32 SIMD. Otherwise (very high QP, high bit + // depth) fall back to the scalar int64 loop. + if (fact <= 32767) { + tctx->decctx->acceleration.dequant_coeff_block(tctx->coeffBuf, + tctx->coeffListcIdx, + tctx->coeffPoscIdx, + tctx->nCoeffcIdx, + fact, offset, bdShift); + } + else { + for (int i=0;i<tctx->nCoeffcIdx;i++) { + int64_t currCoeff = tctx->coeffListcIdxi; + currCoeff = Clip3(-32768,32767, + ( (currCoeff * fact + offset ) >> bdShift)); + tctx->coeffBuf tctx->coeffPoscIdxi = currCoeff; + } } } else { @@ -545,8 +549,8 @@ int extended_precision_processing_flag = 0; int Log2nTbS = Log2(nT); - int bdShift = libde265_max( 20 - bit_depth, extended_precision_processing_flag ? 11 : 0 ); - int tsShift = (extended_precision_processing_flag ? libde265_min( 5, bdShift - 2 ) : 5 ) + int bdShift = std::max( 20 - bit_depth, extended_precision_processing_flag ? 11 : 0 ); + int tsShift = (extended_precision_processing_flag ? std::min( 5, bdShift - 2 ) : 5 ) + Log2nTbS; if (rotateCoeffs) { @@ -699,7 +703,7 @@ //logtrace(LogTransform,"(%d,%d) %d -> ", x,y,level); sign = (level < 0 ? -1: 1); - level = (abs_value(level) * uiQ + rnd ) >> qBits; + level = (std::abs(level) * uiQ + rnd ) >> qBits; level *= sign; out_coeffblockPos = Clip3(-32768, 32767, level); //logtrace(LogTransform,"%d\n", out_coeffblockPos);
View file
libde265-1.0.17.tar.gz/libde265/transform.h -> libde265-1.1.1.tar.gz/libde265/transform.h
Changed
@@ -26,7 +26,7 @@ extern const int tab8_22; -LIBDE265_INLINE static int table8_22(int qPi) +inline static int table8_22(int qPi) { if (qPi<30) return qPi; if (qPi>=43) return qPi-6;
View file
libde265-1.0.17.tar.gz/libde265/util.h -> libde265-1.1.1.tar.gz/libde265/util.h
Changed
@@ -31,6 +31,7 @@ #include <stdio.h> #include <string> +#include <cstdlib> #include "libde265/de265.h" @@ -74,22 +75,35 @@ #endif #endif -//inline uint8_t Clip1_8bit(int16_t value) { if (value<=0) return 0; else if (value>=255) return 255; else return value; } -#define Clip1_8bit(value) ((value)<0 ? 0 : (value)>255 ? 255 : (value)) -#define Clip_BitDepth(value, bit_depth) ((value)<0 ? 0 : (value)>((1<<bit_depth)-1) ? ((1<<bit_depth)-1) : (value)) -#define Clip3(low,high,value) ((value)<(low) ? (low) : (value)>(high) ? (high) : (value)) -#define Sign(value) (((value)<0) ? -1 : ((value)>0) ? 1 : 0) -#define abs_value(a) (((a)<0) ? -(a) : (a)) -#define libde265_min(a,b) (((a)<(b)) ? (a) : (b)) -#define libde265_max(a,b) (((a)>(b)) ? (a) : (b)) +inline static int Clip1_8bit(int value) +{ + return value<0 ? 0 : value>255 ? 255 : value; +} + +inline static int Clip_BitDepth(int value, int bit_depth) +{ + const int maxval = (1<<bit_depth)-1; + return value<0 ? 0 : value>maxval ? maxval : value; +} + +inline static int Clip3(int low, int high, int value) +{ + return value<low ? low : value>high ? high : value; +} + +// three-valued sign: returns -1, 0, or +1 +template <typename T> inline int Sign(T value) +{ + return (T(0) < value) - (value < T(0)); +} -LIBDE265_INLINE static int ceil_div(int num,int denom) +inline static int ceil_div(int num,int denom) { num += denom-1; return num/denom; } -LIBDE265_INLINE static int ceil_log2(int val) +inline static int ceil_log2(int val) { int n=0; while (val > (1<<n)) { @@ -99,7 +113,7 @@ return n; } -LIBDE265_INLINE static int Log2(int v) +inline static int Log2(int v) { int n=0; while (v>1) { @@ -110,7 +124,7 @@ return n; } -LIBDE265_INLINE static int Log2SizeToArea(int v) +inline static int Log2SizeToArea(int v) { return (1<<(v<<1)); }
View file
libde265-1.0.17.tar.gz/libde265/visualize.cc -> libde265-1.1.1.tar.gz/libde265/visualize.cc
Changed
@@ -467,7 +467,7 @@ int ctbAddrRS = ctby*sps.PicWidthInCtbsY + ctbx; int prevCtbRS = -1; - if (ctbx>0 || ctby>0) { prevCtbRS = img->get_pps().CtbAddrTStoRS img->get_pps().CtbAddrRStoTSctbAddrRS -1 ; } + if (ctbx>0 || ctby>0) { prevCtbRS = img->get_pps().scan->CtbAddrTStoRS img->get_pps().scan->CtbAddrRStoTSctbAddrRS -1 ; } if (prevCtbRS<0 || img->get_SliceHeaderIndex_atIndex(ctbAddrRS) !=
View file
libde265-1.0.17.tar.gz/libde265/x86/CMakeLists.txt -> libde265-1.1.1.tar.gz/libde265/x86/CMakeLists.txt
Changed
@@ -2,8 +2,17 @@ sse.cc sse.h ) -set (x86_sse_sources - sse-motion.cc sse-motion.h sse-dct.h sse-dct.cc +set (x86_sse_sources + sse-motion.cc sse-motion.h sse-dct.h sse-dct.cc sse-intrapred.h sse-intrapred.cc + sse-deblk.h sse-deblk.cc +) + +set (x86_avx2_sources + transform-avx2.cc transform-avx2.h transform-dct-tables.h +) + +set (x86_avx512_sources + transform-avx512.cc transform-avx512.h ) add_library(x86 OBJECT ${x86_sources}) @@ -20,6 +29,31 @@ endif(CMAKE_SIZEOF_VOID_P EQUAL 8) endif() -set(X86_OBJECTS $<TARGET_OBJECTS:x86> $<TARGET_OBJECTS:x86_sse> PARENT_SCOPE) - SET_TARGET_PROPERTIES(x86_sse PROPERTIES COMPILE_FLAGS "${sse_flags}") + +set(X86_OBJECTS $<TARGET_OBJECTS:x86> $<TARGET_OBJECTS:x86_sse>) + +# AVX2 kernels (only compiled where supported). The dispatch/detection code lives +# in the plain x86 object (sse.cc) so it never runs AVX2 instructions before the +# runtime CPU check. +if(HAVE_AVX2) + add_library(x86_avx2 OBJECT ${x86_avx2_sources}) + if(MSVC) + SET_TARGET_PROPERTIES(x86_avx2 PROPERTIES COMPILE_FLAGS "/arch:AVX2") + else() + SET_TARGET_PROPERTIES(x86_avx2 PROPERTIES COMPILE_FLAGS "-mavx2") + endif() + set(X86_OBJECTS ${X86_OBJECTS} $<TARGET_OBJECTS:x86_avx2>) +endif() + +if(HAVE_AVX512) + add_library(x86_avx512 OBJECT ${x86_avx512_sources}) + if(MSVC) + SET_TARGET_PROPERTIES(x86_avx512 PROPERTIES COMPILE_FLAGS "/arch:AVX512") + else() + SET_TARGET_PROPERTIES(x86_avx512 PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw") + endif() + set(X86_OBJECTS ${X86_OBJECTS} $<TARGET_OBJECTS:x86_avx512>) +endif() + +set(X86_OBJECTS ${X86_OBJECTS} PARENT_SCOPE)
View file
libde265-1.0.17.tar.gz/libde265/x86/sse-dct.cc -> libde265-1.1.1.tar.gz/libde265/x86/sse-dct.cc
Changed
@@ -2909,6 +2909,13 @@ #if HAVE_SSE4_1 +// All m128iS0..m128iS31 are unconditionally loaded at function entry before any +// use, but GCC's path analysis gives up inside this very large inlined function +// and emits spurious -Wmaybe-uninitialized warnings for them. Suppress them here. +#if defined(__GNUC__) && !defined(__clang__) +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wmaybe-uninitialized" +#endif void ff_hevc_transform_32x32_add_8_sse4(uint8_t *_dst, const int16_t *coeffs, ptrdiff_t _stride) { uint8_t shift_2nd = 12; // 20 - Bit depth @@ -5197,6 +5204,9 @@ } } } +#if defined(__GNUC__) && !defined(__clang__) +#pragma GCC diagnostic pop +#endif #endif @@ -7092,3 +7102,135 @@ } #endif + +#if HAVE_SSE4_1 +// Add the int32 residual block 'r' (nT x nT, row-major, nT values per row) to +// the prediction samples in 'dst' and clip into the valid pixel range. +// Equivalent to add_residual_fallback<uint8_t> (bit_depth is always 8 here). +void add_residual_8_sse4(uint8_t *dst, ptrdiff_t stride, + const int32_t* r, int nT, int bit_depth) +{ + if (nT==4) { + for (int y=0;y<4;y++) { + uint8_t* drow = dst + y*stride; + + __m128i res = _mm_loadu_si128((const __m128i*)(r + y*4)); // 4 x int32 residual + __m128i pix = _mm_cvtsi32_si128(*(const int32_t*)drow); // 4 x uint8 + pix = _mm_cvtepu8_epi32(pix); // -> 4 x int32 + + __m128i sum = _mm_add_epi32(res, pix); + sum = _mm_packs_epi32(sum, sum); // -> int16 (saturate) + sum = _mm_packus_epi16(sum, sum); // -> uint8 (clip 0..255) + + *(int32_t*)drow = _mm_cvtsi128_si32(sum); + } + } + else { + // nT is 8, 16 or 32 -> always a multiple of 8 + for (int y=0;y<nT;y++) { + const int32_t* rrow = r + y*nT; + uint8_t* drow = dst + y*stride; + + for (int x=0;x<nT;x+=8) { + __m128i r0 = _mm_loadu_si128((const __m128i*)(rrow + x)); // 4 x int32 + __m128i r1 = _mm_loadu_si128((const __m128i*)(rrow + x+4)); // 4 x int32 + __m128i pix = _mm_loadl_epi64((const __m128i*)(drow + x)); // 8 x uint8 + + __m128i p0 = _mm_cvtepu8_epi32(pix); // 4 x int32 + __m128i p1 = _mm_cvtepu8_epi32(_mm_srli_si128(pix,4)); // 4 x int32 + + __m128i s0 = _mm_add_epi32(r0, p0); + __m128i s1 = _mm_add_epi32(r1, p1); + + __m128i p16 = _mm_packs_epi32(s0, s1); // 8 x int16 (saturate) + __m128i p8 = _mm_packus_epi16(p16, p16); // 8 x uint8 (clip 0..255) + + _mm_storel_epi64((__m128i*)(drow + x), p8); + } + } + } +} + + +// 16-bit (high bit-depth) variant. Equivalent to add_residual_fallback<uint16_t>. +void add_residual_16_sse4(uint16_t *dst, ptrdiff_t stride, + const int32_t* r, int nT, int bit_depth) +{ + const int32_t maxval = (1<<bit_depth)-1; + const __m128i vmax = _mm_set1_epi32(maxval); + const __m128i vzero = _mm_setzero_si128(); + + if (nT==4) { + for (int y=0;y<4;y++) { + uint16_t* drow = dst + y*stride; + + __m128i res = _mm_loadu_si128((const __m128i*)(r + y*4)); // 4 x int32 residual + __m128i pix = _mm_loadl_epi64((const __m128i*)drow); // 4 x uint16 + pix = _mm_cvtepu16_epi32(pix); // -> 4 x int32 + + __m128i sum = _mm_add_epi32(res, pix); + sum = _mm_min_epi32(_mm_max_epi32(sum, vzero), vmax); // clip 0..maxval + sum = _mm_packus_epi32(sum, sum); // -> uint16 + + _mm_storel_epi64((__m128i*)drow, sum); + } + } + else { + // nT is 8, 16 or 32 -> always a multiple of 8 + for (int y=0;y<nT;y++) { + const int32_t* rrow = r + y*nT; + uint16_t* drow = dst + y*stride; + + for (int x=0;x<nT;x+=8) { + __m128i r0 = _mm_loadu_si128((const __m128i*)(rrow + x)); // 4 x int32 + __m128i r1 = _mm_loadu_si128((const __m128i*)(rrow + x+4)); // 4 x int32 + __m128i pix = _mm_loadu_si128((const __m128i*)(drow + x)); // 8 x uint16 + + __m128i p0 = _mm_cvtepu16_epi32(pix); // 4 x int32 + __m128i p1 = _mm_cvtepu16_epi32(_mm_srli_si128(pix,8)); // 4 x int32 + + __m128i s0 = _mm_add_epi32(r0, p0); + __m128i s1 = _mm_add_epi32(r1, p1); + + s0 = _mm_min_epi32(_mm_max_epi32(s0, vzero), vmax); // clip 0..maxval + s1 = _mm_min_epi32(_mm_max_epi32(s1, vzero), vmax); + + __m128i out = _mm_packus_epi32(s0, s1); // 8 x uint16 + _mm_storeu_si128((__m128i*)(drow + x), out); + } + } + } +} + + +// Inverse quantization without scaling list, int32 fast path (see acceleration.h). +// Vectorizes the multiply/round/clip 8 coefficients at a time; the scatter into +// coeffBufcoeffPosi stays scalar (no 16-bit SIMD scatter exists). +void dequant_coeff_block_sse4(int16_t* coeffBuf, const int16_t* coeffList, + const int16_t* coeffPos, int nCoeff, + int32_t fact, int32_t offset, int32_t bdShift) +{ + const __m128i vfact = _mm_set1_epi32(fact); + const __m128i voff = _mm_set1_epi32(offset); + const __m128i vsh = _mm_cvtsi32_si128(bdShift); + + alignas(16) int16_t tmp8; + int i = 0; + for (; i+8 <= nCoeff; i += 8) { + __m128i c = _mm_loadu_si128((const __m128i*)(coeffList + i)); // 8 int16 + __m128i lo = _mm_cvtepi16_epi32(c); // ci..i+3 + __m128i hi = _mm_cvtepi16_epi32(_mm_srli_si128(c, 8)); // ci+4..i+7 + lo = _mm_sra_epi32(_mm_add_epi32(_mm_mullo_epi32(lo, vfact), voff), vsh); + hi = _mm_sra_epi32(_mm_add_epi32(_mm_mullo_epi32(hi, vfact), voff), vsh); + __m128i r = _mm_packs_epi32(lo, hi); // signed sat == Clip3 + _mm_store_si128((__m128i*)tmp, r); + for (int k=0;k<8;k++) coeffBuf coeffPosi+k = tmpk; // scatter + } + for (; i < nCoeff; i++) { + int32_t v = (coeffListi*fact + offset) >> bdShift; + v = v < -32768 ? -32768 : (v > 32767 ? 32767 : v); + coeffBuf coeffPosi = (int16_t)v; + } +} +#endif +
View file
libde265-1.0.17.tar.gz/libde265/x86/sse-dct.h -> libde265-1.1.1.tar.gz/libde265/x86/sse-dct.h
Changed
@@ -32,4 +32,12 @@ void ff_hevc_transform_16x16_add_8_sse4(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride); void ff_hevc_transform_32x32_add_8_sse4(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride); +void add_residual_8_sse4 (uint8_t *dst, ptrdiff_t stride, const int32_t* r, int nT, int bit_depth); +void add_residual_16_sse4(uint16_t *dst, ptrdiff_t stride, const int32_t* r, int nT, int bit_depth); + +// Inverse quantization (no scaling list, int32 fast path). See acceleration.h. +void dequant_coeff_block_sse4(int16_t* coeffBuf, const int16_t* coeffList, + const int16_t* coeffPos, int nCoeff, + int32_t fact, int32_t offset, int32_t bdShift); + #endif
View file
libde265-1.1.1.tar.gz/libde265/x86/sse-deblk.cc
Added
@@ -0,0 +1,185 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +// SSE4.1 8-bit deblocking. One edge segment = 4 lines along the edge. The four +// lines all use the same per-edge parameters and (for luma) the same strong/ +// weak choice, so they are processed in parallel as the 4 int32 lanes of an +// xmm register. Each sample position (p3..q3) becomes one vector-of-4-lines. +// For horizontal edges those vectors are 4 contiguous bytes (one per line); +// for vertical edges the 4 lines are strided, so load/store transpose a small +// block. The arithmetic is identical to the scalar kernels -> bit-exact. + +#include "x86/sse-deblk.h" + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#if HAVE_SSE4_1 + +#include <string.h> +#include <smmintrin.h> // SSE4.1 + +namespace { + +inline __m128i load4(const uint8_t* p) { // 4 bytes -> 4 int32 + int32_t t; memcpy(&t, p, 4); + return _mm_cvtepu8_epi32(_mm_cvtsi32_si128(t)); +} +inline void store4(uint8_t* p, __m128i v) { // 4 int32 (0..255) -> 4 bytes + __m128i b = _mm_packus_epi16(_mm_packus_epi32(v, v), _mm_setzero_si128()); + int32_t t = _mm_cvtsi128_si32(b); + memcpy(p, &t, 4); +} + +inline __m128i clip3(__m128i lo, __m128i hi, __m128i v) { + return _mm_min_epi32(_mm_max_epi32(v, lo), hi); +} +inline __m128i clip_u8(__m128i v) { + return _mm_min_epi32(_mm_max_epi32(v, _mm_setzero_si128()), _mm_set1_epi32(255)); +} +inline __m128i x2(__m128i a){ return _mm_add_epi32(a,a); } +inline __m128i add3(__m128i a,__m128i b,__m128i c){ return _mm_add_epi32(_mm_add_epi32(a,b),c); } + +// --- vertical: load/store transpose a 4(lines) x 8(samples) block ---------- + +inline void load_vert(const uint8_t* ptr, ptrdiff_t stride, __m128i s8) { + __m128i r0 = _mm_loadl_epi64((const __m128i*)(ptr-4)); + __m128i r1 = _mm_loadl_epi64((const __m128i*)(ptr-4+stride)); + __m128i r2 = _mm_loadl_epi64((const __m128i*)(ptr-4+2*stride)); + __m128i r3 = _mm_loadl_epi64((const __m128i*)(ptr-4+3*stride)); + __m128i e = _mm_unpacklo_epi8(r0, r1); + __m128i f = _mm_unpacklo_epi8(r2, r3); + __m128i lo = _mm_unpacklo_epi16(e, f); // samples p3 p2 p1 p0 (4 bytes each, 4 lines) + __m128i hi = _mm_unpackhi_epi16(e, f); // samples q0 q1 q2 q3 + s0=_mm_cvtepu8_epi32(lo); + s1=_mm_cvtepu8_epi32(_mm_srli_si128(lo,4)); + s2=_mm_cvtepu8_epi32(_mm_srli_si128(lo,8)); + s3=_mm_cvtepu8_epi32(_mm_srli_si128(lo,12)); + s4=_mm_cvtepu8_epi32(hi); + s5=_mm_cvtepu8_epi32(_mm_srli_si128(hi,4)); + s6=_mm_cvtepu8_epi32(_mm_srli_si128(hi,8)); + s7=_mm_cvtepu8_epi32(_mm_srli_si128(hi,12)); +} + +inline void store_vert(uint8_t* ptr, ptrdiff_t stride, const __m128i s8) { + __m128i lo = _mm_packus_epi16(_mm_packus_epi32(s0,s1), _mm_packus_epi32(s2,s3)); + __m128i hi = _mm_packus_epi16(_mm_packus_epi32(s4,s5), _mm_packus_epi32(s6,s7)); + const __m128i base = _mm_setr_epi8(0,4,8,12, (char)0x80,(char)0x80,(char)0x80,(char)0x80, + (char)0x80,(char)0x80,(char)0x80,(char)0x80, + (char)0x80,(char)0x80,(char)0x80,(char)0x80); + for (int k=0;k<4;k++) { + __m128i mask = _mm_add_epi8(base, _mm_set1_epi8((char)k)); // {k,4+k,8+k,12+k, 0x80..} + __m128i a = _mm_shuffle_epi8(lo, mask); // low4 = s0..s3 of line k + __m128i b = _mm_shuffle_epi8(hi, mask); // low4 = s4..s7 of line k + __m128i row = _mm_unpacklo_epi32(a, b); // low8 = the 8 samples of line k + _mm_storel_epi64((__m128i*)(ptr-4+k*stride), row); + } +} + +inline void load_horiz(const uint8_t* ptr, ptrdiff_t stride, __m128i s8) { + s0=load4(ptr-4*stride); s1=load4(ptr-3*stride); s2=load4(ptr-2*stride); s3=load4(ptr-1*stride); + s4=load4(ptr+0*stride); s5=load4(ptr+1*stride); s6=load4(ptr+2*stride); s7=load4(ptr+3*stride); +} +inline void store_horiz(uint8_t* ptr, ptrdiff_t stride, const __m128i s8) { + store4(ptr-4*stride,s0); store4(ptr-3*stride,s1); store4(ptr-2*stride,s2); store4(ptr-1*stride,s3); + store4(ptr+0*stride,s4); store4(ptr+1*stride,s5); store4(ptr+2*stride,s6); store4(ptr+3*stride,s7); +} + +} // namespace + + +void deblock_luma_8_sse4(uint8_t* ptr, ptrdiff_t stride, int vertical, + int dE, int dEp, int dEq, int tc, int filterP, int filterQ) +{ + __m128i s8; + if (vertical) load_vert(ptr, stride, s); else load_horiz(ptr, stride, s); + + const __m128i p3=s0, p2=s1, p1=s2, p0=s3; + const __m128i q0=s4, q1=s5, q2=s6, q3=s7; + + if (dE==2) { + // strong filtering + const __m128i v2tc = _mm_set1_epi32(2*tc); + const __m128i c4 = _mm_set1_epi32(4); + const __m128i c2 = _mm_set1_epi32(2); + + __m128i pn0 = _mm_srai_epi32(_mm_add_epi32(add3(p2, x2(p1), x2(p0)), add3(x2(q0), q1, c4)), 3); + pn0 = clip3(_mm_sub_epi32(p0,v2tc), _mm_add_epi32(p0,v2tc), pn0); + __m128i pn1 = _mm_srai_epi32(_mm_add_epi32(add3(p2,p1,p0), _mm_add_epi32(q0,c2)), 2); + pn1 = clip3(_mm_sub_epi32(p1,v2tc), _mm_add_epi32(p1,v2tc), pn1); + __m128i pn2 = _mm_srai_epi32(_mm_add_epi32(add3(x2(p3), _mm_add_epi32(x2(p2),p2), p1), add3(p0,q0,c4)), 3); + pn2 = clip3(_mm_sub_epi32(p2,v2tc), _mm_add_epi32(p2,v2tc), pn2); + + __m128i qn0 = _mm_srai_epi32(_mm_add_epi32(add3(p1, x2(p0), x2(q0)), add3(x2(q1), q2, c4)), 3); + qn0 = clip3(_mm_sub_epi32(q0,v2tc), _mm_add_epi32(q0,v2tc), qn0); + __m128i qn1 = _mm_srai_epi32(_mm_add_epi32(add3(p0,q0,q1), _mm_add_epi32(q2,c2)), 2); + qn1 = clip3(_mm_sub_epi32(q1,v2tc), _mm_add_epi32(q1,v2tc), qn1); + __m128i qn2 = _mm_srai_epi32(_mm_add_epi32(add3(p0,q0,q1), add3(_mm_add_epi32(x2(q2),q2), x2(q3), c4)), 3); + qn2 = clip3(_mm_sub_epi32(q2,v2tc), _mm_add_epi32(q2,v2tc), qn2); + + if (filterP) { s3=pn0; s2=pn1; s1=pn2; } + if (filterQ) { s4=qn0; s5=qn1; s6=qn2; } + } + else { + // weak filtering + const __m128i vtc = _mm_set1_epi32(tc); + const __m128i delta0 = _mm_srai_epi32( + _mm_add_epi32(_mm_sub_epi32(_mm_mullo_epi32(_mm_set1_epi32(9), _mm_sub_epi32(q0,p0)), + _mm_mullo_epi32(_mm_set1_epi32(3), _mm_sub_epi32(q1,p1))), + _mm_set1_epi32(8)), 4); + // per-line mask: abs(delta) < tc*10 + __m128i mask = _mm_cmpgt_epi32(_mm_set1_epi32(tc*10), _mm_abs_epi32(delta0)); + __m128i delta = clip3(_mm_set1_epi32(-tc), vtc, delta0); + + if (filterP) { + __m128i p0n = clip_u8(_mm_add_epi32(p0, delta)); + s3 = _mm_blendv_epi8(p0, p0n, mask); + } + if (filterQ) { + __m128i q0n = clip_u8(_mm_sub_epi32(q0, delta)); + s4 = _mm_blendv_epi8(q0, q0n, mask); + } + if (dEp && filterP) { + const __m128i htc = _mm_set1_epi32(tc>>1); + __m128i dp = _mm_srai_epi32(_mm_add_epi32(_mm_sub_epi32(_mm_srai_epi32(_mm_add_epi32(_mm_add_epi32(p2,p0),_mm_set1_epi32(1)),1), p1), delta), 1); + dp = clip3(_mm_sub_epi32(_mm_setzero_si128(),htc), htc, dp); + __m128i p1n = clip_u8(_mm_add_epi32(p1, dp)); + s2 = _mm_blendv_epi8(p1, p1n, mask); + } + if (dEq && filterQ) { + const __m128i htc = _mm_set1_epi32(tc>>1); + __m128i dq = _mm_srai_epi32(_mm_sub_epi32(_mm_sub_epi32(_mm_srai_epi32(_mm_add_epi32(_mm_add_epi32(q2,q0),_mm_set1_epi32(1)),1), q1), delta), 1); + dq = clip3(_mm_sub_epi32(_mm_setzero_si128(),htc), htc, dq); + __m128i q1n = clip_u8(_mm_add_epi32(q1, dq)); + s5 = _mm_blendv_epi8(q1, q1n, mask); + } + } + + if (vertical) store_vert(ptr, stride, s); else store_horiz(ptr, stride, s); +} + +// Note: an SSE chroma deblock filter was implemented and benchmarked too, but +// the chroma filter is a single delta per line -- so trivial that it is fully +// load/store-bound (the vertical case needs a strided 2-column scatter), and +// SSE measured slower than scalar (~0.5-0.9x). Chroma deblock therefore stays +// on the scalar fallback; only luma is accelerated here. + +#endif // HAVE_SSE4_1
View file
libde265-1.1.1.tar.gz/libde265/x86/sse-deblk.h
Added
@@ -0,0 +1,33 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef SSE_DEBLK_H +#define SSE_DEBLK_H + +#include <stddef.h> +#include <stdint.h> + +// SSE4.1 8-bit deblocking of one 4-line edge segment. Bit-identical to the +// scalar kernels in fallback-deblk.h (verified by dev-tools/test-deblk). +void deblock_luma_8_sse4(uint8_t* ptr, ptrdiff_t stride, int vertical, + int dE, int dEp, int dEq, int tc, int filterP, int filterQ); +// (chroma deblock stays scalar — SSE measured slower; see sse-deblk.cc) + +#endif
View file
libde265-1.1.1.tar.gz/libde265/x86/sse-intrapred.cc
Added
@@ -0,0 +1,224 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "x86/sse-intrapred.h" +#include "libde265/util.h" + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <string.h> +#include <emmintrin.h> // SSE2 + +#if HAVE_SSE4_1 +#include <smmintrin.h> // SSE4.1 + +// angle / inverse-angle lookup tables, defined in intrapred.cc +extern const int intraPredAngle_table1+34; +extern const int invAngle_table25-10; + +namespace { + +const int kMaxBlk = 64; // == MAX_INTRA_PRED_BLOCK_SIZE + +// load 4 / 8 consecutive uint8_t and zero-extend to 8x int16 +inline __m128i load4_epi16(const uint8_t* p) { + int32_t t; memcpy(&t, p, 4); + return _mm_cvtepu8_epi16(_mm_cvtsi32_si128(t)); +} +inline __m128i load8_epi16(const uint8_t* p) { + return _mm_cvtepu8_epi16(_mm_loadl_epi64((const __m128i*)p)); +} + +// store nT bytes from the low lanes of an u8-packed register (nT in {4,8,16,32}) +inline void store_row(uint8_t* d, __m128i v, int nT) { + switch (nT) { + case 4: { int32_t t = _mm_cvtsi128_si32(v); memcpy(d, &t, 4); } break; + case 8: _mm_storel_epi64((__m128i*)d, v); break; + case 16: _mm_storeu_si128((__m128i*)d, v); break; + default: _mm_storeu_si128((__m128i*)d, v); + _mm_storeu_si128((__m128i*)(d+16), v); break; + } +} + +// copy exactly nT bytes (nT in {4,8,16,32}) +inline void copy_row(uint8_t* d, const uint8_t* s, int nT) { + switch (nT) { + case 4: memcpy(d, s, 4); break; + case 8: memcpy(d, s, 8); break; + case 16: _mm_storeu_si128((__m128i*)d, _mm_loadu_si128((const __m128i*)s)); break; + default: _mm_storeu_si128((__m128i*)d, _mm_loadu_si128((const __m128i*)s)); + _mm_storeu_si128((__m128i*)(d+16), _mm_loadu_si128((const __m128i*)(s+16))); break; + } +} + +inline int shift_for(int nT) { return (nT==4)?3 : (nT==8)?4 : (nT==16)?5 : 6; } // Log2(nT)+1 + +} // namespace + + +void intra_pred_dc_8_sse4(uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border) +{ + const int shift = shift_for(nT); + + int dcVal = 0; + for (int i=0;i<nT;i++) { dcVal += borderi+1; dcVal += border-i-1; } + dcVal += nT; + dcVal >>= shift; + + const __m128i v = _mm_set1_epi8((char)dcVal); + for (int y=0;y<nT;y++) store_row(dst + y*stride, v, nT); + + // luma edge smoothing overwrites first row and first column (disjoint cells) + if (cIdx==0 && nT<32) { + dst0 = (uint8_t)((border-1 + 2*dcVal + border1 + 2) >> 2); + for (int x=1;x<nT;x++) dstx = (uint8_t)((border x+1 + 3*dcVal + 2) >> 2); + for (int y=1;y<nT;y++) dsty*stride = (uint8_t)((border-y-1 + 3*dcVal + 2) >> 2); + } +} + + +void intra_pred_planar_8_sse4(uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border) +{ + const int shift = shift_for(nT); + const int TR = border 1+nT; // top-right corner sample + const int BL = border-1-nT; // bottom-left corner sample + + const __m128i base = _mm_setr_epi16(0,1,2,3,4,5,6,7); + const __m128i vTR = _mm_set1_epi16((short)TR); + const __m128i vNTm1 = _mm_set1_epi16((short)(nT-1)); + const __m128i one = _mm_set1_epi16(1); + + for (int y=0;y<nT;y++) { + const int left_y = border-1-y; + const int Cy = (y+1)*BL + nT; // constant term for this row + const __m128i vL = _mm_set1_epi16((short)left_y); + const __m128i vNT1Y = _mm_set1_epi16((short)(nT-1-y)); + const __m128i vC = _mm_set1_epi16((short)Cy); + + for (int x=0;x<nT;x+=8) { + const __m128i xidx = _mm_add_epi16(_mm_set1_epi16((short)x), base); + const __m128i vA = _mm_sub_epi16(vNTm1, xidx); // (nT-1-x) + const __m128i vB = _mm_add_epi16(xidx, one); // (x+1) + const __m128i top = (nT==4) ? load4_epi16(border+1+x) : load8_epi16(border+1+x); + + __m128i acc = _mm_mullo_epi16(vA, vL); // (nT-1-x)*border-1-y + acc = _mm_add_epi16(acc, _mm_mullo_epi16(vB, vTR)); // (x+1)*border1+nT + acc = _mm_add_epi16(acc, _mm_mullo_epi16(top, vNT1Y));// (nT-1-y)*border1+x + acc = _mm_add_epi16(acc, vC); // (y+1)*border-1-nT + nT + acc = _mm_srli_epi16(acc, shift); + + const __m128i p = _mm_packus_epi16(acc, acc); + store_row(dst + y*stride + x, p, (nT<8)?nT:8); + } + } +} + + +void intra_pred_angular_8_sse4(uint8_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const uint8_t* border) +{ + const int intraPredAngle = intraPredAngle_tablemode; + + uint8_t ref_mem4*kMaxBlk+1; + uint8_t* ref = &ref_mem2*kMaxBlk; + + if (mode >= 18) { + for (int x=0;x<=nT;x++) refx = borderx; + + if (intraPredAngle<0) { + const int invAngle = invAngle_tablemode-11; + if (((nT*intraPredAngle)>>5) < -1) { + for (int x=(nT*intraPredAngle)>>5; x<=-1; x++) + refx = border0-((x*invAngle+128)>>8); + } + } else { + for (int x=nT+1; x<=2*nT; x++) refx = borderx; + } + + for (int y=0;y<nT;y++) { + const int iIdx = ((y+1)*intraPredAngle)>>5; + const int iFact = ((y+1)*intraPredAngle)&31; + const uint8_t* src = ref + iIdx + 1; + uint8_t* d = dst + y*stride; + + if (iFact==0) { + copy_row(d, src, nT); // dstx = refx+iIdx+1 + } else { + const __m128i w0 = _mm_set1_epi16((short)(32-iFact)); + const __m128i w1 = _mm_set1_epi16((short)iFact); + const __m128i r16 = _mm_set1_epi16(16); + if (nT==4) { + __m128i a = load4_epi16(src), b = load4_epi16(src+1); + __m128i acc = _mm_add_epi16(_mm_add_epi16(_mm_mullo_epi16(a,w0), _mm_mullo_epi16(b,w1)), r16); + acc = _mm_srli_epi16(acc, 5); + store_row(d, _mm_packus_epi16(acc,acc), 4); + } else { + for (int x=0;x<nT;x+=8) { + __m128i a = load8_epi16(src+x), b = load8_epi16(src+x+1); + __m128i acc = _mm_add_epi16(_mm_add_epi16(_mm_mullo_epi16(a,w0), _mm_mullo_epi16(b,w1)), r16); + acc = _mm_srli_epi16(acc, 5); + store_row(d+x, _mm_packus_epi16(acc,acc), 8); + } + } + } + } + + if (mode==26 && cIdx==0 && nT<32 && !disableBoundaryFilter) { + for (int y=0;y<nT;y++) + dsty*stride = (uint8_t)Clip_BitDepth(border1 + ((border-1-y - border0)>>1), bit_depth); + } + } + else { + // Modes 2..17: the reference projection is transposed (per-column iIdx/iFact and + // a row-indexed reference fetch), which does not map onto contiguous SIMD loads or + // stores. Use the scalar reference path here -- kept bit-identical to + // intra_prediction_angular() in intrapred.h. + for (int x=0;x<=nT;x++) refx = border-x; + + if (intraPredAngle<0) { + const int invAngle = invAngle_tablemode-11; + if (((nT*intraPredAngle)>>5) < -1) { + for (int x=(nT*intraPredAngle)>>5; x<=-1; x++) + refx = border(x*invAngle+128)>>8; + } + } else { + for (int x=nT+1; x<=2*nT; x++) refx = border-x; + } + + for (int y=0;y<nT;y++) + for (int x=0;x<nT;x++) { + const int iIdx = ((x+1)*intraPredAngle)>>5; + const int iFact = ((x+1)*intraPredAngle)&31; + if (iFact != 0) + dstx+y*stride = (uint8_t)(((32-iFact)*refy+iIdx+1 + iFact*refy+iIdx+2 + 16)>>5); + else + dstx+y*stride = refy+iIdx+1; + } + + if (mode==10 && cIdx==0 && nT<32 && !disableBoundaryFilter) { + for (int x=0;x<nT;x++) + dstx = (uint8_t)Clip_BitDepth(border-1 + ((border1+x - border0)>>1), bit_depth); + } + } +} + +#endif // HAVE_SSE4_1
View file
libde265-1.1.1.tar.gz/libde265/x86/sse-intrapred.h
Added
@@ -0,0 +1,35 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef SSE_INTRAPRED_H +#define SSE_INTRAPRED_H + +#include <stddef.h> +#include <stdint.h> + +// SSE4.1 accelerated 8-bit intra prediction. All three are bit-identical to the +// scalar kernels in intrapred.h (verified by dev-tools/test-intrapred.cc). + +void intra_pred_dc_8_sse4 (uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border); +void intra_pred_planar_8_sse4(uint8_t* dst, ptrdiff_t stride, int nT, int cIdx, const uint8_t* border); +void intra_pred_angular_8_sse4(uint8_t* dst, ptrdiff_t stride, int bit_depth, int disableBoundaryFilter, + int xB0, int yB0, int mode, int nT, int cIdx, const uint8_t* border); + +#endif
View file
libde265-1.0.17.tar.gz/libde265/x86/sse.cc -> libde265-1.1.1.tar.gz/libde265/x86/sse.cc
Changed
@@ -25,11 +25,20 @@ #include "x86/sse.h" #include "x86/sse-motion.h" #include "x86/sse-dct.h" +#include "x86/sse-intrapred.h" +#include "x86/sse-deblk.h" #ifdef HAVE_CONFIG_H #include "config.h" #endif +#if HAVE_AVX2 +#include "x86/transform-avx2.h" +#endif +#if HAVE_AVX512 +#include "x86/transform-avx512.h" +#endif + #if defined(__GNUC__) && !defined(__EMSCRIPTEN__) #include <cpuid.h> #endif @@ -109,7 +118,55 @@ accel->transform_add_81 = ff_hevc_transform_8x8_add_8_sse4; accel->transform_add_82 = ff_hevc_transform_16x16_add_8_sse4; accel->transform_add_83 = ff_hevc_transform_32x32_add_8_sse4; + + accel->add_residual_8 = add_residual_8_sse4; + accel->add_residual_16 = add_residual_16_sse4; + accel->dequant_coeff_block = dequant_coeff_block_sse4; + + accel->intra_pred_dc_8 = intra_pred_dc_8_sse4; + accel->intra_pred_planar_8 = intra_pred_planar_8_sse4; + accel->intra_pred_angular_8 = intra_pred_angular_8_sse4; + + accel->deblock_luma_8 = deblock_luma_8_sse4; + // chroma deblock stays on the scalar fallback: the filter is too cheap to + // amortize the SIMD load/transpose/scatter overhead (SSE measured slower). + } +#endif +} + + +void init_acceleration_functions_avx2(struct acceleration_functions* accel) +{ +#if HAVE_AVX2 + // __builtin_cpu_supports("avx2") handles the OSXSAVE / XGETBV (YMM-enabled) + // checks internally, so this is safe to call on any CPU. This TU is *not* + // compiled with -mavx2, so reaching here never executes an AVX2 instruction. +#if defined(__GNUC__) && !defined(__EMSCRIPTEN__) + __builtin_cpu_init(); + if (__builtin_cpu_supports("avx2")) { + accel->transform_add_82 = transform_16x16_add_8_avx2; + accel->transform_add_83 = transform_32x32_add_8_avx2; + // NB: dequant intentionally stays on the SSE version. An AVX2 variant was + // implemented and benchmarked, but inverse quantization is scatter-bound, so + // the wider arithmetic gave no benefit and AVX2 measured actually slightly + // slower than SSE. (AVX-512 would be no better, for the same reason.) } #endif +#endif +} + + +void init_acceleration_functions_avx512(struct acceleration_functions* accel) +{ +#if HAVE_AVX512 +#if defined(__GNUC__) && !defined(__EMSCRIPTEN__) + __builtin_cpu_init(); + // need AVX-512F + AVX-512BW (16-bit ops). __builtin_cpu_supports handles the + // OS (XCR0/ZMM-enabled) check. This TU is not compiled with -mavx512*. + if (__builtin_cpu_supports("avx512f") && __builtin_cpu_supports("avx512bw")) { + accel->transform_add_83 = transform_32x32_add_8_avx512; + } +#endif +#endif }
View file
libde265-1.0.17.tar.gz/libde265/x86/sse.h -> libde265-1.1.1.tar.gz/libde265/x86/sse.h
Changed
@@ -25,4 +25,13 @@ void init_acceleration_functions_sse(struct acceleration_functions* accel); +// Overrides selected transform kernels with AVX2 versions, but only if the +// running CPU actually supports AVX2 (checked at runtime). Safe to call on any +// CPU; a no-op when AVX2 is unavailable. +void init_acceleration_functions_avx2(struct acceleration_functions* accel); + +// Overrides selected transform kernels with AVX-512 versions, runtime-checked. +// Safe to call on any CPU; a no-op when AVX-512 is unavailable. +void init_acceleration_functions_avx512(struct acceleration_functions* accel); + #endif
View file
libde265-1.1.1.tar.gz/libde265/x86/transform-avx2.cc
Added
@@ -0,0 +1,318 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +// AVX2 inverse transforms (16x16, 32x32) for 8-bit. +// +// These produce bit-identical results to the SSE4.1 / scalar kernels but +// process all columns of a block in 256-bit registers (16 int16 per register) +// instead of 8-column slabs. The per-column butterfly arithmetic matches the +// SSE version: every 256-bit op acts independently on its two 128-bit halves, +// so each half does exactly what the SSE code does for 8 columns. The only +// genuinely new piece is a full 16x16 / 32x32 in-register transpose between the +// vertical and horizontal passes (the SSE code only transposed 8x8 sub-blocks). +// +// The 32-point inverse transform's even part is itself a 16-point inverse +// transform of the even-indexed input rows, so idct16_core() is reused for it. + +#include "x86/transform-avx2.h" + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#if HAVE_AVX2 + +#include <immintrin.h> +#include "x86/transform-dct-tables.h" // idct_T16_{1,2,3}, idct_T32 + +namespace { + +#define shift_1st 7 +#define add_1st (1 << (shift_1st - 1)) // 64 +// 8-bit: shift_2nd = 20 - 8 = 12, add_2nd = 1<<11 +#define shift_2nd 12 +#define add_2nd (1 << (shift_2nd - 1)) // 2048 + +static inline __m256i bc(const int16_t* p) { + return _mm256_broadcastsi128_si256(_mm_load_si128((const __m128i*)p)); +} + +// --- transposes ---------------------------------------------------------- + +// 8x8 int16 transpose performed independently in each 128-bit lane. +static inline void transpose8x8_lanes(__m256i a8) { + __m256i b0=_mm256_unpacklo_epi16(a0,a1), b1=_mm256_unpackhi_epi16(a0,a1); + __m256i b2=_mm256_unpacklo_epi16(a2,a3), b3=_mm256_unpackhi_epi16(a2,a3); + __m256i b4=_mm256_unpacklo_epi16(a4,a5), b5=_mm256_unpackhi_epi16(a4,a5); + __m256i b6=_mm256_unpacklo_epi16(a6,a7), b7=_mm256_unpackhi_epi16(a6,a7); + __m256i c0=_mm256_unpacklo_epi32(b0,b2), c1=_mm256_unpackhi_epi32(b0,b2); + __m256i c2=_mm256_unpacklo_epi32(b1,b3), c3=_mm256_unpackhi_epi32(b1,b3); + __m256i c4=_mm256_unpacklo_epi32(b4,b6), c5=_mm256_unpackhi_epi32(b4,b6); + __m256i c6=_mm256_unpacklo_epi32(b5,b7), c7=_mm256_unpackhi_epi32(b5,b7); + a0=_mm256_unpacklo_epi64(c0,c4); a1=_mm256_unpackhi_epi64(c0,c4); + a2=_mm256_unpacklo_epi64(c1,c5); a3=_mm256_unpackhi_epi64(c1,c5); + a4=_mm256_unpacklo_epi64(c2,c6); a5=_mm256_unpackhi_epi64(c2,c6); + a6=_mm256_unpacklo_epi64(c3,c7); a7=_mm256_unpackhi_epi64(c3,c7); +} + +// Full 16x16 int16 transpose of S0..15 (each ymm = one row: lane0 cols0-7, +// lane1 cols8-15): four 8x8 quadrant transposes + a lane swap. +static inline void transpose16x16(__m256i S16) { + __m256i top8, bot8; + for (int r=0;r<8;r++){ topr=Sr; botr=Sr+8; } + transpose8x8_lanes(top); + transpose8x8_lanes(bot); + for (int r=0;r<8;r++){ + Sr = _mm256_permute2x128_si256(topr, botr, 0x20); + Sr+8 = _mm256_permute2x128_si256(topr, botr, 0x31); + } +} + +// Full 32x32 int16 transpose. Lor=cols0-15 of row r, Hir=cols16-31. +// Built from four 16x16 block transposes: out = A^T,C^T,B^T,D^T. +static inline void transpose32x32(__m256i Lo32, __m256i Hi32) { + __m256i A16,B16,C16,D16; + for (int r=0;r<16;r++){ Ar=Lor; Br=Hir; Cr=Lor+16; Dr=Hir+16; } + transpose16x16(A); transpose16x16(B); transpose16x16(C); transpose16x16(D); + for (int r=0;r<16;r++){ Lor=Ar; Hir=Cr; Lor+16=Br; Hir+16=Dr; } +} + +// --- 16-point core ------------------------------------------------------- + +// Computes the 16 outputs of a 1-D 16-point inverse transform applied down the +// 16 rows S0..15, for 16 columns at once, as int32 (no rounding/shift). The +// result lanes split lo = cols{0-3,8-11}, hi = cols{4-7,12-15} (recombined to +// natural order by packs_epi32 in the caller). +static inline void idct16_core(const __m256i S16, __m256i resL16, __m256i resH16) { + const __m256i T00=bc(idct_T16_100),T01=bc(idct_T16_101),T02=bc(idct_T16_102),T03=bc(idct_T16_103); + const __m256i T04=bc(idct_T16_104),T05=bc(idct_T16_105),T06=bc(idct_T16_106),T07=bc(idct_T16_107); + const __m256i T10=bc(idct_T16_110),T11=bc(idct_T16_111),T12=bc(idct_T16_112),T13=bc(idct_T16_113); + const __m256i T14=bc(idct_T16_114),T15=bc(idct_T16_115),T16=bc(idct_T16_116),T17=bc(idct_T16_117); + const __m256i T20=bc(idct_T16_120),T21=bc(idct_T16_121),T22=bc(idct_T16_122),T23=bc(idct_T16_123); + const __m256i T24=bc(idct_T16_124),T25=bc(idct_T16_125),T26=bc(idct_T16_126),T27=bc(idct_T16_127); + const __m256i T30=bc(idct_T16_130),T31=bc(idct_T16_131),T32_=bc(idct_T16_132),T33=bc(idct_T16_133); + const __m256i T34=bc(idct_T16_134),T35=bc(idct_T16_135),T36=bc(idct_T16_136),T37=bc(idct_T16_137); + const __m256i U00=bc(idct_T16_200),U01=bc(idct_T16_201),U02=bc(idct_T16_202),U03=bc(idct_T16_203); + const __m256i U10=bc(idct_T16_210),U11=bc(idct_T16_211),U12=bc(idct_T16_212),U13=bc(idct_T16_213); + const __m256i V00=bc(idct_T16_300),V01=bc(idct_T16_301),V10=bc(idct_T16_310),V11=bc(idct_T16_311); + + __m256i m0,m1,m2,m3,m4,m5,m6,m7; + __m256i E0l,E1l,E2l,E3l,E0h,E1h,E2h,E3h; + __m256i O0l,O1l,O2l,O3l,O4l,O5l,O6l,O7l,O0h,O1h,O2h,O3h,O4h,O5h,O6h,O7h; + __m256i E00l,E01l,E00h,E01h,EE0l,EE1l,EE2l,EE3l,EE0h,EE1h,EE2h,EE3h; + __m256i E4l,E5l,E6l,E7l,E4h,E5h,E6h,E7h; + + m0=_mm256_unpacklo_epi16(S1,S3); E0l=_mm256_madd_epi16(m0,T00); + m1=_mm256_unpackhi_epi16(S1,S3); E0h=_mm256_madd_epi16(m1,T00); + m2=_mm256_unpacklo_epi16(S5,S7); E1l=_mm256_madd_epi16(m2,T10); + m3=_mm256_unpackhi_epi16(S5,S7); E1h=_mm256_madd_epi16(m3,T10); + m4=_mm256_unpacklo_epi16(S9,S11); E2l=_mm256_madd_epi16(m4,T20); + m5=_mm256_unpackhi_epi16(S9,S11); E2h=_mm256_madd_epi16(m5,T20); + m6=_mm256_unpacklo_epi16(S13,S15);E3l=_mm256_madd_epi16(m6,T30); + m7=_mm256_unpackhi_epi16(S13,S15);E3h=_mm256_madd_epi16(m7,T30); + O0l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O0h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T01);E0h=_mm256_madd_epi16(m1,T01);E1l=_mm256_madd_epi16(m2,T11);E1h=_mm256_madd_epi16(m3,T11); + E2l=_mm256_madd_epi16(m4,T21);E2h=_mm256_madd_epi16(m5,T21);E3l=_mm256_madd_epi16(m6,T31);E3h=_mm256_madd_epi16(m7,T31); + O1l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O1h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T02);E0h=_mm256_madd_epi16(m1,T02);E1l=_mm256_madd_epi16(m2,T12);E1h=_mm256_madd_epi16(m3,T12); + E2l=_mm256_madd_epi16(m4,T22);E2h=_mm256_madd_epi16(m5,T22);E3l=_mm256_madd_epi16(m6,T32_);E3h=_mm256_madd_epi16(m7,T32_); + O2l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O2h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T03);E0h=_mm256_madd_epi16(m1,T03);E1l=_mm256_madd_epi16(m2,T13);E1h=_mm256_madd_epi16(m3,T13); + E2l=_mm256_madd_epi16(m4,T23);E2h=_mm256_madd_epi16(m5,T23);E3l=_mm256_madd_epi16(m6,T33);E3h=_mm256_madd_epi16(m7,T33); + O3l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O3h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T04);E0h=_mm256_madd_epi16(m1,T04);E1l=_mm256_madd_epi16(m2,T14);E1h=_mm256_madd_epi16(m3,T14); + E2l=_mm256_madd_epi16(m4,T24);E2h=_mm256_madd_epi16(m5,T24);E3l=_mm256_madd_epi16(m6,T34);E3h=_mm256_madd_epi16(m7,T34); + O4l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O4h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T05);E0h=_mm256_madd_epi16(m1,T05);E1l=_mm256_madd_epi16(m2,T15);E1h=_mm256_madd_epi16(m3,T15); + E2l=_mm256_madd_epi16(m4,T25);E2h=_mm256_madd_epi16(m5,T25);E3l=_mm256_madd_epi16(m6,T35);E3h=_mm256_madd_epi16(m7,T35); + O5l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O5h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T06);E0h=_mm256_madd_epi16(m1,T06);E1l=_mm256_madd_epi16(m2,T16);E1h=_mm256_madd_epi16(m3,T16); + E2l=_mm256_madd_epi16(m4,T26);E2h=_mm256_madd_epi16(m5,T26);E3l=_mm256_madd_epi16(m6,T36);E3h=_mm256_madd_epi16(m7,T36); + O6l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O6h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + E0l=_mm256_madd_epi16(m0,T07);E0h=_mm256_madd_epi16(m1,T07);E1l=_mm256_madd_epi16(m2,T17);E1h=_mm256_madd_epi16(m3,T17); + E2l=_mm256_madd_epi16(m4,T27);E2h=_mm256_madd_epi16(m5,T27);E3l=_mm256_madd_epi16(m6,T37);E3h=_mm256_madd_epi16(m7,T37); + O7l=_mm256_add_epi32(_mm256_add_epi32(E0l,E1l),_mm256_add_epi32(E2l,E3l)); + O7h=_mm256_add_epi32(_mm256_add_epi32(E0h,E1h),_mm256_add_epi32(E2h,E3h)); + + // even part + m0=_mm256_unpacklo_epi16(S2,S6); E0l=_mm256_madd_epi16(m0,U00); + m1=_mm256_unpackhi_epi16(S2,S6); E0h=_mm256_madd_epi16(m1,U00); + m2=_mm256_unpacklo_epi16(S10,S14);E0l=_mm256_add_epi32(E0l,_mm256_madd_epi16(m2,U10)); + m3=_mm256_unpackhi_epi16(S10,S14);E0h=_mm256_add_epi32(E0h,_mm256_madd_epi16(m3,U10)); + E1l=_mm256_madd_epi16(m0,U01);E1h=_mm256_madd_epi16(m1,U01); + E1l=_mm256_add_epi32(E1l,_mm256_madd_epi16(m2,U11));E1h=_mm256_add_epi32(E1h,_mm256_madd_epi16(m3,U11)); + E2l=_mm256_madd_epi16(m0,U02);E2h=_mm256_madd_epi16(m1,U02); + E2l=_mm256_add_epi32(E2l,_mm256_madd_epi16(m2,U12));E2h=_mm256_add_epi32(E2h,_mm256_madd_epi16(m3,U12)); + E3l=_mm256_madd_epi16(m0,U03);E3h=_mm256_madd_epi16(m1,U03); + E3l=_mm256_add_epi32(E3l,_mm256_madd_epi16(m2,U13));E3h=_mm256_add_epi32(E3h,_mm256_madd_epi16(m3,U13)); + + m0=_mm256_unpacklo_epi16(S4,S12); E00l=_mm256_madd_epi16(m0,V00); + m1=_mm256_unpackhi_epi16(S4,S12); E00h=_mm256_madd_epi16(m1,V00); + m2=_mm256_unpacklo_epi16(S0,S8); EE0l=_mm256_madd_epi16(m2,V10); + m3=_mm256_unpackhi_epi16(S0,S8); EE0h=_mm256_madd_epi16(m3,V10); + E01l=_mm256_madd_epi16(m0,V01);E01h=_mm256_madd_epi16(m1,V01); + EE1l=_mm256_madd_epi16(m2,V11);EE1h=_mm256_madd_epi16(m3,V11); + + EE2l=_mm256_sub_epi32(EE1l,E01l);EE3l=_mm256_sub_epi32(EE0l,E00l); + EE2h=_mm256_sub_epi32(EE1h,E01h);EE3h=_mm256_sub_epi32(EE0h,E00h); + EE0l=_mm256_add_epi32(EE0l,E00l);EE1l=_mm256_add_epi32(EE1l,E01l); + EE0h=_mm256_add_epi32(EE0h,E00h);EE1h=_mm256_add_epi32(EE1h,E01h); + + E4l=_mm256_sub_epi32(EE3l,E3l); E5l=_mm256_sub_epi32(EE2l,E2l); + E6l=_mm256_sub_epi32(EE1l,E1l); E7l=_mm256_sub_epi32(EE0l,E0l); + E4h=_mm256_sub_epi32(EE3h,E3h); E5h=_mm256_sub_epi32(EE2h,E2h); + E6h=_mm256_sub_epi32(EE1h,E1h); E7h=_mm256_sub_epi32(EE0h,E0h); + E0l=_mm256_add_epi32(EE0l,E0l); E1l=_mm256_add_epi32(EE1l,E1l); + E2l=_mm256_add_epi32(EE2l,E2l); E3l=_mm256_add_epi32(EE3l,E3l); + E0h=_mm256_add_epi32(EE0h,E0h); E1h=_mm256_add_epi32(EE1h,E1h); + E2h=_mm256_add_epi32(EE2h,E2h); E3h=_mm256_add_epi32(EE3h,E3h); + + resL0=_mm256_add_epi32(E0l,O0l); resH0=_mm256_add_epi32(E0h,O0h); + resL1=_mm256_add_epi32(E1l,O1l); resH1=_mm256_add_epi32(E1h,O1h); + resL2=_mm256_add_epi32(E2l,O2l); resH2=_mm256_add_epi32(E2h,O2h); + resL3=_mm256_add_epi32(E3l,O3l); resH3=_mm256_add_epi32(E3h,O3h); + resL4=_mm256_add_epi32(E4l,O4l); resH4=_mm256_add_epi32(E4h,O4h); + resL5=_mm256_add_epi32(E5l,O5l); resH5=_mm256_add_epi32(E5h,O5h); + resL6=_mm256_add_epi32(E6l,O6l); resH6=_mm256_add_epi32(E6h,O6h); + resL7=_mm256_add_epi32(E7l,O7l); resH7=_mm256_add_epi32(E7h,O7h); + resL15=_mm256_sub_epi32(E0l,O0l); resH15=_mm256_sub_epi32(E0h,O0h); + resL14=_mm256_sub_epi32(E1l,O1l); resH14=_mm256_sub_epi32(E1h,O1h); + resL13=_mm256_sub_epi32(E2l,O2l); resH13=_mm256_sub_epi32(E2h,O2h); + resL12=_mm256_sub_epi32(E3l,O3l); resH12=_mm256_sub_epi32(E3h,O3h); + resL11=_mm256_sub_epi32(E4l,O4l); resH11=_mm256_sub_epi32(E4h,O4h); + resL10=_mm256_sub_epi32(E5l,O5l); resH10=_mm256_sub_epi32(E5h,O5h); + resL9 =_mm256_sub_epi32(E6l,O6l); resH9 =_mm256_sub_epi32(E6h,O6h); + resL8 =_mm256_sub_epi32(E7l,O7l); resH8 =_mm256_sub_epi32(E7h,O7h); +} + +static inline __m256i round_shift_pack(__m256i lo, __m256i hi, __m256i vr, int shift) { + return _mm256_packs_epi32( + _mm256_srai_epi32(_mm256_add_epi32(lo,vr),shift), + _mm256_srai_epi32(_mm256_add_epi32(hi,vr),shift)); +} + +// 1-D 16-point inverse transform in place on S0..15 (16 columns). +static inline void idct16_vpass(__m256i S16, int add, int shift) { + __m256i resL16, resH16; + idct16_core(S, resL, resH); + const __m256i vr = _mm256_set1_epi32(add); + for (int k=0;k<16;k++) Sk = round_shift_pack(resLk, resHk, vr, shift); +} + +// 1-D 32-point inverse transform in place on S0..31 (16 columns). Even part +// reuses idct16_core on the even rows; odd part uses the T32 coefficients. +static inline void idct32_vpass(__m256i S32, int add, int shift) { + __m256i ev16; + for (int i=0;i<16;i++) evi=S2*i; + __m256i EL16, EH16; + idct16_core(ev, EL, EH); + + __m256i pl8, ph8; + for (int g=0; g<8; g++) { + plg=_mm256_unpacklo_epi16(S4*g+1, S4*g+3); + phg=_mm256_unpackhi_epi16(S4*g+1, S4*g+3); + } + + const __m256i vr = _mm256_set1_epi32(add); + for (int k=0;k<16;k++) { + __m256i OL=_mm256_madd_epi16(pl0, bc(idct_T320k)); + __m256i OH=_mm256_madd_epi16(ph0, bc(idct_T320k)); + for (int g=1; g<8; g++) { + OL=_mm256_add_epi32(OL,_mm256_madd_epi16(plg, bc(idct_T32gk))); + OH=_mm256_add_epi32(OH,_mm256_madd_epi16(phg, bc(idct_T32gk))); + } + Sk = round_shift_pack(_mm256_add_epi32(ELk,OL), _mm256_add_epi32(EHk,OH), vr, shift); + S31-k = round_shift_pack(_mm256_sub_epi32(ELk,OL), _mm256_sub_epi32(EHk,OH), vr, shift); + } +} + +} // namespace + + +void transform_16x16_add_8_avx2(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride) +{ + __m256i S16; + for (int i=0;i<16;i++) Si=_mm256_loadu_si256((const __m256i*)(coeffs + i*16)); + + idct16_vpass(S, add_1st, shift_1st); + transpose16x16(S); + idct16_vpass(S, add_2nd, shift_2nd); + transpose16x16(S); + + for (int r=0;r<16;r++) { + uint8_t* d = dst + r*stride; + __m256i pred = _mm256_cvtepu8_epi16(_mm_loadu_si128((const __m128i*)d)); + __m256i sum = _mm256_adds_epi16(Sr, pred); + __m256i pk = _mm256_packus_epi16(sum, sum); // c0-7|c0-7 ; c8-15|c8-15 + pk = _mm256_permute4x64_epi64(pk, 0x08); // low 128 = c0-7,c8-15 + _mm_storeu_si128((__m128i*)d, _mm256_castsi256_si128(pk)); + } +} + + +void transform_32x32_add_8_avx2(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride) +{ + __m256i Lo32, Hi32; + for (int r=0;r<32;r++) { + Lor=_mm256_loadu_si256((const __m256i*)(coeffs + r*32)); + Hir=_mm256_loadu_si256((const __m256i*)(coeffs + r*32 + 16)); + } + + idct32_vpass(Lo, add_1st, shift_1st); + idct32_vpass(Hi, add_1st, shift_1st); + transpose32x32(Lo, Hi); + idct32_vpass(Lo, add_2nd, shift_2nd); + idct32_vpass(Hi, add_2nd, shift_2nd); + transpose32x32(Lo, Hi); + + for (int r=0;r<32;r++) { + uint8_t* d = dst + r*stride; + __m256i predLo = _mm256_cvtepu8_epi16(_mm_loadu_si128((const __m128i*)d)); + __m256i predHi = _mm256_cvtepu8_epi16(_mm_loadu_si128((const __m128i*)(d+16))); + __m256i sumLo = _mm256_adds_epi16(Lor, predLo); + __m256i sumHi = _mm256_adds_epi16(Hir, predHi); + // pack to bytes: qwords come out as c0-7, c16-23, c8-15, c24-31 + __m256i pk = _mm256_packus_epi16(sumLo, sumHi); + pk = _mm256_permute4x64_epi64(pk, _MM_SHUFFLE(3,1,2,0)); // -> c0-31 + _mm256_storeu_si256((__m256i*)d, pk); + } +} + +// Note: an AVX2 inverse-quantization kernel was implemented and benchmarked too, +// but inverse quantization is scatter-bound (the result is scattered to +// coeffBufcoeffPosi, and there is no 16-bit SIMD scatter), so the wider +// arithmetic gave no benefit -- AVX2 measured ~equal-to-slightly-slower than the +// SSE version. The SSE version (dequant_coeff_block_sse4) is used instead. + +#endif // HAVE_AVX2
View file
libde265-1.1.1.tar.gz/libde265/x86/transform-avx2.h
Added
@@ -0,0 +1,33 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef SSE_TRANSFORM_AVX2_H +#define SSE_TRANSFORM_AVX2_H + +#include <stddef.h> +#include <stdint.h> + +// AVX2 inverse-DCT + add. Bit-identical to the SSE4.1 / scalar versions +// (verified by dev-tools/test-transform). Same signatures as transform_add_8. + +void transform_16x16_add_8_avx2(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride); +void transform_32x32_add_8_avx2(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride); + +#endif
View file
libde265-1.1.1.tar.gz/libde265/x86/transform-avx512.cc
Added
@@ -0,0 +1,275 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +// AVX-512 (F + BW) inverse transform for 32x32, 8-bit. +// +// A full 32-column row fits in one zmm (32 int16), so the vertical pass +// processes all 32 columns at once (twice the AVX2 width). The per-column +// butterfly is the same arithmetic as the AVX2/SSE versions: every 512-bit op +// acts independently per 128-bit lane, so each of the 4 lanes does what the SSE +// does for 8 columns -> 32 columns, bit-identical. The even part reuses a +// 16-point core (a 32-pt inverse DCT's even half is a 16-pt inverse DCT of the +// even rows). The 32x32 transpose between passes is done at 256-bit width using +// the proven AVX2 path (split each zmm into its two ymm halves, transpose, +// recombine) since the transpose is shuffle-bound, not the hot path. + +#include "x86/transform-avx512.h" + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#if HAVE_AVX512 + +#include <immintrin.h> +#include "x86/transform-dct-tables.h" // idct_T16_{1,2,3}, idct_T32 + +namespace { + +#define shift_1st 7 +#define add_1st (1 << (shift_1st - 1)) // 64 +#define shift_2nd 12 +#define add_2nd (1 << (shift_2nd - 1)) // 2048 + +// broadcast a 128-bit (8 int16) madd pattern to all four 128-bit lanes +static inline __m512i bc(const int16_t* p) { + return _mm512_broadcast_i32x4(_mm_load_si128((const __m128i*)p)); +} + +// --- 256-bit transpose helpers (identical to transform-avx2.cc) ---------- + +static inline void transpose8x8_lanes(__m256i a8) { + __m256i b0=_mm256_unpacklo_epi16(a0,a1), b1=_mm256_unpackhi_epi16(a0,a1); + __m256i b2=_mm256_unpacklo_epi16(a2,a3), b3=_mm256_unpackhi_epi16(a2,a3); + __m256i b4=_mm256_unpacklo_epi16(a4,a5), b5=_mm256_unpackhi_epi16(a4,a5); + __m256i b6=_mm256_unpacklo_epi16(a6,a7), b7=_mm256_unpackhi_epi16(a6,a7); + __m256i c0=_mm256_unpacklo_epi32(b0,b2), c1=_mm256_unpackhi_epi32(b0,b2); + __m256i c2=_mm256_unpacklo_epi32(b1,b3), c3=_mm256_unpackhi_epi32(b1,b3); + __m256i c4=_mm256_unpacklo_epi32(b4,b6), c5=_mm256_unpackhi_epi32(b4,b6); + __m256i c6=_mm256_unpacklo_epi32(b5,b7), c7=_mm256_unpackhi_epi32(b5,b7); + a0=_mm256_unpacklo_epi64(c0,c4); a1=_mm256_unpackhi_epi64(c0,c4); + a2=_mm256_unpacklo_epi64(c1,c5); a3=_mm256_unpackhi_epi64(c1,c5); + a4=_mm256_unpacklo_epi64(c2,c6); a5=_mm256_unpackhi_epi64(c2,c6); + a6=_mm256_unpacklo_epi64(c3,c7); a7=_mm256_unpackhi_epi64(c3,c7); +} +static inline void transpose16x16(__m256i S16) { + __m256i top8, bot8; + for (int r=0;r<8;r++){ topr=Sr; botr=Sr+8; } + transpose8x8_lanes(top); transpose8x8_lanes(bot); + for (int r=0;r<8;r++){ + Sr = _mm256_permute2x128_si256(topr, botr, 0x20); + Sr+8 = _mm256_permute2x128_si256(topr, botr, 0x31); + } +} +static inline void transpose32x32_ymm(__m256i Lo32, __m256i Hi32) { + __m256i A16,B16,C16,D16; + for (int r=0;r<16;r++){ Ar=Lor; Br=Hir; Cr=Lor+16; Dr=Hir+16; } + transpose16x16(A); transpose16x16(B); transpose16x16(C); transpose16x16(D); + for (int r=0;r<16;r++){ Lor=Ar; Hir=Cr; Lor+16=Br; Hir+16=Dr; } +} + +// 32x32 transpose of zmm rows, via the 256-bit path. +static inline void transpose32x32_z(__m512i S32) { + __m256i Lo32, Hi32; + for (int r=0;r<32;r++){ + Lor=_mm512_castsi512_si256(Sr); + Hir=_mm512_extracti64x4_epi64(Sr, 1); + } + transpose32x32_ymm(Lo, Hi); + for (int r=0;r<32;r++) + Sr=_mm512_inserti64x4(_mm512_castsi256_si512(Lor), Hir, 1); +} + +// --- 16-point core (32 columns) ------------------------------------------ + +static inline void idct16_core_512(const __m512i S16, __m512i resL16, __m512i resH16) { + const __m512i T00=bc(idct_T16_100),T01=bc(idct_T16_101),T02=bc(idct_T16_102),T03=bc(idct_T16_103); + const __m512i T04=bc(idct_T16_104),T05=bc(idct_T16_105),T06=bc(idct_T16_106),T07=bc(idct_T16_107); + const __m512i T10=bc(idct_T16_110),T11=bc(idct_T16_111),T12=bc(idct_T16_112),T13=bc(idct_T16_113); + const __m512i T14=bc(idct_T16_114),T15=bc(idct_T16_115),T16=bc(idct_T16_116),T17=bc(idct_T16_117); + const __m512i T20=bc(idct_T16_120),T21=bc(idct_T16_121),T22=bc(idct_T16_122),T23=bc(idct_T16_123); + const __m512i T24=bc(idct_T16_124),T25=bc(idct_T16_125),T26=bc(idct_T16_126),T27=bc(idct_T16_127); + const __m512i T30=bc(idct_T16_130),T31=bc(idct_T16_131),T32_=bc(idct_T16_132),T33=bc(idct_T16_133); + const __m512i T34=bc(idct_T16_134),T35=bc(idct_T16_135),T36=bc(idct_T16_136),T37=bc(idct_T16_137); + const __m512i U00=bc(idct_T16_200),U01=bc(idct_T16_201),U02=bc(idct_T16_202),U03=bc(idct_T16_203); + const __m512i U10=bc(idct_T16_210),U11=bc(idct_T16_211),U12=bc(idct_T16_212),U13=bc(idct_T16_213); + const __m512i V00=bc(idct_T16_300),V01=bc(idct_T16_301),V10=bc(idct_T16_310),V11=bc(idct_T16_311); + + __m512i m0,m1,m2,m3,m4,m5,m6,m7; + __m512i E0l,E1l,E2l,E3l,E0h,E1h,E2h,E3h; + __m512i O0l,O1l,O2l,O3l,O4l,O5l,O6l,O7l,O0h,O1h,O2h,O3h,O4h,O5h,O6h,O7h; + __m512i E00l,E01l,E00h,E01h,EE0l,EE1l,EE2l,EE3l,EE0h,EE1h,EE2h,EE3h; + __m512i E4l,E5l,E6l,E7l,E4h,E5h,E6h,E7h; + + m0=_mm512_unpacklo_epi16(S1,S3); E0l=_mm512_madd_epi16(m0,T00); + m1=_mm512_unpackhi_epi16(S1,S3); E0h=_mm512_madd_epi16(m1,T00); + m2=_mm512_unpacklo_epi16(S5,S7); E1l=_mm512_madd_epi16(m2,T10); + m3=_mm512_unpackhi_epi16(S5,S7); E1h=_mm512_madd_epi16(m3,T10); + m4=_mm512_unpacklo_epi16(S9,S11); E2l=_mm512_madd_epi16(m4,T20); + m5=_mm512_unpackhi_epi16(S9,S11); E2h=_mm512_madd_epi16(m5,T20); + m6=_mm512_unpacklo_epi16(S13,S15);E3l=_mm512_madd_epi16(m6,T30); + m7=_mm512_unpackhi_epi16(S13,S15);E3h=_mm512_madd_epi16(m7,T30); + O0l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O0h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T01);E0h=_mm512_madd_epi16(m1,T01);E1l=_mm512_madd_epi16(m2,T11);E1h=_mm512_madd_epi16(m3,T11); + E2l=_mm512_madd_epi16(m4,T21);E2h=_mm512_madd_epi16(m5,T21);E3l=_mm512_madd_epi16(m6,T31);E3h=_mm512_madd_epi16(m7,T31); + O1l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O1h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T02);E0h=_mm512_madd_epi16(m1,T02);E1l=_mm512_madd_epi16(m2,T12);E1h=_mm512_madd_epi16(m3,T12); + E2l=_mm512_madd_epi16(m4,T22);E2h=_mm512_madd_epi16(m5,T22);E3l=_mm512_madd_epi16(m6,T32_);E3h=_mm512_madd_epi16(m7,T32_); + O2l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O2h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T03);E0h=_mm512_madd_epi16(m1,T03);E1l=_mm512_madd_epi16(m2,T13);E1h=_mm512_madd_epi16(m3,T13); + E2l=_mm512_madd_epi16(m4,T23);E2h=_mm512_madd_epi16(m5,T23);E3l=_mm512_madd_epi16(m6,T33);E3h=_mm512_madd_epi16(m7,T33); + O3l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O3h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T04);E0h=_mm512_madd_epi16(m1,T04);E1l=_mm512_madd_epi16(m2,T14);E1h=_mm512_madd_epi16(m3,T14); + E2l=_mm512_madd_epi16(m4,T24);E2h=_mm512_madd_epi16(m5,T24);E3l=_mm512_madd_epi16(m6,T34);E3h=_mm512_madd_epi16(m7,T34); + O4l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O4h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T05);E0h=_mm512_madd_epi16(m1,T05);E1l=_mm512_madd_epi16(m2,T15);E1h=_mm512_madd_epi16(m3,T15); + E2l=_mm512_madd_epi16(m4,T25);E2h=_mm512_madd_epi16(m5,T25);E3l=_mm512_madd_epi16(m6,T35);E3h=_mm512_madd_epi16(m7,T35); + O5l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O5h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T06);E0h=_mm512_madd_epi16(m1,T06);E1l=_mm512_madd_epi16(m2,T16);E1h=_mm512_madd_epi16(m3,T16); + E2l=_mm512_madd_epi16(m4,T26);E2h=_mm512_madd_epi16(m5,T26);E3l=_mm512_madd_epi16(m6,T36);E3h=_mm512_madd_epi16(m7,T36); + O6l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O6h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + E0l=_mm512_madd_epi16(m0,T07);E0h=_mm512_madd_epi16(m1,T07);E1l=_mm512_madd_epi16(m2,T17);E1h=_mm512_madd_epi16(m3,T17); + E2l=_mm512_madd_epi16(m4,T27);E2h=_mm512_madd_epi16(m5,T27);E3l=_mm512_madd_epi16(m6,T37);E3h=_mm512_madd_epi16(m7,T37); + O7l=_mm512_add_epi32(_mm512_add_epi32(E0l,E1l),_mm512_add_epi32(E2l,E3l)); + O7h=_mm512_add_epi32(_mm512_add_epi32(E0h,E1h),_mm512_add_epi32(E2h,E3h)); + + // even part + m0=_mm512_unpacklo_epi16(S2,S6); E0l=_mm512_madd_epi16(m0,U00); + m1=_mm512_unpackhi_epi16(S2,S6); E0h=_mm512_madd_epi16(m1,U00); + m2=_mm512_unpacklo_epi16(S10,S14);E0l=_mm512_add_epi32(E0l,_mm512_madd_epi16(m2,U10)); + m3=_mm512_unpackhi_epi16(S10,S14);E0h=_mm512_add_epi32(E0h,_mm512_madd_epi16(m3,U10)); + E1l=_mm512_madd_epi16(m0,U01);E1h=_mm512_madd_epi16(m1,U01); + E1l=_mm512_add_epi32(E1l,_mm512_madd_epi16(m2,U11));E1h=_mm512_add_epi32(E1h,_mm512_madd_epi16(m3,U11)); + E2l=_mm512_madd_epi16(m0,U02);E2h=_mm512_madd_epi16(m1,U02); + E2l=_mm512_add_epi32(E2l,_mm512_madd_epi16(m2,U12));E2h=_mm512_add_epi32(E2h,_mm512_madd_epi16(m3,U12)); + E3l=_mm512_madd_epi16(m0,U03);E3h=_mm512_madd_epi16(m1,U03); + E3l=_mm512_add_epi32(E3l,_mm512_madd_epi16(m2,U13));E3h=_mm512_add_epi32(E3h,_mm512_madd_epi16(m3,U13)); + + m0=_mm512_unpacklo_epi16(S4,S12); E00l=_mm512_madd_epi16(m0,V00); + m1=_mm512_unpackhi_epi16(S4,S12); E00h=_mm512_madd_epi16(m1,V00); + m2=_mm512_unpacklo_epi16(S0,S8); EE0l=_mm512_madd_epi16(m2,V10); + m3=_mm512_unpackhi_epi16(S0,S8); EE0h=_mm512_madd_epi16(m3,V10); + E01l=_mm512_madd_epi16(m0,V01);E01h=_mm512_madd_epi16(m1,V01); + EE1l=_mm512_madd_epi16(m2,V11);EE1h=_mm512_madd_epi16(m3,V11); + + EE2l=_mm512_sub_epi32(EE1l,E01l);EE3l=_mm512_sub_epi32(EE0l,E00l); + EE2h=_mm512_sub_epi32(EE1h,E01h);EE3h=_mm512_sub_epi32(EE0h,E00h); + EE0l=_mm512_add_epi32(EE0l,E00l);EE1l=_mm512_add_epi32(EE1l,E01l); + EE0h=_mm512_add_epi32(EE0h,E00h);EE1h=_mm512_add_epi32(EE1h,E01h); + + E4l=_mm512_sub_epi32(EE3l,E3l); E5l=_mm512_sub_epi32(EE2l,E2l); + E6l=_mm512_sub_epi32(EE1l,E1l); E7l=_mm512_sub_epi32(EE0l,E0l); + E4h=_mm512_sub_epi32(EE3h,E3h); E5h=_mm512_sub_epi32(EE2h,E2h); + E6h=_mm512_sub_epi32(EE1h,E1h); E7h=_mm512_sub_epi32(EE0h,E0h); + E0l=_mm512_add_epi32(EE0l,E0l); E1l=_mm512_add_epi32(EE1l,E1l); + E2l=_mm512_add_epi32(EE2l,E2l); E3l=_mm512_add_epi32(EE3l,E3l); + E0h=_mm512_add_epi32(EE0h,E0h); E1h=_mm512_add_epi32(EE1h,E1h); + E2h=_mm512_add_epi32(EE2h,E2h); E3h=_mm512_add_epi32(EE3h,E3h); + + resL0=_mm512_add_epi32(E0l,O0l); resH0=_mm512_add_epi32(E0h,O0h); + resL1=_mm512_add_epi32(E1l,O1l); resH1=_mm512_add_epi32(E1h,O1h); + resL2=_mm512_add_epi32(E2l,O2l); resH2=_mm512_add_epi32(E2h,O2h); + resL3=_mm512_add_epi32(E3l,O3l); resH3=_mm512_add_epi32(E3h,O3h); + resL4=_mm512_add_epi32(E4l,O4l); resH4=_mm512_add_epi32(E4h,O4h); + resL5=_mm512_add_epi32(E5l,O5l); resH5=_mm512_add_epi32(E5h,O5h); + resL6=_mm512_add_epi32(E6l,O6l); resH6=_mm512_add_epi32(E6h,O6h); + resL7=_mm512_add_epi32(E7l,O7l); resH7=_mm512_add_epi32(E7h,O7h); + resL15=_mm512_sub_epi32(E0l,O0l); resH15=_mm512_sub_epi32(E0h,O0h); + resL14=_mm512_sub_epi32(E1l,O1l); resH14=_mm512_sub_epi32(E1h,O1h); + resL13=_mm512_sub_epi32(E2l,O2l); resH13=_mm512_sub_epi32(E2h,O2h); + resL12=_mm512_sub_epi32(E3l,O3l); resH12=_mm512_sub_epi32(E3h,O3h); + resL11=_mm512_sub_epi32(E4l,O4l); resH11=_mm512_sub_epi32(E4h,O4h); + resL10=_mm512_sub_epi32(E5l,O5l); resH10=_mm512_sub_epi32(E5h,O5h); + resL9 =_mm512_sub_epi32(E6l,O6l); resH9 =_mm512_sub_epi32(E6h,O6h); + resL8 =_mm512_sub_epi32(E7l,O7l); resH8 =_mm512_sub_epi32(E7h,O7h); +} + +static inline __m512i round_shift_pack(__m512i lo, __m512i hi, __m512i vr, int shift) { + return _mm512_packs_epi32( + _mm512_srai_epi32(_mm512_add_epi32(lo,vr),shift), + _mm512_srai_epi32(_mm512_add_epi32(hi,vr),shift)); +} + +// 1-D 32-point inverse transform in place on S0..31 (32 columns). +static inline void idct32_vpass_512(__m512i S32, int add, int shift) { + __m512i ev16; + for (int i=0;i<16;i++) evi=S2*i; + __m512i EL16, EH16; + idct16_core_512(ev, EL, EH); + + __m512i pl8, ph8; + for (int g=0; g<8; g++) { + plg=_mm512_unpacklo_epi16(S4*g+1, S4*g+3); + phg=_mm512_unpackhi_epi16(S4*g+1, S4*g+3); + } + + const __m512i vr = _mm512_set1_epi32(add); + for (int k=0;k<16;k++) { + __m512i OL=_mm512_madd_epi16(pl0, bc(idct_T320k)); + __m512i OH=_mm512_madd_epi16(ph0, bc(idct_T320k)); + for (int g=1; g<8; g++) { + OL=_mm512_add_epi32(OL,_mm512_madd_epi16(plg, bc(idct_T32gk))); + OH=_mm512_add_epi32(OH,_mm512_madd_epi16(phg, bc(idct_T32gk))); + } + Sk = round_shift_pack(_mm512_add_epi32(ELk,OL), _mm512_add_epi32(EHk,OH), vr, shift); + S31-k = round_shift_pack(_mm512_sub_epi32(ELk,OL), _mm512_sub_epi32(EHk,OH), vr, shift); + } +} + +} // namespace + + +void transform_32x32_add_8_avx512(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride) +{ + __m512i S32; + for (int r=0;r<32;r++) Sr=_mm512_loadu_si512((const void*)(coeffs + r*32)); + + idct32_vpass_512(S, add_1st, shift_1st); + transpose32x32_z(S); + idct32_vpass_512(S, add_2nd, shift_2nd); + transpose32x32_z(S); + + // index to gather the low qword of each 128-bit lane into a contiguous 256 + const __m512i gather = _mm512_setr_epi64(0,2,4,6, 1,3,5,7); + for (int r=0;r<32;r++) { + uint8_t* d = dst + r*stride; + __m512i pred = _mm512_cvtepu8_epi16(_mm256_loadu_si256((const __m256i*)d)); // 32 int16 + __m512i sum = _mm512_adds_epi16(Sr, pred); + __m512i pk = _mm512_packus_epi16(sum, sum); // per lane: c|c + pk = _mm512_permutexvar_epi64(gather, pk); // low 256 = c0..c31 + _mm256_storeu_si256((__m256i*)d, _mm512_castsi512_si256(pk)); + } +} + +#endif // HAVE_AVX512
View file
libde265-1.1.1.tar.gz/libde265/x86/transform-avx512.h
Added
@@ -0,0 +1,32 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef SSE_TRANSFORM_AVX512_H +#define SSE_TRANSFORM_AVX512_H + +#include <stddef.h> +#include <stdint.h> + +// AVX-512 (F+BW) inverse DCT + add. Bit-identical to the SSE4.1 / AVX2 / scalar +// versions (verified by dev-tools/test-transform). Same signature as +// transform_add_8. +void transform_32x32_add_8_avx512(uint8_t *dst, const int16_t *coeffs, ptrdiff_t stride); + +#endif
View file
libde265-1.1.1.tar.gz/libde265/x86/transform-dct-tables.h
Added
@@ -0,0 +1,88 @@ +/* + * H.265 video codec. + * Copyright (c) 2026 Dirk Farin <dirk.farin@gmail.com> + * + * This file is part of libde265. + * + * libde265 is free software: you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation, either version 3 of + * the License, or (at your option) any later version. + * + * libde265 is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public License + * along with libde265. If not, see <http://www.gnu.org/licenses/>. + */ + +// Coefficient tables for the AVX2 / AVX-512 inverse transforms. Each inner row +// holds one madd coefficient pattern (a,b,a,b,...) and is broadcast to all +// 128-bit lanes of the SIMD register. Values match transform16x16_{1,2,3} / +// transform32x32 in sse-dct.cc. Included by transform-avx2.cc and +// transform-avx512.cc (static -> a private copy per translation unit). + +#ifndef X86_TRANSFORM_DCT_TABLES_H +#define X86_TRANSFORM_DCT_TABLES_H + +#include <stdint.h> + +#ifndef ALIGNED_64 +#define ALIGNED_64(decl) decl __attribute__((aligned(64))) +#endif + +ALIGNED_64(static const int16_t) idct_T16_1488 = { + {{ 90, 87, 90, 87, 90, 87, 90, 87},{ 87, 57, 87, 57, 87, 57, 87, 57}, + { 80, 9, 80, 9, 80, 9, 80, 9},{ 70,-43, 70,-43, 70,-43, 70,-43}, + { 57,-80, 57,-80, 57,-80, 57,-80},{ 43,-90, 43,-90, 43,-90, 43,-90}, + { 25,-70, 25,-70, 25,-70, 25,-70},{ 9,-25, 9,-25, 9,-25, 9,-25}}, + {{ 80, 70, 80, 70, 80, 70, 80, 70},{ 9,-43, 9,-43, 9,-43, 9,-43}, + {-70,-87,-70,-87,-70,-87,-70,-87},{-87, 9,-87, 9,-87, 9,-87, 9}, + {-25, 90,-25, 90,-25, 90,-25, 90},{ 57, 25, 57, 25, 57, 25, 57, 25}, + { 90,-80, 90,-80, 90,-80, 90,-80},{ 43,-57, 43,-57, 43,-57, 43,-57}}, + {{ 57, 43, 57, 43, 57, 43, 57, 43},{-80,-90,-80,-90,-80,-90,-80,-90}, + {-25, 57,-25, 57,-25, 57,-25, 57},{ 90, 25, 90, 25, 90, 25, 90, 25}, + { -9,-87, -9,-87, -9,-87, -9,-87},{-87, 70,-87, 70,-87, 70,-87, 70}, + { 43, 9, 43, 9, 43, 9, 43, 9},{ 70,-80, 70,-80, 70,-80, 70,-80}}, + {{ 25, 9, 25, 9, 25, 9, 25, 9},{-70,-25,-70,-25,-70,-25,-70,-25}, + { 90, 43, 90, 43, 90, 43, 90, 43},{-80,-57,-80,-57,-80,-57,-80,-57}, + { 43, 70, 43, 70, 43, 70, 43, 70},{ 9,-80, 9,-80, 9,-80, 9,-80}, + {-57, 87,-57, 87,-57, 87,-57, 87},{ 87,-90, 87,-90, 87,-90, 87,-90}} +}; + +ALIGNED_64(static const int16_t) idct_T16_2248 = { + {{ 89, 75, 89, 75, 89, 75, 89, 75},{ 75,-18, 75,-18, 75,-18, 75,-18}, + { 50,-89, 50,-89, 50,-89, 50,-89},{ 18,-50, 18,-50, 18,-50, 18,-50}}, + {{ 50, 18, 50, 18, 50, 18, 50, 18},{-89,-50,-89,-50,-89,-50,-89,-50}, + { 18, 75, 18, 75, 18, 75, 18, 75},{ 75,-89, 75,-89, 75,-89, 75,-89}} +}; + +ALIGNED_64(static const int16_t) idct_T16_3228 = { + {{ 83, 36, 83, 36, 83, 36, 83, 36},{ 36,-83, 36,-83, 36,-83, 36,-83}}, + {{ 64, 64, 64, 64, 64, 64, 64, 64},{ 64,-64, 64,-64, 64,-64, 64,-64}} +}; + +#define IDCT_R8(a,b) { a,b,a,b,a,b,a,b } +ALIGNED_64(static const int16_t) idct_T328168 = { + { IDCT_R8(90,90),IDCT_R8(90,82),IDCT_R8(88,67),IDCT_R8(85,46),IDCT_R8(82,22),IDCT_R8(78,-4),IDCT_R8(73,-31),IDCT_R8(67,-54), + IDCT_R8(61,-73),IDCT_R8(54,-85),IDCT_R8(46,-90),IDCT_R8(38,-88),IDCT_R8(31,-78),IDCT_R8(22,-61),IDCT_R8(13,-38),IDCT_R8(4,-13) }, + { IDCT_R8(88,85),IDCT_R8(67,46),IDCT_R8(31,-13),IDCT_R8(-13,-67),IDCT_R8(-54,-90),IDCT_R8(-82,-73),IDCT_R8(-90,-22),IDCT_R8(-78,38), + IDCT_R8(-46,82),IDCT_R8(-4,88),IDCT_R8(38,54),IDCT_R8(73,-4),IDCT_R8(90,-61),IDCT_R8(85,-90),IDCT_R8(61,-78),IDCT_R8(22,-31) }, + { IDCT_R8(82,78),IDCT_R8(22,-4),IDCT_R8(-54,-82),IDCT_R8(-90,-73),IDCT_R8(-61,13),IDCT_R8(13,85),IDCT_R8(78,67),IDCT_R8(85,-22), + IDCT_R8(31,-88),IDCT_R8(-46,-61),IDCT_R8(-90,31),IDCT_R8(-67,90),IDCT_R8(4,54),IDCT_R8(73,-38),IDCT_R8(88,-90),IDCT_R8(38,-46) }, + { IDCT_R8(73,67),IDCT_R8(-31,-54),IDCT_R8(-90,-78),IDCT_R8(-22,38),IDCT_R8(78,85),IDCT_R8(67,-22),IDCT_R8(-38,-90),IDCT_R8(-90,4), + IDCT_R8(-13,90),IDCT_R8(82,13),IDCT_R8(61,-88),IDCT_R8(-46,-31),IDCT_R8(-88,82),IDCT_R8(-4,46),IDCT_R8(85,-73),IDCT_R8(54,-61) }, + { IDCT_R8(61,54),IDCT_R8(-73,-85),IDCT_R8(-46,-4),IDCT_R8(82,88),IDCT_R8(31,-46),IDCT_R8(-88,-61),IDCT_R8(-13,82),IDCT_R8(90,13), + IDCT_R8(-4,-90),IDCT_R8(-90,38),IDCT_R8(22,67),IDCT_R8(85,-78),IDCT_R8(-38,-22),IDCT_R8(-78,90),IDCT_R8(54,-31),IDCT_R8(67,-73) }, + { IDCT_R8(46,38),IDCT_R8(-90,-88),IDCT_R8(38,73),IDCT_R8(54,-4),IDCT_R8(-90,-67),IDCT_R8(31,90),IDCT_R8(61,-46),IDCT_R8(-88,-31), + IDCT_R8(22,85),IDCT_R8(67,-78),IDCT_R8(-85,13),IDCT_R8(13,61),IDCT_R8(73,-90),IDCT_R8(-82,54),IDCT_R8(4,22),IDCT_R8(78,-82) }, + { IDCT_R8(31,22),IDCT_R8(-78,-61),IDCT_R8(90,85),IDCT_R8(-61,-90),IDCT_R8(4,73),IDCT_R8(54,-38),IDCT_R8(-88,-4),IDCT_R8(82,46), + IDCT_R8(-38,-78),IDCT_R8(-22,90),IDCT_R8(73,-82),IDCT_R8(-90,54),IDCT_R8(67,-13),IDCT_R8(-13,-31),IDCT_R8(-46,67),IDCT_R8(85,-88) }, + { IDCT_R8(13,4),IDCT_R8(-38,-13),IDCT_R8(61,22),IDCT_R8(-78,-31),IDCT_R8(88,38),IDCT_R8(-90,-46),IDCT_R8(85,54),IDCT_R8(-73,-61), + IDCT_R8(54,67),IDCT_R8(-31,-73),IDCT_R8(4,78),IDCT_R8(22,-82),IDCT_R8(-46,85),IDCT_R8(67,-88),IDCT_R8(-82,90),IDCT_R8(90,-90) } +}; +#undef IDCT_R8 + +#endif
View file
libde265-1.0.17.tar.gz/sherlock265/VideoDecoder.cc -> libde265-1.1.1.tar.gz/sherlock265/VideoDecoder.cc
Changed
@@ -25,6 +25,7 @@ */ #include "VideoDecoder.h" +#include <chrono> #ifdef HAVE_VIDEOGFX #include <libvideogfx.hh> #endif @@ -44,6 +45,7 @@ img(NULL), mNextBuffer(0), mFrameCount(0), + mFramerate(30), mPlayingVideo(false), mVideoEnded(false), mSingleStep(false), @@ -114,9 +116,15 @@ void VideoDecoder::decoder_loop() { + using std::chrono::steady_clock; + using std::chrono::microseconds; + using std::chrono::duration_cast; + for (;;) { if (mPlayingVideo) { + auto frame_start_time = steady_clock::now(); + mutex.lock(); if (img) { @@ -140,8 +148,13 @@ else if (more && err == DE265_ERROR_WAITING_FOR_INPUT_DATA) { uint8_t buf4096; int buf_size = fread(buf,1,sizeof(buf),mFH); - int err = de265_push_data(ctx,buf,buf_size ,0,0); - (void)err; + if (buf_size > 0) { + int err = de265_push_data(ctx,buf,buf_size ,0,0); + (void)err; + } + if (feof(mFH)) { + de265_flush_data(ctx); // signal end-of-stream so trailing frames are emitted + } } else if (!more) { @@ -168,6 +181,16 @@ // process events QCoreApplication::processEvents(); + + // Throttle to target frame rate (skip when we are about to go idle anyway) + int framerate = mFramerate.load(); + if (mPlayingVideo && framerate > 0) { + auto target_interval = microseconds(1000000 / framerate); + auto elapsed = duration_cast<microseconds>(steady_clock::now() - frame_start_time); + if (elapsed < target_interval) { + QThread::usleep((target_interval - elapsed).count()); + } + } } else { exec(); @@ -430,6 +453,10 @@ } +void VideoDecoder::setFramerate(int framerate) +{ + mFramerate = framerate; +} void VideoDecoder::init_decoder(const char* filename)
View file
libde265-1.0.17.tar.gz/sherlock265/VideoDecoder.h -> libde265-1.1.1.tar.gz/sherlock265/VideoDecoder.h
Changed
@@ -32,6 +32,7 @@ #endif #include <QtGui> +#include <atomic> #ifdef HAVE_SWSCALE #ifdef __cplusplus extern "C" { @@ -63,6 +64,7 @@ void startDecoder(); void stopDecoder(); void singleStepDecoder(); + void setFramerate(int framerate); void showCBPartitioning(bool flag); void showTBPartitioning(bool flag); @@ -93,6 +95,8 @@ int mNextBuffer; int mFrameCount; + std::atomic<int> mFramerate; + bool mPlayingVideo; bool mVideoEnded; bool mSingleStep;
View file
libde265-1.0.17.tar.gz/sherlock265/VideoPlayer.cc -> libde265-1.1.1.tar.gz/sherlock265/VideoPlayer.cc
Changed
@@ -49,6 +49,14 @@ videoWidget, SLOT(setImage(QImage*)), Qt::QueuedConnection); + QSpinBox* framerateSpinbox = new QSpinBox(); + framerateSpinbox->setMinimum(1); + framerateSpinbox->setMaximum(300); + framerateSpinbox->setValue(30); + framerateSpinbox->setSuffix(" FPS"); + QObject::connect(framerateSpinbox, QOverload<int>::of(&QSpinBox::valueChanged), + mDecoder, &VideoDecoder::setFramerate); + QPushButton* showCBPartitioningButton = new QPushButton("CB-tree"); showCBPartitioningButton->setCheckable(true); @@ -106,6 +114,7 @@ layout->addWidget(startButton, 1,0,1,1); layout->addWidget(stopButton, 1,1,1,1); layout->addWidget(stepButton, 1,2,1,1); + layout->addWidget(framerateSpinbox, 1,3,1,1); layout->addWidget(showDecodedImageButton, 1,6,1,1); layout->addWidget(showTilesButton, 1,5,1,1); layout->addWidget(showSlicesButton, 1,4,1,1);
Locations
Projects
Search
Status Monitor
Help
Open Build Service
OBS Manuals
API Documentation
OBS Portal
Reporting a Bug
Contact
Mailing List
Forums
Chat (IRC)
Twitter
Open Build Service (OBS)
is an
openSUSE project
.