We truncated the diff of some files because they were too big.
If you want to see the full diff for every file, click here.
Changes of Revision 24
x265.changes
Changed
x
1
2
-------------------------------------------------------------------
3
+Thu Jul 27 08:33:52 UTC 2017 - joerg.lorenzen@ki.tng.de
4
+
5
+- Update to version 2.5
6
+ Encoder enhancements
7
+ * Improved grain handling with --tune grain option by throttling
8
+ VBV operations to limit QP jumps.
9
+ * Frame threads are now decided based on number of threads
10
+ specified in the --pools, as opposed to the number of hardware
11
+ threads available. The mapping was also adjusted to improve
12
+ quality of the encodes with minimal impact to performance.
13
+ * CSV logging feature (enabled by --csv) is now part of the
14
+ library; it was previously part of the x265 application.
15
+ Applications that integrate libx265 can now extract frame level
16
+ statistics for their encodes by exercising this option in the
17
+ library.
18
+ * Globals that track min and max CU sizes, number of slices, and
19
+ other parameters have now been moved into instance-specific
20
+ variables. Consequently, applications that invoke multiple
21
+ instances of x265 library are no longer restricted to use the
22
+ same settings for these parameter options across the multiple
23
+ instances.
24
+ * x265 can now generate a seprate library that exports the HDR10+
25
+ parsing API. Other libraries that wish to use this API may do
26
+ so by linking against this library. Enable ENABLE_HDR10_PLUS in
27
+ CMake options and build to generate this library.
28
+ * SEA motion search receives a 10% performance boost from AVX2
29
+ optimization of its kernels.
30
+ * The CSV log is now more elaborate with additional fields such
31
+ as PU statistics, average-min-max luma and chroma values, etc.
32
+ Refer to documentation of --csv for details of all fields.
33
+ * x86inc.asm cleaned-up for improved instruction handling.
34
+ API changes
35
+ * New API x265_encoder_ctu_info() introduced to specify suggested
36
+ partition sizes for various CTUs in a frame. To be used in
37
+ conjunction with --ctu-info to react to the specified
38
+ partitions appropriately.
39
+ * Rate-control statistics passed through the x265_picture object
40
+ for an incoming frame are now used by the encoder.
41
+ * Options to scale, reuse, and refine analysis for incoming
42
+ analysis shared through the x265_analysis_data field in
43
+ x265_picture for runs that use --analysis-reuse-mode load; use
44
+ options --scale, --refine-mv, --refine-inter, and
45
+ --refine-intra to explore.
46
+ * VBV now has a deterministic mode. Use --const-vbv to exercise.
47
+ Bug fixes
48
+ * Several fixes for HDR10+ parsing code including incompatibility
49
+ with user-specific SEI, removal of warnings, linking issues in
50
+ linux, etc.
51
+ * SEI messages for HDR10 repeated every keyint when HDR options
52
+ (--hdr-opt, --master-display) specified.
53
+- soname bump to 130.
54
+
55
+-------------------------------------------------------------------
56
Thu Apr 27 14:15:13 UTC 2017 - joerg.lorenzen@ki.tng.de
57
58
- Update to version 2.4
59
x265.spec
Changed
14
1
2
# based on the spec file from https://build.opensuse.org/package/view_file/home:Simmphonie/libx265/
3
4
Name: x265
5
-%define soname 116
6
+%define soname 130
7
%define libname lib%{name}
8
%define libsoname %{libname}-%{soname}
9
-Version: 2.4
10
+Version: 2.5
11
Release: 0
12
License: GPL-2.0+
13
Summary: A free h265/HEVC encoder - encoder binary
14
baselibs.conf
Changed
4
1
2
-libx265-116
3
+libx265-130
4
x265_2.4.tar.gz/source/dynamicHDR10/BasicStructures.cpp
Deleted
42
1
2
-/**
3
- * @file BasicStructures.cpp
4
- * @brief Defines the structure of metadata parameters
5
- * @author Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date 03/01/2017
7
- * @version 0.0.1
8
- *
9
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
10
- *
11
- * This program is free software; you can redistribute it and/or
12
- * modify it under the terms of the GNU General Public License
13
- * as published by the Free Software Foundation; either version 2
14
- * of the License, or (at your option) any later version.
15
- *
16
- * This program is distributed in the hope that it will be useful,
17
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
18
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
19
- * GNU General Public License for more details.
20
- *
21
- * You should have received a copy of the GNU General Public License
22
- * along with this program; if not, write to the Free Software
23
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
24
- * MA 02110-1301, USA.
25
-**/
26
-
27
-#include "BasicStructures.h"
28
-#include "vector"
29
-
30
-struct PercentileLuminance{
31
-
32
- float averageLuminance = 0.0;
33
- float maxRLuminance = 0.0;
34
- float maxGLuminance = 0.0;
35
- float maxBLuminance = 0.0;
36
- int order;
37
- std::vector<unsigned int> percentiles;
38
-};
39
-
40
-
41
-
42
x265_2.4.tar.gz/.hg_archival.txt -> x265_2.5.tar.gz/.hg_archival.txt
Changed
8
1
2
repo: 09fe40627f03a0f9c3e6ac78b22ac93da23f9fdf
3
-node: e7a4dd48293b7956d4a20df257d23904cc78e376
4
+node: 64b2d0bf45a52511e57a6b7299160b961ca3d51c
5
branch: stable
6
-tag: 2.4
7
+tag: 2.5
8
x265_2.4.tar.gz/.hgtags -> x265_2.5.tar.gz/.hgtags
Changed
6
1
2
981e3bfef16a997bce6f46ce1b15631a0e234747 2.1
3
be14a7e9755e54f0fd34911c72bdfa66981220bc 2.2
4
3037c1448549ca920967831482c653e5892fa8ed 2.3
5
+e7a4dd48293b7956d4a20df257d23904cc78e376 2.4
6
x265_2.4.tar.gz/doc/reST/api.rst -> x265_2.5.tar.gz/doc/reST/api.rst
Changed
29
1
2
* presets is not recommended without a more fine-grained breakdown of
3
* parameters to take this into account. */
4
int x265_encoder_reconfig(x265_encoder *, x265_param *);
5
+**x265_encoder_ctu_info**
6
+ /* x265_encoder_ctu_info:
7
+ * Copy CTU information such as ctu address and ctu partition structure of all
8
+ * CTUs in each frame. The function is invoked only if "--ctu-info" is enabled and
9
+ * the encoder will wait for this copy to complete if enabled.
10
+ */
11
12
Pictures
13
========
14
15
Cleanup
16
=======
17
18
+At the end of the encode, the application will want to trigger logging
19
+of the final encode statistics, if :option:`--csv` had been specified::
20
+
21
+ /* x265_encoder_log:
22
+ * write a line to the configured CSV file. If a CSV filename was not
23
+ * configured, or file open failed, this function will perform no write. */
24
+ void x265_encoder_log(x265_encoder *encoder, int argc, char **argv);
25
+
26
Finally, the encoder must be closed in order to free all of its
27
resources. An encoder that has been flushed cannot be restarted and
28
reused. Once **x265_encoder_close()** has been called, the encoder
29
x265_2.4.tar.gz/doc/reST/cli.rst -> x265_2.5.tar.gz/doc/reST/cli.rst
Changed
201
1
2
2. unable to open encoder
3
3. unable to generate stream headers
4
4. encoder abort
5
- 5. unable to open csv file
6
-
7
+
8
Logging/Statistic Options
9
=========================
10
11
12
it adds one line per run. If :option:`--csv-log-level` is greater than
13
0, it writes one line per frame. Default none
14
15
- Several frame performance statistics are available when
16
- :option:`--csv-log-level` is greater than or equal to 2:
17
-
18
+ The following statistics are available when :option:`--csv-log-level` is
19
+ greater than or equal to 1:
20
+
21
+ **Encode Order** The frame order in which the encoder encodes.
22
+
23
+ **Type** Slice type of the frame.
24
+
25
+ **POC** Picture Order Count - The display order of the frames.
26
+
27
+ **QP** Quantization Parameter decided for the frame.
28
+
29
+ **Bits** Number of bits consumed by the frame.
30
+
31
+ **Scenecut** 1 if the frame is a scenecut, 0 otherwise.
32
+
33
+ **RateFactor** Applicable only when CRF is enabled. The rate factor depends
34
+ on the CRF given by the user. This is used to determine the QP so as to
35
+ target a certain quality.
36
+
37
+ **BufferFill** Bits available for the next frame. Includes bits carried
38
+ over from the current frame.
39
+
40
+ **Latency** Latency in terms of number of frames between when the frame
41
+ was given in and when the frame is given out.
42
+
43
+ **PSNR** Peak signal to noise ratio for Y, U and V planes.
44
+
45
+ **SSIM** A quality metric that denotes the structural similarity between frames.
46
+
47
+ **Ref lists** POC of references in lists 0 and 1 for the frame.
48
+
49
+ Several statistics about the encoded bitstream and encoder performance are
50
+ available when :option:`--csv-log-level` is greater than or equal to 2:
51
+
52
+ **I/P cost ratio:** The ratio between the cost when a frame is decided as an
53
+ I frame to that when it is decided as a P frame as computed from the
54
+ quarter-resolution frame in look-ahead. This, in combination with other parameters
55
+ such as position of the frame in the GOP, is used to decide scene transitions.
56
+
57
+ **Analysis statistics:**
58
+
59
+ **CU Statistics** percentage of CU modes.
60
+
61
+ **Distortion** Average luma and chroma distortion. Calculated as
62
+ SSE is done on fenc and recon(after quantization).
63
+
64
+ **Psy Energy** Average psy energy calculated as the sum of absolute
65
+ difference between source and recon energy. Energy is measured by sa8d
66
+ minus SAD.
67
+
68
+ **Residual Energy** Average residual energy. SSE is calculated on fenc
69
+ and pred(before quantization).
70
+
71
+ **Luma/Chroma Values** minumum, maximum and average(averaged by area)
72
+ luma and chroma values of source for each frame.
73
+
74
+ **PU Statistics** percentage of PU modes at each depth.
75
+
76
+ **Performance statistics:**
77
+
78
**DecideWait ms** number of milliseconds the frame encoder had to
79
wait, since the previous frame was retrieved by the API thread,
80
before a new frame has been given to it. This is the latency
81
82
**Stall Time ms** the number of milliseconds of the reported wall
83
time that were spent with zero worker threads, aka all compression
84
was completely stalled.
85
+
86
+ **Total frame time** Total time spent to encode the frame.
87
88
**Avg WPP** the average number of worker threads working on this
89
frame, at any given time. This value is sampled at the completion of
90
91
is more of a problem for P frames where some blocks are much more
92
expensive than others.
93
94
- **CLI ONLY**
95
-
96
.. option:: --csv-log-level <integer>
97
98
Controls the level of detail (and size) of --csv log files
99
100
1. frame level logging
101
2. frame level logging with performance statistics
102
103
- **CLI ONLY**
104
-
105
.. option:: --ssim, --no-ssim
106
107
Calculate and report Structural Similarity values. It is
108
109
110
Analysis re-use options, to improve performance when encoding the same
111
sequence multiple times (presumably at varying bitrates). The encoder
112
-will not reuse analysis if the resolution and slice type parameters do
113
-not match.
114
+will not reuse analysis if slice type parameters do not match.
115
116
-.. option:: --analysis-mode <string|int>
117
+.. option:: --analysis-reuse-mode <string|int>
118
119
- Specify whether analysis information of each frame is output by encoder
120
- or input for reuse. By reading the analysis data writen by an
121
- earlier encode of the same sequence, substantial redundant work may
122
- be avoided.
123
-
124
- The following data may be stored and reused:
125
- I frames - split decisions and luma intra directions of all CUs.
126
- P/B frames - motion vectors are dumped at each depth for all CUs.
127
+ This option allows reuse of analysis information from first pass to second pass.
128
+ :option:`--analysis-reuse-mode save` specifies that encoder outputs analysis information of each frame.
129
+ :option:`--analysis-reuse-mode load` specifies that encoder reuses analysis information from first pass.
130
+ There is no benefit using load mode without running encoder in save mode. Analysis data from save mode is
131
+ written to a file specified by :option:`--analysis-reuse-file`. The amount of analysis data stored/reused
132
+ is determined by :option:`--analysis-reuse-level`. By reading the analysis data writen by an earlier encode
133
+ of the same sequence, substantial redundant work may be avoided. Requires cutree, pmode to be off. Default 0.
134
135
**Values:** off(0), save(1): dump analysis data, load(2): read analysis data
136
137
-.. option:: --analysis-file <filename>
138
+.. option:: --analysis-reuse-file <filename>
139
140
- Specify a filename for analysis data (see :option:`--analysis-mode`)
141
+ Specify a filename for analysis data (see :option:`--analysis-reuse-mode`)
142
If no filename is specified, x265_analysis.dat is used.
143
144
-.. option:: --refine-level <1..10>
145
+.. option:: --analysis-reuse-level <1..10>
146
147
- Amount of information stored/reused in :option:`--analysis-mode` is distributed across levels.
148
+ Amount of information stored/reused in :option:`--analysis-reuse-mode` is distributed across levels.
149
Higher the value, higher the information stored/reused, faster the encode. Default 5.
150
151
- Note that --refine-level must be paired with analysis-mode.
152
+ Note that --analysis-reuse-level must be paired with analysis-reuse-mode.
153
154
+--------+-----------------------------------------+
155
| Level | Description |
156
157
| 10 | Level 5 + Full CU analysis-info |
158
+--------+-----------------------------------------+
159
160
+.. option:: --scale-factor
161
+
162
+ Factor by which input video is scaled down for analysis save mode.
163
+ This option should be coupled with analysis-reuse-mode option, --analysis-reuse-level 10.
164
+ The ctu size of load should be double the size of save. Default 0.
165
+
166
+.. option:: --refine-intra <0|1|2>
167
+
168
+ Enables refinement of intra blocks in current encode.
169
+
170
+ Level 0 - Forces both mode and depth from the previous encode.
171
+
172
+ Level 1 - Evaluates all intra modes for blocks of size one smaller than
173
+ the min-cu-size of the incoming analysis data from the previous encode,
174
+ forces modes for blocks of larger size.
175
+
176
+ Level 2 - Evaluates all intra modes for blocks of size one smaller than
177
+ the min-cu-size of the incoming analysis data from the previous encode.
178
+ For larger blocks, force only depth when angular mode is chosen by the
179
+ previous encode, force depth and mode when other intra modes are chosen.
180
+
181
+ Default 0.
182
+
183
+.. option:: --refine-inter-depth
184
+
185
+ Enables refinement of inter blocks in current encode. Evaluates all
186
+ inter modes for blocks of size one smaller than the min-cu-size of the
187
+ incoming analysis data from the previous encode. Default disabled.
188
+
189
+.. option:: --refine-mv
190
+
191
+ Enables refinement of motion vector for scaled video. Evaluates the best
192
+ motion vector by searching the surrounding eight integer and subpel pixel
193
+ positions.
194
+
195
Options which affect the transform unit quad-tree, sometimes referred to
196
as the residual quad-tree (RQT).
197
198
199
intra cost of a frame used in scenecut detection. For example, a value of 5 indicates,
200
if the inter cost of a frame is greater than or equal to 95 percent of the intra cost of the frame,
201
x265_2.4.tar.gz/doc/reST/releasenotes.rst -> x265_2.5.tar.gz/doc/reST/releasenotes.rst
Changed
37
1
2
Release Notes
3
*************
4
5
-Release Notes
6
-*************
7
+Version 2.5
8
+===========
9
+
10
+Release date - 13th July, 2017.
11
+
12
+Encoder enhancements
13
+--------------------
14
+1. Improved grain handling with :option:`--tune` grain option by throttling VBV operations to limit QP jumps.
15
+2. Frame threads are now decided based on number of threads specified in the :option:`--pools`, as opposed to the number of hardware threads available. The mapping was also adjusted to improve quality of the encodes with minimal impact to performance.
16
+3. CSV logging feature (enabled by :option:`--csv`) is now part of the library; it was previously part of the x265 application. Applications that integrate libx265 can now extract frame level statistics for their encodes by exercising this option in the library.
17
+4. Globals that track min and max CU sizes, number of slices, and other parameters have now been moved into instance-specific variables. Consequently, applications that invoke multiple instances of x265 library are no longer restricted to use the same settings for these parameter options across the multiple instances.
18
+5. x265 can now generate a seprate library that exports the HDR10+ parsing API. Other libraries that wish to use this API may do so by linking against this library. Enable ENABLE_HDR10_PLUS in CMake options and build to generate this library.
19
+6. SEA motion search receives a 10% performance boost from AVX2 optimization of its kernels.
20
+7. The CSV log is now more elaborate with additional fields such as PU statistics, average-min-max luma and chroma values, etc. Refer to documentation of :option:`--csv` for details of all fields.
21
+8. x86inc.asm cleaned-up for improved instruction handling.
22
+
23
+API changes
24
+-----------
25
+1. New API x265_encoder_ctu_info() introduced to specify suggested partition sizes for various CTUs in a frame. To be used in conjunction with :option:`--ctu-info` to react to the specified partitions appropriately.
26
+2. Rate-control statistics passed through the x265_picture object for an incoming frame are now used by the encoder.
27
+3. Options to scale, reuse, and refine analysis for incoming analysis shared through the x265_analysis_data field in x265_picture for runs that use :option:`--analysis-reuse-mode` load; use options :option:`--scale`, :option:`--refine-mv`, :option:`--refine-inter`, and :option:`--refine-intra` to explore.
28
+4. VBV now has a deterministic mode. Use :option:`--const-vbv` to exercise.
29
+
30
+Bug fixes
31
+---------
32
+1. Several fixes for HDR10+ parsing code including incompatibility with user-specific SEI, removal of warnings, linking issues in linux, etc.
33
+2. SEI messages for HDR10 repeated every keyint when HDR options (:option:`--hdr-opt`, :option:`--master-display`) specified.
34
35
Version 2.4
36
===========
37
x265_2.4.tar.gz/source/CMakeLists.txt -> x265_2.5.tar.gz/source/CMakeLists.txt
Changed
132
1
2
option(STATIC_LINK_CRT "Statically link C runtime for release builds" OFF)
3
mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
4
# X265_BUILD must be incremented each time the public API is changed
5
-set(X265_BUILD 116)
6
+set(X265_BUILD 130)
7
configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
8
"${PROJECT_BINARY_DIR}/x265.def")
9
configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
10
11
add_definitions(-O3 -qstrict -qhot -qaltivec)
12
add_definitions(-qinline=level=10 -qpath=IL:/data/video_files/latest.tpo/)
13
endif()
14
-
15
-
16
+# this option is to enable the inclusion of dynamic HDR10 library to the libx265 compilation
17
+option(ENABLE_HDR10_PLUS "Enable dynamic HDR10 compilation" OFF)
18
if(GCC)
19
add_definitions(-Wall -Wextra -Wshadow)
20
add_definitions(-D__STDC_LIMIT_MACROS=1)
21
- add_definitions(-std=gnu++98)
22
+ if(ENABLE_HDR10_PLUS)
23
+ if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "4.8")
24
+ message(FATAL_ERROR "gcc version above 4.8 required to support hdr10plus")
25
+ endif()
26
+ add_definitions(-std=gnu++11)
27
+ else()
28
+ add_definitions(-std=gnu++98)
29
+ endif()
30
if(ENABLE_PIC)
31
add_definitions(-fPIC)
32
endif(ENABLE_PIC)
33
34
else(HIGH_BIT_DEPTH)
35
add_definitions(-DHIGH_BIT_DEPTH=0 -DX265_DEPTH=8)
36
endif(HIGH_BIT_DEPTH)
37
-# this option is to enable the inclusion of dynamic HDR10 library to the libx265 compilation
38
-option(ENABLE_DYNAMIC_HDR10 "Enable dynamic HDR10 compilation" OFF)
39
-if (ENABLE_DYNAMIC_HDR10)
40
- add_subdirectory(dynamicHDR10)
41
- include_directories(dynamicHDR10)
42
- add_definitions(-DENABLE_DYNAMIC_HDR10)
43
-endif(ENABLE_DYNAMIC_HDR10)
44
45
+if (ENABLE_HDR10_PLUS)
46
+ include_directories(. dynamicHDR10 "${PROJECT_BINARY_DIR}")
47
+ add_subdirectory(dynamicHDR10)
48
+ add_definitions(-DENABLE_HDR10_PLUS)
49
+endif(ENABLE_HDR10_PLUS)
50
# this option can only be used when linking multiple libx265 libraries
51
# together, and some alternate API access method is implemented.
52
option(EXPORT_C_API "Implement public C programming interface" ON)
53
54
endif()
55
endif()
56
source_group(ASM FILES ${ASM_SRCS})
57
-if(ENABLE_DYNAMIC_HDR10)
58
+if(ENABLE_HDR10_PLUS)
59
add_library(x265-static STATIC $<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common> $<TARGET_OBJECTS:dynamicHDR10> ${ASM_OBJS} ${ASM_SRCS})
60
+ add_library(hdr10plus-static STATIC $<TARGET_OBJECTS:dynamicHDR10>)
61
+ set_target_properties(hdr10plus-static PROPERTIES OUTPUT_NAME hdr10plus)
62
else()
63
add_library(x265-static STATIC $<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common> ${ASM_OBJS} ${ASM_SRCS})
64
endif()
65
66
install(TARGETS x265-static
67
LIBRARY DESTINATION ${LIB_INSTALL_DIR}
68
ARCHIVE DESTINATION ${LIB_INSTALL_DIR})
69
+
70
+if(ENABLE_HDR10_PLUS)
71
+ install(TARGETS hdr10plus-static
72
+ LIBRARY DESTINATION ${LIB_INSTALL_DIR}
73
+ ARCHIVE DESTINATION ${LIB_INSTALL_DIR})
74
+endif()
75
install(FILES x265.h "${PROJECT_BINARY_DIR}/x265_config.h" DESTINATION include)
76
77
if(CMAKE_RC_COMPILER)
78
79
endif()
80
option(ENABLE_SHARED "Build shared library" ON)
81
if(ENABLE_SHARED)
82
-
83
- if(ENABLE_DYNAMIC_HDR10)
84
+ if(ENABLE_HDR10_PLUS)
85
add_library(x265-shared SHARED "${PROJECT_BINARY_DIR}/x265.def" ${ASM_OBJS}
86
${X265_RC_FILE} $<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common> $<TARGET_OBJECTS:dynamicHDR10>)
87
+ add_library(hdr10plus-shared SHARED $<TARGET_OBJECTS:dynamicHDR10>)
88
+
89
+ if(MSVC)
90
+ set_target_properties(hdr10plus-shared PROPERTIES OUTPUT_NAME libhdr10plus)
91
+ else()
92
+ set_target_properties(hdr10plus-shared PROPERTIES OUTPUT_NAME hdr10plus)
93
+ endif()
94
else()
95
add_library(x265-shared SHARED "${PROJECT_BINARY_DIR}/x265.def" ${ASM_OBJS}
96
${X265_RC_FILE} $<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common>)
97
98
ARCHIVE DESTINATION ${LIB_INSTALL_DIR}
99
RUNTIME DESTINATION ${BIN_INSTALL_DIR})
100
endif()
101
+ if(ENABLE_HDR10_PLUS)
102
+ install(TARGETS hdr10plus-shared
103
+ LIBRARY DESTINATION ${LIB_INSTALL_DIR}
104
+ ARCHIVE DESTINATION ${LIB_INSTALL_DIR})
105
+ endif()
106
if(LINKER_OPTIONS)
107
# set_target_properties can't do list expansion
108
string(REPLACE ";" " " LINKER_OPTION_STR "${LINKER_OPTIONS}")
109
110
endif(WIN32)
111
if(XCODE)
112
# Xcode seems unable to link the CLI with libs, so link as one targget
113
- if(ENABLE_DYNAMIC_HDR10)
114
+ if(ENABLE_HDR10_PLUS)
115
add_executable(cli ../COPYING ${InputFiles} ${OutputFiles} ${GETOPT}
116
- x265.cpp x265.h x265cli.h x265-extras.h x265-extras.cpp
117
+ x265.cpp x265.h x265cli.h
118
$<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common> $<TARGET_OBJECTS:dynamicHDR10> ${ASM_OBJS} ${ASM_SRCS})
119
else()
120
add_executable(cli ../COPYING ${InputFiles} ${OutputFiles} ${GETOPT}
121
- x265.cpp x265.h x265cli.h x265-extras.h x265-extras.cpp
122
+ x265.cpp x265.h x265cli.h
123
$<TARGET_OBJECTS:encoder> $<TARGET_OBJECTS:common> ${ASM_OBJS} ${ASM_SRCS})
124
endif()
125
else()
126
add_executable(cli ../COPYING ${InputFiles} ${OutputFiles} ${GETOPT} ${X265_RC_FILE}
127
- ${ExportDefs} x265.cpp x265.h x265cli.h x265-extras.h x265-extras.cpp)
128
+ ${ExportDefs} x265.cpp x265.h x265cli.h)
129
if(WIN32 OR NOT ENABLE_SHARED OR INTEL_CXX)
130
# The CLI cannot link to the shared library on Windows, it
131
# requires internal APIs not exported from the DLL
132
x265_2.4.tar.gz/source/common/CMakeLists.txt -> x265_2.5.tar.gz/source/common/CMakeLists.txt
Changed
14
1
2
set(VEC_PRIMITIVES vec/vec-primitives.cpp ${PRIMITIVES})
3
source_group(Intrinsics FILES ${VEC_PRIMITIVES})
4
5
- set(C_SRCS asm-primitives.cpp pixel.h mc.h ipfilter8.h blockcopy8.h dct8.h loopfilter.h)
6
+ set(C_SRCS asm-primitives.cpp pixel.h mc.h ipfilter8.h blockcopy8.h dct8.h loopfilter.h seaintegral.h)
7
set(A_SRCS pixel-a.asm const-a.asm cpu-a.asm ssd-a.asm mc-a.asm
8
mc-a2.asm pixel-util8.asm blockcopy8.asm
9
- pixeladd8.asm dct8.asm)
10
+ pixeladd8.asm dct8.asm seaintegral.asm)
11
if(HIGH_BIT_DEPTH)
12
set(A_SRCS ${A_SRCS} sad16-a.asm intrapred16.asm ipfilter16.asm loopfilter.asm)
13
else()
14
x265_2.4.tar.gz/source/common/common.h -> x265_2.5.tar.gz/source/common/common.h
Changed
9
1
2
#define LOG2_RASTER_SIZE (MAX_LOG2_CU_SIZE - LOG2_UNIT_SIZE)
3
#define RASTER_SIZE (1 << LOG2_RASTER_SIZE)
4
#define MAX_NUM_PARTITIONS (RASTER_SIZE * RASTER_SIZE)
5
-#define NUM_4x4_PARTITIONS (1U << (g_unitSizeDepth << 1)) // number of 4x4 units in max CU size
6
7
#define MIN_PU_SIZE 4
8
#define MIN_TU_SIZE 4
9
x265_2.4.tar.gz/source/common/constants.cpp -> x265_2.5.tar.gz/source/common/constants.cpp
Changed
9
1
2
65535
3
};
4
5
-int g_ctuSizeConfigured = 0;
6
uint32_t g_maxLog2CUSize = MAX_LOG2_CU_SIZE;
7
uint32_t g_maxCUSize = MAX_CU_SIZE;
8
uint32_t g_unitSizeDepth = NUM_CU_DEPTH;
9
x265_2.4.tar.gz/source/common/constants.h -> x265_2.5.tar.gz/source/common/constants.h
Changed
10
1
2
namespace X265_NS {
3
// private namespace
4
5
-extern int g_ctuSizeConfigured;
6
-
7
extern double x265_lambda_tab[QP_MAX_MAX + 1];
8
extern double x265_lambda2_tab[QP_MAX_MAX + 1];
9
extern const uint16_t x265_chroma_lambda2_offset_tab[MAX_CHROMA_LAMBDA_OFFSET + 1];
10
x265_2.4.tar.gz/source/common/cpu.cpp -> x265_2.5.tar.gz/source/common/cpu.cpp
Changed
31
1
2
{ "SSE2Slow", SSE2 | X265_CPU_SSE2_IS_SLOW },
3
{ "SSE2", SSE2 },
4
{ "SSE2Fast", SSE2 | X265_CPU_SSE2_IS_FAST },
5
+ { "LZCNT", X265_CPU_LZCNT },
6
{ "SSE3", SSE2 | X265_CPU_SSE3 },
7
{ "SSSE3", SSE2 | X265_CPU_SSE3 | X265_CPU_SSSE3 },
8
{ "SSE4.1", SSE2 | X265_CPU_SSE3 | X265_CPU_SSSE3 | X265_CPU_SSE4 },
9
10
{ "AVX", AVX },
11
{ "XOP", AVX | X265_CPU_XOP },
12
{ "FMA4", AVX | X265_CPU_FMA4 },
13
- { "AVX2", AVX | X265_CPU_AVX2 },
14
{ "FMA3", AVX | X265_CPU_FMA3 },
15
+ { "BMI1", AVX | X265_CPU_LZCNT | X265_CPU_BMI1 },
16
+ { "BMI2", AVX | X265_CPU_LZCNT | X265_CPU_BMI1 | X265_CPU_BMI2 },
17
+#define AVX2 AVX | X265_CPU_FMA3 | X265_CPU_LZCNT | X265_CPU_BMI1 | X265_CPU_BMI2 | X265_CPU_AVX2
18
+ { "AVX2", AVX2},
19
+#undef AVX2
20
#undef AVX
21
#undef SSE2
22
#undef MMX2
23
{ "Cache32", X265_CPU_CACHELINE_32 },
24
{ "Cache64", X265_CPU_CACHELINE_64 },
25
- { "LZCNT", X265_CPU_LZCNT },
26
- { "BMI1", X265_CPU_BMI1 },
27
- { "BMI2", X265_CPU_BMI1 | X265_CPU_BMI2 },
28
{ "SlowCTZ", X265_CPU_SLOW_CTZ },
29
{ "SlowAtom", X265_CPU_SLOW_ATOM },
30
{ "SlowPshufb", X265_CPU_SLOW_PSHUFB },
31
x265_2.4.tar.gz/source/common/cudata.cpp -> x265_2.5.tar.gz/source/common/cudata.cpp
Changed
201
1
2
#include "picyuv.h"
3
#include "mv.h"
4
#include "cudata.h"
5
+#define MAX_MV 1 << 14
6
7
using namespace X265_NS;
8
9
10
11
}
12
13
-cubcast_t CUData::s_partSet[NUM_FULL_DEPTH] = { NULL, NULL, NULL, NULL, NULL };
14
-uint32_t CUData::s_numPartInCUSize;
15
-
16
CUData::CUData()
17
{
18
memset(this, 0, sizeof(*this));
19
}
20
21
-void CUData::initialize(const CUDataMemPool& dataPool, uint32_t depth, int csp, int instance)
22
+void CUData::initialize(const CUDataMemPool& dataPool, uint32_t depth, const x265_param& param, int instance)
23
{
24
+ int csp = param.internalCsp;
25
m_chromaFormat = csp;
26
m_hChromaShift = CHROMA_H_SHIFT(csp);
27
m_vChromaShift = CHROMA_V_SHIFT(csp);
28
- m_numPartitions = NUM_4x4_PARTITIONS >> (depth * 2);
29
+ m_numPartitions = param.num4x4Partitions >> (depth * 2);
30
31
if (!s_partSet[0])
32
{
33
- s_numPartInCUSize = 1 << g_unitSizeDepth;
34
- switch (g_maxLog2CUSize)
35
+ s_numPartInCUSize = 1 << param.unitSizeDepth;
36
+ switch (param.maxLog2CUSize)
37
{
38
case 6:
39
s_partSet[0] = bcast256;
40
41
42
m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
43
44
- uint32_t cuSize = g_maxCUSize >> depth;
45
+ uint32_t cuSize = param.maxCUSize >> depth;
46
m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (cuSize * cuSize);
47
m_trCoeff[1] = m_trCoeff[2] = 0;
48
m_transformSkip[1] = m_transformSkip[2] = m_cbf[1] = m_cbf[2] = 0;
49
50
51
m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
52
53
- uint32_t cuSize = g_maxCUSize >> depth;
54
+ uint32_t cuSize = param.maxCUSize >> depth;
55
uint32_t sizeL = cuSize * cuSize;
56
uint32_t sizeC = sizeL >> (m_hChromaShift + m_vChromaShift); // block chroma part
57
m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (sizeL + sizeC * 2);
58
59
m_encData = frame.m_encData;
60
m_slice = m_encData->m_slice;
61
m_cuAddr = cuAddr;
62
- m_cuPelX = (cuAddr % m_slice->m_sps->numCuInWidth) << g_maxLog2CUSize;
63
- m_cuPelY = (cuAddr / m_slice->m_sps->numCuInWidth) << g_maxLog2CUSize;
64
+ m_cuPelX = (cuAddr % m_slice->m_sps->numCuInWidth) << m_slice->m_param->maxLog2CUSize;
65
+ m_cuPelY = (cuAddr / m_slice->m_sps->numCuInWidth) << m_slice->m_param->maxLog2CUSize;
66
m_absIdxInCTU = 0;
67
- m_numPartitions = NUM_4x4_PARTITIONS;
68
+ m_numPartitions = m_encData->m_param->num4x4Partitions;
69
m_bFirstRowInSlice = (uint8_t)firstRowInSlice;
70
m_bLastRowInSlice = (uint8_t)lastRowInSlice;
71
m_bLastCuInSlice = (uint8_t)lastCuInSlice;
72
73
/* sequential memsets */
74
m_partSet((uint8_t*)m_qp, (uint8_t)qp);
75
- m_partSet(m_log2CUSize, (uint8_t)g_maxLog2CUSize);
76
+ m_partSet(m_log2CUSize, (uint8_t)m_slice->m_param->maxLog2CUSize);
77
m_partSet(m_lumaIntraDir, (uint8_t)ALL_IDX);
78
m_partSet(m_chromaIntraDir, (uint8_t)ALL_IDX);
79
m_partSet(m_tqBypass, (uint8_t)frame.m_encData->m_param->bLossless);
80
81
82
memcpy(m_distortion + offset, subCU.m_distortion, childGeom.numPartitions * sizeof(sse_t));
83
84
- uint32_t tmp = 1 << ((g_maxLog2CUSize - childGeom.depth) * 2);
85
+ uint32_t tmp = 1 << ((m_slice->m_param->maxLog2CUSize - childGeom.depth) * 2);
86
uint32_t tmp2 = subPartIdx * tmp;
87
memcpy(m_trCoeff[0] + tmp2, subCU.m_trCoeff[0], sizeof(coeff_t)* tmp);
88
89
90
91
memcpy(ctu.m_distortion + m_absIdxInCTU, m_distortion, m_numPartitions * sizeof(sse_t));
92
93
- uint32_t tmpY = 1 << ((g_maxLog2CUSize - depth) * 2);
94
+ uint32_t tmpY = 1 << ((m_slice->m_param->maxLog2CUSize - depth) * 2);
95
uint32_t tmpY2 = m_absIdxInCTU << (LOG2_UNIT_SIZE * 2);
96
memcpy(ctu.m_trCoeff[0] + tmpY2, m_trCoeff[0], sizeof(coeff_t)* tmpY);
97
98
99
m_partCopy(ctu.m_tuDepth + m_absIdxInCTU, m_tuDepth);
100
m_partCopy(ctu.m_cbf[0] + m_absIdxInCTU, m_cbf[0]);
101
102
- uint32_t tmpY = 1 << ((g_maxLog2CUSize - depth) * 2);
103
+ uint32_t tmpY = 1 << ((m_slice->m_param->maxLog2CUSize - depth) * 2);
104
uint32_t tmpY2 = m_absIdxInCTU << (LOG2_UNIT_SIZE * 2);
105
memcpy(ctu.m_trCoeff[0] + tmpY2, m_trCoeff[0], sizeof(coeff_t)* tmpY);
106
107
108
return m_cuLeft;
109
}
110
111
- alPartUnitIdx = NUM_4x4_PARTITIONS - 1;
112
+ alPartUnitIdx = m_encData->m_param->num4x4Partitions - 1;
113
return m_cuAboveLeft;
114
}
115
116
117
/* Get left QpMinCu */
118
const CUData* CUData::getQpMinCuLeft(uint32_t& lPartUnitIdx, uint32_t curAbsIdxInCTU) const
119
{
120
- uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
121
+ uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
122
uint32_t absRorderQpMinCUIdx = g_zscanToRaster[absZorderQpMinCUIdx];
123
124
// check for left CTU boundary
125
126
/* Get above QpMinCu */
127
const CUData* CUData::getQpMinCuAbove(uint32_t& aPartUnitIdx, uint32_t curAbsIdxInCTU) const
128
{
129
- uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
130
+ uint32_t absZorderQpMinCUIdx = curAbsIdxInCTU & (0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2);
131
uint32_t absRorderQpMinCUIdx = g_zscanToRaster[absZorderQpMinCUIdx];
132
133
// check for top CTU boundary
134
135
136
int8_t CUData::getLastCodedQP(uint32_t absPartIdx) const
137
{
138
- uint32_t quPartIdxMask = 0xFF << (g_unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2;
139
+ uint32_t quPartIdxMask = 0xFF << (m_encData->m_param->unitSizeDepth - m_slice->m_pps->maxCuDQPDepth) * 2;
140
int lastValidPartIdx = getLastValidPartIdx(absPartIdx & quPartIdxMask);
141
142
if (lastValidPartIdx >= 0)
143
144
if (m_absIdxInCTU)
145
return m_encData->getPicCTU(m_cuAddr)->getLastCodedQP(m_absIdxInCTU);
146
else if (m_cuAddr > 0 && !(m_slice->m_pps->bEntropyCodingSyncEnabled && !(m_cuAddr % m_slice->m_sps->numCuInWidth)))
147
- return m_encData->getPicCTU(m_cuAddr - 1)->getLastCodedQP(NUM_4x4_PARTITIONS);
148
+ return m_encData->getPicCTU(m_cuAddr - 1)->getLastCodedQP(m_encData->m_param->num4x4Partitions);
149
else
150
return (int8_t)m_slice->m_sliceQp;
151
}
152
153
154
bool CUData::setQPSubCUs(int8_t qp, uint32_t absPartIdx, uint32_t depth)
155
{
156
- uint32_t curPartNumb = NUM_4x4_PARTITIONS >> (depth << 1);
157
+ uint32_t curPartNumb = m_encData->m_param->num4x4Partitions >> (depth << 1);
158
uint32_t curPartNumQ = curPartNumb >> 2;
159
160
if (m_cuDepth[absPartIdx] > depth)
161
162
dir |= (1 << list);
163
candMvField[count][list].mv = colmv;
164
candMvField[count][list].refIdx = refIdx;
165
+ if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisReuseMode == X265_ANALYSIS_SAVE && m_log2CUSize[0] < 4)
166
+ {
167
+ MV dist(MAX_MV, MAX_MV);
168
+ candMvField[count][list].mv = dist;
169
+ }
170
}
171
}
172
173
174
int curRefPOC = m_slice->m_refPOCList[picList][refIdx];
175
int curPOC = m_slice->m_poc;
176
177
- pmv[numMvc++] = amvpCand[num++] = scaleMvByPOCDist(neighbours[MD_COLLOCATED].mv[picList], curPOC, curRefPOC, colPOC, colRefPOC);
178
+ if (m_encData->m_param->scaleFactor && m_encData->m_param->analysisReuseMode == X265_ANALYSIS_SAVE && (m_log2CUSize[0] < 4))
179
+ {
180
+ MV dist(MAX_MV, MAX_MV);
181
+ pmv[numMvc++] = amvpCand[num++] = dist;
182
+ }
183
+ else
184
+ pmv[numMvc++] = amvpCand[num++] = scaleMvByPOCDist(neighbours[MD_COLLOCATED].mv[picList], curPOC, curRefPOC, colPOC, colRefPOC);
185
}
186
}
187
188
189
uint32_t offset = 8;
190
191
int16_t xmax = (int16_t)((m_slice->m_sps->picWidthInLumaSamples + offset - m_cuPelX - 1) << mvshift);
192
- int16_t xmin = -(int16_t)((g_maxCUSize + offset + m_cuPelX - 1) << mvshift);
193
+ int16_t xmin = -(int16_t)((m_encData->m_param->maxCUSize + offset + m_cuPelX - 1) << mvshift);
194
195
int16_t ymax = (int16_t)((m_slice->m_sps->picHeightInLumaSamples + offset - m_cuPelY - 1) << mvshift);
196
- int16_t ymin = -(int16_t)((g_maxCUSize + offset + m_cuPelY - 1) << mvshift);
197
+ int16_t ymin = -(int16_t)((m_encData->m_param->maxCUSize + offset + m_cuPelY - 1) << mvshift);
198
199
outMV.x = X265_MIN(xmax, X265_MAX(xmin, outMV.x));
200
outMV.y = X265_MIN(ymax, X265_MAX(ymin, outMV.y));
201
x265_2.4.tar.gz/source/common/cudata.h -> x265_2.5.tar.gz/source/common/cudata.h
Changed
53
1
2
{
3
public:
4
5
- static cubcast_t s_partSet[NUM_FULL_DEPTH]; // pointer to broadcast set functions per absolute depth
6
- static uint32_t s_numPartInCUSize;
7
+ cubcast_t s_partSet[NUM_FULL_DEPTH]; // pointer to broadcast set functions per absolute depth
8
+ uint32_t s_numPartInCUSize;
9
10
bool m_vbvAffected;
11
12
13
14
CUData();
15
16
- void initialize(const CUDataMemPool& dataPool, uint32_t depth, int csp, int instance);
17
+ void initialize(const CUDataMemPool& dataPool, uint32_t depth, const x265_param& param, int instance);
18
static void calcCTUGeoms(uint32_t ctuWidth, uint32_t ctuHeight, uint32_t maxCUSize, uint32_t minCUSize, CUGeom cuDataArray[CUGeom::MAX_GEOMS]);
19
20
void initCTU(const Frame& frame, uint32_t cuAddr, int qp, uint32_t firstRowInSlice, uint32_t lastRowInSlice, uint32_t lastCUInSlice);
21
22
void getInterTUQtDepthRange(uint32_t tuDepthRange[2], uint32_t absPartIdx) const;
23
uint32_t getBestRefIdx(uint32_t subPartIdx) const { return ((m_interDir[subPartIdx] & 1) << m_refIdx[0][subPartIdx]) |
24
(((m_interDir[subPartIdx] >> 1) & 1) << (m_refIdx[1][subPartIdx] + 16)); }
25
- uint32_t getPUOffset(uint32_t puIdx, uint32_t absPartIdx) const { return (partAddrTable[(int)m_partSize[absPartIdx]][puIdx] << (g_unitSizeDepth - m_cuDepth[absPartIdx]) * 2) >> 4; }
26
+ uint32_t getPUOffset(uint32_t puIdx, uint32_t absPartIdx) const { return (partAddrTable[(int)m_partSize[absPartIdx]][puIdx] << (m_slice->m_param->unitSizeDepth - m_cuDepth[absPartIdx]) * 2) >> 4; }
27
28
uint32_t getNumPartInter(uint32_t absPartIdx) const { return nbPartsTable[(int)m_partSize[absPartIdx]]; }
29
bool isIntra(uint32_t absPartIdx) const { return m_predMode[absPartIdx] == MODE_INTRA; }
30
31
void getAllowedChromaDir(uint32_t absPartIdx, uint32_t* modeList) const;
32
int getIntraDirLumaPredictor(uint32_t absPartIdx, uint32_t* intraDirPred) const;
33
34
- uint32_t getSCUAddr() const { return (m_cuAddr << g_unitSizeDepth * 2) + m_absIdxInCTU; }
35
+ uint32_t getSCUAddr() const { return (m_cuAddr << m_slice->m_param->unitSizeDepth * 2) + m_absIdxInCTU; }
36
uint32_t getCtxSplitFlag(uint32_t absPartIdx, uint32_t depth) const;
37
uint32_t getCtxSkipFlag(uint32_t absPartIdx) const;
38
void getTUEntropyCodingParameters(TUEntropyCodingParameters &result, uint32_t absPartIdx, uint32_t log2TrSize, bool bIsLuma) const;
39
40
41
CUDataMemPool() { charMemBlock = NULL; trCoeffMemBlock = NULL; mvMemBlock = NULL; distortionMemBlock = NULL; }
42
43
- bool create(uint32_t depth, uint32_t csp, uint32_t numInstances)
44
+ bool create(uint32_t depth, uint32_t csp, uint32_t numInstances, const x265_param& param)
45
{
46
- uint32_t numPartition = NUM_4x4_PARTITIONS >> (depth * 2);
47
- uint32_t cuSize = g_maxCUSize >> depth;
48
+ uint32_t numPartition = param.num4x4Partitions >> (depth * 2);
49
+ uint32_t cuSize = param.maxCUSize >> depth;
50
uint32_t sizeL = cuSize * cuSize;
51
if (csp == X265_CSP_I400)
52
{
53
x265_2.4.tar.gz/source/common/frame.cpp -> x265_2.5.tar.gz/source/common/frame.cpp
Changed
94
1
2
m_rcData = NULL;
3
m_encodeStartTime = 0;
4
m_reconfigureRc = false;
5
+ m_ctuInfo = NULL;
6
+ m_prevCtuInfoChange = NULL;
7
+ m_addOnDepth = NULL;
8
+ m_addOnCtuInfo = NULL;
9
+ m_addOnPrevChange = NULL;
10
}
11
12
bool Frame::create(x265_param *param, float* quantOffsets)
13
14
m_param = param;
15
CHECKED_MALLOC_ZERO(m_rcData, RcStats, 1);
16
17
- if (m_fencPic->create(param->sourceWidth, param->sourceHeight, param->internalCsp) &&
18
- m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode || !!param->bAQMotion, param->rc.qgSize))
19
+ if (param->bCTUInfo)
20
+ {
21
+ uint32_t widthInCTU = (m_param->sourceWidth + param->maxCUSize - 1) >> m_param->maxLog2CUSize;
22
+ uint32_t heightInCTU = (m_param->sourceHeight + param->maxCUSize - 1) >> m_param->maxLog2CUSize;
23
+ uint32_t numCTUsInFrame = widthInCTU * heightInCTU;
24
+ CHECKED_MALLOC_ZERO(m_addOnDepth, uint8_t *, numCTUsInFrame);
25
+ CHECKED_MALLOC_ZERO(m_addOnCtuInfo, uint8_t *, numCTUsInFrame);
26
+ CHECKED_MALLOC_ZERO(m_addOnPrevChange, int *, numCTUsInFrame);
27
+ for (uint32_t i = 0; i < numCTUsInFrame; i++)
28
+ {
29
+ CHECKED_MALLOC_ZERO(m_addOnDepth[i], uint8_t, uint32_t(param->num4x4Partitions));
30
+ CHECKED_MALLOC_ZERO(m_addOnCtuInfo[i], uint8_t, uint32_t(param->num4x4Partitions));
31
+ CHECKED_MALLOC_ZERO(m_addOnPrevChange[i], int, uint32_t(param->num4x4Partitions));
32
+ }
33
+ }
34
+
35
+ if (m_fencPic->create(param) && m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode || !!param->bAQMotion, param->rc.qgSize))
36
{
37
X265_CHECK((m_reconColCount == NULL), "m_reconColCount was initialized");
38
- m_numRows = (m_fencPic->m_picHeight + g_maxCUSize - 1) / g_maxCUSize;
39
+ m_numRows = (m_fencPic->m_picHeight + param->maxCUSize - 1) / param->maxCUSize;
40
m_reconRowFlag = new ThreadSafeInteger[m_numRows];
41
m_reconColCount = new ThreadSafeInteger[m_numRows];
42
43
44
m_reconPic = new PicYuv;
45
m_param = param;
46
m_encData->m_reconPic = m_reconPic;
47
- bool ok = m_encData->create(*param, sps, m_fencPic->m_picCsp) && m_reconPic->create(param->sourceWidth, param->sourceHeight, param->internalCsp);
48
+ bool ok = m_encData->create(*param, sps, m_fencPic->m_picCsp) && m_reconPic->create(param);
49
if (ok)
50
{
51
/* initialize right border of m_reconpicYuv as SAO may read beyond the
52
* end of the picture accessing uninitialized pixels */
53
- int maxHeight = sps.numCuInHeight * g_maxCUSize;
54
+ int maxHeight = sps.numCuInHeight * param->maxCUSize;
55
memset(m_reconPic->m_picOrg[0], 0, sizeof(pixel)* m_reconPic->m_stride * maxHeight);
56
57
/* use pre-calculated cu/pu offsets cached in the SPS structure */
58
59
delete[] m_userSEI.payloads;
60
}
61
62
+ if (m_ctuInfo)
63
+ {
64
+ uint32_t widthInCU = (m_param->sourceWidth + m_param->maxCUSize - 1) >> m_param->maxLog2CUSize;
65
+ uint32_t heightInCU = (m_param->sourceHeight + m_param->maxCUSize - 1) >> m_param->maxLog2CUSize;
66
+ uint32_t numCUsInFrame = widthInCU * heightInCU;
67
+ for (uint32_t i = 0; i < numCUsInFrame; i++)
68
+ {
69
+ X265_FREE((*m_ctuInfo + i)->ctuInfo);
70
+ (*m_ctuInfo + i)->ctuInfo = NULL;
71
+ X265_FREE(m_addOnDepth[i]);
72
+ m_addOnDepth[i] = NULL;
73
+ X265_FREE(m_addOnCtuInfo[i]);
74
+ m_addOnCtuInfo[i] = NULL;
75
+ X265_FREE(m_addOnPrevChange[i]);
76
+ m_addOnPrevChange[i] = NULL;
77
+ }
78
+ X265_FREE(*m_ctuInfo);
79
+ *m_ctuInfo = NULL;
80
+ X265_FREE(m_ctuInfo);
81
+ m_ctuInfo = NULL;
82
+ X265_FREE(m_prevCtuInfoChange);
83
+ m_prevCtuInfoChange = NULL;
84
+ X265_FREE(m_addOnDepth);
85
+ m_addOnDepth = NULL;
86
+ X265_FREE(m_addOnCtuInfo);
87
+ m_addOnCtuInfo = NULL;
88
+ X265_FREE(m_addOnPrevChange);
89
+ m_addOnPrevChange = NULL;
90
+ }
91
m_lowres.destroy();
92
X265_FREE(m_rcData);
93
}
94
x265_2.4.tar.gz/source/common/frame.h -> x265_2.5.tar.gz/source/common/frame.h
Changed
27
1
2
double shortTermCplxCount;
3
int64_t totalBits;
4
int64_t encodedBits;
5
+ double coeff[4];
6
+ double count[4];
7
+ double offset[4];
8
+ double bufferFillFinal;
9
};
10
11
class Frame
12
13
x265_analysis_2Pass m_analysis2Pass;
14
RcStats* m_rcData;
15
16
+ x265_ctu_info_t** m_ctuInfo;
17
+ Event m_copied;
18
+ int* m_prevCtuInfoChange;
19
int64_t m_encodeStartTime;
20
+
21
+ uint8_t** m_addOnDepth;
22
+ uint8_t** m_addOnCtuInfo;
23
+ int** m_addOnPrevChange;
24
Frame();
25
26
bool create(x265_param *param, float* quantOffsets);
27
x265_2.4.tar.gz/source/common/framedata.cpp -> x265_2.5.tar.gz/source/common/framedata.cpp
Changed
13
1
2
if (param.rc.bStatWrite)
3
m_spsrps = const_cast<RPS*>(sps.spsrps);
4
5
- m_cuMemPool.create(0, param.internalCsp, sps.numCUsInFrame);
6
+ m_cuMemPool.create(0, param.internalCsp, sps.numCUsInFrame, param);
7
for (uint32_t ctuAddr = 0; ctuAddr < sps.numCUsInFrame; ctuAddr++)
8
- m_picCTU[ctuAddr].initialize(m_cuMemPool, 0, param.internalCsp, ctuAddr);
9
+ m_picCTU[ctuAddr].initialize(m_cuMemPool, 0, param, ctuAddr);
10
11
CHECKED_MALLOC_ZERO(m_cuStat, RCStatCU, sps.numCUsInFrame);
12
CHECKED_MALLOC(m_rowStat, RCStatRow, sps.numCuInHeight);
13
x265_2.4.tar.gz/source/common/framedata.h -> x265_2.5.tar.gz/source/common/framedata.h
Changed
25
1
2
double percentMergeCu[NUM_CU_DEPTH];
3
double percentIntraDistribution[NUM_CU_DEPTH][INTRA_MODES];
4
double percentInterDistribution[NUM_CU_DEPTH][3]; // 2Nx2N, RECT, AMP modes percentage
5
+ double ipCostRatio;
6
7
uint64_t cntIntraNxN;
8
uint64_t totalCu;
9
10
uint64_t cuInterDistribution[NUM_CU_DEPTH][INTER_MODES];
11
uint64_t cuIntraDistribution[NUM_CU_DEPTH][INTRA_MODES];
12
13
+
14
+ uint64_t totalPu[NUM_CU_DEPTH + 1];
15
+ uint64_t cntSkipPu[NUM_CU_DEPTH];
16
+ uint64_t cntIntraPu[NUM_CU_DEPTH];
17
+ uint64_t cntAmp[NUM_CU_DEPTH];
18
+ uint64_t cnt4x4;
19
+ uint64_t cntInterPu[NUM_CU_DEPTH][INTER_MODES - 1];
20
+ uint64_t cntMergePu[NUM_CU_DEPTH][INTER_MODES - 1];
21
+
22
FrameStats()
23
{
24
memset(this, 0, sizeof(FrameStats));
25
x265_2.4.tar.gz/source/common/ipfilter.cpp -> x265_2.5.tar.gz/source/common/ipfilter.cpp
Changed
24
1
2
const int16_t* coeff = (N == 4) ? g_chromaFilter[coeffIdx] : g_lumaFilter[coeffIdx];
3
int headRoom = IF_INTERNAL_PREC - X265_DEPTH;
4
int shift = IF_FILTER_PREC - headRoom;
5
- int offset = -IF_INTERNAL_OFFS << shift;
6
+ int offset = (unsigned)-IF_INTERNAL_OFFS << shift;
7
int blkheight = height;
8
-
9
src -= N / 2 - 1;
10
11
if (isRowExt)
12
13
const int16_t* c = (N == 4) ? g_chromaFilter[coeffIdx] : g_lumaFilter[coeffIdx];
14
int headRoom = IF_INTERNAL_PREC - X265_DEPTH;
15
int shift = IF_FILTER_PREC - headRoom;
16
- int offset = -IF_INTERNAL_OFFS << shift;
17
-
18
+ int offset = (unsigned)-IF_INTERNAL_OFFS << shift;
19
src -= (N / 2 - 1) * srcStride;
20
-
21
int row, col;
22
for (row = 0; row < height; row++)
23
{
24
x265_2.4.tar.gz/source/common/lowres.h -> x265_2.5.tar.gz/source/common/lowres.h
Changed
10
1
2
bool bKeyframe;
3
bool bLastMiniGopBFrame;
4
5
+ double ipCostRatio;
6
+
7
/* lookahead output data */
8
int64_t costEst[X265_BFRAME_MAX + 2][X265_BFRAME_MAX + 2];
9
int64_t costEstAq[X265_BFRAME_MAX + 2][X265_BFRAME_MAX + 2];
10
x265_2.4.tar.gz/source/common/param.cpp -> x265_2.5.tar.gz/source/common/param.cpp
Changed
201
1
2
param->frameNumThreads = 0;
3
4
param->logLevel = X265_LOG_INFO;
5
+ param->csvLogLevel = 0;
6
param->csvfn = NULL;
7
param->rc.lambdaFileName = NULL;
8
param->bLogCuStats = 0;
9
10
param->rdPenalty = 0;
11
param->psyRd = 2.0;
12
param->psyRdoq = 0.0;
13
- param->analysisMode = 0;
14
+ param->analysisReuseMode = 0;
15
param->analysisMultiPassRefine = 0;
16
param->analysisMultiPassDistortion = 0;
17
- param->analysisFileName = NULL;
18
+ param->analysisReuseFileName = NULL;
19
param->bIntraInBFrames = 0;
20
param->bLossless = 0;
21
param->bCULossless = 0;
22
23
param->rc.bEnableGrain = 0;
24
param->rc.qpMin = 0;
25
param->rc.qpMax = QP_MAX_MAX;
26
+ param->rc.bEnableConstVbv = 0;
27
28
/* Video Usability Information (VUI) */
29
param->vui.aspectRatioIdc = 0;
30
31
param->bOptCUDeltaQP = 0;
32
param->bAQMotion = 0;
33
param->bHDROpt = 0;
34
- param->analysisRefineLevel = 5;
35
+ param->analysisReuseLevel = 5;
36
37
param->toneMapFile = NULL;
38
param->bDhdr10opt = 0;
39
+ param->bCTUInfo = 0;
40
+ param->bUseRcStats = 0;
41
+ param->scaleFactor = 0;
42
+ param->intraRefine = 0;
43
+ param->interRefine = 0;
44
+ param->mvRefine = 0;
45
+ param->bUseAnalysisFile = 1;
46
+ param->csvfpt = NULL;
47
}
48
49
int x265_param_default_preset(x265_param* param, const char* preset, const char* tune)
50
51
param->psyRd = 4.0;
52
param->psyRdoq = 10.0;
53
param->bEnableSAO = 0;
54
+ param->rc.bEnableConstVbv = 1;
55
}
56
else
57
return -1;
58
59
p->rc.bStrictCbr = atobool(value);
60
p->rc.pbFactor = 1.0;
61
}
62
- OPT("analysis-mode") p->analysisMode = parseName(value, x265_analysis_names, bError);
63
+ OPT("analysis-reuse-mode") p->analysisReuseMode = parseName(value, x265_analysis_names, bError);
64
OPT("sar")
65
{
66
p->vui.aspectRatioIdc = parseName(value, x265_sar_names, bError);
67
68
OPT("scaling-list") p->scalingLists = strdup(value);
69
OPT2("pools", "numa-pools") p->numaPools = strdup(value);
70
OPT("lambda-file") p->rc.lambdaFileName = strdup(value);
71
- OPT("analysis-file") p->analysisFileName = strdup(value);
72
+ OPT("analysis-reuse-file") p->analysisReuseFileName = strdup(value);
73
OPT("qg-size") p->rc.qgSize = atoi(value);
74
OPT("master-display") p->masteringDisplayColorVolume = strdup(value);
75
OPT("max-cll") bError |= sscanf(value, "%hu,%hu", &p->maxCLL, &p->maxFALL) != 2;
76
77
if (bExtraParams)
78
{
79
if (0) ;
80
+ OPT("csv") p->csvfn = strdup(value);
81
+ OPT("csv-log-level") p->csvLogLevel = atoi(value);
82
OPT("qpmin") p->rc.qpMin = atoi(value);
83
OPT("analyze-src-pics") p->bSourceReferenceEstimation = atobool(value);
84
OPT("log2-max-poc-lsb") p->log2MaxPocLsb = atoi(value);
85
86
OPT("multi-pass-opt-distortion") p->analysisMultiPassDistortion = atobool(value);
87
OPT("aq-motion") p->bAQMotion = atobool(value);
88
OPT("dynamic-rd") p->dynamicRd = atof(value);
89
- OPT("refine-level") p->analysisRefineLevel = atoi(value);
90
+ OPT("analysis-reuse-level") p->analysisReuseLevel = atoi(value);
91
OPT("ssim-rd")
92
{
93
int bval = atobool(value);
94
95
OPT("limit-sao") p->bLimitSAO = atobool(value);
96
OPT("dhdr10-info") p->toneMapFile = strdup(value);
97
OPT("dhdr10-opt") p->bDhdr10opt = atobool(value);
98
+ OPT("const-vbv") p->rc.bEnableConstVbv = atobool(value);
99
+ OPT("ctu-info") p->bCTUInfo = atoi(value);
100
+ OPT("scale-factor") p->scaleFactor = atoi(value);
101
+ OPT("refine-intra")p->intraRefine = atoi(value);
102
+ OPT("refine-inter")p->interRefine = atobool(value);
103
+ OPT("refine-mv")p->mvRefine = atobool(value);
104
else
105
return X265_PARAM_BAD_NAME;
106
}
107
108
"Constant QP is incompatible with 2pass");
109
CHECK(param->rc.bStrictCbr && (param->rc.bitrate <= 0 || param->rc.vbvBufferSize <=0),
110
"Strict-cbr cannot be applied without specifying target bitrate or vbv bufsize");
111
- CHECK(param->analysisMode && (param->analysisMode < X265_ANALYSIS_OFF || param->analysisMode > X265_ANALYSIS_LOAD),
112
+ CHECK(param->analysisReuseMode && (param->analysisReuseMode < X265_ANALYSIS_OFF || param->analysisReuseMode > X265_ANALYSIS_LOAD),
113
"Invalid analysis mode. Analysis mode 0: OFF 1: SAVE : 2 LOAD");
114
- CHECK(param->analysisMode && (param->analysisRefineLevel < 1 || param->analysisRefineLevel > 10),
115
+ CHECK(param->analysisReuseMode && (param->analysisReuseLevel < 1 || param->analysisReuseLevel > 10),
116
"Invalid analysis refine level. Value must be between 1 and 10 (inclusive)");
117
+ CHECK(param->scaleFactor > 2, "Invalid scale-factor. Supports factor <= 2");
118
CHECK(param->rc.qpMax < QP_MIN || param->rc.qpMax > QP_MAX_MAX,
119
"qpmax exceeds supported range (0 to 69)");
120
CHECK(param->rc.qpMin < QP_MIN || param->rc.qpMin > QP_MAX_MAX,
121
"qpmin exceeds supported range (0 to 69)");
122
CHECK(param->log2MaxPocLsb < 4 || param->log2MaxPocLsb > 16,
123
"Supported range for log2MaxPocLsb is 4 to 16");
124
+ CHECK(param->bCTUInfo < 0 || (param->bCTUInfo != 0 && param->bCTUInfo != 1 && param->bCTUInfo != 2 && param->bCTUInfo != 4 && param->bCTUInfo != 6) || param->bCTUInfo > 6,
125
+ "Supported values for bCTUInfo are 0, 1, 2, 4, 6");
126
#if !X86_64
127
CHECK(param->searchMethod == X265_SEA && (param->sourceWidth > 840 || param->sourceHeight > 480),
128
"SEA motion search does not support resolutions greater than 480p in 32 bit build");
129
130
}
131
}
132
133
-int x265_set_globals(x265_param* param)
134
-{
135
- uint32_t maxLog2CUSize = (uint32_t)g_log2Size[param->maxCUSize];
136
- uint32_t minLog2CUSize = (uint32_t)g_log2Size[param->minCUSize];
137
-
138
- Lock gLock;
139
- ScopedLock sLock(gLock);
140
-
141
- if (++g_ctuSizeConfigured > 1)
142
- {
143
- if (g_maxCUSize != param->maxCUSize)
144
- {
145
- x265_log(param, X265_LOG_WARNING, "maxCUSize must be the same for all encoders in a single process");
146
- }
147
- if (g_maxCUDepth != maxLog2CUSize - minLog2CUSize)
148
- {
149
- x265_log(param, X265_LOG_WARNING, "maxCUDepth must be the same for all encoders in a single process");
150
- }
151
- param->maxCUSize = g_maxCUSize;
152
- return x265_check_params(param); /* Check again, since param may have changed */
153
- }
154
- else
155
- {
156
- // set max CU width & height
157
- g_maxCUSize = param->maxCUSize;
158
- g_maxLog2CUSize = maxLog2CUSize;
159
-
160
- // compute actual CU depth with respect to config depth and max transform size
161
- g_maxCUDepth = maxLog2CUSize - minLog2CUSize;
162
- g_unitSizeDepth = maxLog2CUSize - LOG2_UNIT_SIZE;
163
- }
164
-
165
- g_maxSlices = param->maxSlices;
166
- return 0;
167
-}
168
-
169
static void appendtool(x265_param* param, char* buf, size_t size, const char* toolstr)
170
{
171
static const int overhead = (int)strlen("x265 [info]: tools: ");
172
173
TOOLOPT(param->bEnableStrongIntraSmoothing, "strong-intra-smoothing");
174
TOOLVAL(param->lookaheadSlices, "lslices=%d");
175
TOOLVAL(param->lookaheadThreads, "lthreads=%d")
176
+ TOOLVAL(param->bCTUInfo, "ctu-info=%d");
177
if (param->maxSlices > 1)
178
TOOLVAL(param->maxSlices, "slices=%d");
179
if (param->bEnableLoopFilter)
180
181
TOOLOPT(!param->bSaoNonDeblocked && param->bEnableSAO, "sao");
182
TOOLOPT(param->rc.bStatWrite, "stats-write");
183
TOOLOPT(param->rc.bStatRead, "stats-read");
184
-#if ENABLE_DYNAMIC_HDR10
185
- TOOLVAL(param->toneMapFile != NULL, "dhdr10-info");
186
+#if ENABLE_HDR10_PLUS
187
+ TOOLOPT(param->toneMapFile != NULL, "dhdr10-info");
188
#endif
189
x265_log(param, X265_LOG_INFO, "tools:%s\n", buf);
190
fflush(stderr);
191
192
BOOL(p->bEnablePsnr, "psnr");
193
BOOL(p->bEnableSsim, "ssim");
194
s += sprintf(s, " log-level=%d", p->logLevel);
195
+ if (p->csvfn)
196
+ s += sprintf(s, " csvfn=%s csv-log-level=%d", p->csvfn, p->csvLogLevel);
197
s += sprintf(s, " bitdepth=%d", p->internalBitDepth);
198
s += sprintf(s, " input-csp=%d", p->internalCsp);
199
s += sprintf(s, " fps=%u/%u", p->fpsNum, p->fpsDenom);
200
201
x265_2.4.tar.gz/source/common/param.h -> x265_2.5.tar.gz/source/common/param.h
Changed
9
1
2
namespace X265_NS {
3
4
int x265_check_params(x265_param *param);
5
-int x265_set_globals(x265_param *param);
6
void x265_print_params(x265_param *param);
7
void x265_param_apply_fastfirstpass(x265_param *p);
8
char* x265_param2string(x265_param *param, int padx, int pady);
9
x265_2.4.tar.gz/source/common/picyuv.cpp -> x265_2.5.tar.gz/source/common/picyuv.cpp
Changed
189
1
2
3
m_maxLumaLevel = 0;
4
m_avgLumaLevel = 0;
5
+
6
+ m_maxChromaULevel = 0;
7
+ m_avgChromaULevel = 0;
8
+
9
+ m_maxChromaVLevel = 0;
10
+ m_avgChromaVLevel = 0;
11
+
12
+#if (X265_DEPTH > 8)
13
+ m_minLumaLevel = 0xFFFF;
14
+ m_minChromaULevel = 0xFFFF;
15
+ m_minChromaVLevel = 0xFFFF;
16
+#else
17
+ m_minLumaLevel = 0xFF;
18
+ m_minChromaULevel = 0xFF;
19
+ m_minChromaVLevel = 0xFF;
20
+#endif
21
+
22
m_stride = 0;
23
m_strideC = 0;
24
m_hChromaShift = 0;
25
m_vChromaShift = 0;
26
}
27
28
-bool PicYuv::create(uint32_t picWidth, uint32_t picHeight, uint32_t picCsp)
29
+bool PicYuv::create(x265_param* param, pixel *pixelbuf)
30
{
31
+ m_param = param;
32
+ uint32_t picWidth = m_param->sourceWidth;
33
+ uint32_t picHeight = m_param->sourceHeight;
34
+ uint32_t picCsp = m_param->internalCsp;
35
m_picWidth = picWidth;
36
m_picHeight = picHeight;
37
m_hChromaShift = CHROMA_H_SHIFT(picCsp);
38
m_vChromaShift = CHROMA_V_SHIFT(picCsp);
39
m_picCsp = picCsp;
40
41
- uint32_t numCuInWidth = (m_picWidth + g_maxCUSize - 1) / g_maxCUSize;
42
- uint32_t numCuInHeight = (m_picHeight + g_maxCUSize - 1) / g_maxCUSize;
43
+ uint32_t numCuInWidth = (m_picWidth + param->maxCUSize - 1) / param->maxCUSize;
44
+ uint32_t numCuInHeight = (m_picHeight + param->maxCUSize - 1) / param->maxCUSize;
45
46
- m_lumaMarginX = g_maxCUSize + 32; // search margin and 8-tap filter half-length, padded for 32-byte alignment
47
- m_lumaMarginY = g_maxCUSize + 16; // margin for 8-tap filter and infinite padding
48
- m_stride = (numCuInWidth * g_maxCUSize) + (m_lumaMarginX << 1);
49
+ m_lumaMarginX = param->maxCUSize + 32; // search margin and 8-tap filter half-length, padded for 32-byte alignment
50
+ m_lumaMarginY = param->maxCUSize + 16; // margin for 8-tap filter and infinite padding
51
+ m_stride = (numCuInWidth * param->maxCUSize) + (m_lumaMarginX << 1);
52
53
- int maxHeight = numCuInHeight * g_maxCUSize;
54
- CHECKED_MALLOC(m_picBuf[0], pixel, m_stride * (maxHeight + (m_lumaMarginY * 2)));
55
- m_picOrg[0] = m_picBuf[0] + m_lumaMarginY * m_stride + m_lumaMarginX;
56
+ int maxHeight = numCuInHeight * param->maxCUSize;
57
+ if (pixelbuf)
58
+ m_picOrg[0] = pixelbuf;
59
+ else
60
+ {
61
+ CHECKED_MALLOC(m_picBuf[0], pixel, m_stride * (maxHeight + (m_lumaMarginY * 2)));
62
+ m_picOrg[0] = m_picBuf[0] + m_lumaMarginY * m_stride + m_lumaMarginX;
63
+ }
64
65
if (picCsp != X265_CSP_I400)
66
{
67
m_chromaMarginX = m_lumaMarginX; // keep 16-byte alignment for chroma CTUs
68
m_chromaMarginY = m_lumaMarginY >> m_vChromaShift;
69
- m_strideC = ((numCuInWidth * g_maxCUSize) >> m_hChromaShift) + (m_chromaMarginX * 2);
70
+ m_strideC = ((numCuInWidth * m_param->maxCUSize) >> m_hChromaShift) + (m_chromaMarginX * 2);
71
72
CHECKED_MALLOC(m_picBuf[1], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
73
CHECKED_MALLOC(m_picBuf[2], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
74
75
return false;
76
}
77
78
+int PicYuv::getLumaBufLen(uint32_t picWidth, uint32_t picHeight, uint32_t picCsp)
79
+{
80
+ m_picWidth = picWidth;
81
+ m_picHeight = picHeight;
82
+ m_hChromaShift = CHROMA_H_SHIFT(picCsp);
83
+ m_vChromaShift = CHROMA_V_SHIFT(picCsp);
84
+ m_picCsp = picCsp;
85
+
86
+ uint32_t numCuInWidth = (m_picWidth + m_param->maxCUSize - 1) / m_param->maxCUSize;
87
+ uint32_t numCuInHeight = (m_picHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
88
+
89
+ m_lumaMarginX = m_param->maxCUSize + 32; // search margin and 8-tap filter half-length, padded for 32-byte alignment
90
+ m_lumaMarginY = m_param->maxCUSize + 16; // margin for 8-tap filter and infinite padding
91
+ m_stride = (numCuInWidth * m_param->maxCUSize) + (m_lumaMarginX << 1);
92
+
93
+ int maxHeight = numCuInHeight * m_param->maxCUSize;
94
+ int bufLen = (int)(m_stride * (maxHeight + (m_lumaMarginY * 2)));
95
+
96
+ return bufLen;
97
+}
98
+
99
/* the first picture allocated by the encoder will be asked to generate these
100
* offset arrays. Once generated, they will be provided to all future PicYuv
101
* allocated by the same encoder. */
102
bool PicYuv::createOffsets(const SPS& sps)
103
{
104
- uint32_t numPartitions = 1 << (g_unitSizeDepth * 2);
105
+ uint32_t numPartitions = 1 << (m_param->unitSizeDepth * 2);
106
107
if (m_picCsp != X265_CSP_I400)
108
{
109
110
{
111
for (uint32_t cuCol = 0; cuCol < sps.numCuInWidth; cuCol++)
112
{
113
- m_cuOffsetY[cuRow * sps.numCuInWidth + cuCol] = m_stride * cuRow * g_maxCUSize + cuCol * g_maxCUSize;
114
- m_cuOffsetC[cuRow * sps.numCuInWidth + cuCol] = m_strideC * cuRow * (g_maxCUSize >> m_vChromaShift) + cuCol * (g_maxCUSize >> m_hChromaShift);
115
+ m_cuOffsetY[cuRow * sps.numCuInWidth + cuCol] = m_stride * cuRow * m_param->maxCUSize + cuCol * m_param->maxCUSize;
116
+ m_cuOffsetC[cuRow * sps.numCuInWidth + cuCol] = m_strideC * cuRow * (m_param->maxCUSize >> m_vChromaShift) + cuCol * (m_param->maxCUSize >> m_hChromaShift);
117
}
118
}
119
120
121
CHECKED_MALLOC(m_cuOffsetY, intptr_t, sps.numCuInWidth * sps.numCuInHeight);
122
for (uint32_t cuRow = 0; cuRow < sps.numCuInHeight; cuRow++)
123
for (uint32_t cuCol = 0; cuCol < sps.numCuInWidth; cuCol++)
124
- m_cuOffsetY[cuRow * sps.numCuInWidth + cuCol] = m_stride * cuRow * g_maxCUSize + cuCol * g_maxCUSize;
125
+ m_cuOffsetY[cuRow * sps.numCuInWidth + cuCol] = m_stride * cuRow * m_param->maxCUSize + cuCol * m_param->maxCUSize;
126
127
CHECKED_MALLOC(m_buOffsetY, intptr_t, (size_t)numPartitions);
128
for (uint32_t idx = 0; idx < numPartitions; ++idx)
129
130
131
X265_CHECK(pic.bitDepth >= 8, "pic.bitDepth check failure");
132
133
+ uint64_t lumaSum;
134
+ uint64_t cbSum;
135
+ uint64_t crSum;
136
+ lumaSum = cbSum = crSum = 0;
137
+
138
if (pic.bitDepth == 8)
139
{
140
#if (X265_DEPTH > 8)
141
142
pixel *U = m_picOrg[1];
143
pixel *V = m_picOrg[2];
144
145
+ pixel *yPic = m_picOrg[0];
146
+ pixel *uPic = m_picOrg[1];
147
+ pixel *vPic = m_picOrg[2];
148
+
149
+ for (int r = 0; r < height; r++)
150
+ {
151
+ for (int c = 0; c < width; c++)
152
+ {
153
+ m_maxLumaLevel = X265_MAX(yPic[c], m_maxLumaLevel);
154
+ m_minLumaLevel = X265_MIN(yPic[c], m_minLumaLevel);
155
+ lumaSum += yPic[c];
156
+ }
157
+ yPic += m_stride;
158
+ }
159
+ m_avgLumaLevel = (double)lumaSum / (m_picHeight * m_picWidth);
160
+
161
+ if (param.csvLogLevel >= 2)
162
+ {
163
+ if (param.internalCsp != X265_CSP_I400)
164
+ {
165
+ for (int r = 0; r < height >> m_vChromaShift; r++)
166
+ {
167
+ for (int c = 0; c < width >> m_hChromaShift; c++)
168
+ {
169
+ m_maxChromaULevel = X265_MAX(uPic[c], m_maxChromaULevel);
170
+ m_minChromaULevel = X265_MIN(uPic[c], m_minChromaULevel);
171
+ cbSum += uPic[c];
172
+
173
+ m_maxChromaVLevel = X265_MAX(vPic[c], m_maxChromaVLevel);
174
+ m_minChromaVLevel = X265_MIN(vPic[c], m_minChromaVLevel);
175
+ crSum += vPic[c];
176
+ }
177
+
178
+ uPic += m_strideC;
179
+ vPic += m_strideC;
180
+ }
181
+ m_avgChromaULevel = (double)cbSum / ((height >> m_vChromaShift) * (width >> m_hChromaShift));
182
+ m_avgChromaVLevel = (double)crSum / ((height >> m_vChromaShift) * (width >> m_hChromaShift));
183
+ }
184
+ }
185
+
186
#if HIGH_BIT_DEPTH
187
bool calcHDRParams = !!param.minLuma || (param.maxLuma != PIXEL_MAX);
188
/* Apply min/max luma bounds for HDR pixel manipulations */
189
x265_2.4.tar.gz/source/common/picyuv.h -> x265_2.5.tar.gz/source/common/picyuv.h
Changed
30
1
2
uint32_t m_chromaMarginX;
3
uint32_t m_chromaMarginY;
4
5
- pixel m_maxLumaLevel;
6
- double m_avgLumaLevel;
7
+ pixel m_maxLumaLevel;
8
+ pixel m_minLumaLevel;
9
+ double m_avgLumaLevel;
10
+
11
+ pixel m_maxChromaULevel;
12
+ pixel m_minChromaULevel;
13
+ double m_avgChromaULevel;
14
+
15
+ pixel m_maxChromaVLevel;
16
+ pixel m_minChromaVLevel;
17
+ double m_avgChromaVLevel;
18
+ x265_param *m_param;
19
20
PicYuv();
21
22
- bool create(uint32_t picWidth, uint32_t picHeight, uint32_t csp);
23
+ bool create(x265_param* param, pixel *pixelbuf = NULL);
24
bool createOffsets(const SPS& sps);
25
void destroy();
26
+ int getLumaBufLen(uint32_t picWidth, uint32_t picHeight, uint32_t picCsp);
27
28
void copyFromPicture(const x265_picture&, const x265_param& param, int padx, int pady);
29
30
x265_2.4.tar.gz/source/common/primitives.cpp -> x265_2.5.tar.gz/source/common/primitives.cpp
Changed
17
1
2
void setupIntraPrimitives_c(EncoderPrimitives &p);
3
void setupLoopFilterPrimitives_c(EncoderPrimitives &p);
4
void setupSaoPrimitives_c(EncoderPrimitives &p);
5
+void setupSeaIntegralPrimitives_c(EncoderPrimitives &p);
6
7
void setupCPrimitives(EncoderPrimitives &p)
8
{
9
10
setupIntraPrimitives_c(p); // intrapred.cpp
11
setupLoopFilterPrimitives_c(p); // loopfilter.cpp
12
setupSaoPrimitives_c(p); // sao.cpp
13
+ setupSeaIntegralPrimitives_c(p); // framefilter.cpp
14
}
15
16
void setupAliasPrimitives(EncoderPrimitives &p)
17
x265_2.4.tar.gz/source/common/primitives.h -> x265_2.5.tar.gz/source/common/primitives.h
Changed
39
1
2
BLOCK_422_32x64
3
};
4
5
+enum IntegralSize
6
+{
7
+ INTEGRAL_4,
8
+ INTEGRAL_8,
9
+ INTEGRAL_12,
10
+ INTEGRAL_16,
11
+ INTEGRAL_24,
12
+ INTEGRAL_32,
13
+ NUM_INTEGRAL_SIZE
14
+};
15
+
16
typedef int (*pixelcmp_t)(const pixel* fenc, intptr_t fencstride, const pixel* fref, intptr_t frefstride); // fenc is aligned
17
typedef int (*pixelcmp_ss_t)(const int16_t* fenc, intptr_t fencstride, const int16_t* fref, intptr_t frefstride);
18
typedef sse_t (*pixel_sse_t)(const pixel* fenc, intptr_t fencstride, const pixel* fref, intptr_t frefstride); // fenc is aligned
19
20
typedef void (*pelFilterLumaStrong_t)(pixel* src, intptr_t srcStep, intptr_t offset, int32_t tcP, int32_t tcQ);
21
typedef void (*pelFilterChroma_t)(pixel* src, intptr_t srcStep, intptr_t offset, int32_t tc, int32_t maskP, int32_t maskQ);
22
23
+typedef void (*integralv_t)(uint32_t *sum, intptr_t stride);
24
+typedef void (*integralh_t)(uint32_t *sum, pixel *pix, intptr_t stride);
25
+
26
/* Function pointers to optimized encoder primitives. Each pointer can reference
27
* either an assembly routine, a SIMD intrinsic primitive, or a C function */
28
struct EncoderPrimitives
29
30
pelFilterLumaStrong_t pelFilterLumaStrong[2]; // EDGE_VER = 0, EDGE_HOR = 1
31
pelFilterChroma_t pelFilterChroma[2]; // EDGE_VER = 0, EDGE_HOR = 1
32
33
+ integralv_t integral_initv[NUM_INTEGRAL_SIZE];
34
+ integralh_t integral_inith[NUM_INTEGRAL_SIZE];
35
+
36
/* There is one set of chroma primitives per color space. An encoder will
37
* have just a single color space and thus it will only ever use one entry
38
* in this array. However we always fill all entries in the array in case
39
x265_2.4.tar.gz/source/common/slice.cpp -> x265_2.5.tar.gz/source/common/slice.cpp
Changed
30
1
2
uint32_t Slice::realEndAddress(uint32_t endCUAddr) const
3
{
4
// Calculate end address
5
- uint32_t internalAddress = (endCUAddr - 1) % NUM_4x4_PARTITIONS;
6
- uint32_t externalAddress = (endCUAddr - 1) / NUM_4x4_PARTITIONS;
7
- uint32_t xmax = m_sps->picWidthInLumaSamples - (externalAddress % m_sps->numCuInWidth) * g_maxCUSize;
8
- uint32_t ymax = m_sps->picHeightInLumaSamples - (externalAddress / m_sps->numCuInWidth) * g_maxCUSize;
9
+ uint32_t internalAddress = (endCUAddr - 1) % m_param->num4x4Partitions;
10
+ uint32_t externalAddress = (endCUAddr - 1) / m_param->num4x4Partitions;
11
+ uint32_t xmax = m_sps->picWidthInLumaSamples - (externalAddress % m_sps->numCuInWidth) * m_param->maxCUSize;
12
+ uint32_t ymax = m_sps->picHeightInLumaSamples - (externalAddress / m_sps->numCuInWidth) * m_param->maxCUSize;
13
14
while (g_zscanToPelX[internalAddress] >= xmax || g_zscanToPelY[internalAddress] >= ymax)
15
internalAddress--;
16
17
internalAddress++;
18
- if (internalAddress == NUM_4x4_PARTITIONS)
19
+ if (internalAddress == m_param->num4x4Partitions)
20
{
21
internalAddress = 0;
22
externalAddress++;
23
}
24
25
- return externalAddress * NUM_4x4_PARTITIONS + internalAddress;
26
+ return externalAddress * m_param->num4x4Partitions + internalAddress;
27
}
28
29
30
x265_2.4.tar.gz/source/common/slice.h -> x265_2.5.tar.gz/source/common/slice.h
Changed
9
1
2
int m_iPPSQpMinus26;
3
int numRefIdxDefault[2];
4
int m_iNumRPSInSPS;
5
+ const x265_param *m_param;
6
7
Slice()
8
{
9
x265_2.4.tar.gz/source/common/threadpool.cpp -> x265_2.5.tar.gz/source/common/threadpool.cpp
Changed
73
1
2
int cpusPerNode[MAX_NODE_NUM + 1];
3
int threadsPerPool[MAX_NODE_NUM + 2];
4
uint64_t nodeMaskPerPool[MAX_NODE_NUM + 2];
5
+ int totalNumThreads = 0;
6
7
memset(cpusPerNode, 0, sizeof(cpusPerNode));
8
memset(threadsPerPool, 0, sizeof(threadsPerPool));
9
10
if (bNumaSupport)
11
x265_log(p, X265_LOG_DEBUG, "NUMA node %d may use %d logical cores\n", i, cpusPerNode[i]);
12
if (threadsPerPool[i])
13
+ {
14
numPools += (threadsPerPool[i] + MAX_POOL_THREADS - 1) / MAX_POOL_THREADS;
15
+ totalNumThreads += threadsPerPool[i];
16
+ }
17
}
18
+ if (!isThreadsReserved)
19
+ {
20
+ if (!numPools)
21
+ {
22
+ x265_log(p, X265_LOG_DEBUG, "No pool thread available. Deciding frame-threads based on detected CPU threads\n");
23
+ totalNumThreads = ThreadPool::getCpuCount(); // auto-detect frame threads
24
+ }
25
26
+ if (!p->frameNumThreads)
27
+ ThreadPool::getFrameThreadsCount(p, totalNumThreads);
28
+ }
29
+
30
if (!numPools)
31
return NULL;
32
33
34
node++;
35
int numThreads = X265_MIN(MAX_POOL_THREADS, threadsPerPool[node]);
36
int origNumThreads = numThreads;
37
- if (p->lookaheadThreads > numThreads / 2)
38
+ if (i == 0 && p->lookaheadThreads > numThreads / 2)
39
{
40
p->lookaheadThreads = numThreads / 2;
41
x265_log(p, X265_LOG_DEBUG, "Setting lookahead threads to a maximum of half the total number of threads\n");
42
43
maxProviders = 1;
44
}
45
46
- else
47
+ else if (i == 0)
48
numThreads -= p->lookaheadThreads;
49
if (!pools[i].create(numThreads, maxProviders, nodeMaskPerPool[node]))
50
{
51
52
#endif
53
}
54
55
+void ThreadPool::getFrameThreadsCount(x265_param* p, int cpuCount)
56
+{
57
+ int rows = (p->sourceHeight + p->maxCUSize - 1) >> g_log2Size[p->maxCUSize];
58
+ if (!p->bEnableWavefront)
59
+ p->frameNumThreads = X265_MIN3(cpuCount, (rows + 1) / 2, X265_MAX_FRAME_THREADS);
60
+ else if (cpuCount >= 32)
61
+ p->frameNumThreads = (p->sourceHeight > 2000) ? 6 : 5;
62
+ else if (cpuCount >= 16)
63
+ p->frameNumThreads = 4;
64
+ else if (cpuCount >= 8)
65
+ p->frameNumThreads = 3;
66
+ else if (cpuCount >= 4)
67
+ p->frameNumThreads = 2;
68
+ else
69
+ p->frameNumThreads = 1;
70
+}
71
+
72
} // end namespace X265_NS
73
x265_2.4.tar.gz/source/common/threadpool.h -> x265_2.5.tar.gz/source/common/threadpool.h
Changed
9
1
2
static ThreadPool* allocThreadPools(x265_param* p, int& numPools, bool isThreadsReserved);
3
static int getCpuCount();
4
static int getNumaNodeCount();
5
+ static void getFrameThreadsCount(x265_param* p,int cpuCount);
6
};
7
8
/* Any worker thread may enlist the help of idle worker threads from the same
9
x265_2.4.tar.gz/source/common/x86/asm-primitives.cpp -> x265_2.5.tar.gz/source/common/x86/asm-primitives.cpp
Changed
47
1
2
#include "blockcopy8.h"
3
#include "intrapred.h"
4
#include "dct8.h"
5
+#include "seaintegral.h"
6
}
7
8
#define ALL_LUMA_CU_TYPED(prim, fncdef, fname, cpu) \
9
10
p.fix8Unpack = PFX(cutree_fix8_unpack_avx2);
11
p.fix8Pack = PFX(cutree_fix8_pack_avx2);
12
13
+ p.integral_initv[INTEGRAL_4] = PFX(integral4v_avx2);
14
+ p.integral_initv[INTEGRAL_8] = PFX(integral8v_avx2);
15
+ p.integral_initv[INTEGRAL_12] = PFX(integral12v_avx2);
16
+ p.integral_initv[INTEGRAL_16] = PFX(integral16v_avx2);
17
+ p.integral_initv[INTEGRAL_24] = PFX(integral24v_avx2);
18
+ p.integral_initv[INTEGRAL_32] = PFX(integral32v_avx2);
19
+ p.integral_inith[INTEGRAL_4] = PFX(integral4h_avx2);
20
+ p.integral_inith[INTEGRAL_8] = PFX(integral8h_avx2);
21
+ p.integral_inith[INTEGRAL_12] = PFX(integral12h_avx2);
22
+ p.integral_inith[INTEGRAL_16] = PFX(integral16h_avx2);
23
+
24
/* TODO: This kernel needs to be modified to work with HIGH_BIT_DEPTH only
25
p.planeClipAndMax = PFX(planeClipAndMax_avx2); */
26
27
28
p.fix8Unpack = PFX(cutree_fix8_unpack_avx2);
29
p.fix8Pack = PFX(cutree_fix8_pack_avx2);
30
31
+ p.integral_initv[INTEGRAL_4] = PFX(integral4v_avx2);
32
+ p.integral_initv[INTEGRAL_8] = PFX(integral8v_avx2);
33
+ p.integral_initv[INTEGRAL_12] = PFX(integral12v_avx2);
34
+ p.integral_initv[INTEGRAL_16] = PFX(integral16v_avx2);
35
+ p.integral_initv[INTEGRAL_24] = PFX(integral24v_avx2);
36
+ p.integral_initv[INTEGRAL_32] = PFX(integral32v_avx2);
37
+ p.integral_inith[INTEGRAL_4] = PFX(integral4h_avx2);
38
+ p.integral_inith[INTEGRAL_8] = PFX(integral8h_avx2);
39
+ p.integral_inith[INTEGRAL_12] = PFX(integral12h_avx2);
40
+ p.integral_inith[INTEGRAL_16] = PFX(integral16h_avx2);
41
+ p.integral_inith[INTEGRAL_24] = PFX(integral24h_avx2);
42
+ p.integral_inith[INTEGRAL_32] = PFX(integral32h_avx2);
43
+
44
}
45
#endif
46
}
47
x265_2.4.tar.gz/source/common/x86/loopfilter.asm -> x265_2.5.tar.gz/source/common/x86/loopfilter.asm
Changed
28
1
2
pshufb m1, m4, m0
3
pcmpgtb m0, [pb_15] ; m0 = [mask]
4
5
- pblendvb m6, m6, m1, m0 ; NOTE: don't use 3 parameters style, x264 macro have some bug!
6
+ pblendvb m6, m1, m0
7
8
pmovsxbw m0, m6 ; offset
9
punpckhbw m6, m6
10
11
pshufb m6, m3, m1
12
pshufb m5, m4, m1
13
14
- pblendvb m6, m6, m5, m0 ; NOTE: don't use 3 parameters style, x264 macro have some bug!
15
+ pblendvb m6, m5, m0
16
17
pmovzxbw m1, m2 ; rec
18
punpckhbw m2, m7
19
20
sub r3, r4
21
movu xmm0, [r3]
22
movu m3, [r0]
23
- pblendvb m5, m5, m3, xmm0
24
+ pblendvb m5, m3, xmm0
25
movu [r0], m5
26
27
.end:
28
x265_2.4.tar.gz/source/common/x86/pixel-a.asm -> x265_2.5.tar.gz/source/common/x86/pixel-a.asm
Changed
10
1
2
; clobber: m3..m7
3
; out: %1 = satd
4
%macro SATD_4x4_MMX 3
5
- %xdefine %%n n%1
6
+ %xdefine %%n nn%1
7
%assign offset %2*SIZEOF_PIXEL
8
LOAD_DIFF m4, m3, none, [r0+ offset], [r2+ offset]
9
LOAD_DIFF m5, m3, none, [r0+ r1+offset], [r2+ r3+offset]
10
x265_2.4.tar.gz/source/common/x86/pixel-util8.asm -> x265_2.5.tar.gz/source/common/x86/pixel-util8.asm
Changed
10
1
2
3
.widthLess8:
4
movu m6, [r1]
5
- pblendvb m6, m6, m7, m0
6
+ pblendvb m6, m7, m0
7
movu [r1], m6
8
9
.nextH:
10
x265_2.5.tar.gz/source/common/x86/seaintegral.asm
Added
201
1
2
+;*****************************************************************************
3
+;* Copyright (C) 2013-2017 MulticoreWare, Inc
4
+;*
5
+;* Authors: Jayashri Murugan <jayashri@multicorewareinc.com>
6
+;* Vignesh V Menon <vignesh@multicorewareinc.com>
7
+;* Praveen Tiwari <praveen@multicorewareinc.com>
8
+;*
9
+;* This program is free software; you can redistribute it and/or modify
10
+;* it under the terms of the GNU General Public License as published by
11
+;* the Free Software Foundation; either version 2 of the License, or
12
+;* (at your option) any later version.
13
+;*
14
+;* This program is distributed in the hope that it will be useful,
15
+;* but WITHOUT ANY WARRANTY; without even the implied warranty of
16
+;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17
+;* GNU General Public License for more details.
18
+;*
19
+;* You should have received a copy of the GNU General Public License
20
+;* along with this program; if not, write to the Free Software
21
+;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02111, USA.
22
+;*
23
+;* This program is also available under a commercial proprietary license.
24
+;* For more information, contact us at license @ x265.com.
25
+;*****************************************************************************/
26
+
27
+%include "x86inc.asm"
28
+%include "x86util.asm"
29
+
30
+SECTION .text
31
+
32
+;-----------------------------------------------------------------------------
33
+;void integral_init4v_c(uint32_t *sum4, intptr_t stride)
34
+;-----------------------------------------------------------------------------
35
+INIT_YMM avx2
36
+cglobal integral4v, 2, 3, 2
37
+ mov r2, r1
38
+ shl r2, 4
39
+
40
+.loop
41
+ movu m0, [r0]
42
+ movu m1, [r0 + r2]
43
+ psubd m1, m0
44
+ movu [r0], m1
45
+ add r0, 32
46
+ sub r1, 8
47
+ jnz .loop
48
+ RET
49
+
50
+;-----------------------------------------------------------------------------
51
+;void integral_init8v_c(uint32_t *sum8, intptr_t stride)
52
+;-----------------------------------------------------------------------------
53
+INIT_YMM avx2
54
+cglobal integral8v, 2, 3, 2
55
+ mov r2, r1
56
+ shl r2, 5
57
+
58
+.loop
59
+ movu m0, [r0]
60
+ movu m1, [r0 + r2]
61
+ psubd m1, m0
62
+ movu [r0], m1
63
+ add r0, 32
64
+ sub r1, 8
65
+ jnz .loop
66
+ RET
67
+
68
+;-----------------------------------------------------------------------------
69
+;void integral_init12v_c(uint32_t *sum12, intptr_t stride)
70
+;-----------------------------------------------------------------------------
71
+INIT_YMM avx2
72
+cglobal integral12v, 2, 4, 2
73
+ mov r2, r1
74
+ mov r3, r1
75
+ shl r2, 5
76
+ shl r3, 4
77
+ add r2, r3
78
+
79
+.loop
80
+ movu m0, [r0]
81
+ movu m1, [r0 + r2]
82
+ psubd m1, m0
83
+ movu [r0], m1
84
+ add r0, 32
85
+ sub r1, 8
86
+ jnz .loop
87
+ RET
88
+
89
+;-----------------------------------------------------------------------------
90
+;void integral_init16v_c(uint32_t *sum16, intptr_t stride)
91
+;-----------------------------------------------------------------------------
92
+INIT_YMM avx2
93
+cglobal integral16v, 2, 3, 2
94
+ mov r2, r1
95
+ shl r2, 6
96
+
97
+.loop
98
+ movu m0, [r0]
99
+ movu m1, [r0 + r2]
100
+ psubd m1, m0
101
+ movu [r0], m1
102
+ add r0, 32
103
+ sub r1, 8
104
+ jnz .loop
105
+ RET
106
+
107
+;-----------------------------------------------------------------------------
108
+;void integral_init24v_c(uint32_t *sum24, intptr_t stride)
109
+;-----------------------------------------------------------------------------
110
+INIT_YMM avx2
111
+cglobal integral24v, 2, 4, 2
112
+ mov r2, r1
113
+ mov r3, r1
114
+ shl r2, 6
115
+ shl r3, 5
116
+ add r2, r3
117
+
118
+.loop
119
+ movu m0, [r0]
120
+ movu m1, [r0 + r2]
121
+ psubd m1, m0
122
+ movu [r0], m1
123
+ add r0, 32
124
+ sub r1, 8
125
+ jnz .loop
126
+ RET
127
+
128
+;-----------------------------------------------------------------------------
129
+;void integral_init32v_c(uint32_t *sum32, intptr_t stride)
130
+;-----------------------------------------------------------------------------
131
+INIT_YMM avx2
132
+cglobal integral32v, 2, 3, 2
133
+ mov r2, r1
134
+ shl r2, 7
135
+
136
+.loop
137
+ movu m0, [r0]
138
+ movu m1, [r0 + r2]
139
+ psubd m1, m0
140
+ movu [r0], m1
141
+ add r0, 32
142
+ sub r1, 8
143
+ jnz .loop
144
+ RET
145
+
146
+%macro INTEGRAL_FOUR_HORIZONTAL_16 0
147
+ pmovzxbw m0, [r1]
148
+ pmovzxbw m1, [r1 + 1]
149
+ paddw m0, m1
150
+ pmovzxbw m1, [r1 + 2]
151
+ paddw m0, m1
152
+ pmovzxbw m1, [r1 + 3]
153
+ paddw m0, m1
154
+%endmacro
155
+
156
+%macro INTEGRAL_FOUR_HORIZONTAL_4 0
157
+ movd xm0, [r1]
158
+ movd xm1, [r1 + 1]
159
+ pmovzxbw xm0, xm0
160
+ pmovzxbw xm1, xm1
161
+ paddw xm0, xm1
162
+ movd xm1, [r1 + 2]
163
+ pmovzxbw xm1, xm1
164
+ paddw xm0, xm1
165
+ movd xm1, [r1 + 3]
166
+ pmovzxbw xm1, xm1
167
+ paddw xm0, xm1
168
+%endmacro
169
+
170
+%macro INTEGRAL_FOUR_HORIZONTAL_8_HBD 0
171
+ pmovzxwd m0, [r1]
172
+ pmovzxwd m1, [r1 + 2]
173
+ paddd m0, m1
174
+ pmovzxwd m1, [r1 + 4]
175
+ paddd m0, m1
176
+ pmovzxwd m1, [r1 + 6]
177
+ paddd m0, m1
178
+%endmacro
179
+
180
+%macro INTEGRAL_FOUR_HORIZONTAL_4_HBD 0
181
+ pmovzxwd xm0, [r1]
182
+ pmovzxwd xm1, [r1 + 2]
183
+ paddd xm0, xm1
184
+ pmovzxwd xm1, [r1 + 4]
185
+ paddd xm0, xm1
186
+ pmovzxwd xm1, [r1 + 6]
187
+ paddd xm0, xm1
188
+%endmacro
189
+
190
+;-----------------------------------------------------------------------------
191
+;static void integral_init4h(uint32_t *sum, pixel *pix, intptr_t stride)
192
+;-----------------------------------------------------------------------------
193
+INIT_YMM avx2
194
+%if HIGH_BIT_DEPTH
195
+cglobal integral4h, 3, 5, 3
196
+ lea r3, [4 * r2]
197
+ sub r0, r3
198
+ sub r2, 4 ;stride - 4
199
+ mov r4, r2
200
+ shr r4, 3
201
x265_2.5.tar.gz/source/common/x86/seaintegral.h
Added
44
1
2
+/*****************************************************************************
3
+* Copyright (C) 2013-2017 MulticoreWare, Inc
4
+*
5
+* Authors: Vignesh V Menon <vignesh@multicorewareinc.com>
6
+* Jayashri Murugan <jayashri@multicorewareinc.com>
7
+* Praveen Tiwari <praveen@multicorewareinc.com>
8
+*
9
+* This program is free software; you can redistribute it and/or modify
10
+* it under the terms of the GNU General Public License as published by
11
+* the Free Software Foundation; either version 2 of the License, or
12
+* (at your option) any later version.
13
+*
14
+* This program is distributed in the hope that it will be useful,
15
+* but WITHOUT ANY WARRANTY; without even the implied warranty of
16
+* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17
+* GNU General Public License for more details.
18
+*
19
+* You should have received a copy of the GNU General Public License
20
+* along with this program; if not, write to the Free Software
21
+* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02111, USA.
22
+*
23
+* This program is also available under a commercial proprietary license.
24
+* For more information, contact us at license @ x265.com.
25
+*****************************************************************************/
26
+
27
+#ifndef X265_SEAINTEGRAL_H
28
+#define X265_SEAINTEGRAL_H
29
+
30
+void PFX(integral4v_avx2)(uint32_t *sum, intptr_t stride);
31
+void PFX(integral8v_avx2)(uint32_t *sum, intptr_t stride);
32
+void PFX(integral12v_avx2)(uint32_t *sum, intptr_t stride);
33
+void PFX(integral16v_avx2)(uint32_t *sum, intptr_t stride);
34
+void PFX(integral24v_avx2)(uint32_t *sum, intptr_t stride);
35
+void PFX(integral32v_avx2)(uint32_t *sum, intptr_t stride);
36
+void PFX(integral4h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
37
+void PFX(integral8h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
38
+void PFX(integral12h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
39
+void PFX(integral16h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
40
+void PFX(integral24h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
41
+void PFX(integral32h_avx2)(uint32_t *sum, pixel *pix, intptr_t stride);
42
+
43
+#endif //X265_SEAINTEGRAL_H
44
x265_2.4.tar.gz/source/common/x86/x86inc.asm -> x265_2.5.tar.gz/source/common/x86/x86inc.asm
Changed
201
1
2
SECTION .rodata align=%1
3
%endmacro
4
5
-%macro SECTION_TEXT 0-1 16
6
- SECTION .text align=%1
7
-%endmacro
8
-
9
%if WIN64
10
%define PIC
11
%elif ARCH_X86_64 == 0
12
13
%define r%1w %2w
14
%define r%1b %2b
15
%define r%1h %2h
16
+ %define %2q %2
17
%if %0 == 2
18
%define r%1m %2d
19
%define r%1mp %2
20
21
%define e%1h %3
22
%define r%1b %2
23
%define e%1b %2
24
-%if ARCH_X86_64 == 0
25
- %define r%1 e%1
26
-%endif
27
+ %if ARCH_X86_64 == 0
28
+ %define r%1 e%1
29
+ %endif
30
%endmacro
31
32
DECLARE_REG_SIZE ax, al, ah
33
34
35
%macro ASSERT 1
36
%if (%1) == 0
37
- %error assert failed
38
+ %error assertion ``%1'' failed
39
%endif
40
%endmacro
41
42
43
%ifnum %1
44
%if %1 != 0 && required_stack_alignment > STACK_ALIGNMENT
45
%if %1 > 0
46
+ ; Reserve an additional register for storing the original stack pointer, but avoid using
47
+ ; eax/rax for this purpose since it can potentially get overwritten as a return value.
48
%assign regs_used (regs_used + 1)
49
- %elif ARCH_X86_64 && regs_used == num_args && num_args <= 4 + UNIX64 * 2
50
- %warning "Stack pointer will overwrite register argument"
51
+ %if ARCH_X86_64 && regs_used == 7
52
+ %assign regs_used 8
53
+ %elif ARCH_X86_64 == 0 && regs_used == 1
54
+ %assign regs_used 2
55
+ %endif
56
+ %endif
57
+ %if ARCH_X86_64 && regs_used < 5 + UNIX64 * 3
58
+ ; Ensure that we don't clobber any registers containing arguments. For UNIX64 we also preserve r6 (rax)
59
+ ; since it's used as a hidden argument in vararg functions to specify the number of vector registers used.
60
+ %assign regs_used 5 + UNIX64 * 3
61
%endif
62
%endif
63
%endif
64
65
DECLARE_REG 8, rsi, 72
66
DECLARE_REG 9, rbx, 80
67
DECLARE_REG 10, rbp, 88
68
-DECLARE_REG 11, R12, 96
69
-DECLARE_REG 12, R13, 104
70
-DECLARE_REG 13, R14, 112
71
-DECLARE_REG 14, R15, 120
72
+DECLARE_REG 11, R14, 96
73
+DECLARE_REG 12, R15, 104
74
+DECLARE_REG 13, R12, 112
75
+DECLARE_REG 14, R13, 120
76
77
%macro PROLOGUE 2-5+ 0 ; #args, #regs, #xmm_regs, [stack_size,] arg_names...
78
%assign num_args %1
79
80
WIN64_PUSH_XMM
81
%endmacro
82
83
-%macro WIN64_RESTORE_XMM_INTERNAL 1
84
+%macro WIN64_RESTORE_XMM_INTERNAL 0
85
%assign %%pad_size 0
86
%if xmm_regs_used > 8
87
%assign %%i xmm_regs_used
88
%rep xmm_regs_used-8
89
%assign %%i %%i-1
90
- movaps xmm %+ %%i, [%1 + (%%i-8)*16 + stack_size + 32]
91
+ movaps xmm %+ %%i, [rsp + (%%i-8)*16 + stack_size + 32]
92
%endrep
93
%endif
94
%if stack_size_padded > 0
95
%if stack_size > 0 && required_stack_alignment > STACK_ALIGNMENT
96
mov rsp, rstkm
97
%else
98
- add %1, stack_size_padded
99
+ add rsp, stack_size_padded
100
%assign %%pad_size stack_size_padded
101
%endif
102
%endif
103
%if xmm_regs_used > 7
104
- movaps xmm7, [%1 + stack_offset - %%pad_size + 24]
105
+ movaps xmm7, [rsp + stack_offset - %%pad_size + 24]
106
%endif
107
%if xmm_regs_used > 6
108
- movaps xmm6, [%1 + stack_offset - %%pad_size + 8]
109
+ movaps xmm6, [rsp + stack_offset - %%pad_size + 8]
110
%endif
111
%endmacro
112
113
-%macro WIN64_RESTORE_XMM 1
114
- WIN64_RESTORE_XMM_INTERNAL %1
115
+%macro WIN64_RESTORE_XMM 0
116
+ WIN64_RESTORE_XMM_INTERNAL
117
%assign stack_offset (stack_offset-stack_size_padded)
118
+ %assign stack_size_padded 0
119
%assign xmm_regs_used 0
120
%endmacro
121
122
%define has_epilogue regs_used > 7 || xmm_regs_used > 6 || mmsize == 32 || stack_size > 0
123
124
%macro RET 0
125
- WIN64_RESTORE_XMM_INTERNAL rsp
126
+ WIN64_RESTORE_XMM_INTERNAL
127
POP_IF_USED 14, 13, 12, 11, 10, 9, 8, 7
128
-%if mmsize == 32
129
- vzeroupper
130
-%endif
131
+ %if mmsize == 32
132
+ vzeroupper
133
+ %endif
134
AUTO_REP_RET
135
%endmacro
136
137
138
DECLARE_REG 8, R11, 24
139
DECLARE_REG 9, rbx, 32
140
DECLARE_REG 10, rbp, 40
141
-DECLARE_REG 11, R12, 48
142
-DECLARE_REG 12, R13, 56
143
-DECLARE_REG 13, R14, 64
144
-DECLARE_REG 14, R15, 72
145
+DECLARE_REG 11, R14, 48
146
+DECLARE_REG 12, R15, 56
147
+DECLARE_REG 13, R12, 64
148
+DECLARE_REG 14, R13, 72
149
150
%macro PROLOGUE 2-5+ ; #args, #regs, #xmm_regs, [stack_size,] arg_names...
151
%assign num_args %1
152
153
%define has_epilogue regs_used > 9 || mmsize == 32 || stack_size > 0
154
155
%macro RET 0
156
-%if stack_size_padded > 0
157
-%if required_stack_alignment > STACK_ALIGNMENT
158
- mov rsp, rstkm
159
-%else
160
- add rsp, stack_size_padded
161
-%endif
162
-%endif
163
+ %if stack_size_padded > 0
164
+ %if required_stack_alignment > STACK_ALIGNMENT
165
+ mov rsp, rstkm
166
+ %else
167
+ add rsp, stack_size_padded
168
+ %endif
169
+ %endif
170
POP_IF_USED 14, 13, 12, 11, 10, 9
171
-%if mmsize == 32
172
- vzeroupper
173
-%endif
174
+ %if mmsize == 32
175
+ vzeroupper
176
+ %endif
177
AUTO_REP_RET
178
%endmacro
179
180
181
%define has_epilogue regs_used > 3 || mmsize == 32 || stack_size > 0
182
183
%macro RET 0
184
-%if stack_size_padded > 0
185
-%if required_stack_alignment > STACK_ALIGNMENT
186
- mov rsp, rstkm
187
-%else
188
- add rsp, stack_size_padded
189
-%endif
190
-%endif
191
+ %if stack_size_padded > 0
192
+ %if required_stack_alignment > STACK_ALIGNMENT
193
+ mov rsp, rstkm
194
+ %else
195
+ add rsp, stack_size_padded
196
+ %endif
197
+ %endif
198
POP_IF_USED 6, 5, 4, 3
199
-%if mmsize == 32
200
- vzeroupper
201
x265_2.4.tar.gz/source/dynamicHDR10/BasicStructures.h -> x265_2.5.tar.gz/source/dynamicHDR10/BasicStructures.h
Changed
32
1
2
float maxRLuminance = 0.0;
3
float maxGLuminance = 0.0;
4
float maxBLuminance = 0.0;
5
- int order;
6
+ int order = 0;
7
std::vector<unsigned int> percentiles;
8
};
9
10
struct BezierCurveData
11
{
12
- int order;
13
- int sPx;
14
- int sPy;
15
+ int order = 0;
16
+ int sPx = 0;
17
+ int sPy = 0;
18
std::vector<int> coeff;
19
};
20
21
+struct PercentileLuminance{
22
+
23
+ float averageLuminance = 0.0;
24
+ float maxRLuminance = 0.0;
25
+ float maxGLuminance = 0.0;
26
+ float maxBLuminance = 0.0;
27
+ int order = 0;
28
+ std::vector<unsigned int> percentiles;
29
+};
30
+
31
#endif // BASICSTRUCTURES_H
32
x265_2.4.tar.gz/source/dynamicHDR10/CMakeLists.txt -> x265_2.5.tar.gz/source/dynamicHDR10/CMakeLists.txt
Changed
48
1
2
# vim: syntax=cmake
3
-if(ENABLE_DYNAMIC_HDR10)
4
+if(ENABLE_HDR10_PLUS)
5
6
add_library(dynamicHDR10 OBJECT
7
- BasicStructures.cpp BasicStructures.h
8
+ BasicStructures.h
9
json11/json11.cpp json11/json11.h
10
JsonHelper.cpp JsonHelper.h
11
metadataFromJson.cpp metadataFromJson.h
12
13
hdr10plus.h
14
api.cpp )
15
16
-else()
17
cmake_minimum_required (VERSION 2.8.11)
18
project(dynamicHDR10)
19
include(CheckIncludeFiles)
20
21
22
option(ENABLE_SHARED "Build shared library" OFF)
23
24
-if(ENABLE_SHARED)
25
- add_library(dynamicHDR10 SHARED
26
- json11/json11.cpp json11/json11.h
27
- BasicStructures.cpp BasicStructures.h
28
- JsonHelper.cpp JsonHelper.h
29
- metadataFromJson.cpp metadataFromJson.h
30
- SeiMetadataDictionary.cpp SeiMetadataDictionary.h
31
- hdr10plus.h api.cpp )
32
-else()
33
- add_library(dynamicHDR10 STATIC
34
- json11/json11.cpp json11/json11.h
35
- BasicStructures.cpp BasicStructures.h
36
- JsonHelper.cpp JsonHelper.h
37
- metadataFromJson.cpp metadataFromJson.h
38
- SeiMetadataDictionary.cpp SeiMetadataDictionary.h
39
- hdr10plus.h api.cpp )
40
-endif()
41
-
42
-install (TARGETS dynamicHDR10
43
- LIBRARY DESTINATION ${LIB_INSTALL_DIR}
44
- ARCHIVE DESTINATION ${LIB_INSTALL_DIR})
45
install(FILES hdr10plus.h DESTINATION include)
46
endif()
47
\ No newline at end of file
48
x265_2.4.tar.gz/source/dynamicHDR10/json11/json11.cpp -> x265_2.5.tar.gz/source/dynamicHDR10/json11/json11.cpp
Changed
50
1
2
#include <cstdio>
3
#include <limits>
4
5
+#if _MSC_VER
6
+#pragma warning(disable: 4510) //const member cannot be default initialized
7
+#pragma warning(disable: 4512) //assignment operator could not be generated
8
+#pragma warning(disable: 4610) //const member cannot be default initialized
9
+#endif
10
+
11
namespace json11 {
12
13
static const int max_depth = 200;
14
15
char get_next_token() {
16
consume_garbage();
17
if (i == str.size())
18
- return fail("unexpected end of input", 0);
19
+ return fail("unexpected end of input", '0');
20
21
return str[i++];
22
}
23
24
string parse_string() {
25
string out;
26
long last_escaped_codepoint = -1;
27
- while (true) {
28
+ for (;;) {
29
if (i == str.size())
30
return fail("unexpected end of input in string", "");
31
32
33
if (ch == '}')
34
return data;
35
36
- while (1) {
37
+ for (;;) {
38
if (ch != '"')
39
return fail("expected '\"' in object, got " + esc(ch));
40
41
42
if (ch == ']')
43
return data;
44
45
- while (1) {
46
+ for (;;) {
47
i--;
48
data.push_back(parse_json(depth + 1));
49
if (failed)
50
x265_2.4.tar.gz/source/dynamicHDR10/metadataFromJson.cpp -> x265_2.5.tar.gz/source/dynamicHDR10/metadataFromJson.cpp
Changed
10
1
2
{
3
int payloadBytes = 1;
4
5
- for(;payload > 0xFF; payload -= 0xFF, ++payloadBytes);
6
+ for(;payload >= 0xFF; payload -= 0xFF, ++payloadBytes);
7
8
if(payloadBytes > 1)
9
{
10
x265_2.4.tar.gz/source/encoder/CMakeLists.txt -> x265_2.5.tar.gz/source/encoder/CMakeLists.txt
Changed
8
1
2
reference.cpp reference.h
3
encoder.cpp encoder.h
4
api.cpp
5
- weightPrediction.cpp)
6
+ weightPrediction.cpp
7
+ ../x265-extras.cpp ../x265-extras.h)
8
x265_2.4.tar.gz/source/encoder/analysis.cpp -> x265_2.5.tar.gz/source/encoder/analysis.cpp
Changed
201
1
2
m_reuseInterDataCTU = NULL;
3
m_reuseRef = NULL;
4
m_bHD = false;
5
+ m_evaluateInter = 0;
6
}
7
8
bool Analysis::create(ThreadLocalData *tld)
9
10
cacheCost = X265_MALLOC(uint64_t, costArrSize);
11
12
int csp = m_param->internalCsp;
13
- uint32_t cuSize = g_maxCUSize;
14
+ uint32_t cuSize = m_param->maxCUSize;
15
16
bool ok = true;
17
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++, cuSize >>= 1)
18
+ for (uint32_t depth = 0; depth <= m_param->maxCUDepth; depth++, cuSize >>= 1)
19
{
20
ModeDepth &md = m_modeDepth[depth];
21
22
- md.cuMemPool.create(depth, csp, MAX_PRED_TYPES);
23
+ md.cuMemPool.create(depth, csp, MAX_PRED_TYPES, *m_param);
24
ok &= md.fencYuv.create(cuSize, csp);
25
26
for (int j = 0; j < MAX_PRED_TYPES; j++)
27
{
28
- md.pred[j].cu.initialize(md.cuMemPool, depth, csp, j);
29
+ md.pred[j].cu.initialize(md.cuMemPool, depth, *m_param, j);
30
ok &= md.pred[j].predYuv.create(cuSize, csp);
31
ok &= md.pred[j].reconYuv.create(cuSize, csp);
32
md.pred[j].fencYuv = &md.fencYuv;
33
34
35
void Analysis::destroy()
36
{
37
- for (uint32_t i = 0; i <= g_maxCUDepth; i++)
38
+ for (uint32_t i = 0; i <= m_param->maxCUDepth; i++)
39
{
40
m_modeDepth[i].cuMemPool.destroy();
41
m_modeDepth[i].fencYuv.destroy();
42
43
calculateNormFactor(ctu, qp);
44
45
uint32_t numPartition = ctu.m_numPartitions;
46
+ if (m_param->bCTUInfo && (*m_frame->m_ctuInfo + ctu.m_cuAddr))
47
+ {
48
+ x265_ctu_info_t* ctuTemp = *m_frame->m_ctuInfo + ctu.m_cuAddr;
49
+ if (ctuTemp->ctuPartitions)
50
+ {
51
+ int32_t depthIdx = 0;
52
+ uint32_t maxNum8x8Partitions = 64;
53
+ uint8_t* depthInfoPtr = m_frame->m_addOnDepth[ctu.m_cuAddr];
54
+ uint8_t* contentInfoPtr = m_frame->m_addOnCtuInfo[ctu.m_cuAddr];
55
+ int* prevCtuInfoChangePtr = m_frame->m_addOnPrevChange[ctu.m_cuAddr];
56
+ do
57
+ {
58
+ uint8_t depth = (uint8_t)ctuTemp->ctuPartitions[depthIdx];
59
+ uint8_t content = (uint8_t)(*((int32_t *)ctuTemp->ctuInfo + depthIdx));
60
+ int prevCtuInfoChange = m_frame->m_prevCtuInfoChange[ctu.m_cuAddr * maxNum8x8Partitions + depthIdx];
61
+ memset(depthInfoPtr, depth, sizeof(uint8_t) * numPartition >> 2 * depth);
62
+ memset(contentInfoPtr, content, sizeof(uint8_t) * numPartition >> 2 * depth);
63
+ memset(prevCtuInfoChangePtr, 0, sizeof(int) * numPartition >> 2 * depth);
64
+ for (uint32_t l = 0; l < numPartition >> 2 * depth; l++)
65
+ prevCtuInfoChangePtr[l] = prevCtuInfoChange;
66
+ depthInfoPtr += ctu.m_numPartitions >> 2 * depth;
67
+ contentInfoPtr += ctu.m_numPartitions >> 2 * depth;
68
+ prevCtuInfoChangePtr += ctu.m_numPartitions >> 2 * depth;
69
+ depthIdx++;
70
+ } while (ctuTemp->ctuPartitions[depthIdx] != 0);
71
+
72
+ m_additionalCtuInfo = m_frame->m_addOnCtuInfo[ctu.m_cuAddr];
73
+ m_prevCtuInfoChange = m_frame->m_addOnPrevChange[ctu.m_cuAddr];
74
+ memcpy(ctu.m_cuDepth, m_frame->m_addOnDepth[ctu.m_cuAddr], sizeof(uint8_t) * numPartition);
75
+ //Calculate log2CUSize from depth
76
+ for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
77
+ ctu.m_log2CUSize[i] = (uint8_t)m_param->maxLog2CUSize - ctu.m_cuDepth[i];
78
+ }
79
+ }
80
+
81
if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead)
82
{
83
m_multipassAnalysis = (analysis2PassFrameData*)m_frame->m_analysis2Pass.analysisFramedata;
84
85
}
86
}
87
88
- if (m_param->analysisMode && m_slice->m_sliceType != I_SLICE && m_param->analysisRefineLevel > 1 && m_param->analysisRefineLevel < 10)
89
+ if (m_param->analysisReuseMode && m_slice->m_sliceType != I_SLICE && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel < 10)
90
{
91
int numPredDir = m_slice->isInterP() ? 1 : 2;
92
m_reuseInterDataCTU = (analysis_inter_data*)m_frame->m_analysisData.interData;
93
m_reuseRef = &m_reuseInterDataCTU->ref[ctu.m_cuAddr * X265_MAX_PRED_MODE_PER_CTU * numPredDir];
94
m_reuseDepth = &m_reuseInterDataCTU->depth[ctu.m_cuAddr * ctu.m_numPartitions];
95
m_reuseModes = &m_reuseInterDataCTU->modes[ctu.m_cuAddr * ctu.m_numPartitions];
96
- if (m_param->analysisRefineLevel > 4)
97
+ if (m_param->analysisReuseLevel > 4)
98
{
99
m_reusePartSize = &m_reuseInterDataCTU->partSize[ctu.m_cuAddr * ctu.m_numPartitions];
100
m_reuseMergeFlag = &m_reuseInterDataCTU->mergeFlag[ctu.m_cuAddr * ctu.m_numPartitions];
101
}
102
- if (m_param->analysisMode == X265_ANALYSIS_SAVE)
103
+ if (m_param->analysisReuseMode == X265_ANALYSIS_SAVE)
104
for (int i = 0; i < X265_MAX_PRED_MODE_PER_CTU * numPredDir; i++)
105
m_reuseRef[i] = -1;
106
}
107
108
if (m_slice->m_sliceType == I_SLICE)
109
{
110
analysis_intra_data* intraDataCTU = (analysis_intra_data*)m_frame->m_analysisData.intraData;
111
- if (m_param->analysisMode == X265_ANALYSIS_LOAD && m_param->analysisRefineLevel > 1)
112
+ if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1)
113
{
114
memcpy(ctu.m_cuDepth, &intraDataCTU->depth[ctu.m_cuAddr * numPartition], sizeof(uint8_t) * numPartition);
115
memcpy(ctu.m_lumaIntraDir, &intraDataCTU->modes[ctu.m_cuAddr * numPartition], sizeof(uint8_t) * numPartition);
116
117
else
118
{
119
if (m_param->bIntraRefresh && m_slice->m_sliceType == P_SLICE &&
120
- ctu.m_cuPelX / g_maxCUSize >= frame.m_encData->m_pir.pirStartCol
121
- && ctu.m_cuPelX / g_maxCUSize < frame.m_encData->m_pir.pirEndCol)
122
+ ctu.m_cuPelX / m_param->maxCUSize >= frame.m_encData->m_pir.pirStartCol
123
+ && ctu.m_cuPelX / m_param->maxCUSize < frame.m_encData->m_pir.pirEndCol)
124
compressIntraCU(ctu, cuGeom, qp);
125
else if (!m_param->rdLevel)
126
{
127
128
/* generate residual for entire CTU at once and copy to reconPic */
129
encodeResidue(ctu, cuGeom);
130
}
131
- else if (m_param->analysisMode == X265_ANALYSIS_LOAD && m_param->analysisRefineLevel == 10)
132
+ else if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel == 10)
133
{
134
analysis_inter_data* interDataCTU = (analysis_inter_data*)m_frame->m_analysisData.interData;
135
int posCTU = ctu.m_cuAddr * numPartition;
136
137
}
138
//Calculate log2CUSize from depth
139
for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
140
- ctu.m_log2CUSize[i] = (uint8_t)g_maxLog2CUSize - ctu.m_cuDepth[i];
141
+ ctu.m_log2CUSize[i] = (uint8_t)m_param->maxLog2CUSize - ctu.m_cuDepth[i];
142
143
qprdRefine (ctu, cuGeom, qp, qp);
144
return *m_modeDepth[0].bestMode;
145
146
if (m_param->bEnableRdRefine || m_param->bOptCUDeltaQP)
147
qprdRefine(ctu, cuGeom, qp, qp);
148
149
+ if (m_param->csvLogLevel >= 2)
150
+ collectPUStatistics(ctu, cuGeom);
151
+
152
return *m_modeDepth[0].bestMode;
153
}
154
155
+void Analysis::collectPUStatistics(const CUData& ctu, const CUGeom& cuGeom)
156
+{
157
+ uint8_t depth = 0;
158
+ uint8_t partSize = 0;
159
+ for (uint32_t absPartIdx = 0; absPartIdx < ctu.m_numPartitions; absPartIdx += ctu.m_numPartitions >> (depth * 2))
160
+ {
161
+ depth = ctu.m_cuDepth[absPartIdx];
162
+ partSize = ctu.m_partSize[absPartIdx];
163
+ uint32_t numPU = nbPartsTable[(int)partSize];
164
+ int shift = 2 * (m_param->maxCUDepth + 1 - depth);
165
+ for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
166
+ {
167
+ PredictionUnit pu(ctu, cuGeom, puIdx);
168
+ int puabsPartIdx = ctu.getPUOffset(puIdx, absPartIdx);
169
+ int mode = 1;
170
+ if (ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_Nx2N || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_2NxN)
171
+ mode = 2;
172
+ else if (ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_2NxnU || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_2NxnD || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_nLx2N || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_nRx2N)
173
+ mode = 3;
174
+
175
+ if (ctu.m_predMode[puabsPartIdx + absPartIdx] == MODE_SKIP)
176
+ {
177
+ ctu.m_encData->m_frameStats.cntSkipPu[depth] += (uint64_t)(1 << shift);
178
+ ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
179
+ }
180
+ else if (ctu.m_predMode[puabsPartIdx + absPartIdx] == MODE_INTRA)
181
+ {
182
+ if (ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_NxN)
183
+ {
184
+ ctu.m_encData->m_frameStats.cnt4x4++;
185
+ ctu.m_encData->m_frameStats.totalPu[4]++;
186
+ }
187
+ else
188
+ {
189
+ ctu.m_encData->m_frameStats.cntIntraPu[depth] += (uint64_t)(1 << shift);
190
+ ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
191
+ }
192
+ }
193
+ else if (mode == 3)
194
+ {
195
+ ctu.m_encData->m_frameStats.cntAmp[depth] += (uint64_t)(1 << shift);
196
+ ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
197
+ break;
198
+ }
199
+ else
200
+ {
201
x265_2.4.tar.gz/source/encoder/analysis.h -> x265_2.5.tar.gz/source/encoder/analysis.h
Changed
30
1
2
int* m_multipassMvpIdx[2];
3
int32_t* m_multipassRef[2];
4
uint8_t* m_multipassModes;
5
+
6
+ uint8_t m_evaluateInter;
7
+ uint8_t* m_additionalCtuInfo;
8
+ int* m_prevCtuInfoChange;
9
/* refine RD based on QP for rd-levels 5 and 6 */
10
void qprdRefine(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp, int32_t lqp);
11
12
13
14
void calculateNormFactor(CUData& ctu, int qp);
15
void normFactor(const pixel* src, uint32_t blockSize, CUData& ctu, int qp, TextType ttype);
16
+
17
+ void collectPUStatistics(const CUData& ctu, const CUGeom& cuGeom);
18
+
19
/* check whether current mode is the new best */
20
inline void checkBestMode(Mode& mode, uint32_t depth)
21
{
22
23
else
24
md.bestMode = &mode;
25
}
26
+ int findSameContentRefCount(const CUData& parentCTU, const CUGeom& cuGeom);
27
};
28
29
struct ThreadLocalData
30
x265_2.4.tar.gz/source/encoder/api.cpp -> x265_2.5.tar.gz/source/encoder/api.cpp
Changed
152
1
2
#include "level.h"
3
#include "nal.h"
4
#include "bitcost.h"
5
+#include "x265-extras.h"
6
7
/* multilib namespace reflectors */
8
#if LINKED_8BIT
9
10
if (x265_check_params(param))
11
goto fail;
12
13
- if (x265_set_globals(param))
14
- goto fail;
15
-
16
encoder = new Encoder;
17
if (!param->rc.bEnableSlowFirstPass)
18
PARAM_NS::x265_param_apply_fastfirstpass(param);
19
20
}
21
22
encoder->create();
23
+ /* Try to open CSV file handle */
24
+ if (encoder->m_param->csvfn)
25
+ {
26
+ encoder->m_param->csvfpt = x265_csvlog_open(*encoder->m_param, encoder->m_param->csvfn, encoder->m_param->csvLogLevel);
27
+ if (!encoder->m_param->csvfpt)
28
+ {
29
+ x265_log(encoder->m_param, X265_LOG_ERROR, "Unable to open CSV log file <%s>, aborting\n", encoder->m_param->csvfn);
30
+ encoder->m_aborted = true;
31
+ }
32
+ }
33
+
34
encoder->m_latestParam = latestParam;
35
memcpy(latestParam, param, sizeof(x265_param));
36
if (encoder->m_aborted)
37
38
if (encoder->m_param->rc.bStatRead && encoder->m_param->bMultiPassOptRPS)
39
{
40
if (!encoder->computeSPSRPSIndex())
41
+ {
42
+ encoder->m_aborted = true;
43
return -1;
44
+ }
45
}
46
encoder->getStreamHeaders(encoder->m_nalList, sbacCoder, bs);
47
*pp_nal = &encoder->m_nalList.m_nal[0];
48
49
return encoder->m_nalList.m_occupancy;
50
}
51
52
+ if (enc)
53
+ {
54
+ Encoder *encoder = static_cast<Encoder*>(enc);
55
+ encoder->m_aborted = true;
56
+ }
57
return -1;
58
}
59
60
61
else if (pi_nal)
62
*pi_nal = 0;
63
64
+ if (numEncoded && encoder->m_param->csvLogLevel)
65
+ x265_csvlog_frame(encoder->m_param->csvfpt, *encoder->m_param, *pic_out, encoder->m_param->csvLogLevel);
66
+
67
+ if (numEncoded < 0)
68
+ encoder->m_aborted = true;
69
+
70
return numEncoded;
71
}
72
73
74
}
75
}
76
77
-void x265_encoder_log(x265_encoder* enc, int, char **)
78
+void x265_encoder_log(x265_encoder* enc, int argc, char **argv)
79
{
80
if (enc)
81
{
82
Encoder *encoder = static_cast<Encoder*>(enc);
83
- x265_log(encoder->m_param, X265_LOG_WARNING, "x265_encoder_log is now deprecated\n");
84
+ x265_stats stats;
85
+ int padx = encoder->m_sps.conformanceWindow.rightOffset;
86
+ int pady = encoder->m_sps.conformanceWindow.bottomOffset;
87
+ encoder->fetchStats(&stats, sizeof(stats));
88
+ const x265_api * api = x265_api_get(0);
89
+ x265_csvlog_encode(encoder->m_param->csvfpt, api->version_str, *encoder->m_param, padx, pady, stats, encoder->m_param->csvLogLevel, argc, argv);
90
}
91
}
92
93
94
encoder->printSummary();
95
encoder->destroy();
96
delete encoder;
97
- ATOMIC_DEC(&g_ctuSizeConfigured);
98
}
99
}
100
101
102
encoder->m_bQueuedIntraRefresh = 1;
103
return 0;
104
}
105
+int x265_encoder_ctu_info(x265_encoder *enc, int poc, x265_ctu_info_t** ctu)
106
+{
107
+ if (!ctu || !enc)
108
+ return -1;
109
+ Encoder* encoder = static_cast<Encoder*>(enc);
110
+ encoder->copyCtuInfo(ctu, poc);
111
+ return 0;
112
+}
113
114
void x265_cleanup(void)
115
{
116
- if (!g_ctuSizeConfigured)
117
- {
118
- BitCost::destroy();
119
- CUData::s_partSet[0] = NULL; /* allow CUData to adjust to new CTU size */
120
- }
121
+ BitCost::destroy();
122
}
123
124
x265_picture *x265_picture_alloc()
125
126
pic->userSEI.payloads = NULL;
127
pic->userSEI.numPayloads = 0;
128
129
- if (param->analysisMode)
130
+ if (param->analysisReuseMode)
131
{
132
- uint32_t widthInCU = (param->sourceWidth + g_maxCUSize - 1) >> g_maxLog2CUSize;
133
- uint32_t heightInCU = (param->sourceHeight + g_maxCUSize - 1) >> g_maxLog2CUSize;
134
+ uint32_t widthInCU = (param->sourceWidth + param->maxCUSize - 1) >> param->maxLog2CUSize;
135
+ uint32_t heightInCU = (param->sourceHeight + param->maxCUSize - 1) >> param->maxLog2CUSize;
136
137
uint32_t numCUsInFrame = widthInCU * heightInCU;
138
pic->analysisData.numCUsInFrame = numCUsInFrame;
139
- pic->analysisData.numPartitions = NUM_4x4_PARTITIONS;
140
+ pic->analysisData.numPartitions = param->num4x4Partitions;
141
}
142
}
143
144
145
146
sizeof(x265_frame_stats),
147
&x265_encoder_intra_refresh,
148
+ &x265_encoder_ctu_info,
149
};
150
151
typedef const x265_api* (*api_get_func)(int bitDepth);
152
x265_2.4.tar.gz/source/encoder/dpb.cpp -> x265_2.5.tar.gz/source/encoder/dpb.cpp
Changed
34
1
2
}
3
}
4
5
+ if (curFrame->m_ctuInfo != NULL)
6
+ {
7
+ uint32_t widthInCU = (curFrame->m_param->sourceWidth + curFrame->m_param->maxCUSize - 1) >> curFrame->m_param->maxLog2CUSize;
8
+ uint32_t heightInCU = (curFrame->m_param->sourceHeight + curFrame->m_param->maxCUSize - 1) >> curFrame->m_param->maxLog2CUSize;
9
+ uint32_t numCUsInFrame = widthInCU * heightInCU;
10
+ for (uint32_t i = 0; i < numCUsInFrame; i++)
11
+ {
12
+ X265_FREE((*curFrame->m_ctuInfo + i)->ctuInfo);
13
+ (*curFrame->m_ctuInfo + i)->ctuInfo = NULL;
14
+ }
15
+ X265_FREE(*curFrame->m_ctuInfo);
16
+ *(curFrame->m_ctuInfo) = NULL;
17
+ X265_FREE(curFrame->m_ctuInfo);
18
+ curFrame->m_ctuInfo = NULL;
19
+ X265_FREE(curFrame->m_prevCtuInfoChange);
20
+ curFrame->m_prevCtuInfoChange = NULL;
21
+ }
22
curFrame->m_encData = NULL;
23
curFrame->m_reconPic = NULL;
24
}
25
26
}
27
28
// Disable Loopfilter in bound area, because we will do slice-parallelism in future
29
- slice->m_sLFaseFlag = (g_maxSlices > 1) ? false : ((SLFASE_CONSTANT & (1 << (pocCurr % 31))) > 0);
30
+ slice->m_sLFaseFlag = (newFrame->m_param->maxSlices > 1) ? false : ((SLFASE_CONSTANT & (1 << (pocCurr % 31))) > 0);
31
32
/* Increment reference count of all motion-referenced frames to prevent them
33
* from being recycled. These counts are decremented at the end of
34
x265_2.4.tar.gz/source/encoder/encoder.cpp -> x265_2.5.tar.gz/source/encoder/encoder.cpp
Changed
201
1
2
m_frameEncoder[i] = NULL;
3
MotionEstimate::initScales();
4
5
-#if ENABLE_DYNAMIC_HDR10
6
+#if ENABLE_HDR10_PLUS
7
m_hdr10plus_api = hdr10plus_api_get();
8
+ numCimInfo = 0;
9
+ cim = NULL;
10
#endif
11
12
m_prevTonemapPayload.payload = NULL;
13
14
if (!p->bEnableWavefront && !p->bDistributeModeAnalysis && !p->bDistributeMotionEstimation && !p->lookaheadSlices)
15
allowPools = false;
16
17
- if (!p->frameNumThreads)
18
- {
19
- // auto-detect frame threads
20
- int cpuCount = ThreadPool::getCpuCount();
21
- if (!p->bEnableWavefront)
22
- p->frameNumThreads = X265_MIN3(cpuCount, (rows + 1) / 2, X265_MAX_FRAME_THREADS);
23
- else if (cpuCount >= 32)
24
- p->frameNumThreads = (p->sourceHeight > 2000) ? 8 : 6; // dual-socket 10-core IvyBridge or higher
25
- else if (cpuCount >= 16)
26
- p->frameNumThreads = 5; // 8 HT cores, or dual socket
27
- else if (cpuCount >= 8)
28
- p->frameNumThreads = 3; // 4 HT cores
29
- else if (cpuCount >= 4)
30
- p->frameNumThreads = 2; // Dual or Quad core
31
- else
32
- p->frameNumThreads = 1;
33
- }
34
m_numPools = 0;
35
if (allowPools)
36
m_threadPool = ThreadPool::allocThreadPools(p, m_numPools, 0);
37
+ else
38
+ {
39
+ if (!p->frameNumThreads)
40
+ {
41
+ // auto-detect frame threads
42
+ int cpuCount = ThreadPool::getCpuCount();
43
+ ThreadPool::getFrameThreadsCount(p, cpuCount);
44
+ }
45
+ }
46
+
47
if (!m_numPools)
48
{
49
// issue warnings if any of these features were requested
50
51
else
52
m_scalingList.setupQuantMatrices(m_sps.chromaFormatIdc);
53
54
- int numRows = (m_param->sourceHeight + g_maxCUSize - 1) / g_maxCUSize;
55
- int numCols = (m_param->sourceWidth + g_maxCUSize - 1) / g_maxCUSize;
56
+ int numRows = (m_param->sourceHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
57
+ int numCols = (m_param->sourceWidth + m_param->maxCUSize - 1) / m_param->maxCUSize;
58
for (int i = 0; i < m_param->frameNumThreads; i++)
59
{
60
if (!m_frameEncoder[i]->init(this, numRows, numCols))
61
62
63
initRefIdx();
64
65
- if (m_param->analysisMode)
66
+ if (m_param->analysisReuseMode)
67
{
68
- const char* name = m_param->analysisFileName;
69
+ const char* name = m_param->analysisReuseFileName;
70
if (!name)
71
name = defaultAnalysisFileName;
72
- const char* mode = m_param->analysisMode == X265_ANALYSIS_LOAD ? "rb" : "wb";
73
+ const char* mode = m_param->analysisReuseMode == X265_ANALYSIS_LOAD ? "rb" : "wb";
74
m_analysisFile = x265_fopen(name, mode);
75
if (!m_analysisFile)
76
{
77
78
79
if (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion)
80
{
81
- const char* name = m_param->analysisFileName;
82
+ const char* name = m_param->analysisReuseFileName;
83
if (!name)
84
name = defaultAnalysisFileName;
85
if (m_param->rc.bStatWrite)
86
87
88
void Encoder::destroy()
89
{
90
+#if ENABLE_HDR10_PLUS
91
+ m_hdr10plus_api->hdr10plus_clear_movie(cim, numCimInfo);
92
+#endif
93
+
94
if (m_exportedPic)
95
{
96
ATOMIC_DEC(&m_exportedPic->m_countRefEncoders);
97
98
{
99
int bError = 1;
100
fclose(m_analysisFileOut);
101
- const char* name = m_param->analysisFileName;
102
+ const char* name = m_param->analysisReuseFileName;
103
if (!name)
104
name = defaultAnalysisFileName;
105
char* temp = strcatFilename(name, ".temp");
106
107
}
108
if (m_param)
109
{
110
+ if (m_param->csvfpt)
111
+ fclose(m_param->csvfpt);
112
/* release string arguments that were strdup'd */
113
free((char*)m_param->rc.lambdaFileName);
114
free((char*)m_param->rc.statFileName);
115
- free((char*)m_param->analysisFileName);
116
+ free((char*)m_param->analysisReuseFileName);
117
free((char*)m_param->scalingLists);
118
+ free((char*)m_param->csvfn);
119
free((char*)m_param->numaPools);
120
free((char*)m_param->masteringDisplayColorVolume);
121
free((char*)m_param->toneMapFile);
122
123
FrameEncoder *encoder = m_frameEncoder[i];
124
if (encoder->m_rce.isActive && encoder->m_rce.poc != rc->m_curSlice->m_poc)
125
{
126
- int64_t bits = (int64_t) X265_MAX(encoder->m_rce.frameSizeEstimated, encoder->m_rce.frameSizePlanned);
127
+ int64_t bits = m_param->rc.bEnableConstVbv ? (int64_t)encoder->m_rce.frameSizePlanned : (int64_t)X265_MAX(encoder->m_rce.frameSizeEstimated, encoder->m_rce.frameSizePlanned);
128
rc->m_bufferFill -= bits;
129
rc->m_bufferFill = X265_MAX(rc->m_bufferFill, 0);
130
rc->m_bufferFill += encoder->m_rce.bufferRate;
131
132
133
if (m_exportedPic)
134
{
135
+ if (!m_param->bUseAnalysisFile && m_param->analysisReuseMode == X265_ANALYSIS_SAVE)
136
+ freeAnalysis(&m_exportedPic->m_analysisData);
137
ATOMIC_DEC(&m_exportedPic->m_countRefEncoders);
138
m_exportedPic = NULL;
139
m_dpb->recycleUnreferenced();
140
141
{
142
x265_sei_payload toneMap;
143
toneMap.payload = NULL;
144
-#if ENABLE_DYNAMIC_HDR10
145
+#if ENABLE_HDR10_PLUS
146
if (m_bToneMap)
147
{
148
- uint8_t *cim = NULL;
149
- if (m_hdr10plus_api->hdr10plus_json_to_frame_cim(m_param->toneMapFile, pic_in->poc, cim))
150
- {
151
- toneMap.payload = (uint8_t*)x265_malloc(sizeof(uint8_t) * cim[0]);
152
- toneMap.payloadSize = cim[0];
153
+ if (pic_in->poc == 0)
154
+ numCimInfo = m_hdr10plus_api->hdr10plus_json_to_movie_cim(m_param->toneMapFile, cim);
155
+ if (pic_in->poc < numCimInfo)
156
+ {
157
+ int32_t i = 0;
158
+ toneMap.payloadSize = 0;
159
+ while (cim[pic_in->poc][i] == 0xFF)
160
+ toneMap.payloadSize += cim[pic_in->poc][i++];
161
+ toneMap.payloadSize += cim[pic_in->poc][i++];
162
+
163
+ toneMap.payload = (uint8_t*)x265_malloc(sizeof(uint8_t) * toneMap.payloadSize);
164
toneMap.payloadType = USER_DATA_REGISTERED_ITU_T_T35;
165
- memcpy(toneMap.payload, cim, toneMap.payloadSize);
166
+ memcpy(toneMap.payload, cim[pic_in->poc] + i, toneMap.payloadSize);
167
}
168
}
169
#endif
170
171
for (int i = 0; i < numPayloads; i++)
172
{
173
x265_sei_payload input;
174
- if (i == (numPayloads - 1))
175
+ if ((i == (numPayloads - 1)) && toneMapEnable)
176
input = toneMap;
177
else
178
input = pic_in->userSEI.payloads[i];
179
180
181
/* In analysisSave mode, x265_analysis_data is allocated in pic_in and inFrame points to this */
182
/* Load analysis data before lookahead->addPicture, since sliceType has been decided */
183
- if (m_param->analysisMode == X265_ANALYSIS_LOAD)
184
+ if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD)
185
{
186
- x265_picture* inputPic = const_cast<x265_picture*>(pic_in);
187
/* readAnalysisFile reads analysis data for the frame and allocates memory based on slicetype */
188
- readAnalysisFile(&inputPic->analysisData, inFrame->m_poc);
189
- inFrame->m_analysisData.poc = inFrame->m_poc;
190
- inFrame->m_analysisData.sliceType = inputPic->analysisData.sliceType;
191
- inFrame->m_analysisData.bScenecut = inputPic->analysisData.bScenecut;
192
- inFrame->m_analysisData.satdCost = inputPic->analysisData.satdCost;
193
- inFrame->m_analysisData.numCUsInFrame = inputPic->analysisData.numCUsInFrame;
194
- inFrame->m_analysisData.numPartitions = inputPic->analysisData.numPartitions;
195
- inFrame->m_analysisData.wt = inputPic->analysisData.wt;
196
- inFrame->m_analysisData.interData = inputPic->analysisData.interData;
197
- inFrame->m_analysisData.intraData = inputPic->analysisData.intraData;
198
- sliceType = inputPic->analysisData.sliceType;
199
+ readAnalysisFile(&inFrame->m_analysisData, inFrame->m_poc, pic_in);
200
+ sliceType = inFrame->m_analysisData.sliceType;
201
x265_2.4.tar.gz/source/encoder/encoder.h -> x265_2.5.tar.gz/source/encoder/encoder.h
Changed
54
1
2
#include "x265.h"
3
#include "nal.h"
4
#include "framedata.h"
5
-
6
-#ifdef ENABLE_DYNAMIC_HDR10
7
- #include "dynamicHDR10\hdr10plus.h"
8
+#ifdef ENABLE_HDR10_PLUS
9
+ #include "dynamicHDR10/hdr10plus.h"
10
#endif
11
-
12
struct x265_encoder {};
13
namespace X265_NS {
14
// private namespace
15
16
17
int m_bToneMap; // Enables tone-mapping
18
19
-#ifdef ENABLE_DYNAMIC_HDR10
20
+#ifdef ENABLE_HDR10_PLUS
21
const hdr10plus_api *m_hdr10plus_api;
22
+ uint8_t **cim;
23
+ int numCimInfo;
24
#endif
25
26
x265_sei_payload m_prevTonemapPayload;
27
28
Encoder();
29
~Encoder()
30
{
31
-#ifdef ENABLE_DYNAMIC_HDR10
32
+#ifdef ENABLE_HDR10_PLUS
33
if (m_prevTonemapPayload.payload != NULL)
34
X265_FREE(m_prevTonemapPayload.payload);
35
#endif
36
37
38
int reconfigureParam(x265_param* encParam, x265_param* param);
39
40
+ void copyCtuInfo(x265_ctu_info_t** frameCtuInfo, int poc);
41
+
42
void getStreamHeaders(NALList& list, Entropy& sbacCoder, Bitstream& bs);
43
44
void fetchStats(x265_stats* stats, size_t statsSizeBytes);
45
46
47
void freeAnalysis2Pass(x265_analysis_2Pass* analysis, int sliceType);
48
49
- void readAnalysisFile(x265_analysis_data* analysis, int poc);
50
+ void readAnalysisFile(x265_analysis_data* analysis, int poc, const x265_picture* picIn);
51
52
void writeAnalysisFile(x265_analysis_data* pic, FrameData &curEncData);
53
void readAnalysis2PassFile(x265_analysis_2Pass* analysis2Pass, int poc, int sliceType);
54
x265_2.4.tar.gz/source/encoder/entropy.cpp -> x265_2.5.tar.gz/source/encoder/entropy.cpp
Changed
64
1
2
// TODO: Enable when pps_loop_filter_across_slices_enabled_flag==1
3
// We didn't support filter across slice board, so disable it now
4
5
- if (g_maxSlices <= 1)
6
+ if (encData.m_param->maxSlices <= 1)
7
{
8
bool isSAOEnabled = slice.m_sps->bUseSAO ? saoParam->bSaoFlag[0] || saoParam->bSaoFlag[1] : false;
9
bool isDBFEnabled = !slice.m_pps->bPicDisableDeblockingFilter;
10
11
if (cuSplitFlag)
12
codeSplitFlag(ctu, absPartIdx, depth);
13
14
- if (depth < ctu.m_cuDepth[absPartIdx] && depth < g_maxCUDepth)
15
+ if (depth < ctu.m_cuDepth[absPartIdx] && depth < ctu.m_encData->m_param->maxCUDepth)
16
{
17
uint32_t qNumParts = cuGeom.numPartitions >> 2;
18
if (depth == slice->m_pps->maxCuDQPDepth && slice->m_pps->bUseDQP)
19
20
case SIZE_nRx2N:
21
bits += bitsCodeBin(0, m_contextState[OFF_PART_SIZE_CTX + 0]);
22
bits += bitsCodeBin(0, m_contextState[OFF_PART_SIZE_CTX + 1]);
23
- if (depth == g_maxCUDepth && !(cu.m_log2CUSize[absPartIdx] == 3))
24
+ if (depth == cu.m_encData->m_param->maxCUDepth && !(cu.m_log2CUSize[absPartIdx] == 3))
25
bits += bitsCodeBin(1, m_contextState[OFF_PART_SIZE_CTX + 2]);
26
if (cu.m_slice->m_sps->maxAMPDepth > depth)
27
{
28
29
uint32_t cuAddr = ctu.getSCUAddr() + absPartIdx;
30
X265_CHECK(realEndAddress == slice->realEndAddress(slice->m_endCUAddr), "real end address expected\n");
31
32
- uint32_t granularityMask = g_maxCUSize - 1;
33
+ uint32_t granularityMask = ctu.m_encData->m_param->maxCUSize - 1;
34
uint32_t cuSize = 1 << ctu.m_log2CUSize[absPartIdx];
35
uint32_t rpelx = ctu.m_cuPelX + g_zscanToPelX[absPartIdx] + cuSize;
36
uint32_t bpely = ctu.m_cuPelY + g_zscanToPelY[absPartIdx] + cuSize;
37
38
{
39
// Encode slice finish
40
uint32_t bTerminateSlice = ctu.m_bLastCuInSlice;
41
- if (cuAddr + (NUM_4x4_PARTITIONS >> (depth << 1)) == realEndAddress)
42
+ if (cuAddr + (slice->m_param->num4x4Partitions >> (depth << 1)) == realEndAddress)
43
bTerminateSlice = 1;
44
45
// The 1-terminating bit is added to all streams, so don't add it here when it's 1.
46
47
48
if (cu.isIntra(absPartIdx))
49
{
50
- if (depth == g_maxCUDepth)
51
+ if (depth == cu.m_encData->m_param->maxCUDepth)
52
encodeBin(partSize == SIZE_2Nx2N ? 1 : 0, m_contextState[OFF_PART_SIZE_CTX]);
53
return;
54
}
55
56
case SIZE_nRx2N:
57
encodeBin(0, m_contextState[OFF_PART_SIZE_CTX + 0]);
58
encodeBin(0, m_contextState[OFF_PART_SIZE_CTX + 1]);
59
- if (depth == g_maxCUDepth && !(cu.m_log2CUSize[absPartIdx] == 3))
60
+ if (depth == cu.m_encData->m_param->maxCUDepth && !(cu.m_log2CUSize[absPartIdx] == 3))
61
encodeBin(1, m_contextState[OFF_PART_SIZE_CTX + 2]);
62
if (cu.m_slice->m_sps->maxAMPDepth > depth)
63
{
64
x265_2.4.tar.gz/source/encoder/frameencoder.cpp -> x265_2.5.tar.gz/source/encoder/frameencoder.cpp
Changed
201
1
2
range += !!(m_param->searchMethod < 2); /* diamond/hex range check lag */
3
range += NTAPS_LUMA / 2; /* subpel filter half-length */
4
range += 2 + (MotionEstimate::hpelIterationCount(m_param->subpelRefine) + 1) / 2; /* subpel refine steps */
5
- m_refLagRows = /*(m_param->maxSlices > 1 ? 1 : 0) +*/ 1 + ((range + g_maxCUSize - 1) / g_maxCUSize);
6
+ m_refLagRows = /*(m_param->maxSlices > 1 ? 1 : 0) +*/ 1 + ((range + m_param->maxCUSize - 1) / m_param->maxCUSize);
7
8
// NOTE: 2 times of numRows because both Encoder and Filter in same queue
9
if (!WaveFront::init(m_numRows * 2))
10
11
12
while (m_threadActive)
13
{
14
+ if (m_param->bCTUInfo)
15
+ {
16
+ while (!m_frame->m_ctuInfo)
17
+ m_frame->m_copied.wait();
18
+ }
19
compressFrame();
20
m_done.trigger(); /* FrameEncoder::getEncodedPicture() blocks for this event */
21
m_enable.wait();
22
23
bool bUseWeightB = slice->m_sliceType == B_SLICE && slice->m_pps->bUseWeightedBiPred;
24
25
WeightParam* reuseWP = NULL;
26
- if (m_param->analysisMode && (bUseWeightP || bUseWeightB))
27
+ if (m_param->analysisReuseMode && (bUseWeightP || bUseWeightB))
28
reuseWP = (WeightParam*)m_frame->m_analysisData.wt;
29
30
if (bUseWeightP || bUseWeightB)
31
32
m_cuStats.countWeightAnalyze++;
33
ScopedElapsedTime time(m_cuStats.weightAnalyzeTime);
34
#endif
35
- if (m_param->analysisMode == X265_ANALYSIS_LOAD)
36
+ if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD)
37
{
38
for (int list = 0; list < slice->isInterB() + 1; list++)
39
{
40
41
slice->m_refReconPicList[l][ref] = slice->m_refFrameList[l][ref]->m_reconPic;
42
m_mref[l][ref].init(slice->m_refReconPicList[l][ref], w, *m_param);
43
}
44
- if (m_param->analysisMode == X265_ANALYSIS_SAVE && (bUseWeightP || bUseWeightB))
45
+ if (m_param->analysisReuseMode == X265_ANALYSIS_SAVE && (bUseWeightP || bUseWeightB))
46
{
47
for (int i = 0; i < (m_param->internalCsp != X265_CSP_I400 ? 3 : 1); i++)
48
*(reuseWP++) = slice->m_weightPredTable[l][0][i];
49
50
if (writeSei)
51
{
52
SEICreativeIntentMeta sei;
53
- sei.cim = payload->payload;
54
+ sei.m_payload = payload->payload;
55
m_bs.resetBits();
56
sei.setSize(payload->payloadSize);
57
sei.write(m_bs, *slice->m_sps);
58
59
}
60
else if (m_param->decodedPictureHashSEI == 3)
61
{
62
- uint32_t cuHeight = g_maxCUSize;
63
+ uint32_t cuHeight = m_param->maxCUSize;
64
65
m_checksum[0] = 0;
66
67
68
m_frame->m_encData->m_frameStats.percent8x8Inter = (double)totalP / totalCuCount;
69
m_frame->m_encData->m_frameStats.percent8x8Skip = (double)totalSkip / totalCuCount;
70
}
71
- for (uint32_t i = 0; i < m_numRows; i++)
72
+
73
+ if (m_param->csvLogLevel >= 1)
74
{
75
- m_frame->m_encData->m_frameStats.cntIntraNxN += m_rows[i].rowStats.cntIntraNxN;
76
- m_frame->m_encData->m_frameStats.totalCu += m_rows[i].rowStats.totalCu;
77
- m_frame->m_encData->m_frameStats.totalCtu += m_rows[i].rowStats.totalCtu;
78
- m_frame->m_encData->m_frameStats.lumaDistortion += m_rows[i].rowStats.lumaDistortion;
79
- m_frame->m_encData->m_frameStats.chromaDistortion += m_rows[i].rowStats.chromaDistortion;
80
- m_frame->m_encData->m_frameStats.psyEnergy += m_rows[i].rowStats.psyEnergy;
81
- m_frame->m_encData->m_frameStats.ssimEnergy += m_rows[i].rowStats.ssimEnergy;
82
- m_frame->m_encData->m_frameStats.resEnergy += m_rows[i].rowStats.resEnergy;
83
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
84
+ for (uint32_t i = 0; i < m_numRows; i++)
85
{
86
- m_frame->m_encData->m_frameStats.cntSkipCu[depth] += m_rows[i].rowStats.cntSkipCu[depth];
87
- m_frame->m_encData->m_frameStats.cntMergeCu[depth] += m_rows[i].rowStats.cntMergeCu[depth];
88
- for (int m = 0; m < INTER_MODES; m++)
89
- m_frame->m_encData->m_frameStats.cuInterDistribution[depth][m] += m_rows[i].rowStats.cuInterDistribution[depth][m];
90
+ m_frame->m_encData->m_frameStats.cntIntraNxN += m_rows[i].rowStats.cntIntraNxN;
91
+ m_frame->m_encData->m_frameStats.totalCu += m_rows[i].rowStats.totalCu;
92
+ m_frame->m_encData->m_frameStats.totalCtu += m_rows[i].rowStats.totalCtu;
93
+ m_frame->m_encData->m_frameStats.lumaDistortion += m_rows[i].rowStats.lumaDistortion;
94
+ m_frame->m_encData->m_frameStats.chromaDistortion += m_rows[i].rowStats.chromaDistortion;
95
+ m_frame->m_encData->m_frameStats.psyEnergy += m_rows[i].rowStats.psyEnergy;
96
+ m_frame->m_encData->m_frameStats.ssimEnergy += m_rows[i].rowStats.ssimEnergy;
97
+ m_frame->m_encData->m_frameStats.resEnergy += m_rows[i].rowStats.resEnergy;
98
+ for (uint32_t depth = 0; depth <= m_param->maxCUDepth; depth++)
99
+ {
100
+ m_frame->m_encData->m_frameStats.cntSkipCu[depth] += m_rows[i].rowStats.cntSkipCu[depth];
101
+ m_frame->m_encData->m_frameStats.cntMergeCu[depth] += m_rows[i].rowStats.cntMergeCu[depth];
102
+ for (int m = 0; m < INTER_MODES; m++)
103
+ m_frame->m_encData->m_frameStats.cuInterDistribution[depth][m] += m_rows[i].rowStats.cuInterDistribution[depth][m];
104
+ for (int n = 0; n < INTRA_MODES; n++)
105
+ m_frame->m_encData->m_frameStats.cuIntraDistribution[depth][n] += m_rows[i].rowStats.cuIntraDistribution[depth][n];
106
+ }
107
+ }
108
+ m_frame->m_encData->m_frameStats.percentIntraNxN = (double)(m_frame->m_encData->m_frameStats.cntIntraNxN * 100) / m_frame->m_encData->m_frameStats.totalCu;
109
+
110
+ for (uint32_t depth = 0; depth <= m_param->maxCUDepth; depth++)
111
+ {
112
+ m_frame->m_encData->m_frameStats.percentSkipCu[depth] = (double)(m_frame->m_encData->m_frameStats.cntSkipCu[depth] * 100) / m_frame->m_encData->m_frameStats.totalCu;
113
+ m_frame->m_encData->m_frameStats.percentMergeCu[depth] = (double)(m_frame->m_encData->m_frameStats.cntMergeCu[depth] * 100) / m_frame->m_encData->m_frameStats.totalCu;
114
for (int n = 0; n < INTRA_MODES; n++)
115
- m_frame->m_encData->m_frameStats.cuIntraDistribution[depth][n] += m_rows[i].rowStats.cuIntraDistribution[depth][n];
116
+ m_frame->m_encData->m_frameStats.percentIntraDistribution[depth][n] = (double)(m_frame->m_encData->m_frameStats.cuIntraDistribution[depth][n] * 100) / m_frame->m_encData->m_frameStats.totalCu;
117
+ uint64_t cuInterRectCnt = 0; // sum of Nx2N, 2NxN counts
118
+ cuInterRectCnt += m_frame->m_encData->m_frameStats.cuInterDistribution[depth][1] + m_frame->m_encData->m_frameStats.cuInterDistribution[depth][2];
119
+ m_frame->m_encData->m_frameStats.percentInterDistribution[depth][0] = (double)(m_frame->m_encData->m_frameStats.cuInterDistribution[depth][0] * 100) / m_frame->m_encData->m_frameStats.totalCu;
120
+ m_frame->m_encData->m_frameStats.percentInterDistribution[depth][1] = (double)(cuInterRectCnt * 100) / m_frame->m_encData->m_frameStats.totalCu;
121
+ m_frame->m_encData->m_frameStats.percentInterDistribution[depth][2] = (double)(m_frame->m_encData->m_frameStats.cuInterDistribution[depth][3] * 100) / m_frame->m_encData->m_frameStats.totalCu;
122
}
123
}
124
- m_frame->m_encData->m_frameStats.avgLumaDistortion = (double)(m_frame->m_encData->m_frameStats.lumaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
125
- m_frame->m_encData->m_frameStats.avgChromaDistortion = (double)(m_frame->m_encData->m_frameStats.chromaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
126
- m_frame->m_encData->m_frameStats.avgPsyEnergy = (double)(m_frame->m_encData->m_frameStats.psyEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
127
- m_frame->m_encData->m_frameStats.avgSsimEnergy = (double)(m_frame->m_encData->m_frameStats.ssimEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
128
- m_frame->m_encData->m_frameStats.avgResEnergy = (double)(m_frame->m_encData->m_frameStats.resEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
129
- m_frame->m_encData->m_frameStats.percentIntraNxN = (double)(m_frame->m_encData->m_frameStats.cntIntraNxN * 100) / m_frame->m_encData->m_frameStats.totalCu;
130
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
131
+
132
+ if (m_param->csvLogLevel >= 2)
133
{
134
- m_frame->m_encData->m_frameStats.percentSkipCu[depth] = (double)(m_frame->m_encData->m_frameStats.cntSkipCu[depth] * 100) / m_frame->m_encData->m_frameStats.totalCu;
135
- m_frame->m_encData->m_frameStats.percentMergeCu[depth] = (double)(m_frame->m_encData->m_frameStats.cntMergeCu[depth] * 100) / m_frame->m_encData->m_frameStats.totalCu;
136
- for (int n = 0; n < INTRA_MODES; n++)
137
- m_frame->m_encData->m_frameStats.percentIntraDistribution[depth][n] = (double)(m_frame->m_encData->m_frameStats.cuIntraDistribution[depth][n] * 100) / m_frame->m_encData->m_frameStats.totalCu;
138
- uint64_t cuInterRectCnt = 0; // sum of Nx2N, 2NxN counts
139
- cuInterRectCnt += m_frame->m_encData->m_frameStats.cuInterDistribution[depth][1] + m_frame->m_encData->m_frameStats.cuInterDistribution[depth][2];
140
- m_frame->m_encData->m_frameStats.percentInterDistribution[depth][0] = (double)(m_frame->m_encData->m_frameStats.cuInterDistribution[depth][0] * 100) / m_frame->m_encData->m_frameStats.totalCu;
141
- m_frame->m_encData->m_frameStats.percentInterDistribution[depth][1] = (double)(cuInterRectCnt * 100) / m_frame->m_encData->m_frameStats.totalCu;
142
- m_frame->m_encData->m_frameStats.percentInterDistribution[depth][2] = (double)(m_frame->m_encData->m_frameStats.cuInterDistribution[depth][3] * 100) / m_frame->m_encData->m_frameStats.totalCu;
143
+ m_frame->m_encData->m_frameStats.avgLumaDistortion = (double)(m_frame->m_encData->m_frameStats.lumaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
144
+ m_frame->m_encData->m_frameStats.avgChromaDistortion = (double)(m_frame->m_encData->m_frameStats.chromaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
145
+ m_frame->m_encData->m_frameStats.avgPsyEnergy = (double)(m_frame->m_encData->m_frameStats.psyEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
146
+ m_frame->m_encData->m_frameStats.avgSsimEnergy = (double)(m_frame->m_encData->m_frameStats.ssimEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
147
+ m_frame->m_encData->m_frameStats.avgResEnergy = (double)(m_frame->m_encData->m_frameStats.resEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
148
}
149
150
m_bs.resetBits();
151
152
/* Accumulate CU statistics from each worker thread, we could report
153
* per-frame stats here, but currently we do not. */
154
for (int i = 0; i < numTLD; i++)
155
- m_cuStats.accumulate(m_tld[i].analysis.m_stats[m_jpId]);
156
+ m_cuStats.accumulate(m_tld[i].analysis.m_stats[m_jpId], *m_param);
157
#endif
158
159
m_endFrameTime = x265_mdate();
160
161
{
162
Slice* slice = m_frame->m_encData->m_slice;
163
const uint32_t widthInLCUs = slice->m_sps->numCuInWidth;
164
- const uint32_t lastCUAddr = (slice->m_endCUAddr + NUM_4x4_PARTITIONS - 1) / NUM_4x4_PARTITIONS;
165
+ const uint32_t lastCUAddr = (slice->m_endCUAddr + m_param->num4x4Partitions - 1) / m_param->num4x4Partitions;
166
const uint32_t numSubstreams = m_param->bEnableWavefront ? slice->m_sps->numCuInHeight : 1;
167
168
SAOParam* saoParam = slice->m_sps->bUseSAO ? m_frame->m_encData->m_saoParam : NULL;
169
170
const uint32_t row = (uint32_t)intRow;
171
CTURow& curRow = m_rows[row];
172
173
- tld.analysis.m_param = m_param;
174
if (m_param->bEnableWavefront)
175
{
176
ScopedLock self(curRow.lock);
177
178
179
uint32_t maxBlockCols = (m_frame->m_fencPic->m_picWidth + (16 - 1)) / 16;
180
uint32_t maxBlockRows = (m_frame->m_fencPic->m_picHeight + (16 - 1)) / 16;
181
- uint32_t noOfBlocks = g_maxCUSize / 16;
182
+ uint32_t noOfBlocks = m_param->maxCUSize / 16;
183
const uint32_t bFirstRowInSlice = ((row == 0) || (m_rows[row - 1].sliceId != curRow.sliceId)) ? 1 : 0;
184
const uint32_t bLastRowInSlice = ((row == m_numRows - 1) || (m_rows[row + 1].sliceId != curRow.sliceId)) ? 1 : 0;
185
const uint32_t sliceId = curRow.sliceId;
186
187
// TODO: specially case handle on first and last row
188
189
// Initialize restrict on MV range in slices
190
- tld.analysis.m_sliceMinY = -(int16_t)(rowInSlice * g_maxCUSize * 4) + 3 * 4;
191
- tld.analysis.m_sliceMaxY = (int16_t)((endRowInSlicePlus1 - 1 - row) * (g_maxCUSize * 4) - 4 * 4);
192
+ tld.analysis.m_sliceMinY = -(int16_t)(rowInSlice * m_param->maxCUSize * 4) + 3 * 4;
193
+ tld.analysis.m_sliceMaxY = (int16_t)((endRowInSlicePlus1 - 1 - row) * (m_param->maxCUSize * 4) - 4 * 4);
194
195
// Handle single row slice
196
if (tld.analysis.m_sliceMaxY < tld.analysis.m_sliceMinY)
197
198
cuStat.baseQp = curEncData.m_rowStat[row].rowQp;
199
200
/* TODO: use defines from slicetype.h for lowres block size */
201
x265_2.4.tar.gz/source/encoder/framefilter.cpp -> x265_2.5.tar.gz/source/encoder/framefilter.cpp
Changed
201
1
2
static uint64_t computeSSD(pixel *fenc, pixel *rec, intptr_t stride, uint32_t width, uint32_t height);
3
static float calculateSSIM(pixel *pix1, intptr_t stride1, pixel *pix2, intptr_t stride2, uint32_t width, uint32_t height, void *buf, uint32_t& cnt);
4
5
-static void integral_init4h(uint32_t *sum, pixel *pix, intptr_t stride)
6
+namespace X265_NS
7
{
8
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3];
9
- for (int16_t x = 0; x < stride - 4; x++)
10
+ static void integral_init4h_c(uint32_t *sum, pixel *pix, intptr_t stride)
11
{
12
- sum[x] = v + sum[x - stride];
13
- v += pix[x + 4] - pix[x];
14
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3];
15
+ for (int16_t x = 0; x < stride - 4; x++)
16
+ {
17
+ sum[x] = v + sum[x - stride];
18
+ v += pix[x + 4] - pix[x];
19
+ }
20
}
21
-}
22
23
-static void integral_init8h(uint32_t *sum, pixel *pix, intptr_t stride)
24
-{
25
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7];
26
- for (int16_t x = 0; x < stride - 8; x++)
27
+ static void integral_init8h_c(uint32_t *sum, pixel *pix, intptr_t stride)
28
{
29
- sum[x] = v + sum[x - stride];
30
- v += pix[x + 8] - pix[x];
31
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7];
32
+ for (int16_t x = 0; x < stride - 8; x++)
33
+ {
34
+ sum[x] = v + sum[x - stride];
35
+ v += pix[x + 8] - pix[x];
36
+ }
37
}
38
-}
39
40
-static void integral_init12h(uint32_t *sum, pixel *pix, intptr_t stride)
41
-{
42
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
43
- pix[8] + pix[9] + pix[10] + pix[11];
44
- for (int16_t x = 0; x < stride - 12; x++)
45
+ static void integral_init12h_c(uint32_t *sum, pixel *pix, intptr_t stride)
46
{
47
- sum[x] = v + sum[x - stride];
48
- v += pix[x + 12] - pix[x];
49
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
50
+ pix[8] + pix[9] + pix[10] + pix[11];
51
+ for (int16_t x = 0; x < stride - 12; x++)
52
+ {
53
+ sum[x] = v + sum[x - stride];
54
+ v += pix[x + 12] - pix[x];
55
+ }
56
}
57
-}
58
59
-static void integral_init16h(uint32_t *sum, pixel *pix, intptr_t stride)
60
-{
61
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
62
- pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15];
63
- for (int16_t x = 0; x < stride - 16; x++)
64
+ static void integral_init16h_c(uint32_t *sum, pixel *pix, intptr_t stride)
65
{
66
- sum[x] = v + sum[x - stride];
67
- v += pix[x + 16] - pix[x];
68
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
69
+ pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15];
70
+ for (int16_t x = 0; x < stride - 16; x++)
71
+ {
72
+ sum[x] = v + sum[x - stride];
73
+ v += pix[x + 16] - pix[x];
74
+ }
75
}
76
-}
77
78
-static void integral_init24h(uint32_t *sum, pixel *pix, intptr_t stride)
79
-{
80
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
81
- pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15] +
82
- pix[16] + pix[17] + pix[18] + pix[19] + pix[20] + pix[21] + pix[22] + pix[23];
83
- for (int16_t x = 0; x < stride - 24; x++)
84
+ static void integral_init24h_c(uint32_t *sum, pixel *pix, intptr_t stride)
85
{
86
- sum[x] = v + sum[x - stride];
87
- v += pix[x + 24] - pix[x];
88
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
89
+ pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15] +
90
+ pix[16] + pix[17] + pix[18] + pix[19] + pix[20] + pix[21] + pix[22] + pix[23];
91
+ for (int16_t x = 0; x < stride - 24; x++)
92
+ {
93
+ sum[x] = v + sum[x - stride];
94
+ v += pix[x + 24] - pix[x];
95
+ }
96
}
97
-}
98
99
-static void integral_init32h(uint32_t *sum, pixel *pix, intptr_t stride)
100
-{
101
- int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
102
- pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15] +
103
- pix[16] + pix[17] + pix[18] + pix[19] + pix[20] + pix[21] + pix[22] + pix[23] +
104
- pix[24] + pix[25] + pix[26] + pix[27] + pix[28] + pix[29] + pix[30] + pix[31];
105
- for (int16_t x = 0; x < stride - 32; x++)
106
+ static void integral_init32h_c(uint32_t *sum, pixel *pix, intptr_t stride)
107
{
108
- sum[x] = v + sum[x - stride];
109
- v += pix[x + 32] - pix[x];
110
+ int32_t v = pix[0] + pix[1] + pix[2] + pix[3] + pix[4] + pix[5] + pix[6] + pix[7] +
111
+ pix[8] + pix[9] + pix[10] + pix[11] + pix[12] + pix[13] + pix[14] + pix[15] +
112
+ pix[16] + pix[17] + pix[18] + pix[19] + pix[20] + pix[21] + pix[22] + pix[23] +
113
+ pix[24] + pix[25] + pix[26] + pix[27] + pix[28] + pix[29] + pix[30] + pix[31];
114
+ for (int16_t x = 0; x < stride - 32; x++)
115
+ {
116
+ sum[x] = v + sum[x - stride];
117
+ v += pix[x + 32] - pix[x];
118
+ }
119
}
120
-}
121
122
-static void integral_init4v(uint32_t *sum4, intptr_t stride)
123
-{
124
- for (int x = 0; x < stride; x++)
125
- sum4[x] = sum4[x + 4 * stride] - sum4[x];
126
-}
127
+ static void integral_init4v_c(uint32_t *sum4, intptr_t stride)
128
+ {
129
+ for (int x = 0; x < stride; x++)
130
+ sum4[x] = sum4[x + 4 * stride] - sum4[x];
131
+ }
132
133
-static void integral_init8v(uint32_t *sum8, intptr_t stride)
134
-{
135
- for (int x = 0; x < stride; x++)
136
- sum8[x] = sum8[x + 8 * stride] - sum8[x];
137
-}
138
+ static void integral_init8v_c(uint32_t *sum8, intptr_t stride)
139
+ {
140
+ for (int x = 0; x < stride; x++)
141
+ sum8[x] = sum8[x + 8 * stride] - sum8[x];
142
+ }
143
144
-static void integral_init12v(uint32_t *sum12, intptr_t stride)
145
-{
146
- for (int x = 0; x < stride; x++)
147
- sum12[x] = sum12[x + 12 * stride] - sum12[x];
148
-}
149
+ static void integral_init12v_c(uint32_t *sum12, intptr_t stride)
150
+ {
151
+ for (int x = 0; x < stride; x++)
152
+ sum12[x] = sum12[x + 12 * stride] - sum12[x];
153
+ }
154
155
-static void integral_init16v(uint32_t *sum16, intptr_t stride)
156
-{
157
- for (int x = 0; x < stride; x++)
158
- sum16[x] = sum16[x + 16 * stride] - sum16[x];
159
-}
160
+ static void integral_init16v_c(uint32_t *sum16, intptr_t stride)
161
+ {
162
+ for (int x = 0; x < stride; x++)
163
+ sum16[x] = sum16[x + 16 * stride] - sum16[x];
164
+ }
165
166
-static void integral_init24v(uint32_t *sum24, intptr_t stride)
167
-{
168
- for (int x = 0; x < stride; x++)
169
- sum24[x] = sum24[x + 24 * stride] - sum24[x];
170
-}
171
+ static void integral_init24v_c(uint32_t *sum24, intptr_t stride)
172
+ {
173
+ for (int x = 0; x < stride; x++)
174
+ sum24[x] = sum24[x + 24 * stride] - sum24[x];
175
+ }
176
177
-static void integral_init32v(uint32_t *sum32, intptr_t stride)
178
-{
179
- for (int x = 0; x < stride; x++)
180
- sum32[x] = sum32[x + 32 * stride] - sum32[x];
181
+ static void integral_init32v_c(uint32_t *sum32, intptr_t stride)
182
+ {
183
+ for (int x = 0; x < stride; x++)
184
+ sum32[x] = sum32[x + 32 * stride] - sum32[x];
185
+ }
186
+
187
+ void setupSeaIntegralPrimitives_c(EncoderPrimitives &p)
188
+ {
189
+ p.integral_initv[INTEGRAL_4] = integral_init4v_c;
190
+ p.integral_initv[INTEGRAL_8] = integral_init8v_c;
191
+ p.integral_initv[INTEGRAL_12] = integral_init12v_c;
192
+ p.integral_initv[INTEGRAL_16] = integral_init16v_c;
193
+ p.integral_initv[INTEGRAL_24] = integral_init24v_c;
194
+ p.integral_initv[INTEGRAL_32] = integral_init32v_c;
195
+ p.integral_inith[INTEGRAL_4] = integral_init4h_c;
196
+ p.integral_inith[INTEGRAL_8] = integral_init8h_c;
197
+ p.integral_inith[INTEGRAL_12] = integral_init12h_c;
198
+ p.integral_inith[INTEGRAL_16] = integral_init16h_c;
199
+ p.integral_inith[INTEGRAL_24] = integral_init24h_c;
200
+ p.integral_inith[INTEGRAL_32] = integral_init32h_c;
201
x265_2.4.tar.gz/source/encoder/framefilter.h -> x265_2.5.tar.gz/source/encoder/framefilter.h
Changed
10
1
2
3
uint32_t getCUWidth(int colNum) const
4
{
5
- return (colNum == (int)m_numCols - 1) ? m_lastWidth : g_maxCUSize;
6
+ return (colNum == (int)m_numCols - 1) ? m_lastWidth : m_param->maxCUSize;
7
}
8
9
void init(Encoder *top, FrameEncoder *frame, int numRows, uint32_t numCols);
10
x265_2.4.tar.gz/source/encoder/motion.cpp -> x265_2.5.tar.gz/source/encoder/motion.cpp
Changed
158
1
2
}
3
}
4
5
+void MotionEstimate::refineMV(ReferencePlanes* ref,
6
+ const MV& mvmin,
7
+ const MV& mvmax,
8
+ const MV& qmvp,
9
+ MV& outQMv)
10
+{
11
+ ALIGN_VAR_16(int, costs[16]);
12
+ if (ctuAddr >= 0)
13
+ blockOffset = ref->reconPic->getLumaAddr(ctuAddr, absPartIdx) - ref->reconPic->getLumaAddr(0);
14
+ intptr_t stride = ref->lumaStride;
15
+ pixel* fenc = fencPUYuv.m_buf[0];
16
+ pixel* fref = ref->fpelPlane[0] + blockOffset;
17
+
18
+ setMVP(qmvp);
19
+
20
+ MV qmvmin = mvmin.toQPel();
21
+ MV qmvmax = mvmax.toQPel();
22
+
23
+ /* The term cost used here means satd/sad values for that particular search.
24
+ * The costs used in ME integer search only includes the SAD cost of motion
25
+ * residual and sqrtLambda times MVD bits. The subpel refine steps use SATD
26
+ * cost of residual and sqrtLambda * MVD bits.
27
+ */
28
+
29
+ // measure SATD cost at clipped QPEL MVP
30
+ MV pmv = qmvp.clipped(qmvmin, qmvmax);
31
+ MV bestpre = pmv;
32
+ int bprecost;
33
+
34
+ bprecost = subpelCompare(ref, pmv, sad);
35
+
36
+ /* re-measure full pel rounded MVP with SAD as search start point */
37
+ MV bmv = pmv.roundToFPel();
38
+ int bcost = bprecost;
39
+ if (pmv.isSubpel())
40
+ bcost = sad(fenc, FENC_STRIDE, fref + bmv.x + bmv.y * stride, stride) + mvcost(bmv << 2);
41
+
42
+ /* square refine */
43
+ int dir = 0;
44
+ COST_MV_X4_DIR(0, -1, 0, 1, -1, 0, 1, 0, costs);
45
+ if ((bmv.y - 1 >= mvmin.y) & (bmv.y - 1 <= mvmax.y))
46
+ COPY2_IF_LT(bcost, costs[0], dir, 1);
47
+ if ((bmv.y + 1 >= mvmin.y) & (bmv.y + 1 <= mvmax.y))
48
+ COPY2_IF_LT(bcost, costs[1], dir, 2);
49
+ COPY2_IF_LT(bcost, costs[2], dir, 3);
50
+ COPY2_IF_LT(bcost, costs[3], dir, 4);
51
+ COST_MV_X4_DIR(-1, -1, -1, 1, 1, -1, 1, 1, costs);
52
+ if ((bmv.y - 1 >= mvmin.y) & (bmv.y - 1 <= mvmax.y))
53
+ COPY2_IF_LT(bcost, costs[0], dir, 5);
54
+ if ((bmv.y + 1 >= mvmin.y) & (bmv.y + 1 <= mvmax.y))
55
+ COPY2_IF_LT(bcost, costs[1], dir, 6);
56
+ if ((bmv.y - 1 >= mvmin.y) & (bmv.y - 1 <= mvmax.y))
57
+ COPY2_IF_LT(bcost, costs[2], dir, 7);
58
+ if ((bmv.y + 1 >= mvmin.y) & (bmv.y + 1 <= mvmax.y))
59
+ COPY2_IF_LT(bcost, costs[3], dir, 8);
60
+ bmv += square1[dir];
61
+
62
+ if (bprecost < bcost)
63
+ {
64
+ bmv = bestpre;
65
+ bcost = bprecost;
66
+ }
67
+ else
68
+ bmv = bmv.toQPel(); // promote search bmv to qpel
69
+
70
+ // TO DO: Change SubpelWorkload to fine tune MV
71
+ // Now it is set to 5 for experiment.
72
+ // const SubpelWorkload& wl = workload[this->subpelRefine];
73
+ const SubpelWorkload& wl = workload[5];
74
+
75
+ pixelcmp_t hpelcomp;
76
+
77
+ if (wl.hpel_satd)
78
+ {
79
+ bcost = subpelCompare(ref, bmv, satd) + mvcost(bmv);
80
+ hpelcomp = satd;
81
+ }
82
+ else
83
+ hpelcomp = sad;
84
+
85
+ for (int iter = 0; iter < wl.hpel_iters; iter++)
86
+ {
87
+ int bdir = 0;
88
+ for (int i = 1; i <= wl.hpel_dirs; i++)
89
+ {
90
+ MV qmv = bmv + square1[i] * 2;
91
+
92
+ // check mv range for slice bound
93
+ if ((qmv.y < qmvmin.y) | (qmv.y > qmvmax.y))
94
+ continue;
95
+
96
+ int cost = subpelCompare(ref, qmv, hpelcomp) + mvcost(qmv);
97
+ COPY2_IF_LT(bcost, cost, bdir, i);
98
+ }
99
+
100
+ if (bdir)
101
+ bmv += square1[bdir] * 2;
102
+ else
103
+ break;
104
+ }
105
+
106
+ /* if HPEL search used SAD, remeasure with SATD before QPEL */
107
+ if (!wl.hpel_satd)
108
+ bcost = subpelCompare(ref, bmv, satd) + mvcost(bmv);
109
+
110
+ for (int iter = 0; iter < wl.qpel_iters; iter++)
111
+ {
112
+ int bdir = 0;
113
+ for (int i = 1; i <= wl.qpel_dirs; i++)
114
+ {
115
+ MV qmv = bmv + square1[i];
116
+
117
+ // check mv range for slice bound
118
+ if ((qmv.y < qmvmin.y) | (qmv.y > qmvmax.y))
119
+ continue;
120
+
121
+ int cost = subpelCompare(ref, qmv, satd) + mvcost(qmv);
122
+ COPY2_IF_LT(bcost, cost, bdir, i);
123
+ }
124
+
125
+ if (bdir)
126
+ bmv += square1[bdir];
127
+ else
128
+ break;
129
+ }
130
+
131
+ // check mv range for slice bound
132
+ X265_CHECK(((pmv.y >= qmvmin.y) & (pmv.y <= qmvmax.y)), "mv beyond range!");
133
+
134
+ x265_emms();
135
+ outQMv = bmv;
136
+}
137
+
138
int MotionEstimate::motionEstimate(ReferencePlanes *ref,
139
const MV & mvmin,
140
const MV & mvmax,
141
142
const MV * mvc,
143
int merange,
144
MV & outQMv,
145
+ uint32_t maxSlices,
146
pixel * srcReferencePlane)
147
{
148
ALIGN_VAR_16(int, costs[16]);
149
150
const SubpelWorkload& wl = workload[this->subpelRefine];
151
152
// check mv range for slice bound
153
- if ((g_maxSlices > 1) & ((bmv.y < qmvmin.y) | (bmv.y > qmvmax.y)))
154
+ if ((maxSlices > 1) & ((bmv.y < qmvmin.y) | (bmv.y > qmvmax.y)))
155
{
156
bmv.y = x265_min(x265_max(bmv.y, qmvmin.y), qmvmax.y);
157
bcost = subpelCompare(ref, bmv, satd) + mvcost(bmv);
158
x265_2.4.tar.gz/source/encoder/motion.h -> x265_2.5.tar.gz/source/encoder/motion.h
Changed
11
1
2
chromaSatd(refYuv.getCrAddr(puPartIdx), refYuv.m_csize, fencPUYuv.m_buf[2], fencPUYuv.m_csize);
3
}
4
5
- int motionEstimate(ReferencePlanes* ref, const MV & mvmin, const MV & mvmax, const MV & qmvp, int numCandidates, const MV * mvc, int merange, MV & outQMv, pixel *srcReferencePlane = 0);
6
+ void refineMV(ReferencePlanes* ref, const MV& mvmin, const MV& mvmax, const MV& qmvp, MV& outQMv);
7
+ int motionEstimate(ReferencePlanes* ref, const MV & mvmin, const MV & mvmax, const MV & qmvp, int numCandidates, const MV * mvc, int merange, MV & outQMv, uint32_t maxSlices, pixel *srcReferencePlane = 0);
8
9
int subpelCompare(ReferencePlanes* ref, const MV &qmv, pixelcmp_t);
10
11
x265_2.4.tar.gz/source/encoder/ratecontrol.cpp -> x265_2.5.tar.gz/source/encoder/ratecontrol.cpp
Changed
52
1
2
uint32_t refRowSatdCost = 0, refRowBits = 0, intraCostForPendingCus = 0;
3
double refQScale = 0;
4
5
- if (picType != I_SLICE)
6
+ if (picType != I_SLICE && !m_param->rc.bEnableConstVbv)
7
{
8
FrameData& refEncData = *refFrame->m_encData;
9
uint32_t endCuAddr = maxCols * (row + 1);
10
11
&& refFrame
12
&& refFrame->m_encData->m_slice->m_sliceType == picType
13
&& refQScale > 0
14
- && refRowSatdCost > 0)
15
+ && refRowBits > 0
16
+ && !m_param->rc.bEnableConstVbv)
17
{
18
if (abs((int32_t)(refRowSatdCost - satdCostForPendingCus)) < (int32_t)satdCostForPendingCus / 2)
19
{
20
21
}
22
rowSatdCost >>= X265_DEPTH - 8;
23
updatePredictor(rce->rowPred[0], qScaleVbv, (double)rowSatdCost, encodedBits);
24
- if (curEncData.m_slice->m_sliceType != I_SLICE)
25
+ if (curEncData.m_slice->m_sliceType != I_SLICE && !m_param->rc.bEnableConstVbv)
26
{
27
Frame* refFrame = curEncData.m_slice->m_refFrameList[0][0];
28
if (qpVbv < refFrame->m_encData->m_rowStat[row].rowQp)
29
30
for (uint32_t i = 0; i < slice->m_sps->numCuInHeight; i++)
31
avgQpAq += curEncData.m_rowStat[i].sumQpAq;
32
33
- avgQpAq /= (slice->m_sps->numCUsInFrame * NUM_4x4_PARTITIONS);
34
+ avgQpAq /= (slice->m_sps->numCUsInFrame * m_param->num4x4Partitions);
35
curEncData.m_avgQpAq = avgQpAq;
36
}
37
else
38
39
{
40
*filler = updateVbv(actualBits, rce);
41
42
+ curFrame->m_rcData->bufferFillFinal = m_bufferFillFinal;
43
+ for (int i = 0; i < 4; i++)
44
+ {
45
+ curFrame->m_rcData->coeff[i] = m_pred[i].coeff;
46
+ curFrame->m_rcData->count[i] = m_pred[i].count;
47
+ curFrame->m_rcData->offset[i] = m_pred[i].offset;
48
+ }
49
if (m_param->bEmitHRDSEI)
50
{
51
const VUI *vui = &curEncData.m_slice->m_sps->vuiParameters;
52
x265_2.4.tar.gz/source/encoder/reference.cpp -> x265_2.5.tar.gz/source/encoder/reference.cpp
Changed
36
1
2
3
if (wp)
4
{
5
- uint32_t numCUinHeight = (reconPic->m_picHeight + g_maxCUSize - 1) / g_maxCUSize;
6
+ uint32_t numCUinHeight = (reconPic->m_picHeight + p.maxCUSize - 1) / p.maxCUSize;
7
8
int marginX = reconPic->m_lumaMarginX;
9
int marginY = reconPic->m_lumaMarginY;
10
intptr_t stride = reconPic->m_stride;
11
- int cuHeight = g_maxCUSize;
12
+ int cuHeight = p.maxCUSize;
13
14
for (int c = 0; c < (p.internalCsp != X265_CSP_I400 && recPic->m_picCsp != X265_CSP_I400 ? numInterpPlanes : 1); c++)
15
{
16
17
int marginY = reconPic->m_lumaMarginY;
18
intptr_t stride = reconPic->m_stride;
19
int width = reconPic->m_picWidth;
20
- int height = (finishedRows - numWeightedRows) * g_maxCUSize;
21
+ int height = (finishedRows - numWeightedRows) * reconPic->m_param->maxCUSize;
22
/* the last row may be partial height */
23
if (finishedRows == maxNumRows - 1)
24
{
25
- const int leftRows = (reconPic->m_picHeight & (g_maxCUSize - 1));
26
+ const int leftRows = (reconPic->m_picHeight & (reconPic->m_param->maxCUSize - 1));
27
28
- height += leftRows ? leftRows : g_maxCUSize;
29
+ height += leftRows ? leftRows : reconPic->m_param->maxCUSize;
30
}
31
- int cuHeight = g_maxCUSize;
32
+ int cuHeight = reconPic->m_param->maxCUSize;
33
34
for (int c = 0; c < numInterpPlanes; c++)
35
{
36
x265_2.4.tar.gz/source/encoder/sao.cpp -> x265_2.5.tar.gz/source/encoder/sao.cpp
Changed
118
1
2
m_hChromaShift = CHROMA_H_SHIFT(param->internalCsp);
3
m_vChromaShift = CHROMA_V_SHIFT(param->internalCsp);
4
5
- m_numCuInWidth = (m_param->sourceWidth + g_maxCUSize - 1) / g_maxCUSize;
6
- m_numCuInHeight = (m_param->sourceHeight + g_maxCUSize - 1) / g_maxCUSize;
7
+ m_numCuInWidth = (m_param->sourceWidth + m_param->maxCUSize - 1) / m_param->maxCUSize;
8
+ m_numCuInHeight = (m_param->sourceHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
9
10
const pixel maxY = (1 << X265_DEPTH) - 1;
11
const pixel rangeExt = maxY >> 1;
12
13
14
for (int i = 0; i < (param->internalCsp != X265_CSP_I400 ? 3 : 1); i++)
15
{
16
- CHECKED_MALLOC(m_tmpL1[i], pixel, g_maxCUSize + 1);
17
- CHECKED_MALLOC(m_tmpL2[i], pixel, g_maxCUSize + 1);
18
+ CHECKED_MALLOC(m_tmpL1[i], pixel, m_param->maxCUSize + 1);
19
+ CHECKED_MALLOC(m_tmpL2[i], pixel, m_param->maxCUSize + 1);
20
21
// SAO asm code will read 1 pixel before and after, so pad by 2
22
// NOTE: m_param->sourceWidth+2 enough, to avoid condition check in copySaoAboveRef(), I alloc more up to 63 bytes in here
23
- CHECKED_MALLOC(m_tmpU[i], pixel, m_numCuInWidth * g_maxCUSize + 2 + 32);
24
+ CHECKED_MALLOC(m_tmpU[i], pixel, m_numCuInWidth * m_param->maxCUSize + 2 + 32);
25
m_tmpU[i] += 1;
26
}
27
28
29
uint32_t picWidth = m_param->sourceWidth;
30
uint32_t picHeight = m_param->sourceHeight;
31
const CUData* cu = m_frame->m_encData->getPicCTU(addr);
32
- int ctuWidth = g_maxCUSize;
33
- int ctuHeight = g_maxCUSize;
34
+ int ctuWidth = m_param->maxCUSize;
35
+ int ctuHeight = m_param->maxCUSize;
36
uint32_t lpelx = cu->m_cuPelX;
37
uint32_t tpely = cu->m_cuPelY;
38
const uint32_t firstRowInSlice = cu->m_bFirstRowInSlice;
39
40
{
41
PicYuv* reconPic = m_frame->m_reconPic;
42
intptr_t stride = reconPic->m_stride;
43
- int ctuWidth = g_maxCUSize;
44
- int ctuHeight = g_maxCUSize;
45
+ int ctuWidth = m_param->maxCUSize;
46
+ int ctuHeight = m_param->maxCUSize;
47
48
int addr = idxY * m_numCuInWidth + idxX;
49
pixel* rec = reconPic->getLumaAddr(addr);
50
51
{
52
PicYuv* reconPic = m_frame->m_reconPic;
53
intptr_t stride = reconPic->m_strideC;
54
- int ctuWidth = g_maxCUSize;
55
- int ctuHeight = g_maxCUSize;
56
+ int ctuWidth = m_param->maxCUSize;
57
+ int ctuHeight = m_param->maxCUSize;
58
59
{
60
ctuWidth >>= m_hChromaShift;
61
62
intptr_t stride = plane ? reconPic->m_strideC : reconPic->m_stride;
63
uint32_t picWidth = m_param->sourceWidth;
64
uint32_t picHeight = m_param->sourceHeight;
65
- int ctuWidth = g_maxCUSize;
66
- int ctuHeight = g_maxCUSize;
67
+ int ctuWidth = m_param->maxCUSize;
68
+ int ctuHeight = m_param->maxCUSize;
69
uint32_t lpelx = cu->m_cuPelX;
70
uint32_t tpely = cu->m_cuPelY;
71
const uint32_t firstRowInSlice = cu->m_bFirstRowInSlice;
72
73
// WARNING: *) May read beyond bound on video than ctuWidth or ctuHeight is NOT multiple of cuSize
74
X265_CHECK((ctuWidth == ctuHeight) || (m_chromaFormat != X265_CSP_I420), "video size check failure\n");
75
if (plane)
76
- primitives.chroma[m_chromaFormat].cu[g_maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
77
+ primitives.chroma[m_chromaFormat].cu[m_param->maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
78
else
79
- primitives.cu[g_maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
80
+ primitives.cu[m_param->maxLog2CUSize - 2].sub_ps(diff, MAX_CU_SIZE, fenc0, rec0, stride, stride);
81
}
82
else
83
{
84
85
intptr_t stride = reconPic->m_stride;
86
uint32_t picWidth = m_param->sourceWidth;
87
uint32_t picHeight = m_param->sourceHeight;
88
- int ctuWidth = g_maxCUSize;
89
- int ctuHeight = g_maxCUSize;
90
+ int ctuWidth = m_param->maxCUSize;
91
+ int ctuHeight = m_param->maxCUSize;
92
uint32_t lpelx = cu->m_cuPelX;
93
uint32_t tpely = cu->m_cuPelY;
94
const uint32_t firstRowInSlice = cu->m_bFirstRowInSlice;
95
96
}
97
98
// Estimate Best Position
99
- int64_t bestRDCostBO = MAX_INT64;
100
int32_t bestClassBO = 0;
101
+ int64_t currentRDCost = costClasses[0];
102
+ currentRDCost += costClasses[1];
103
+ currentRDCost += costClasses[2];
104
+ currentRDCost += costClasses[3];
105
+ int64_t bestRDCostBO = currentRDCost;
106
107
- for (int i = 0; i < MAX_NUM_SAO_CLASS - SAO_NUM_OFFSET + 1; i++)
108
+ for (int i = 1; i < MAX_NUM_SAO_CLASS - SAO_NUM_OFFSET + 1; i++)
109
{
110
- int64_t currentRDCost = 0;
111
- for (int j = i; j < i + SAO_NUM_OFFSET; j++)
112
- currentRDCost += costClasses[j];
113
+ currentRDCost -= costClasses[i - 1];
114
+ currentRDCost += costClasses[i + 3];
115
116
if (currentRDCost < bestRDCostBO)
117
{
118
x265_2.4.tar.gz/source/encoder/search.cpp -> x265_2.5.tar.gz/source/encoder/search.cpp
Changed
127
1
2
CHECKED_MALLOC(m_rqt[i].coeffRQT[0], coeff_t, sizeL + sizeC * 2);
3
m_rqt[i].coeffRQT[1] = m_rqt[i].coeffRQT[0] + sizeL;
4
m_rqt[i].coeffRQT[2] = m_rqt[i].coeffRQT[0] + sizeL + sizeC;
5
- ok &= m_rqt[i].reconQtYuv.create(g_maxCUSize, param.internalCsp);
6
- ok &= m_rqt[i].resiQtYuv.create(g_maxCUSize, param.internalCsp);
7
+ ok &= m_rqt[i].reconQtYuv.create(param.maxCUSize, param.internalCsp);
8
+ ok &= m_rqt[i].resiQtYuv.create(param.maxCUSize, param.internalCsp);
9
}
10
}
11
else
12
13
{
14
CHECKED_MALLOC(m_rqt[i].coeffRQT[0], coeff_t, sizeL);
15
m_rqt[i].coeffRQT[1] = m_rqt[i].coeffRQT[2] = NULL;
16
- ok &= m_rqt[i].reconQtYuv.create(g_maxCUSize, param.internalCsp);
17
- ok &= m_rqt[i].resiQtYuv.create(g_maxCUSize, param.internalCsp);
18
+ ok &= m_rqt[i].reconQtYuv.create(param.maxCUSize, param.internalCsp);
19
+ ok &= m_rqt[i].resiQtYuv.create(param.maxCUSize, param.internalCsp);
20
}
21
}
22
23
/* the rest of these buffers are indexed per-depth */
24
- for (uint32_t i = 0; i <= g_maxCUDepth; i++)
25
+ for (uint32_t i = 0; i <= m_param->maxCUDepth; i++)
26
{
27
- int cuSize = g_maxCUSize >> i;
28
+ int cuSize = param.maxCUSize >> i;
29
ok &= m_rqt[i].tmpResiYuv.create(cuSize, param.internalCsp);
30
ok &= m_rqt[i].tmpPredYuv.create(cuSize, param.internalCsp);
31
ok &= m_rqt[i].bidirPredYuv[0].create(cuSize, param.internalCsp);
32
33
m_rqt[i].resiQtYuv.destroy();
34
}
35
36
- for (uint32_t i = 0; i <= g_maxCUDepth; i++)
37
+ for (uint32_t i = 0; i <= m_param->maxCUDepth; i++)
38
{
39
m_rqt[i].tmpResiYuv.destroy();
40
m_rqt[i].tmpPredYuv.destroy();
41
42
int mvpIdx = selectMVP(interMode.cu, pu, amvp, list, ref);
43
MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
44
45
- if (!m_param->analysisMode) /* Prevents load/save outputs from diverging if lowresMV is not available */
46
+ if (!m_param->analysisReuseMode) /* Prevents load/save outputs from diverging if lowresMV is not available */
47
{
48
MV lmv = getLowresMV(interMode.cu, pu, list, ref);
49
if (lmv.notZero())
50
51
52
setSearchRange(interMode.cu, mvp, m_param->searchRange, mvmin, mvmax);
53
54
- int satdCost = m_me.motionEstimate(&m_slice->m_mref[list][ref], mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv,
55
+ int satdCost = m_me.motionEstimate(&m_slice->m_mref[list][ref], mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv, m_param->maxSlices,
56
m_param->bSourceReferenceEstimation ? m_slice->m_refFrameList[list][ref]->m_fencPic->getLumaAddr(0) : 0);
57
58
/* Get total cost of partition, but only include MV bit cost once */
59
60
}
61
}
62
63
+void Search::searchMV(Mode& interMode, const PredictionUnit& pu, int list, int ref, MV& outmv)
64
+{
65
+ CUData& cu = interMode.cu;
66
+ const Slice *slice = m_slice;
67
+ MV mv = cu.m_mv[list][pu.puAbsPartIdx];
68
+ cu.clipMv(mv);
69
+ MV mvmin, mvmax;
70
+ setSearchRange(cu, mv, m_param->searchRange, mvmin, mvmax);
71
+ m_me.refineMV(&slice->m_mref[list][ref], mvmin, mvmax, mv, outmv);
72
+}
73
+
74
/* find the best inter prediction for each PU of specified mode */
75
void Search::predInterSearch(Mode& interMode, const CUGeom& cuGeom, bool bChromaMC, uint32_t refMasks[2])
76
{
77
78
cu.getNeighbourMV(puIdx, pu.puAbsPartIdx, interMode.interNeighbours);
79
80
/* Uni-directional prediction */
81
- if ((m_param->analysisMode == X265_ANALYSIS_LOAD && m_param->analysisRefineLevel > 1)
82
+ if ((m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10)
83
|| (m_param->analysisMultiPassRefine && m_param->rc.bStatRead))
84
{
85
for (int list = 0; list < numPredDir; list++)
86
87
if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && mvpIdx == bestME[list].mvpIdx)
88
mvpIn = bestME[list].mv;
89
90
- int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvpIn, numMvc, mvc, m_param->searchRange, outmv,
91
+ int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvpIn, numMvc, mvc, m_param->searchRange, outmv, m_param->maxSlices,
92
m_param->bSourceReferenceEstimation ? m_slice->m_refFrameList[list][ref]->m_fencPic->getLumaAddr(0) : 0);
93
94
/* Get total cost of partition, but only include MV bit cost once */
95
96
int mvpIdx = selectMVP(cu, pu, amvp, list, ref);
97
MV mvmin, mvmax, outmv, mvp = amvp[mvpIdx];
98
99
- if (!m_param->analysisMode) /* Prevents load/save outputs from diverging when lowresMV is not available */
100
+ if (!m_param->analysisReuseMode) /* Prevents load/save outputs from diverging when lowresMV is not available */
101
{
102
MV lmv = getLowresMV(cu, pu, list, ref);
103
if (lmv.notZero())
104
105
m_me.integral[planes] = interMode.fencYuv->m_integral[list][ref][planes] + puX * pu.width + puY * pu.height * m_slice->m_refFrameList[list][ref]->m_reconPic->m_stride;
106
}
107
setSearchRange(cu, mvp, m_param->searchRange, mvmin, mvmax);
108
- int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv,
109
+ int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv, m_param->maxSlices,
110
m_param->bSourceReferenceEstimation ? m_slice->m_refFrameList[list][ref]->m_fencPic->getLumaAddr(0) : 0);
111
112
/* Get total cost of partition, but only include MV bit cost once */
113
114
cu.clipMv(mvmax);
115
116
if (cu.m_encData->m_param->bIntraRefresh && m_slice->m_sliceType == P_SLICE &&
117
- cu.m_cuPelX / g_maxCUSize < m_frame->m_encData->m_pir.pirStartCol &&
118
+ cu.m_cuPelX / m_param->maxCUSize < m_frame->m_encData->m_pir.pirStartCol &&
119
m_slice->m_refFrameList[0][0]->m_encData->m_pir.pirEndCol < m_slice->m_sps->numCuInWidth)
120
{
121
int safeX, maxSafeMv;
122
- safeX = m_slice->m_refFrameList[0][0]->m_encData->m_pir.pirEndCol * g_maxCUSize - 3;
123
+ safeX = m_slice->m_refFrameList[0][0]->m_encData->m_pir.pirEndCol * m_param->maxCUSize - 3;
124
maxSafeMv = (safeX - cu.m_cuPelX) * 4;
125
mvmax.x = X265_MIN(mvmax.x, maxSafeMv);
126
mvmin.x = X265_MIN(mvmin.x, maxSafeMv);
127
x265_2.4.tar.gz/source/encoder/search.h -> x265_2.5.tar.gz/source/encoder/search.h
Changed
21
1
2
memset(this, 0, sizeof(*this));
3
}
4
5
- void accumulate(CUStats& other)
6
+ void accumulate(CUStats& other, x265_param& param)
7
{
8
- for (uint32_t i = 0; i <= g_maxCUDepth; i++)
9
+ for (uint32_t i = 0; i <= param.maxCUDepth; i++)
10
{
11
intraRDOElapsedTime[i] += other.intraRDOElapsedTime[i];
12
interRDOElapsedTime[i] += other.interRDOElapsedTime[i];
13
14
// estimation inter prediction (non-skip)
15
void predInterSearch(Mode& interMode, const CUGeom& cuGeom, bool bChromaMC, uint32_t masks[2]);
16
17
+ void searchMV(Mode& interMode, const PredictionUnit& pu, int list, int ref, MV& outmv);
18
// encode residual and compute rd-cost for inter mode
19
void encodeResAndCalcRdInterCU(Mode& interMode, const CUGeom& cuGeom);
20
void encodeResAndCalcRdSkipCU(Mode& interMode);
21
x265_2.4.tar.gz/source/encoder/sei.cpp -> x265_2.5.tar.gz/source/encoder/sei.cpp
Changed
28
1
2
}
3
WRITE_CODE(type, 8, "payload_type");
4
uint32_t payloadSize;
5
- if (hrdTypes || m_payloadType == USER_DATA_UNREGISTERED)
6
+ if (hrdTypes || m_payloadType == USER_DATA_UNREGISTERED || m_payloadType == USER_DATA_REGISTERED_ITU_T_T35)
7
{
8
if (hrdTypes)
9
{
10
X265_CHECK(0 == (count.getNumberOfWrittenBits() & 7), "payload unaligned\n");
11
payloadSize = count.getNumberOfWrittenBits() >> 3;
12
}
13
- else
14
+ else if (m_payloadType == USER_DATA_UNREGISTERED)
15
payloadSize = m_payloadSize + 16;
16
+ else
17
+ payloadSize = m_payloadSize;
18
19
for (; payloadSize >= 0xff; payloadSize -= 0xff)
20
WRITE_CODE(0xff, 8, "payload_size");
21
WRITE_CODE(payloadSize, 8, "payload_size");
22
}
23
- else if(m_payloadType != USER_DATA_REGISTERED_ITU_T_T35)
24
+ else
25
WRITE_CODE(m_payloadSize, 8, "payload_size");
26
/* virtual writeSEI method, write to bs */
27
writeSEI(sps);
28
x265_2.4.tar.gz/source/encoder/sei.h -> x265_2.5.tar.gz/source/encoder/sei.h
Changed
34
1
2
m_payloadSize = 0;
3
}
4
5
- uint8_t *cim;
6
+ uint8_t *m_payload;
7
8
// daniel.vt@samsung.com :: for the Creative Intent Meta Data Encoding ( seongnam.oh@samsung.com )
9
void writeSEI(const SPS&)
10
{
11
- if (!cim)
12
+ if (!m_payload)
13
return;
14
15
- int i = 0;
16
- int payloadSize = m_payloadSize;
17
- while (cim[i] == 0xFF)
18
- {
19
- i++;
20
- payloadSize += cim[i];
21
- WRITE_CODE(0xFF, 8, "payload_size");
22
- }
23
- WRITE_CODE(payloadSize, 8, "payload_size");
24
- i++;
25
- payloadSize += i;
26
- for (; i < payloadSize; ++i)
27
- WRITE_CODE(cim[i], 8, "creative_intent_metadata");
28
+ uint32_t i = 0;
29
+ for (; i < m_payloadSize; ++i)
30
+ WRITE_CODE(m_payload[i], 8, "creative_intent_metadata");
31
}
32
};
33
}
34
x265_2.4.tar.gz/source/encoder/slicetype.cpp -> x265_2.5.tar.gz/source/encoder/slicetype.cpp
Changed
52
1
2
if (m_param->rc.cuTree && !m_param->rc.bStatRead)
3
/* update row satds based on cutree offsets */
4
curFrame->m_lowres.satdCost = frameCostRecalculate(frames, p0, p1, b);
5
- else if (m_param->analysisMode != X265_ANALYSIS_LOAD)
6
+ else if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || m_param->scaleFactor)
7
{
8
if (m_param->rc.aqMode)
9
curFrame->m_lowres.satdCost = curFrame->m_lowres.costEstAq[b - p0][p1 - b];
10
11
curFrame->m_lowres.lowresCostForRc = curFrame->m_lowres.lowresCosts[b - p0][p1 - b];
12
uint32_t lowresRow = 0, lowresCol = 0, lowresCuIdx = 0, sum = 0, intraSum = 0;
13
uint32_t scale = m_param->maxCUSize / (2 * X265_LOWRES_CU_SIZE);
14
- uint32_t numCuInHeight = (m_param->sourceHeight + g_maxCUSize - 1) / g_maxCUSize;
15
+ uint32_t numCuInHeight = (m_param->sourceHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
16
uint32_t widthInLowresCu = (uint32_t)m_8x8Width, heightInLowresCu = (uint32_t)m_8x8Height;
17
double *qp_offset = 0;
18
/* Factor in qpoffsets based on Aq/Cutree in CU costs */
19
20
m_isSceneTransition = false; /* Signal end of scene transitioning */
21
}
22
23
+ if (m_param->csvLogLevel >= 2)
24
+ {
25
+ int64_t icost = frames[p1]->costEst[0][0];
26
+ int64_t pcost = frames[p1]->costEst[p1 - p0][0];
27
+ frames[p1]->ipCostRatio = (double)icost / pcost;
28
+ }
29
+
30
/* A frame is always analysed with bRealScenecut = true first, and then bRealScenecut = false,
31
the former for I decisions and the latter for P/B decisions. It's possible that the first
32
analysis detected scenecuts which were later nulled due to scene transitioning, in which
33
34
MV *mvs = frames[b]->lowresMvs[list][listDist[list]];
35
int32_t x = mvs[cuIndex].x;
36
int32_t y = mvs[cuIndex].y;
37
- displacement += sqrt(pow(abs(x), 2) + pow(abs(y), 2));
38
+ // NOTE: the dynamic range of abs(x) and abs(y) is 15-bits
39
+ displacement += sqrt((double)(abs(x) * abs(x)) + (double)(abs(y) * abs(y)));
40
}
41
else
42
displacement += 0.0;
43
44
45
/* ME will never return a cost larger than the cost @MVP, so we do not
46
* have to check that ME cost is more than the estimated merge cost */
47
- fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, s_merange, *fencMV);
48
+ fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, s_merange, *fencMV, m_lookahead.m_param->maxSlices);
49
if (skipCost < 64 && skipCost < fencCost && bBidir)
50
{
51
fencCost = skipCost;
52
x265_2.4.tar.gz/source/test/ipfilterharness.cpp -> x265_2.5.tar.gz/source/test/ipfilterharness.cpp
Changed
13
1
2
{
3
pixel_test_buff[0][i] = rand() & PIXEL_MAX;
4
short_test_buff[0][i] = (rand() % (2 * SMAX)) - SMAX;
5
-
6
pixel_test_buff[1][i] = PIXEL_MIN;
7
- short_test_buff[1][i] = SMIN;
8
-
9
+ short_test_buff[1][i] = (int16_t)SMIN;
10
pixel_test_buff[2][i] = PIXEL_MAX;
11
short_test_buff[2][i] = SMAX;
12
}
13
x265_2.4.tar.gz/source/test/ipfilterharness.h -> x265_2.5.tar.gz/source/test/ipfilterharness.h
Changed
11
1
2
enum { ITERS = 100 };
3
enum { TEST_CASES = 3 };
4
enum { SMAX = 1 << 12 };
5
- enum { SMIN = -1 << 12 };
6
-
7
+ enum { SMIN = (unsigned)-1 << 12 };
8
ALIGN_VAR_32(pixel, pixel_buff[TEST_BUF_SIZE]);
9
int16_t short_buff[TEST_BUF_SIZE];
10
int16_t IPF_vec_output_s[TEST_BUF_SIZE];
11
x265_2.4.tar.gz/source/test/pixelharness.cpp -> x265_2.5.tar.gz/source/test/pixelharness.cpp
Changed
201
1
2
uchar_test_buff[0][i] = rand() % ((1 << 8) - 1);
3
residual_test_buff[0][i] = (rand() % (2 * RMAX + 1)) - RMAX - 1;// For sse_ss only
4
double_test_buff[0][i] = (double)(short_test_buff[0][i]) / 256.0;
5
-
6
pixel_test_buff[1][i] = PIXEL_MIN;
7
- short_test_buff[1][i] = SMIN;
8
+ short_test_buff[1][i] = (int16_t)SMIN;
9
short_test_buff1[1][i] = PIXEL_MIN;
10
short_test_buff2[1][i] = -16384;
11
int_test_buff[1][i] = SHORT_MIN;
12
13
return true;
14
}
15
16
+bool PixelHarness::check_integral_initv(integralv_t ref, integralv_t opt)
17
+{
18
+ intptr_t srcStep = 64;
19
+ int j = 0;
20
+ uint32_t dst_ref[BUFFSIZE] = { 0 };
21
+ uint32_t dst_opt[BUFFSIZE] = { 0 };
22
+
23
+ for (int i = 0; i < 64; i++)
24
+ {
25
+ dst_ref[i] = pixel_test_buff[0][i];
26
+ dst_opt[i] = pixel_test_buff[0][i];
27
+ }
28
+
29
+ for (int i = 0, k = 0; i < BUFFSIZE; i++)
30
+ {
31
+ if (i % 64 == 0)
32
+ k++;
33
+ dst_ref[i] = dst_ref[i % 64] + k;
34
+ dst_opt[i] = dst_opt[i % 64] + k;
35
+ }
36
+
37
+ int padx = 4;
38
+ int pady = 4;
39
+ uint32_t *dst_ref_ptr = dst_ref + srcStep * pady + padx;
40
+ uint32_t *dst_opt_ptr = dst_opt + srcStep * pady + padx;
41
+ for (int i = 0; i < ITERS; i++)
42
+ {
43
+ ref(dst_ref_ptr, srcStep);
44
+ checked(opt, dst_opt_ptr, srcStep);
45
+
46
+ if (memcmp(dst_ref, dst_opt, sizeof(uint32_t) * BUFFSIZE))
47
+ return false;
48
+
49
+ reportfail()
50
+ j += INCR;
51
+ }
52
+ return true;
53
+}
54
+
55
+bool PixelHarness::check_integral_inith(integralh_t ref, integralh_t opt)
56
+{
57
+ /* Since stride is always a multiple of 8 and data movement in AVX2 is 16 elements at a time for 8 bit pixel, we need
58
+ * to check correctness for two cases: stride multiple of 16 and stride not a multiple of 16; fine for High bit depth
59
+ * where data movement in AVX2 is 8 elements at a time */
60
+ intptr_t srcStep[2] = { 56, 64 };
61
+ int j = 0;
62
+ uint32_t dst_ref[BUFFSIZE] = { 0 };
63
+ uint32_t dst_opt[BUFFSIZE] = { 0 };
64
+
65
+ int padx = 4;
66
+ int pady = 4;
67
+ for (int l = 0; l < 2; l++)
68
+ {
69
+ uint32_t *dst_ref_ptr = dst_ref + srcStep[l] * pady + padx;
70
+ uint32_t *dst_opt_ptr = dst_opt + srcStep[l] * pady + padx;
71
+ for (int k = 0; k < ITERS; k++)
72
+ {
73
+ ref(dst_ref_ptr, pixel_test_buff[0], srcStep[l]);
74
+ checked(opt, dst_opt_ptr, pixel_test_buff[0], srcStep[l]);
75
+
76
+ if (memcmp(dst_ref, dst_opt, sizeof(uint32_t) * BUFFSIZE))
77
+ return false;
78
+
79
+ reportfail()
80
+ j += INCR;
81
+ }
82
+ }
83
+ return true;
84
+}
85
+
86
bool PixelHarness::testPU(int part, const EncoderPrimitives& ref, const EncoderPrimitives& opt)
87
{
88
if (opt.pu[part].satd)
89
90
}
91
}
92
93
+ for (int k = 0; k < NUM_INTEGRAL_SIZE; k++)
94
+ {
95
+ if (opt.integral_initv[k] && !check_integral_initv(ref.integral_initv[k], opt.integral_initv[k]))
96
+ {
97
+ switch (k)
98
+ {
99
+ case 0:
100
+ printf("Integral4v failed!\n");
101
+ break;
102
+ case 1:
103
+ printf("Integral8v failed!\n");
104
+ break;
105
+ case 2:
106
+ printf("Integral12v failed!\n");
107
+ break;
108
+ case 3:
109
+ printf("Integral16v failed!\n");
110
+ break;
111
+ case 4:
112
+ printf("Integral24v failed!\n");
113
+ break;
114
+ case 5:
115
+ printf("Integral32v failed!\n");
116
+ break;
117
+ }
118
+ return false;
119
+ }
120
+ }
121
+
122
+
123
+ for (int k = 0; k < NUM_INTEGRAL_SIZE; k++)
124
+ {
125
+ if (opt.integral_inith[k] && !check_integral_inith(ref.integral_inith[k], opt.integral_inith[k]))
126
+ {
127
+ switch (k)
128
+ {
129
+ case 0:
130
+ printf("Integral4h failed!\n");
131
+ break;
132
+ case 1:
133
+ printf("Integral8h failed!\n");
134
+ break;
135
+ case 2:
136
+ printf("Integral12h failed!\n");
137
+ break;
138
+ case 3:
139
+ printf("Integral16h failed!\n");
140
+ break;
141
+ case 4:
142
+ printf("Integral24h failed!\n");
143
+ break;
144
+ case 5:
145
+ printf("Integral32h failed!\n");
146
+ break;
147
+ }
148
+ return false;
149
+ }
150
+ }
151
return true;
152
}
153
154
155
HEADER0("pelFilterChroma_Horizontal");
156
REPORT_SPEEDUP(opt.pelFilterChroma[1], ref.pelFilterChroma[1], pbuf1, 1, STRIDE, tc, maskP, maskQ);
157
}
158
+
159
+ for (int k = 0; k < NUM_INTEGRAL_SIZE; k++)
160
+ {
161
+ if (opt.integral_initv[k])
162
+ {
163
+ switch (k)
164
+ {
165
+ case 0:
166
+ HEADER0("integral_init4v");
167
+ break;
168
+ case 1:
169
+ HEADER0("integral_init8v");
170
+ break;
171
+ case 2:
172
+ HEADER0("integral_init12v");
173
+ break;
174
+ case 3:
175
+ HEADER0("integral_init16v");
176
+ break;
177
+ case 4:
178
+ HEADER0("integral_init24v");
179
+ break;
180
+ case 5:
181
+ HEADER0("integral_init32v");
182
+ break;
183
+ default:
184
+ break;
185
+ }
186
+ REPORT_SPEEDUP(opt.integral_initv[k], ref.integral_initv[k], (uint32_t*)pbuf1, STRIDE);
187
+ }
188
+ }
189
+
190
+ for (int k = 0; k < NUM_INTEGRAL_SIZE; k++)
191
+ {
192
+ if (opt.integral_inith[k])
193
+ {
194
+ uint32_t dst_buf[BUFFSIZE] = { 0 };
195
+ switch (k)
196
+ {
197
+ case 0:
198
+ HEADER0("integral_init4h");
199
+ break;
200
+ case 1:
201
x265_2.4.tar.gz/source/test/pixelharness.h -> x265_2.5.tar.gz/source/test/pixelharness.h
Changed
19
1
2
enum { BUFFSIZE = STRIDE * (MAX_HEIGHT + PAD_ROWS) + INCR * ITERS };
3
enum { TEST_CASES = 3 };
4
enum { SMAX = 1 << 12 };
5
- enum { SMIN = -1 << 12 };
6
+ enum { SMIN = (unsigned)-1 << 12 };
7
enum { RMAX = PIXEL_MAX - PIXEL_MIN }; //The maximum value obtained by subtracting pixel values (residual max)
8
enum { RMIN = PIXEL_MIN - PIXEL_MAX }; //The minimum value obtained by subtracting pixel values (residual min)
9
10
11
bool check_pelFilterLumaStrong_H(pelFilterLumaStrong_t ref, pelFilterLumaStrong_t opt);
12
bool check_pelFilterChroma_V(pelFilterChroma_t ref, pelFilterChroma_t opt);
13
bool check_pelFilterChroma_H(pelFilterChroma_t ref, pelFilterChroma_t opt);
14
+ bool check_integral_initv(integralv_t ref, integralv_t opt);
15
+ bool check_integral_inith(integralh_t ref, integralh_t opt);
16
17
public:
18
19
x265_2.4.tar.gz/source/test/regression-tests.txt -> x265_2.5.tar.gz/source/test/regression-tests.txt
Changed
52
1
2
BasketballDrive_1920x1080_50.y4m,--preset faster --aq-strength 2 --merange 190 --slices 3
3
BasketballDrive_1920x1080_50.y4m,--preset medium --ctu 16 --max-tu-size 8 --subme 7 --qg-size 16 --cu-lossless --tu-inter-depth 3 --limit-tu 1
4
BasketballDrive_1920x1080_50.y4m,--preset medium --keyint -1 --nr-inter 100 -F4 --no-sao
5
-BasketballDrive_1920x1080_50.y4m,--preset medium --no-cutree --analysis-mode=save --refine-level 2 --bitrate 7000 --limit-modes,--preset medium --no-cutree --analysis-mode=load --refine-level 2 --bitrate 7000 --limit-modes
6
+BasketballDrive_1920x1080_50.y4m,--preset medium --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 7000 --limit-modes,--preset medium --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 7000 --limit-modes
7
BasketballDrive_1920x1080_50.y4m,--preset slow --nr-intra 100 -F4 --aq-strength 3 --qg-size 16 --limit-refs 1
8
BasketballDrive_1920x1080_50.y4m,--preset slower --lossless --chromaloc 3 --subme 0 --limit-tu 4
9
-BasketballDrive_1920x1080_50.y4m,--preset slower --no-cutree --analysis-mode=save --refine-level 10 --bitrate 7000 --limit-tu 0,--preset slower --no-cutree --analysis-mode=load --refine-level 10 --bitrate 7000 --limit-tu 0
10
+BasketballDrive_1920x1080_50.y4m,--preset slower --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0,--preset slower --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0
11
BasketballDrive_1920x1080_50.y4m,--preset veryslow --crf 4 --cu-lossless --pmode --limit-refs 1 --aq-mode 3 --limit-tu 3
12
-BasketballDrive_1920x1080_50.y4m,--preset veryslow --no-cutree --analysis-mode=save --bitrate 7000 --tskip-fast --limit-tu 4,--preset veryslow --no-cutree --analysis-mode=load --bitrate 7000 --tskip-fast --limit-tu 4
13
+BasketballDrive_1920x1080_50.y4m,--preset veryslow --no-cutree --analysis-reuse-mode=save --bitrate 7000 --tskip-fast --limit-tu 4,--preset veryslow --no-cutree --analysis-reuse-mode=load --bitrate 7000 --tskip-fast --limit-tu 4
14
BasketballDrive_1920x1080_50.y4m,--preset veryslow --recon-y4m-exec "ffplay -i pipe:0 -autoexit"
15
Coastguard-4k.y4m,--preset ultrafast --recon-y4m-exec "ffplay -i pipe:0 -autoexit"
16
Coastguard-4k.y4m,--preset superfast --tune grain --overscan=crop
17
Coastguard-4k.y4m,--preset superfast --tune grain --pme --aq-strength 2 --merange 190
18
-Coastguard-4k.y4m,--preset veryfast --no-cutree --analysis-mode=save --refine-level 1 --bitrate 15000,--preset veryfast --no-cutree --analysis-mode=load --refine-level 1 --bitrate 15000
19
+Coastguard-4k.y4m,--preset veryfast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 15000,--preset veryfast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 15000
20
Coastguard-4k.y4m,--preset medium --rdoq-level 1 --tune ssim --no-signhide --me umh --slices 2
21
Coastguard-4k.y4m,--preset slow --tune psnr --cbqpoffs -1 --crqpoffs 1 --limit-refs 1
22
CrowdRun_1920x1080_50_10bit_422.yuv,--preset ultrafast --weightp --tune zerolatency --qg-size 16
23
24
DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset veryfast --weightp --nr-intra 1000 -F4
25
DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset medium --nr-inter 500 -F4 --no-psy-rdoq
26
DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset slower --no-weightp --rdoq-level 0 --limit-refs 3 --tu-inter-depth 4 --limit-tu 3
27
-DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset fast --no-cutree --analysis-mode=save --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1,--preset fast --no-cutree --analysis-mode=load --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1
28
+DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset fast --no-cutree --analysis-reuse-mode=save --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1,--preset fast --no-cutree --analysis-reuse-mode=load --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1
29
FourPeople_1280x720_60.y4m,--preset superfast --no-wpp --lookahead-slices 2
30
FourPeople_1280x720_60.y4m,--preset veryfast --aq-mode 2 --aq-strength 1.5 --qg-size 8
31
FourPeople_1280x720_60.y4m,--preset medium --qp 38 --no-psy-rd
32
33
KristenAndSara_1280x720_60.y4m,--preset slower --pmode --max-tu-size 8 --limit-refs 0 --limit-modes --limit-tu 1
34
NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset superfast --tune psnr
35
NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset medium --tune grain --limit-refs 2
36
-NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset slow --no-cutree --analysis-mode=save --rd 5 --refine-level 10 --bitrate 9000,--preset slow --no-cutree --analysis-mode=load --rd 5 --refine-level 10 --bitrate 9000
37
-News-4k.y4m,--preset ultrafast --no-cutree --analysis-mode=save --refine-level 2 --bitrate 15000,--preset ultrafast --no-cutree --analysis-mode=load --refine-level 2 --bitrate 15000
38
+NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset slow --no-cutree --analysis-reuse-mode=save --rd 5 --analysis-reuse-level 10 --bitrate 9000,--preset slow --no-cutree --analysis-reuse-mode=load --rd 5 --analysis-reuse-level 10 --bitrate 9000
39
+News-4k.y4m,--preset ultrafast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 15000,--preset ultrafast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 15000
40
News-4k.y4m,--preset superfast --lookahead-slices 6 --aq-mode 0
41
News-4k.y4m,--preset superfast --slices 4 --aq-mode 0
42
News-4k.y4m,--preset medium --tune ssim --no-sao --qg-size 16
43
44
old_town_cross_444_720p50.y4m,--preset superfast --weightp --min-cu 16 --limit-modes
45
old_town_cross_444_720p50.y4m,--preset veryfast --qp 1 --tune ssim
46
old_town_cross_444_720p50.y4m,--preset faster --rd 1 --tune zero-latency
47
-old_town_cross_444_720p50.y4m,--preset fast --no-cutree --analysis-mode=save --refine-level 1 --bitrate 3000 --early-skip,--preset fast --no-cutree --analysis-mode=load --refine-level 1 --bitrate 3000 --early-skip
48
+old_town_cross_444_720p50.y4m,--preset fast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 3000 --early-skip,--preset fast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 3000 --early-skip
49
old_town_cross_444_720p50.y4m,--preset medium --keyint -1 --no-weightp --ref 6
50
old_town_cross_444_720p50.y4m,--preset slow --rdoq-level 1 --early-skip --ref 7 --no-b-pyramid
51
old_town_cross_444_720p50.y4m,--preset slower --crf 4 --cu-lossless
52
x265_2.4.tar.gz/source/x265-extras.cpp -> x265_2.5.tar.gz/source/x265-extras.cpp
Changed
201
1
2
3
#include "x265.h"
4
#include "x265-extras.h"
5
-
6
+#include "param.h"
7
#include "common.h"
8
9
using namespace X265_NS;
10
11
"B count, B ave-QP, B kbps, B-PSNR Y, B-PSNR U, B-PSNR V, B-SSIM (dB), "
12
"MaxCLL, MaxFALL, Version\n";
13
14
-FILE* x265_csvlog_open(const x265_api& api, const x265_param& param, const char* fname, int level)
15
+FILE* x265_csvlog_open(const x265_param& param, const char* fname, int level)
16
{
17
- if (sizeof(x265_stats) != api.sizeof_stats || sizeof(x265_picture) != api.sizeof_picture)
18
- {
19
- fprintf(stderr, "extras [error]: structure size skew, unable to create CSV logfile\n");
20
- return NULL;
21
- }
22
-
23
FILE *csvfp = x265_fopen(fname, "r");
24
if (csvfp)
25
{
26
27
if (level)
28
{
29
fprintf(csvfp, "Encode Order, Type, POC, QP, Bits, Scenecut, ");
30
+ if (level >= 2)
31
+ fprintf(csvfp, "I/P cost ratio, ");
32
if (param.rc.rateControlMode == X265_RC_CRF)
33
fprintf(csvfp, "RateFactor, ");
34
if (param.rc.vbvBufferSize)
35
36
fprintf(csvfp, "Latency, ");
37
fprintf(csvfp, "List 0, List 1");
38
uint32_t size = param.maxCUSize;
39
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
40
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
41
{
42
fprintf(csvfp, ", Intra %dx%d DC, Intra %dx%d Planar, Intra %dx%d Ang", size, size, size, size, size, size);
43
size /= 2;
44
45
size = param.maxCUSize;
46
if (param.bEnableRectInter)
47
{
48
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
49
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
50
{
51
fprintf(csvfp, ", Inter %dx%d, Inter %dx%d (Rect)", size, size, size, size);
52
if (param.bEnableAMP)
53
54
}
55
else
56
{
57
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
58
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
59
{
60
fprintf(csvfp, ", Inter %dx%d", size, size);
61
size /= 2;
62
}
63
}
64
size = param.maxCUSize;
65
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
66
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
67
{
68
fprintf(csvfp, ", Skip %dx%d", size, size);
69
size /= 2;
70
}
71
size = param.maxCUSize;
72
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
73
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
74
{
75
fprintf(csvfp, ", Merge %dx%d", size, size);
76
size /= 2;
77
}
78
- fprintf(csvfp, ", Avg Luma Distortion, Avg Chroma Distortion, Avg psyEnergy, Avg Luma Level, Max Luma Level, Avg Residual Energy");
79
80
- /* detailed performance statistics */
81
if (level >= 2)
82
- fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms), Stall Time (ms), Total frame time (ms), Avg WPP, Row Blocks");
83
+ {
84
+ fprintf(csvfp, ", Avg Luma Distortion, Avg Chroma Distortion, Avg psyEnergy, Avg Residual Energy,"
85
+ " Min Luma Level, Max Luma Level, Avg Luma Level");
86
+
87
+ if (param.internalCsp != X265_CSP_I400)
88
+ fprintf(csvfp, ", Min Cb Level, Max Cb Level, Avg Cb Level, Min Cr Level, Max Cr Level, Avg Cr Level");
89
+
90
+ /* PU statistics */
91
+ size = param.maxCUSize;
92
+ for (uint32_t i = 0; i< param.maxLog2CUSize - (uint32_t)g_log2Size[param.minCUSize] + 1; i++)
93
+ {
94
+ fprintf(csvfp, ", Intra %dx%d", size, size);
95
+ fprintf(csvfp, ", Skip %dx%d", size, size);
96
+ fprintf(csvfp, ", AMP %d", size);
97
+ fprintf(csvfp, ", Inter %dx%d", size, size);
98
+ fprintf(csvfp, ", Merge %dx%d", size, size);
99
+ fprintf(csvfp, ", Inter %dx%d", size, size / 2);
100
+ fprintf(csvfp, ", Merge %dx%d", size, size / 2);
101
+ fprintf(csvfp, ", Inter %dx%d", size / 2, size);
102
+ fprintf(csvfp, ", Merge %dx%d", size / 2, size);
103
+ size /= 2;
104
+ }
105
+
106
+ if ((uint32_t)g_log2Size[param.minCUSize] == 3)
107
+ fprintf(csvfp, ", 4x4");
108
+
109
+ /* detailed performance statistics */
110
+ fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms),"
111
+ "Stall Time (ms), Total frame time (ms), Avg WPP, Row Blocks");
112
+ }
113
fprintf(csvfp, "\n");
114
}
115
else
116
117
return;
118
119
const x265_frame_stats* frameStats = &pic.frameData;
120
- fprintf(csvfp, "%d, %c-SLICE, %4d, %2.2lf, %10d, %d,", frameStats->encoderOrder, frameStats->sliceType, frameStats->poc, frameStats->qp, (int)frameStats->bits, frameStats->bScenecut);
121
+ fprintf(csvfp, "%d, %c-SLICE, %4d, %2.2lf, %10d, %d,", frameStats->encoderOrder, frameStats->sliceType, frameStats->poc,
122
+ frameStats->qp, (int)frameStats->bits, frameStats->bScenecut);
123
+ if (level >= 2)
124
+ fprintf(csvfp, "%.2f,", frameStats->ipCostRatio);
125
if (param.rc.rateControlMode == X265_RC_CRF)
126
fprintf(csvfp, "%.3lf,", frameStats->rateFactor);
127
if (param.rc.vbvBufferSize)
128
129
else
130
fputs(" -,", csvfp);
131
}
132
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
133
- fprintf(csvfp, "%5.2lf%%, %5.2lf%%, %5.2lf%%,", frameStats->cuStats.percentIntraDistribution[depth][0], frameStats->cuStats.percentIntraDistribution[depth][1], frameStats->cuStats.percentIntraDistribution[depth][2]);
134
- fprintf(csvfp, "%5.2lf%%", frameStats->cuStats.percentIntraNxN);
135
- if (param.bEnableRectInter)
136
+
137
+ if (level)
138
{
139
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
140
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
141
+ fprintf(csvfp, "%5.2lf%%, %5.2lf%%, %5.2lf%%,", frameStats->cuStats.percentIntraDistribution[depth][0],
142
+ frameStats->cuStats.percentIntraDistribution[depth][1],
143
+ frameStats->cuStats.percentIntraDistribution[depth][2]);
144
+ fprintf(csvfp, "%5.2lf%%", frameStats->cuStats.percentIntraNxN);
145
+ if (param.bEnableRectInter)
146
{
147
- fprintf(csvfp, ", %5.2lf%%, %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0], frameStats->cuStats.percentInterDistribution[depth][1]);
148
- if (param.bEnableAMP)
149
- fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][2]);
150
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
151
+ {
152
+ fprintf(csvfp, ", %5.2lf%%, %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0],
153
+ frameStats->cuStats.percentInterDistribution[depth][1]);
154
+ if (param.bEnableAMP)
155
+ fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][2]);
156
+ }
157
}
158
+ else
159
+ {
160
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
161
+ fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0]);
162
+ }
163
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
164
+ fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentSkipCu[depth]);
165
+ for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
166
+ fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentMergeCu[depth]);
167
}
168
- else
169
- {
170
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
171
- fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0]);
172
- }
173
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
174
- fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentSkipCu[depth]);
175
- for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
176
- fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentMergeCu[depth]);
177
- fprintf(csvfp, ", %.2lf, %.2lf, %.2lf, %.2lf, %d, %.2lf", frameStats->avgLumaDistortion, frameStats->avgChromaDistortion, frameStats->avgPsyEnergy, frameStats->avgLumaLevel, frameStats->maxLumaLevel, frameStats->avgResEnergy);
178
179
if (level >= 2)
180
{
181
- fprintf(csvfp, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf,", frameStats->decideWaitTime, frameStats->row0WaitTime, frameStats->wallTime, frameStats->refWaitWallTime, frameStats->totalCTUTime, frameStats->stallTime, frameStats->totalFrameTime);
182
+ fprintf(csvfp, ", %.2lf, %.2lf, %.2lf, %.2lf ", frameStats->avgLumaDistortion,
183
+ frameStats->avgChromaDistortion,
184
+ frameStats->avgPsyEnergy,
185
+ frameStats->avgResEnergy);
186
+
187
+ fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minLumaLevel, frameStats->maxLumaLevel, frameStats->avgLumaLevel);
188
+
189
+ if (param.internalCsp != X265_CSP_I400)
190
+ {
191
+ fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minChromaULevel, frameStats->maxChromaULevel, frameStats->avgChromaULevel);
192
+ fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minChromaVLevel, frameStats->maxChromaVLevel, frameStats->avgChromaVLevel);
193
+ }
194
+
195
+ for (uint32_t i = 0; i < param.maxLog2CUSize - (uint32_t)g_log2Size[param.minCUSize] + 1; i++)
196
+ {
197
+ fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentIntraPu[i]);
198
+ fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentSkipPu[i]);
199
+ fprintf(csvfp, ",%.2lf%%", frameStats->puStats.percentAmpPu[i]);
200
+ for (uint32_t j = 0; j < 3; j++)
201
x265_2.4.tar.gz/source/x265-extras.h -> x265_2.5.tar.gz/source/x265-extras.h
Changed
19
1
2
* closed by the caller using fclose(). If level is 0, then no frame logging
3
* header is written to the file. This function will return NULL if it is unable
4
* to open the file for write or if it detects a structure size skew */
5
-LIBAPI FILE* x265_csvlog_open(const x265_api& api, const x265_param& param, const char* fname, int level);
6
+LIBAPI FILE* x265_csvlog_open(const x265_param& param, const char* fname, int level);
7
8
/* Log frame statistics to the CSV file handle. level should have been non-zero
9
* in the call to x265_csvlog_open() if this function is called. */
10
11
/* Log final encode statistics to the CSV file handle. 'argc' and 'argv' are
12
* intended to be command line arguments passed to the encoder. Encode
13
* statistics should be queried from the encoder just prior to closing it. */
14
-LIBAPI void x265_csvlog_encode(FILE* csvfp, const char* version, const x265_param& param, const x265_stats& stats, int level, int argc, char** argv);
15
+LIBAPI void x265_csvlog_encode(FILE* csvfp, const char* version, const x265_param& param, int padx, int pady, const x265_stats& stats, int level, int argc, char** argv);
16
17
/* In-place downshift from a bit-depth greater than 8 to a bit-depth of 8, using
18
* the residual bits to dither each row. */
19
x265_2.4.tar.gz/source/x265.cpp -> x265_2.5.tar.gz/source/x265.cpp
Changed
124
1
2
ReconFile* recon;
3
OutputFile* output;
4
FILE* qpfile;
5
- FILE* csvfpt;
6
- const char* csvfn;
7
const char* reconPlayCmd;
8
const x265_api* api;
9
x265_param* param;
10
bool bProgress;
11
bool bForceY4m;
12
bool bDither;
13
- int csvLogLevel;
14
uint32_t seek; // number of frames to skip from the beginning
15
uint32_t framesToBeEncoded; // number of frames to encode
16
uint64_t totalbytes;
17
18
recon = NULL;
19
output = NULL;
20
qpfile = NULL;
21
- csvfpt = NULL;
22
- csvfn = NULL;
23
reconPlayCmd = NULL;
24
api = NULL;
25
param = NULL;
26
27
startTime = x265_mdate();
28
prevUpdateTime = 0;
29
bDither = false;
30
- csvLogLevel = 0;
31
}
32
33
void destroy();
34
35
if (qpfile)
36
fclose(qpfile);
37
qpfile = NULL;
38
- if (csvfpt)
39
- fclose(csvfpt);
40
- csvfpt = NULL;
41
if (output)
42
output->release();
43
output = NULL;
44
45
if (0) ;
46
OPT2("frame-skip", "seek") this->seek = (uint32_t)x265_atoi(optarg, bError);
47
OPT("frames") this->framesToBeEncoded = (uint32_t)x265_atoi(optarg, bError);
48
- OPT("csv") this->csvfn = optarg;
49
- OPT("csv-log-level") this->csvLogLevel = x265_atoi(optarg, bError);
50
OPT("no-progress") this->bProgress = false;
51
OPT("output") outputfn = optarg;
52
OPT("input") inputfn = optarg;
53
54
* 1 - unable to parse command line
55
* 2 - unable to open encoder
56
* 3 - unable to generate stream headers
57
- * 4 - encoder abort
58
- * 5 - unable to open csv file */
59
+ * 4 - encoder abort */
60
61
int main(int argc, char **argv)
62
{
63
64
/* get the encoder parameters post-initialization */
65
api->encoder_parameters(encoder, param);
66
67
- if (cliopt.csvfn)
68
- {
69
- cliopt.csvfpt = x265_csvlog_open(*api, *param, cliopt.csvfn, cliopt.csvLogLevel);
70
- if (!cliopt.csvfpt)
71
- {
72
- x265_log_file(param, X265_LOG_ERROR, "Unable to open CSV log file <%s>, aborting\n", cliopt.csvfn);
73
- cliopt.destroy();
74
- if (cliopt.api)
75
- cliopt.api->param_free(cliopt.param);
76
- exit(5);
77
- }
78
- }
79
-
80
- /* Control-C handler */
81
+ /* Control-C handler */
82
if (signal(SIGINT, sigint_handler) == SIG_ERR)
83
x265_log(param, X265_LOG_ERROR, "Unable to register CTRL+C handler: %s\n", strerror(errno));
84
85
x265_picture pic_orig, pic_out;
86
x265_picture *pic_in = &pic_orig;
87
- /* Allocate recon picture if analysisMode is enabled */
88
+ /* Allocate recon picture if analysisReuseMode is enabled */
89
std::priority_queue<int64_t>* pts_queue = cliopt.output->needPTS() ? new std::priority_queue<int64_t>() : NULL;
90
- x265_picture *pic_recon = (cliopt.recon || !!param->analysisMode || pts_queue || reconPlay || cliopt.csvLogLevel) ? &pic_out : NULL;
91
+ x265_picture *pic_recon = (cliopt.recon || !!param->analysisReuseMode || pts_queue || reconPlay || param->csvLogLevel) ? &pic_out : NULL;
92
uint32_t inFrameCount = 0;
93
uint32_t outFrameCount = 0;
94
x265_nal *p_nal;
95
96
}
97
98
cliopt.printStatus(outFrameCount);
99
- if (numEncoded && cliopt.csvLogLevel)
100
- x265_csvlog_frame(cliopt.csvfpt, *param, *pic_recon, cliopt.csvLogLevel);
101
}
102
103
/* Flush the encoder */
104
105
}
106
107
cliopt.printStatus(outFrameCount);
108
- if (numEncoded && cliopt.csvLogLevel)
109
- x265_csvlog_frame(cliopt.csvfpt, *param, *pic_recon, cliopt.csvLogLevel);
110
111
if (!numEncoded)
112
break;
113
114
delete reconPlay;
115
116
api->encoder_get_stats(encoder, &stats, sizeof(stats));
117
- if (cliopt.csvfpt && !b_ctrl_c)
118
- x265_csvlog_encode(cliopt.csvfpt, api->version_str, *param, stats, cliopt.csvLogLevel, argc, argv);
119
+ if (param->csvfn && !b_ctrl_c)
120
+ api->encoder_log(encoder, argc, argv);
121
api->encoder_close(encoder);
122
123
int64_t second_largest_pts = 0;
124
x265_2.4.tar.gz/source/x265.h -> x265_2.5.tar.gz/source/x265.h
Changed
201
1
2
3
#ifndef X265_H
4
#define X265_H
5
-
6
#include <stdint.h>
7
+#include <stdio.h>
8
#include "x265_config.h"
9
-
10
#ifdef __cplusplus
11
extern "C" {
12
#endif
13
14
uint32_t sliceType;
15
uint32_t numCUsInFrame;
16
uint32_t numPartitions;
17
+ uint32_t depthBytes;
18
int bScenecut;
19
void* wt;
20
void* interData;
21
22
} x265_cu_stats;
23
24
25
+/* pu statistics */
26
+typedef struct x265_pu_stats
27
+{
28
+ double percentSkipPu[4]; // Percentage of skip cu in all depths
29
+ double percentIntraPu[4]; // Percentage of intra modes in all depths
30
+ double percentAmpPu[4]; // Percentage of amp modes in all depths
31
+ double percentInterPu[4][3]; // Percentage of inter 2nx2n, 2nxn and nx2n in all depths
32
+ double percentMergePu[4][3]; // Percentage of merge 2nx2n, 2nxn and nx2n in all depth
33
+ double percentNxN;
34
+
35
+ /* All the above values will add up to 100%. */
36
+} x265_pu_stats;
37
+
38
+
39
typedef struct x265_analysis_2Pass
40
{
41
uint32_t poc;
42
43
int list0POC[16];
44
int list1POC[16];
45
uint16_t maxLumaLevel;
46
+ uint16_t minLumaLevel;
47
+
48
+ uint16_t maxChromaULevel;
49
+ uint16_t minChromaULevel;
50
+ double avgChromaULevel;
51
+
52
+
53
+ uint16_t maxChromaVLevel;
54
+ uint16_t minChromaVLevel;
55
+ double avgChromaVLevel;
56
+
57
char sliceType;
58
int bScenecut;
59
+ double ipCostRatio;
60
int frameLatency;
61
x265_cu_stats cuStats;
62
+ x265_pu_stats puStats;
63
double totalFrameTime;
64
} x265_frame_stats;
65
66
+typedef struct x265_ctu_info_t
67
+{
68
+ int32_t ctuAddress;
69
+ int32_t ctuPartitions[64];
70
+ void* ctuInfo;
71
+} x265_ctu_info_t;
72
+
73
+typedef enum
74
+{
75
+ NO_CTU_INFO = 0,
76
+ HAS_CTU_INFO = 1,
77
+ CTU_INFO_CHANGE = 2,
78
+}CTUInfo;
79
+
80
+
81
/* Arbitrary User SEI
82
* Payload size is in bytes and the payload pointer must be non-NULL.
83
* Payload types and syntax can be found in Annex D of the H.265 Specification.
84
85
* to allow the encoder to determine base QP */
86
int forceqp;
87
88
- /* If param.analysisMode is X265_ANALYSIS_OFF this field is ignored on input
89
+ /* If param.analysisReuseMode is X265_ANALYSIS_OFF this field is ignored on input
90
* and output. Else the user must call x265_alloc_analysis_data() to
91
* allocate analysis buffers for every picture passed to the encoder.
92
*
93
- * On input when param.analysisMode is X265_ANALYSIS_LOAD and analysisData
94
+ * On input when param.analysisReuseMode is X265_ANALYSIS_LOAD and analysisData
95
* member pointers are valid, the encoder will use the data stored here to
96
* reduce encoder work.
97
*
98
- * On output when param.analysisMode is X265_ANALYSIS_SAVE and analysisData
99
+ * On output when param.analysisReuseMode is X265_ANALYSIS_SAVE and analysisData
100
* member pointers are valid, the encoder will write output analysis into
101
* this data structure */
102
x265_analysis_data analysisData;
103
104
* X265_LOG_FULL, default is X265_LOG_INFO */
105
int logLevel;
106
107
- /* Filename of CSV log. Now deprecated */
108
+ /* Level of csv logging. 0 is summary, 1 is frame level logging,
109
+ * 2 is frame level logging with performance statistics */
110
+ int csvLogLevel;
111
+
112
+ /* filename of CSV log. If csvLogLevel is non-zero, the encoder will emit
113
+ * per-slice statistics to this log file in encode order. Otherwise the
114
+ * encoder will emit per-stream statistics into the log file when
115
+ * x265_encoder_log is called (presumably at the end of the encode) */
116
const char* csvfn;
117
118
/*== Internal Picture Specification ==*/
119
120
* buffers. if X265_ANALYSIS_LOAD, read analysis information into analysis
121
* buffer and use this analysis information to reduce the amount of work
122
* the encoder must perform. Default X265_ANALYSIS_OFF */
123
- int analysisMode;
124
+ int analysisReuseMode;
125
126
- /* Filename for analysisMode save/load. Default name is "x265_analysis.dat" */
127
- const char* analysisFileName;
128
+ /* Filename for analysisReuseMode save/load. Default name is "x265_analysis.dat" */
129
+ const char* analysisReuseFileName;
130
131
/*== Rate Control ==*/
132
133
134
135
/* sets a hard lower limit on QP */
136
int qpMin;
137
+
138
+ /* internally enable if tune grain is set */
139
+ int bEnableConstVbv;
140
} rc;
141
142
/*== Video Usability Information ==*/
143
144
int bHDROpt;
145
146
/* A value between 1 and 10 (both inclusive) determines the level of
147
- * information stored/reused in save/load analysis-mode. Higher the refine
148
- * level higher the informtion stored/reused. Default is 5 */
149
- int analysisRefineLevel;
150
+ * information stored/reused in save/load analysis-reuse-mode. Higher the refine
151
+ * level higher the information stored/reused. Default is 5 */
152
+ int analysisReuseLevel;
153
154
/* Limit Sample Adaptive Offset filter computation by early terminating SAO
155
* process based on inter prediction mode, CTU spatial-domain correlations,
156
157
/* Insert tone mapping information only for IDR frames and when the
158
* tone mapping information changes. */
159
int bDhdr10opt;
160
+
161
+ /* Determine how x265 react to the content information recieved through the API */
162
+ int bCTUInfo;
163
+
164
+ /* Use ratecontrol statistics from pic_in, if available*/
165
+ int bUseRcStats;
166
+
167
+ /* Factor by which input video is scaled down for analysis save mode. Default is 0 */
168
+ int scaleFactor;
169
+
170
+ /* Enable intra refinement in load mode*/
171
+ int intraRefine;
172
+
173
+ /* Enable inter refinement in load mode*/
174
+ int interRefine;
175
+
176
+ /* Enable motion vector refinement in load mode*/
177
+ int mvRefine;
178
+
179
+ /* Log of maximum CTU size */
180
+ uint32_t maxLog2CUSize;
181
+
182
+ /* Actual CU depth with respect to config depth */
183
+ uint32_t maxCUDepth;
184
+
185
+ /* CU depth with respect to maximum transform size */
186
+ uint32_t unitSizeDepth;
187
+
188
+ /* Number of 4x4 units in maximum CU size */
189
+ uint32_t num4x4Partitions;
190
+
191
+ /* Specify if analysis mode uses file for data reuse */
192
+ int bUseAnalysisFile;
193
+
194
+ /* File pointer for csv log */
195
+ FILE* csvfpt;
196
} x265_param;
197
+
198
/* x265_param_alloc:
199
* Allocates an x265_param instance. The returned param structure is not
200
* special in any way, but using this method together with x265_param_free()
201
x265_2.4.tar.gz/source/x265cli.h -> x265_2.5.tar.gz/source/x265cli.h
Changed
94
1
2
{ "scenecut", required_argument, NULL, 0 },
3
{ "no-scenecut", no_argument, NULL, 0 },
4
{ "scenecut-bias", required_argument, NULL, 0 },
5
+ { "ctu-info", required_argument, NULL, 0 },
6
{ "intra-refresh", no_argument, NULL, 0 },
7
{ "rc-lookahead", required_argument, NULL, 0 },
8
{ "lookahead-slices", required_argument, NULL, 0 },
9
10
{ "qpstep", required_argument, NULL, 0 },
11
{ "qpmin", required_argument, NULL, 0 },
12
{ "qpmax", required_argument, NULL, 0 },
13
+ { "const-vbv", no_argument, NULL, 0 },
14
+ { "no-const-vbv", no_argument, NULL, 0 },
15
{ "ratetol", required_argument, NULL, 0 },
16
{ "cplxblur", required_argument, NULL, 0 },
17
{ "qblur", required_argument, NULL, 0 },
18
19
{ "no-slow-firstpass", no_argument, NULL, 0 },
20
{ "multi-pass-opt-rps", no_argument, NULL, 0 },
21
{ "no-multi-pass-opt-rps", no_argument, NULL, 0 },
22
- { "analysis-mode", required_argument, NULL, 0 },
23
- { "analysis-file", required_argument, NULL, 0 },
24
- { "refine-level", required_argument, NULL, 0 },
25
+ { "analysis-reuse-mode", required_argument, NULL, 0 },
26
+ { "analysis-reuse-file", required_argument, NULL, 0 },
27
+ { "analysis-reuse-level", required_argument, NULL, 0 },
28
+ { "scale-factor", required_argument, NULL, 0 },
29
+ { "refine-intra", required_argument, NULL, 0 },
30
+ { "refine-inter", no_argument, NULL, 0 },
31
+ { "no-refine-inter",no_argument, NULL, 0 },
32
{ "strict-cbr", no_argument, NULL, 0 },
33
{ "temporal-layers", no_argument, NULL, 0 },
34
{ "no-temporal-layers", no_argument, NULL, 0 },
35
36
{ "dhdr10-info", required_argument, NULL, 0 },
37
{ "dhdr10-opt", no_argument, NULL, 0},
38
{ "no-dhdr10-opt", no_argument, NULL, 0},
39
+ { "refine-mv", no_argument, NULL, 0 },
40
+ { "no-refine-mv", no_argument, NULL, 0 },
41
{ 0, 0, 0, 0 },
42
{ 0, 0, 0, 0 },
43
{ 0, 0, 0, 0 },
44
45
H1(" 1 - i420 (4:2:0 default)\n");
46
H1(" 2 - i422 (4:2:2)\n");
47
H1(" 3 - i444 (4:4:4)\n");
48
-#if ENABLE_DYNAMIC_HDR10
49
- H0(" --dhdr10-info <filename> JSON file containing the Creative Intent Metadata to be encoded as Dynamic Tone Mapping \n");
50
- H0(" --[no-]dhdr10-opt Insert tone mapping SEI only for IDR frames and when the tone mapping information changes. Default disabled");
51
+#if ENABLE_HDR10_PLUS
52
+ H0(" --dhdr10-info <filename> JSON file containing the Creative Intent Metadata to be encoded as Dynamic Tone Mapping\n");
53
+ H0(" --[no-]dhdr10-opt Insert tone mapping SEI only for IDR frames and when the tone mapping information changes. Default disabled\n");
54
#endif
55
H0("-f/--frames <integer> Maximum number of frames to encode. Default all\n");
56
H0(" --seek <integer> First frame to encode\n");
57
58
H1(" --[no-]tskip-fast Enable fast intra transform skipping. Default %s\n", OPT(param->bEnableTSkipFast));
59
H1(" --nr-intra <integer> An integer value in range of 0 to 2000, which denotes strength of noise reduction in intra CUs. Default 0\n");
60
H1(" --nr-inter <integer> An integer value in range of 0 to 2000, which denotes strength of noise reduction in inter CUs. Default 0\n");
61
+ H0(" --ctu-info <integer> Enable receiving ctu information asynchronously and determine reaction to the CTU information (0, 1, 2, 4, 6) Default 0\n"
62
+ " - 1: force the partitions if CTU information is present\n"
63
+ " - 2: functionality of (1) and reduce qp if CTU information has changed\n"
64
+ " - 4: functionality of (1) and force Inter modes when CTU Information has changed, merge/skip otherwise\n"
65
+ " Enable this option only when planning to invoke the API function x265_encoder_ctu_info to copy ctu-info asynchronously\n");
66
H0("\nCoding tools:\n");
67
H0("-w/--[no-]weightp Enable weighted prediction in P slices. Default %s\n", OPT(param->bEnableWeightedPred));
68
H0(" --[no-]weightb Enable weighted prediction in B slices. Default %s\n", OPT(param->bEnableWeightedBiPred));
69
70
H0(" --[no-]analyze-src-pics Motion estimation uses source frame planes. Default disable\n");
71
H0(" --[no-]slow-firstpass Enable a slow first pass in a multipass rate control mode. Default %s\n", OPT(param->rc.bEnableSlowFirstPass));
72
H0(" --[no-]strict-cbr Enable stricter conditions and tolerance for bitrate deviations in CBR mode. Default %s\n", OPT(param->rc.bStrictCbr));
73
- H0(" --analysis-mode <string|int> save - Dump analysis info into file, load - Load analysis buffers from the file. Default %d\n", param->analysisMode);
74
- H0(" --analysis-file <filename> Specify file name used for either dumping or reading analysis data.\n");
75
- H0(" --refine-level <1..10> Level of analysis refinement indicates amount of info stored/reused in save/load mode, 1:least....10:most. Default %d\n", param->analysisRefineLevel);
76
+ H0(" --analysis-reuse-mode <string|int> save - Dump analysis info into file, load - Load analysis buffers from the file. Default %d\n", param->analysisReuseMode);
77
+ H0(" --analysis-reuse-file <filename> Specify file name used for either dumping or reading analysis data. Deault x265_analysis.dat\n");
78
+ H0(" --analysis-reuse-level <1..10> Level of analysis reuse indicates amount of info stored/reused in save/load mode, 1:least..10:most. Default %d\n", param->analysisReuseLevel);
79
+ H0(" --scale-factor <int> Specify factor by which input video is scaled down for analysis save mode. Default %d\n", param->scaleFactor);
80
+ H0(" --refine-intra <int> Enable intra refinement for load mode. Default %d\n", param->intraRefine);
81
+ H0(" --[no-]refine-inter Enable inter refinement for load mode. Default %s\n", OPT(param->interRefine));
82
+ H0(" --[no-]refine-mv Enable mv refinement for load mode. Default %s\n", OPT(param->mvRefine));
83
H0(" --aq-mode <integer> Mode for Adaptive Quantization - 0:none 1:uniform AQ 2:auto variance 3:auto variance with bias to dark scenes. Default %d\n", param->rc.aqMode);
84
H0(" --aq-strength <float> Reduces blocking and blurring in flat and textured areas (0 to 3.0). Default %.2f\n", param->rc.aqStrength);
85
H0(" --[no-]aq-motion Adaptive Quantization based on the relative motion of each CU w.r.t., frame. Default %s\n", OPT(param->bOptCUDeltaQP));
86
87
H1(" --qpstep <integer> The maximum single adjustment in QP allowed to rate control. Default %d\n", param->rc.qpStep);
88
H1(" --qpmin <integer> sets a hard lower limit on QP allowed to ratecontrol. Default %d\n", param->rc.qpMin);
89
H1(" --qpmax <integer> sets a hard upper limit on QP allowed to ratecontrol. Default %d\n", param->rc.qpMax);
90
+ H0(" --[no-]const-vbv Enable consistent vbv. turned on with tune grain. Default %s\n", OPT(param->rc.bEnableConstVbv));
91
H1(" --cbqpoffs <integer> Chroma Cb QP Offset [-12..12]. Default %d\n", param->cbQpOffset);
92
H1(" --crqpoffs <integer> Chroma Cr QP Offset [-12..12]. Default %d\n", param->crQpOffset);
93
H1(" --scaling-list <string> Specify a file containing HM style quant scaling lists or 'default' or 'off'. Default: off\n");
94