Changes of Revision 20
x265.changes
Changed
x
1
2
-------------------------------------------------------------------
3
+Fri Feb 24 14:03:24 UTC 2017 - ismail@i10z.com
4
+
5
+- Update to version 2.3
6
+ Encoder enhancements
7
+ * New SSIM-based RD-cost computation for improved visual quality,
8
+ and efficiency; use --ssim-rd to exercise.
9
+ * Multi-pass encoding can now share analysis information from
10
+ prior passes.
11
+ * A dedicated thread pool for lookahead can now be specified
12
+ with --lookahead-threads.
13
+ * option:–dynamic-rd dynamically increase analysis in areas
14
+ where the bitrate is being capped by VBV; works for both
15
+ CRF and ABR encodes with VBV settings.
16
+ * The number of bits used to signal the delta-QP can be
17
+ optimized with the --opt-cu-delta-qp option.
18
+ * Experimental feature option:–aq-motion adds new QP offsets
19
+ based on relative motion of a block with respect to the
20
+ movement of the frame.
21
+ API changes
22
+ * Reconfigure API now supports signalling new scaling lists.
23
+ * x265 application’s csv functionality now reports time
24
+ (in milliseconds) taken to encode each frame.
25
+ * --strict-cbr enables stricter bitrate adherence by adding
26
+ filler bits when achieved bitrate is lower than the target.
27
+ * --hdr can be used to ensure that max-cll and max-fall values
28
+ are always signaled (even if 0,0).
29
+ Bug fixes
30
+ * Fixed scaling lists support for 4:4:4 videos.
31
+ * Inconsistent output fix for --opt-qp-pss by removing last
32
+ slice’s QP from cost calculation.
33
+
34
+-------------------------------------------------------------------
35
Sun Jan 1 20:32:07 UTC 2017 - idonmez@suse.com
36
37
- Update to version 2.2
38
- Encode enhancements
39
+ Encoder enhancements
40
* Enhancements to TU selection algorithm with early-outs for
41
improved speed; use --limit-tu to exercise.
42
* New motion search method SEA (Successive Elimination Algorithm)
43
x265.spec
Changed
14
1
2
# based on the spec file from https://build.opensuse.org/package/view_file/home:Simmphonie/libx265/
3
4
Name: x265
5
-%define soname 102
6
+%define soname 110
7
%define libname lib%{name}
8
%define libsoname %{libname}-%{soname}
9
-Version: 2.2
10
+Version: 2.3
11
Release: 0
12
License: GPL-2.0+
13
Summary: A free h265/HEVC encoder - encoder binary
14
baselibs.conf
Changed
4
1
2
-libx265-102
3
+libx265-110
4
x265_2.2.tar.gz/.hg_archival.txt -> x265_2.3.tar.gz/.hg_archival.txt
Changed
8
1
2
repo: 09fe40627f03a0f9c3e6ac78b22ac93da23f9fdf
3
-node: be14a7e9755e54f0fd34911c72bdfa66981220bc
4
+node: 3037c1448549ca920967831482c653e5892fa8ed
5
branch: stable
6
-tag: 2.2
7
+tag: 2.3
8
x265_2.2.tar.gz/.hgtags -> x265_2.3.tar.gz/.hgtags
Changed
6
1
2
1d3b6e448e01ec40b392ef78b7e55a86249fbe68 1.9
3
960c9991d0dcf46559c32e070418d3cbb7e8aa2f 2.0
4
981e3bfef16a997bce6f46ce1b15631a0e234747 2.1
5
+be14a7e9755e54f0fd34911c72bdfa66981220bc 2.2
6
x265_2.2.tar.gz/doc/reST/cli.rst -> x265_2.3.tar.gz/doc/reST/cli.rst
Changed
167
1
2
.. option:: --limit-tu <0..4>
3
4
Enables early exit from TU depth recursion, for inter coded blocks.
5
+
6
Level 1 - decides to recurse to next higher depth based on cost
7
comparison of full size TU and split TU.
8
9
10
quad-tree begins at the same depth of the coded tree unit, but if the
11
maximum TU size is smaller than the CU size then transform QT begins
12
at the depth of the max-tu-size. Default: 32.
13
+
14
+.. option:: --dynamic-rd <0..4>
15
+
16
+ Increases the RD level at points where quality drops due to VBV rate
17
+ control enforcement. The number of CUs for which the RD is reconfigured
18
+ is determined based on the strength. Strength 1 gives the best FPS,
19
+ strength 4 gives the best SSIM. Strength 0 switches this feature off.
20
+ Default: 0.
21
+
22
+ Effective for RD levels 4 and below.
23
+
24
+.. option:: --ssim-rd, --no-ssim-rd
25
+
26
+ Enable/Disable SSIM RDO. SSIM is a better perceptual quality assessment
27
+ method as compared to MSE. SSIM based RDO calculation is based on residual
28
+ divisive normalization scheme. This normalization is consistent with the
29
+ luminance and contrast masking effect of Human Visual System. It is used
30
+ for mode selection during analysis of CTUs and can achieve significant
31
+ gain in terms of objective quality metrics SSIM and PSNR. It only has effect
32
+ on presets which use RDO-based mode decisions (:option:`--rd` 3 and above).
33
34
Temporal / motion search options
35
================================
36
37
Default: 8 for ultrafast, superfast, faster, fast, medium
38
4 for slow, slower
39
disabled for veryslow, slower
40
+
41
+.. option:: --lookahead-threads <integer>
42
43
+ Use multiple worker threads dedicated to doing only lookahead instead of sharing
44
+ the worker threads with frame Encoders. A dedicated lookahead threadpool is created with the
45
+ specified number of worker threads. This can range from 0 upto half the
46
+ hardware threads available for encoding. Using too many threads for lookahead can starve
47
+ resources for frame Encoder and can harm performance. Default is 0 - disabled, Lookahead
48
+ shares worker threads with other FrameEncoders .
49
50
+ **Values:** 0 - disabled(default). Max - Half of available hardware threads.
51
+
52
.. option:: --b-adapt <integer>
53
54
Set the level of effort in determining B frame placement.
55
56
Default 1.0.
57
**Range of values:** 0.0 to 3.0
58
59
+.. option:: --aq-motion, --no-aq-motion
60
+
61
+ Adjust the AQ offsets based on the relative motion of each block with
62
+ respect to the motion of the frame. The more the relative motion of the block,
63
+ the more quantization is used. Default disabled. **Experimental Feature**
64
+
65
.. option:: --qg-size <64|32|16|8>
66
67
Enable adaptive quantization for sub-CTUs. This parameter specifies
68
69
* :option:`--subme` = MIN(2, :option:`--subme`)
70
* :option:`--rd` = MIN(2, :option:`--rd`)
71
72
+.. option:: --multi-pass-opt-analysis, --no-multi-pass-opt-analysis
73
+
74
+ Enable/Disable multipass analysis refinement along with multipass ratecontrol. Based on
75
+ the information stored in pass 1, in subsequent passes analysis data is refined
76
+ and also redundant steps are skipped.
77
+ In pass 1 analysis information like motion vector, depth, reference and prediction
78
+ modes of the final best CTU partition is stored for each CTU.
79
+ Default disabled.
80
+
81
+.. option:: --multi-pass-opt-distortion, --no-multi-pass-opt-distortion
82
+
83
+ Enable/Disable multipass refinement of qp based on distortion data along with multipass
84
+ ratecontrol. In pass 1 distortion of best CTU partition is stored. CTUs with high
85
+ distortion get lower(negative)qp offsets and vice-versa for low distortion CTUs in pass 2.
86
+ This helps to improve the subjective quality.
87
+ Default disabled.
88
+
89
.. option:: --strict-cbr, --no-strict-cbr
90
91
Enables stricter conditions to control bitrate deviance from the
92
93
where %hu are unsigned 16bit integers and %u are unsigned 32bit
94
integers. The SEI includes X,Y display primaries for RGB channels
95
and white point (WP) in units of 0.00002 and max,min luminance (L)
96
- values in units of 0.0001 candela per meter square. (HDR)
97
+ values in units of 0.0001 candela per meter square. Applicable for HDR
98
+ content.
99
100
Example for a P3D65 1000-nits monitor, where G(x=0.265, y=0.690),
101
B(x=0.150, y=0.060), R(x=0.680, y=0.320), WP(x=0.3127, y=0.3290),
102
103
emitted. The string format is "%hu,%hu" where %hu are unsigned 16bit
104
integers. The first value is the max content light level (or 0 if no
105
maximum is indicated), the second value is the maximum picture
106
- average light level (or 0). (HDR)
107
+ average light level (or 0). Applicable for HDR content.
108
109
Example for MaxCLL=1000 candela per square meter, MaxFALL=400
110
candela per square meter:
111
112
Note that this string value will need to be escaped or quoted to
113
protect against shell expansion on many platforms. No default.
114
115
+.. option:: --hdr, --no-hdr
116
+
117
+ Force signalling of HDR parameters in SEI packets. Enabled
118
+ automatically when :option`--master-display` or :option`--max-cll` is
119
+ specified. Useful when there is a desire to signal 0 values for max-cll
120
+ and max-fall. Default disabled.
121
+
122
.. option:: --min-luma <integer>
123
124
Minimum luma value allowed for input pictures. Any values below min-luma
125
126
127
Maximum of the picture order count. Default 8
128
129
-.. option:: --[no-]vui-timing-info
130
+.. option:: --vui-timing-info, --no-vui-timing-info
131
132
Emit VUI timing info in bitstream. Default enabled.
133
134
-.. option:: --[no-]vui-hrd-info
135
+.. option:: --vui-hrd-info, --no-vui-hrd-info
136
137
Emit VUI HRD info in bitstream. Default enabled when
138
:option:`--hrd` is enabled.
139
140
-.. option:: --[no-]opt-qp-pps
141
+.. option:: --opt-qp-pps, --no-opt-qp-pps
142
143
Optimize QP in PPS (instead of default value of 26) based on the QP values
144
observed in last GOP. Default enabled.
145
146
-.. option:: --[no-]opt-ref-list-length-pps
147
+.. option:: --opt-ref-list-length-pps, --no-opt-ref-list-length-pps
148
149
Optimize L0 and L1 ref list length in PPS (instead of default value of 0)
150
based on the lengths observed in the last GOP. Default enabled.
151
152
-.. option:: --[no-]multi-pass-opt-rps
153
+.. option:: --multi-pass-opt-rps, --no-multi-pass-opt-rps
154
155
Enable storing commonly used RPS in SPS in multi pass mode. Default disabled.
156
157
+.. option:: --opt-cu-delta-qp, --no-opt-cu-delta-qp
158
+
159
+ Optimize CU level QPs by pulling up lower QPs to value close to meanQP thereby
160
+ minimizing fluctuations in deltaQP signalling. Default disabled.
161
+
162
+ Only effective at RD levels 5 and 6
163
+
164
165
Debugging options
166
=================
167
x265_2.2.tar.gz/doc/reST/releasenotes.rst -> x265_2.3.tar.gz/doc/reST/releasenotes.rst
Changed
45
1
2
Release Notes
3
*************
4
5
+Version 2.3
6
+===========
7
+
8
+Release date - 15th February, 2017.
9
+
10
+Encoder enhancements
11
+--------------------
12
+1. New SSIM-based RD-cost computation for improved visual quality, and efficiency; use :option:`--ssim-rd` to exercise.
13
+2. Multi-pass encoding can now share analysis information from prior passes (in addition to rate-control information) to improve performance and quality of subsequent passes; to your multi-pass command-lines that use the :option:`--pass` option, add :option:`--multi-pass-opt-distortion` to share distortion information, and :option:`--multi-pass-opt-analysis` to share other analysis information.
14
+3. A dedicated thread pool for lookahead can now be specified with :option:`--lookahead-threads`.
15
+4. option:`--dynamic-rd` dynamically increase analysis in areas where the bitrate is being capped by VBV; works for both CRF and ABR encodes with VBV settings.
16
+5. The number of bits used to signal the delta-QP can be optimized with the :option:`--opt-cu-delta-qp` option; found to be useful in some scenarios for lower bitrate targets.
17
+6. Experimental feature option:`--aq-motion` adds new QP offsets based on relative motion of a block with respect to the movement of the frame.
18
+
19
+API changes
20
+-----------
21
+1. Reconfigure API now supports signalling new scaling lists.
22
+2. x265 application's csv functionality now reports time (in milliseconds) taken to encode each frame.
23
+3. :option:`--strict-cbr` enables stricter bitrate adherence by adding filler bits when achieved bitrate is lower than the target; earlier, it was only reacting when the achieved rate was higher.
24
+4. :option:`--hdr` can be used to ensure that max-cll and max-fall values are always signaled (even if 0,0).
25
+
26
+Bug fixes
27
+---------
28
+1. Fixed incorrect HW thread counting on MacOS platform.
29
+2. Fixed scaling lists support for 4:4:4 videos.
30
+3. Inconsistent output fix for :option:`--opt-qp-pss` by removing last slice's QP from cost calculation.
31
+4. VTune profiling (enabled using ENABLE_VTUNE CMake option) now also works with 2017 VTune builds.
32
+
33
Version 2.2
34
===========
35
36
37
--------------------
38
1. Enhancements to TU selection algorithm with early-outs for improved speed; use :option:`--limit-tu` to exercise.
39
2. New motion search method SEA (Successive Elimination Algorithm) supported now as :option: `--me` 4
40
-3. Bit-stream optimizations to improve fields in PPS and SPS for bit-rate savings through :option:`--[no-]opt-qp-pps`, :option:`--[no-]opt-ref-list-length-pps`, and :option:`--[no-]multi-pass-opt-rps`.
41
+3. Bit-stream optimizations to improve fields in PPS and SPS for bit-rate savings through :option:`--opt-qp-pps`, :option:`--opt-ref-list-length-pps`, and :option:`--multi-pass-opt-rps`.
42
4. Enabled using VBV constraints when encoding without WPP.
43
5. All param options dumped in SEI packet in bitstream when info selected.
44
6. x265 now supports POWERPC-based systems. Several key functions also have optimized ALTIVEC kernels.
45
x265_2.2.tar.gz/source/CMakeLists.txt -> x265_2.3.tar.gz/source/CMakeLists.txt
Changed
21
1
2
option(NATIVE_BUILD "Target the build CPU" OFF)
3
option(STATIC_LINK_CRT "Statically link C runtime for release builds" OFF)
4
mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
5
-
6
# X265_BUILD must be incremented each time the public API is changed
7
-set(X265_BUILD 102)
8
+set(X265_BUILD 110)
9
configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
10
"${PROJECT_BINARY_DIR}/x265.def")
11
configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
12
13
set(XCODE 1)
14
endif()
15
if(APPLE)
16
- add_definitions(-DMACOS)
17
+ add_definitions(-DMACOS=1)
18
endif()
19
20
if(${CMAKE_CXX_COMPILER_ID} STREQUAL "Clang")
21
x265_2.2.tar.gz/source/cmake/FindVtune.cmake -> x265_2.3.tar.gz/source/cmake/FindVtune.cmake
Changed
10
1
2
else()
3
NAMES amplxe-vars.bat
4
endif(UNIX)
5
- HINTS $ENV{VTUNE_AMPLIFIER_XE_2016_DIR} $ENV{VTUNE_AMPLIFIER_XE_2015_DIR}
6
+ HINTS $ENV{VTUNE_AMPLIFIER_XE_2017_DIR} $ENV{VTUNE_AMPLIFIER_XE_2016_DIR} $ENV{VTUNE_AMPLIFIER_XE_2015_DIR}
7
DOC "Vtune root directory")
8
9
set (VTUNE_INCLUDE_DIR ${VTUNE_DIR}/include)
10
x265_2.2.tar.gz/source/common/common.h -> x265_2.3.tar.gz/source/common/common.h
Changed
12
1
2
3
#define INTEGRAL_PLANE_NUM 12 // 12 integral planes for 32x32, 32x24, 32x8, 24x32, 16x16, 16x12, 16x4, 12x16, 8x32, 8x8, 4x16 and 4x4.
4
5
+#define NAL_TYPE_OVERHEAD 2
6
+#define START_CODE_OVERHEAD 3
7
+#define FILLER_OVERHEAD (NAL_TYPE_OVERHEAD + START_CODE_OVERHEAD + 1)
8
+
9
namespace X265_NS {
10
11
enum { SAO_NUM_OFFSET = 4 };
12
x265_2.2.tar.gz/source/common/cudata.cpp -> x265_2.3.tar.gz/source/common/cudata.cpp
Changed
102
1
2
m_mvd[0] = m_mv[1] + m_numPartitions;
3
m_mvd[1] = m_mvd[0] + m_numPartitions;
4
5
+ m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
6
+
7
uint32_t cuSize = g_maxCUSize >> depth;
8
m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (cuSize * cuSize);
9
m_trCoeff[1] = m_trCoeff[2] = 0;
10
m_transformSkip[1] = m_transformSkip[2] = m_cbf[1] = m_cbf[2] = 0;
11
+ m_fAc_den[0] = m_fDc_den[0] = 0;
12
}
13
else
14
{
15
16
m_mvd[0] = m_mv[1] + m_numPartitions;
17
m_mvd[1] = m_mvd[0] + m_numPartitions;
18
19
+ m_distortion = dataPool.distortionMemBlock + instance * m_numPartitions;
20
+
21
uint32_t cuSize = g_maxCUSize >> depth;
22
uint32_t sizeL = cuSize * cuSize;
23
uint32_t sizeC = sizeL >> (m_hChromaShift + m_vChromaShift); // block chroma part
24
m_trCoeff[0] = dataPool.trCoeffMemBlock + instance * (sizeL + sizeC * 2);
25
m_trCoeff[1] = m_trCoeff[0] + sizeL;
26
m_trCoeff[2] = m_trCoeff[0] + sizeL + sizeC;
27
+ for (int i = 0; i < 3; i++)
28
+ m_fAc_den[i] = m_fDc_den[i] = 0;
29
}
30
}
31
32
33
for (int8_t i = 0; i < NUM_TU_DEPTH; i++)
34
m_refTuDepth[i] = -1;
35
36
+ m_vbvAffected = false;
37
+
38
uint32_t widthInCU = m_slice->m_sps->numCuInWidth;
39
m_cuLeft = (m_cuAddr % widthInCU) ? m_encData->getPicCTU(m_cuAddr - 1) : NULL;
40
m_cuAbove = (m_cuAddr >= widthInCU) && !m_bFirstRowInSlice ? m_encData->getPicCTU(m_cuAddr - widthInCU) : NULL;
41
m_cuAboveLeft = (m_cuLeft && m_cuAbove) ? m_encData->getPicCTU(m_cuAddr - widthInCU - 1) : NULL;
42
m_cuAboveRight = (m_cuAbove && ((m_cuAddr % widthInCU) < (widthInCU - 1))) ? m_encData->getPicCTU(m_cuAddr - widthInCU + 1) : NULL;
43
+ memset(m_distortion, 0, m_numPartitions * sizeof(sse_t));
44
}
45
46
// initialize Sub partition
47
48
m_bFirstRowInSlice = ctu.m_bFirstRowInSlice;
49
m_bLastRowInSlice = ctu.m_bLastRowInSlice;
50
m_bLastCuInSlice = ctu.m_bLastCuInSlice;
51
+ for (int i = 0; i < 3; i++)
52
+ {
53
+ m_fAc_den[i] = ctu.m_fAc_den[i];
54
+ m_fDc_den[i] = ctu.m_fDc_den[i];
55
+ }
56
57
X265_CHECK(m_numPartitions == cuGeom.numPartitions, "initSubCU() size mismatch\n");
58
59
60
61
/* initialize the remaining CU data in one memset */
62
memset(m_predMode, 0, (ctu.m_chromaFormat == X265_CSP_I400 ? BytesPerPartition - 12 : BytesPerPartition - 8) * m_numPartitions);
63
+ memset(m_distortion, 0, m_numPartitions * sizeof(sse_t));
64
}
65
66
/* Copy the results of a sub-part (split) CU to the parent CU */
67
68
memcpy(m_mvd[0] + offset, subCU.m_mvd[0], childGeom.numPartitions * sizeof(MV));
69
memcpy(m_mvd[1] + offset, subCU.m_mvd[1], childGeom.numPartitions * sizeof(MV));
70
71
+ memcpy(m_distortion + offset, subCU.m_distortion, childGeom.numPartitions * sizeof(sse_t));
72
+
73
uint32_t tmp = 1 << ((g_maxLog2CUSize - childGeom.depth) * 2);
74
uint32_t tmp2 = subPartIdx * tmp;
75
memcpy(m_trCoeff[0] + tmp2, subCU.m_trCoeff[0], sizeof(coeff_t)* tmp);
76
77
memcpy(m_mv[1], cu.m_mv[1], m_numPartitions * sizeof(MV));
78
memcpy(m_mvd[0], cu.m_mvd[0], m_numPartitions * sizeof(MV));
79
memcpy(m_mvd[1], cu.m_mvd[1], m_numPartitions * sizeof(MV));
80
+ memcpy(m_distortion, cu.m_distortion, m_numPartitions * sizeof(sse_t));
81
82
/* force TQBypass to true */
83
m_partSet(m_tqBypass, true);
84
85
memcpy(ctu.m_mvd[0] + m_absIdxInCTU, m_mvd[0], m_numPartitions * sizeof(MV));
86
memcpy(ctu.m_mvd[1] + m_absIdxInCTU, m_mvd[1], m_numPartitions * sizeof(MV));
87
88
+ memcpy(ctu.m_distortion + m_absIdxInCTU, m_distortion, m_numPartitions * sizeof(sse_t));
89
+
90
uint32_t tmpY = 1 << ((g_maxLog2CUSize - depth) * 2);
91
uint32_t tmpY2 = m_absIdxInCTU << (LOG2_UNIT_SIZE * 2);
92
memcpy(ctu.m_trCoeff[0] + tmpY2, m_trCoeff[0], sizeof(coeff_t)* tmpY);
93
94
memcpy(m_mvd[0], ctu.m_mvd[0] + m_absIdxInCTU, m_numPartitions * sizeof(MV));
95
memcpy(m_mvd[1], ctu.m_mvd[1] + m_absIdxInCTU, m_numPartitions * sizeof(MV));
96
97
+ memcpy(m_distortion, ctu.m_distortion + m_absIdxInCTU, m_numPartitions * sizeof(sse_t));
98
+
99
/* clear residual coding flags */
100
m_partSet(m_tuDepth, 0);
101
m_partSet(m_transformSkip[0], 0);
102
x265_2.2.tar.gz/source/common/cudata.h -> x265_2.3.tar.gz/source/common/cudata.h
Changed
55
1
2
static cubcast_t s_partSet[NUM_FULL_DEPTH]; // pointer to broadcast set functions per absolute depth
3
static uint32_t s_numPartInCUSize;
4
5
+ bool m_vbvAffected;
6
+
7
FrameData* m_encData;
8
const Slice* m_slice;
9
10
11
uint8_t* m_chromaIntraDir; // array of intra directions (chroma)
12
enum { BytesPerPartition = 21 }; // combined sizeof() of all per-part data
13
14
+ sse_t* m_distortion;
15
coeff_t* m_trCoeff[3]; // transformed coefficient buffer per plane
16
int8_t m_refTuDepth[NUM_TU_DEPTH]; // TU depth of CU at depths 0, 1 and 2
17
18
19
const CUData* m_cuAboveRight; // pointer to above-right neighbor CTU
20
const CUData* m_cuAbove; // pointer to above neighbor CTU
21
const CUData* m_cuLeft; // pointer to left neighbor CTU
22
+ double m_meanQP;
23
+ uint64_t m_fAc_den[3];
24
+ uint64_t m_fDc_den[3];
25
26
CUData();
27
28
29
uint8_t* charMemBlock;
30
coeff_t* trCoeffMemBlock;
31
MV* mvMemBlock;
32
+ sse_t* distortionMemBlock;
33
34
- CUDataMemPool() { charMemBlock = NULL; trCoeffMemBlock = NULL; mvMemBlock = NULL; }
35
+ CUDataMemPool() { charMemBlock = NULL; trCoeffMemBlock = NULL; mvMemBlock = NULL; distortionMemBlock = NULL; }
36
37
bool create(uint32_t depth, uint32_t csp, uint32_t numInstances)
38
{
39
40
}
41
CHECKED_MALLOC(charMemBlock, uint8_t, numPartition * numInstances * CUData::BytesPerPartition);
42
CHECKED_MALLOC_ZERO(mvMemBlock, MV, numPartition * 4 * numInstances);
43
+ CHECKED_MALLOC(distortionMemBlock, sse_t, numPartition * numInstances);
44
return true;
45
fail:
46
return false;
47
48
X265_FREE(trCoeffMemBlock);
49
X265_FREE(mvMemBlock);
50
X265_FREE(charMemBlock);
51
+ X265_FREE(distortionMemBlock);
52
}
53
};
54
}
55
x265_2.2.tar.gz/source/common/frame.cpp -> x265_2.3.tar.gz/source/common/frame.cpp
Changed
18
1
2
m_userSEI.payloads = NULL;
3
memset(&m_lowres, 0, sizeof(m_lowres));
4
m_rcData = NULL;
5
+ m_encodeStartTime = 0;
6
}
7
8
bool Frame::create(x265_param *param, float* quantOffsets)
9
10
CHECKED_MALLOC_ZERO(m_rcData, RcStats, 1);
11
12
if (m_fencPic->create(param->sourceWidth, param->sourceHeight, param->internalCsp) &&
13
- m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode, param->rc.qgSize))
14
+ m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode || !!param->bAQMotion, param->rc.qgSize))
15
{
16
X265_CHECK((m_reconColCount == NULL), "m_reconColCount was initialized");
17
m_numRows = (m_fencPic->m_picHeight + g_maxCUSize - 1) / g_maxCUSize;
18
x265_2.2.tar.gz/source/common/frame.h -> x265_2.3.tar.gz/source/common/frame.h
Changed
12
1
2
Frame* m_prev;
3
x265_param* m_param; // Points to the latest param set for the frame.
4
x265_analysis_data m_analysisData;
5
+ x265_analysis_2Pass m_analysis2Pass;
6
RcStats* m_rcData;
7
+
8
+ int64_t m_encodeStartTime;
9
Frame();
10
11
bool create(x265_param *param, float* quantOffsets);
12
x265_2.2.tar.gz/source/common/framedata.h -> x265_2.3.tar.gz/source/common/framedata.h
Changed
42
1
2
double avgLumaDistortion;
3
double avgChromaDistortion;
4
double avgPsyEnergy;
5
+ double avgSsimEnergy;
6
double avgResEnergy;
7
double percentIntraNxN;
8
double percentSkipCu[NUM_CU_DEPTH];
9
10
uint64_t lumaDistortion;
11
uint64_t chromaDistortion;
12
uint64_t psyEnergy;
13
+ int64_t ssimEnergy;
14
uint64_t resEnergy;
15
uint64_t cntSkipCu[NUM_CU_DEPTH];
16
uint64_t cntMergeCu[NUM_CU_DEPTH];
17
18
uint8_t* partSize;
19
uint8_t* mergeFlag;
20
};
21
+
22
+struct analysis2PassFrameData
23
+{
24
+ uint8_t* depth;
25
+ MV* m_mv[2];
26
+ int* mvpIdx[2];
27
+ int32_t* ref[2];
28
+ uint8_t* modes;
29
+ sse_t* distortion;
30
+ sse_t* ctuDistortion;
31
+ double* scaledDistortion;
32
+ double averageDistortion;
33
+ double sdDistortion;
34
+ uint32_t highDistortionCtuCount;
35
+ uint32_t lowDistortionCtuCount;
36
+ double* offset;
37
+ double* threshold;
38
+};
39
+
40
}
41
#endif // ifndef X265_FRAMEDATA_H
42
x265_2.2.tar.gz/source/common/lowres.cpp -> x265_2.3.tar.gz/source/common/lowres.cpp
Changed
19
1
2
if (bAQEnabled)
3
{
4
CHECKED_MALLOC_ZERO(qpAqOffset, double, cuCountFullRes);
5
+ CHECKED_MALLOC_ZERO(qpAqMotionOffset, double, cuCountFullRes);
6
CHECKED_MALLOC_ZERO(invQscaleFactor, int, cuCountFullRes);
7
CHECKED_MALLOC_ZERO(qpCuTreeOffset, double, cuCountFullRes);
8
CHECKED_MALLOC_ZERO(blockVariance, uint32_t, cuCountFullRes);
9
10
X265_FREE(lowresMvCosts[0][i]);
11
X265_FREE(lowresMvCosts[1][i]);
12
}
13
-
14
X265_FREE(qpAqOffset);
15
+ X265_FREE(qpAqMotionOffset);
16
X265_FREE(invQscaleFactor);
17
X265_FREE(qpCuTreeOffset);
18
X265_FREE(propagateCost);
19
x265_2.2.tar.gz/source/common/lowres.h -> x265_2.3.tar.gz/source/common/lowres.h
Changed
9
1
2
/* rate control / adaptive quant data */
3
double* qpAqOffset; // AQ QP offset values for each 16x16 CU
4
double* qpCuTreeOffset; // cuTree QP offset values for each 16x16 CU
5
+ double* qpAqMotionOffset;
6
int* invQscaleFactor; // qScale values for qp Aq Offsets
7
int* invQscaleFactor8x8; // temporary buffer for qg-size 8
8
uint32_t* blockVariance;
9
x265_2.2.tar.gz/source/common/param.cpp -> x265_2.3.tar.gz/source/common/param.cpp
Changed
132
1
2
param->bEnableAccessUnitDelimiters = 0;
3
param->bEmitHRDSEI = 0;
4
param->bEmitInfoSEI = 1;
5
+ param->bEmitHDRSEI = 0;
6
7
/* CU definitions */
8
param->maxCUSize = 64;
9
10
param->bBPyramid = 1;
11
param->scenecutThreshold = 40; /* Magic number pulled in from x264 */
12
param->lookaheadSlices = 8;
13
+ param->lookaheadThreads = 0;
14
param->scenecutBias = 5.0;
15
-
16
/* Intra Coding Tools */
17
param->bEnableConstrainedIntra = 0;
18
param->bEnableStrongIntraSmoothing = 1;
19
20
param->bEnableTemporalMvp = 1;
21
param->bSourceReferenceEstimation = 0;
22
param->limitTU = 0;
23
+ param->dynamicRd = 0;
24
25
/* Loop Filter */
26
param->bEnableLoopFilter = 1;
27
28
param->psyRd = 2.0;
29
param->psyRdoq = 0.0;
30
param->analysisMode = 0;
31
+ param->analysisMultiPassRefine = 0;
32
+ param->analysisMultiPassDistortion = 0;
33
param->analysisFileName = NULL;
34
param->bIntraInBFrames = 0;
35
param->bLossless = 0;
36
37
param->bEnableTemporalSubLayers = 0;
38
param->bEnableRdRefine = 0;
39
param->bMultiPassOptRPS = 0;
40
+ param->bSsimRd = 0;
41
42
/* Rate control options */
43
param->rc.vbvMaxBitrate = 0;
44
45
param->bEmitVUIHRDInfo = 1;
46
param->bOptQpPPS = 1;
47
param->bOptRefListLengthPPS = 1;
48
+ param->bOptCUDeltaQP = 0;
49
+ param->bAQMotion = 0;
50
51
}
52
53
54
OPT("opt-ref-list-length-pps") p->bOptRefListLengthPPS = atobool(value);
55
OPT("multi-pass-opt-rps") p->bMultiPassOptRPS = atobool(value);
56
OPT("scenecut-bias") p->scenecutBias = atof(value);
57
-
58
+ OPT("lookahead-threads") p->lookaheadThreads = atoi(value);
59
+ OPT("opt-cu-delta-qp") p->bOptCUDeltaQP = atobool(value);
60
+ OPT("multi-pass-opt-analysis") p->analysisMultiPassRefine = atobool(value);
61
+ OPT("multi-pass-opt-distortion") p->analysisMultiPassDistortion = atobool(value);
62
+ OPT("aq-motion") p->bAQMotion = atobool(value);
63
+ OPT("dynamic-rd") p->dynamicRd = atof(value);
64
+ OPT("ssim-rd")
65
+ {
66
+ int bval = atobool(value);
67
+ if (bError || bval)
68
+ {
69
+ bError = false;
70
+ p->psyRd = 0.0;
71
+ p->bSsimRd = atobool(value);
72
+ }
73
+ }
74
+ OPT("hdr") p->bEmitHDRSEI = atobool(value);
75
else
76
return X265_PARAM_BAD_NAME;
77
}
78
79
"RD Level is out of range");
80
CHECK(param->rdoqLevel < 0 || param->rdoqLevel > 2,
81
"RDOQ Level is out of range");
82
+ CHECK(param->dynamicRd < 0 || param->dynamicRd > x265_ADAPT_RD_STRENGTH,
83
+ "Dynamic RD strength must be between 0 and 4");
84
CHECK(param->bframes && param->bframes >= param->lookaheadDepth && !param->rc.bStatRead,
85
"Lookahead depth must be greater than the max consecutive bframe count");
86
CHECK(param->bframes < 0,
87
88
CHECK(param->searchMethod == X265_SEA && (param->sourceWidth > 840 || param->sourceHeight > 480),
89
"SEA motion search does not support resolutions greater than 480p in 32 bit build");
90
#endif
91
+
92
+ if (param->masteringDisplayColorVolume || param->maxFALL || param->maxCLL)
93
+ param->bEmitHDRSEI = 1;
94
+
95
return check_failed;
96
}
97
98
99
TOOLOPT(param->bEnableAMP, "amp");
100
TOOLOPT(param->limitModes, "limit-modes");
101
TOOLVAL(param->rdLevel, "rd=%d");
102
+ TOOLVAL(param->dynamicRd, "dynamic-rd=%.2f");
103
TOOLVAL(param->psyRd, "psy-rd=%.2lf");
104
TOOLVAL(param->rdoqLevel, "rdoq=%d");
105
TOOLVAL(param->psyRdoq, "psy-rdoq=%.2lf");
106
107
TOOLOPT(param->bEnableFastIntra, "fast-intra");
108
TOOLOPT(param->bEnableStrongIntraSmoothing, "strong-intra-smoothing");
109
TOOLVAL(param->lookaheadSlices, "lslices=%d");
110
+ TOOLVAL(param->lookaheadThreads, "lthreads=%d")
111
if (param->maxSlices > 1)
112
TOOLVAL(param->maxSlices, "slices=%d");
113
if (param->bEnableLoopFilter)
114
115
s += sprintf(s, " tu-intra-depth=%d", p->tuQTMaxIntraDepth);
116
s += sprintf(s, " limit-tu=%d", p->limitTU);
117
s += sprintf(s, " rdoq-level=%d", p->rdoqLevel);
118
+ s += sprintf(s, " dynamic-rd=%.2f", p->dynamicRd);
119
BOOL(p->bEnableSignHiding, "signhide");
120
BOOL(p->bEnableTransformSkip, "tskip");
121
s += sprintf(s, " nr-intra=%d", p->noiseReductionIntra);
122
123
BOOL(p->bOptRefListLengthPPS, "opt-ref-list-length-pps");
124
BOOL(p->bMultiPassOptRPS, "multi-pass-opt-rps");
125
s += sprintf(s, " scenecut-bias=%.2f", p->scenecutBias);
126
+ BOOL(p->bOptCUDeltaQP, "opt-cu-delta-qp");
127
+ BOOL(p->bAQMotion, "aq-motion");
128
+ BOOL(p->bEmitHDRSEI, "hdr");
129
#undef BOOL
130
return buf;
131
}
132
x265_2.2.tar.gz/source/common/quant.cpp -> x265_2.3.tar.gz/source/common/quant.cpp
Changed
85
1
2
}
3
}
4
5
+uint64_t Quant::ssimDistortion(const CUData& cu, const pixel* fenc, uint32_t fStride, const pixel* recon, intptr_t rstride, uint32_t log2TrSize, TextType ttype, uint32_t absPartIdx)
6
+{
7
+ static const int ssim_c1 = (int)(.01 * .01 * PIXEL_MAX * PIXEL_MAX * 64 + .5); // 416
8
+ static const int ssim_c2 = (int)(.03 * .03 * PIXEL_MAX * PIXEL_MAX * 64 * 63 + .5); // 235963
9
+ int shift = (X265_DEPTH - 8);
10
+
11
+ int trSize = 1 << log2TrSize;
12
+ uint64_t ssDc = 0, ssBlock = 0, ssAc = 0;
13
+
14
+ // Calculation of (X(0) - Y(0)) * (X(0) - Y(0)), DC
15
+ ssDc = 0;
16
+ for (int y = 0; y < trSize; y += 4)
17
+ {
18
+ for (int x = 0; x < trSize; x += 4)
19
+ {
20
+ int temp = fenc[y * fStride + x] - recon[y * rstride + x]; // copy of residual coeff
21
+ ssDc += temp * temp;
22
+ }
23
+ }
24
+
25
+ // Calculation of (X(k) - Y(k)) * (X(k) - Y(k)), AC
26
+ ssBlock = 0;
27
+ for (int y = 0; y < trSize; y++)
28
+ {
29
+ for (int x = 0; x < trSize; x++)
30
+ {
31
+ int temp = fenc[y * fStride + x] - recon[y * rstride + x]; // copy of residual coeff
32
+ ssBlock += temp * temp;
33
+ }
34
+ }
35
+
36
+ ssAc = ssBlock - ssDc;
37
+
38
+ // 1. Calculation of fdc'
39
+ // Calculate numerator of dc normalization factor
40
+ uint64_t fDc_num = 0;
41
+
42
+ // 2. Calculate dc component
43
+ uint64_t dc_k = 0;
44
+ for (int block_yy = 0; block_yy < trSize; block_yy += 4)
45
+ {
46
+ for (int block_xx = 0; block_xx < trSize; block_xx += 4)
47
+ {
48
+ uint32_t temp = fenc[block_yy * fStride + block_xx] >> shift;
49
+ dc_k += temp * temp;
50
+ }
51
+ }
52
+
53
+ fDc_num = (2 * dc_k) + (trSize * trSize * ssim_c1); // 16 pixels -> for each 4x4 block
54
+ fDc_num /= ((trSize >> 2) * (trSize >> 2));
55
+
56
+ // 1. Calculation of fac'
57
+ // Calculate numerator of ac normalization factor
58
+ uint64_t fAc_num = 0;
59
+
60
+ // 2. Calculate ac component
61
+ uint64_t ac_k = 0;
62
+ for (int block_yy = 0; block_yy < trSize; block_yy += 1)
63
+ {
64
+ for (int block_xx = 0; block_xx < trSize; block_xx += 1)
65
+ {
66
+ uint32_t temp = fenc[block_yy * fStride + block_xx] >> shift;
67
+ ac_k += temp * temp;
68
+ }
69
+ }
70
+ ac_k -= dc_k;
71
+
72
+ double s = 1 + 0.005 * cu.m_qp[absPartIdx];
73
+
74
+ fAc_num = ac_k + uint64_t(s * ac_k) + ssim_c2;
75
+ fAc_num /= ((trSize >> 2) * (trSize >> 2));
76
+
77
+ // Calculate dc and ac normalization factor
78
+ uint64_t ssim_distortion = ((ssDc * cu.m_fDc_den[ttype]) / fDc_num) + ((ssAc * cu.m_fAc_den[ttype]) / fAc_num);
79
+ return ssim_distortion;
80
+}
81
+
82
void Quant::invtransformNxN(const CUData& cu, int16_t* residual, uint32_t resiStride, const coeff_t* coeff,
83
uint32_t log2TrSize, TextType ttype, bool bIntra, bool useTransformSkip, uint32_t numSig)
84
{
85
x265_2.2.tar.gz/source/common/quant.h -> x265_2.3.tar.gz/source/common/quant.h
Changed
10
1
2
3
void invtransformNxN(const CUData& cu, int16_t* residual, uint32_t resiStride, const coeff_t* coeff,
4
uint32_t log2TrSize, TextType ttype, bool bIntra, bool useTransformSkip, uint32_t numSig);
5
+ uint64_t ssimDistortion(const CUData& cu, const pixel* fenc, uint32_t fStride, const pixel* recon, intptr_t rstride,
6
+ uint32_t log2TrSize, TextType ttype, uint32_t absPartIdx);
7
8
/* Pattern decision for context derivation process of significant_coeff_flag */
9
static uint32_t calcPatternSigCtx(uint64_t sigCoeffGroupFlag64, uint32_t cgPosX, uint32_t cgPosY, uint32_t cgBlkPos, uint32_t trSizeCG)
10
x265_2.2.tar.gz/source/common/scalinglist.cpp -> x265_2.3.tar.gz/source/common/scalinglist.cpp
Changed
32
1
2
}
3
4
/** set quantized matrix coefficient for encode */
5
-void ScalingList::setupQuantMatrices()
6
+void ScalingList::setupQuantMatrices(int internalCsp)
7
{
8
for (int size = 0; size < NUM_SIZES; size++)
9
{
10
11
12
if (m_bEnabled)
13
{
14
+ if (internalCsp == X265_CSP_I444)
15
+ {
16
+ for (int i = 0; i < 64; i++)
17
+ {
18
+ m_scalingListCoef[BLOCK_32x32][1][i] = m_scalingListCoef[BLOCK_16x16][1][i];
19
+ m_scalingListCoef[BLOCK_32x32][2][i] = m_scalingListCoef[BLOCK_16x16][2][i];
20
+ m_scalingListCoef[BLOCK_32x32][4][i] = m_scalingListCoef[BLOCK_16x16][4][i];
21
+ m_scalingListCoef[BLOCK_32x32][5][i] = m_scalingListCoef[BLOCK_16x16][5][i];
22
+ }
23
+
24
+ m_scalingListDC[BLOCK_32x32][1] = m_scalingListDC[BLOCK_16x16][1];
25
+ m_scalingListDC[BLOCK_32x32][2] = m_scalingListDC[BLOCK_16x16][2];
26
+ m_scalingListDC[BLOCK_32x32][4] = m_scalingListDC[BLOCK_16x16][4];
27
+ m_scalingListDC[BLOCK_32x32][5] = m_scalingListDC[BLOCK_16x16][5];
28
+ }
29
processScalingListEnc(coeff, quantCoeff, s_quantScales[rem] << 4, width, width, ratio, stride, dc);
30
processScalingListDec(coeff, dequantCoeff, s_invQuantScales[rem], width, width, ratio, stride, dc);
31
}
32
x265_2.2.tar.gz/source/common/scalinglist.h -> x265_2.3.tar.gz/source/common/scalinglist.h
Changed
10
1
2
bool init();
3
void setDefaultScalingList();
4
bool parseScalingList(const char* filename);
5
- void setupQuantMatrices();
6
+ void setupQuantMatrices(int internalCsp);
7
8
/* used during SPS coding */
9
int checkPredMode(int sizeId, int listId) const;
10
x265_2.2.tar.gz/source/common/threadpool.cpp -> x265_2.3.tar.gz/source/common/threadpool.cpp
Changed
76
1
2
3
#endif
4
5
-#if MACOS
6
+/* TODO FIX: Macro __MACH__ ideally should be part of MacOS definition, but adding to Cmake
7
+ behaving is not as expected, need to fix this. */
8
+
9
+#if MACOS && __MACH__
10
#include <sys/param.h>
11
#include <sys/sysctl.h>
12
#endif
13
14
15
return bondCount;
16
}
17
-
18
-ThreadPool* ThreadPool::allocThreadPools(x265_param* p, int& numPools)
19
+ThreadPool* ThreadPool::allocThreadPools(x265_param* p, int& numPools, bool isThreadsReserved)
20
{
21
enum { MAX_NODE_NUM = 127 };
22
int cpusPerNode[MAX_NODE_NUM + 1];
23
24
x265_log(p, X265_LOG_DEBUG, "Reducing number of thread pools for frame thread count\n");
25
numPools = X265_MAX(p->frameNumThreads / 2, 1);
26
}
27
-
28
+ if (isThreadsReserved)
29
+ numPools = 1;
30
ThreadPool *pools = new ThreadPool[numPools];
31
if (pools)
32
{
33
- int maxProviders = (p->frameNumThreads + numPools - 1) / numPools + 1; /* +1 is Lookahead, always assigned to threadpool 0 */
34
+ int maxProviders = (p->frameNumThreads + numPools - 1) / numPools + !isThreadsReserved; /* +1 is Lookahead, always assigned to threadpool 0 */
35
int node = 0;
36
for (int i = 0; i < numPools; i++)
37
{
38
while (!threadsPerPool[node])
39
node++;
40
int numThreads = X265_MIN(MAX_POOL_THREADS, threadsPerPool[node]);
41
+ int origNumThreads = numThreads;
42
+ if (p->lookaheadThreads > numThreads / 2)
43
+ {
44
+ p->lookaheadThreads = numThreads / 2;
45
+ x265_log(p, X265_LOG_DEBUG, "Setting lookahead threads to a maximum of half the total number of threads\n");
46
+ }
47
+ if (isThreadsReserved)
48
+ {
49
+ numThreads = p->lookaheadThreads;
50
+ maxProviders = 1;
51
+ }
52
+
53
+ else
54
+ numThreads -= p->lookaheadThreads;
55
if (!pools[i].create(numThreads, maxProviders, nodeMaskPerPool[node]))
56
{
57
X265_FREE(pools);
58
59
}
60
else
61
x265_log(p, X265_LOG_INFO, "Thread pool created using %d threads\n", numThreads);
62
- threadsPerPool[node] -= numThreads;
63
+ threadsPerPool[node] -= origNumThreads;
64
}
65
}
66
else
67
68
return sysconf(_SC_NPROCESSORS_CONF);
69
#elif __unix__
70
return sysconf(_SC_NPROCESSORS_ONLN);
71
-#elif MACOS
72
+#elif MACOS && __MACH__
73
int nm[2];
74
size_t len = 4;
75
uint32_t count;
76
x265_2.2.tar.gz/source/common/threadpool.h -> x265_2.3.tar.gz/source/common/threadpool.h
Changed
12
1
2
void setThreadNodeAffinity(void *numaMask);
3
int tryAcquireSleepingThread(sleepbitmap_t firstTryBitmap, sleepbitmap_t secondTryBitmap);
4
int tryBondPeers(int maxPeers, sleepbitmap_t peerBitmap, BondedTaskGroup& master);
5
-
6
- static ThreadPool* allocThreadPools(x265_param* p, int& numPools);
7
-
8
+ static ThreadPool* allocThreadPools(x265_param* p, int& numPools, bool isThreadsReserved);
9
static int getCpuCount();
10
static int getNumaNodeCount();
11
};
12
x265_2.2.tar.gz/source/encoder/analysis.cpp -> x265_2.3.tar.gz/source/encoder/analysis.cpp
Changed
294
1
2
m_reuseRef = NULL;
3
m_bHD = false;
4
}
5
+
6
bool Analysis::create(ThreadLocalData *tld)
7
{
8
m_tld = tld;
9
10
ctu.setQPSubParts((int8_t)qp, 0, 0);
11
12
m_rqt[0].cur.load(initialContext);
13
+ ctu.m_meanQP = initialContext.m_meanQP;
14
m_modeDepth[0].fencYuv.copyFromPicYuv(*m_frame->m_fencPic, ctu.m_cuAddr, 0);
15
16
+ if (m_param->bSsimRd)
17
+ calculateNormFactor(ctu, qp);
18
+
19
uint32_t numPartition = ctu.m_numPartitions;
20
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead)
21
+ {
22
+ m_multipassAnalysis = (analysis2PassFrameData*)m_frame->m_analysis2Pass.analysisFramedata;
23
+ m_multipassDepth = &m_multipassAnalysis->depth[ctu.m_cuAddr * ctu.m_numPartitions];
24
+ if (m_slice->m_sliceType != I_SLICE)
25
+ {
26
+ int numPredDir = m_slice->isInterP() ? 1 : 2;
27
+ for (int dir = 0; dir < numPredDir; dir++)
28
+ {
29
+ m_multipassMv[dir] = &m_multipassAnalysis->m_mv[dir][ctu.m_cuAddr * ctu.m_numPartitions];
30
+ m_multipassMvpIdx[dir] = &m_multipassAnalysis->mvpIdx[dir][ctu.m_cuAddr * ctu.m_numPartitions];
31
+ m_multipassRef[dir] = &m_multipassAnalysis->ref[dir][ctu.m_cuAddr * ctu.m_numPartitions];
32
+ }
33
+ m_multipassModes = &m_multipassAnalysis->modes[ctu.m_cuAddr * ctu.m_numPartitions];
34
+ }
35
+ }
36
+
37
if (m_param->analysisMode && m_slice->m_sliceType != I_SLICE)
38
{
39
int numPredDir = m_slice->isInterP() ? 1 : 2;
40
41
compressInterCU_rd5_6(ctu, cuGeom, qp);
42
}
43
44
- if (m_param->bEnableRdRefine)
45
+ if (m_param->bEnableRdRefine || m_param->bOptCUDeltaQP)
46
qprdRefine(ctu, cuGeom, qp, qp);
47
48
return *m_modeDepth[0].bestMode;
49
50
int cuIdx = (cuGeom.childOffset - 1) / 3;
51
bestCUCost = origCUCost = cacheCost[cuIdx];
52
53
- for (int dir = 2; dir >= -2; dir -= 4)
54
+ int direction = m_param->bOptCUDeltaQP ? 1 : 2;
55
+
56
+ for (int dir = direction; dir >= -direction; dir -= (direction * 2))
57
{
58
+ if (m_param->bOptCUDeltaQP && ((dir != 1) || ((qp + 3) >= (int32_t)parentCTU.m_meanQP)))
59
+ break;
60
+
61
int threshold = 1;
62
int failure = 0;
63
cuPrevCost = origCUCost;
64
65
int modCUQP = qp + dir;
66
while (modCUQP >= m_param->rc.qpMin && modCUQP <= QP_MAX_SPEC)
67
{
68
+ if (m_param->bOptCUDeltaQP && modCUQP > (int32_t)parentCTU.m_meanQP)
69
+ break;
70
+
71
recodeCU(parentCTU, cuGeom, modCUQP, qp);
72
cuCost = md.bestMode->rdCost;
73
74
75
76
SplitData Analysis::compressInterCU_rd0_4(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp)
77
{
78
+ if (parentCTU.m_vbvAffected && calculateQpforCuSize(parentCTU, cuGeom, 1))
79
+ return compressInterCU_rd5_6(parentCTU, cuGeom, qp);
80
+
81
uint32_t depth = cuGeom.depth;
82
uint32_t cuAddr = parentCTU.m_cuAddr;
83
ModeDepth& md = m_modeDepth[depth];
84
85
}
86
}
87
}
88
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
89
+ {
90
+ if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
91
+ {
92
+ if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
93
+ {
94
+ md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
95
+ md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
96
+ checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
97
+
98
+ skipRecursion = !!m_param->bEnableRecursionSkip && md.bestMode;
99
+ if (m_param->rdLevel)
100
+ skipModes = m_param->bEnableEarlySkip && md.bestMode;
101
+ }
102
+ }
103
+ }
104
105
/* Step 1. Evaluate Merge/Skip candidates for likely early-outs, if skip mode was not set above */
106
if (mightNotSplit && depth >= minDepth && !md.bestMode) /* TODO: Re-evaluate if analysis load/save still works */
107
108
109
SplitData Analysis::compressInterCU_rd5_6(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp)
110
{
111
+ if (parentCTU.m_vbvAffected && !calculateQpforCuSize(parentCTU, cuGeom, 1))
112
+ return compressInterCU_rd0_4(parentCTU, cuGeom, qp);
113
+
114
uint32_t depth = cuGeom.depth;
115
ModeDepth& md = m_modeDepth[depth];
116
md.bestMode = NULL;
117
118
}
119
}
120
121
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
122
+ {
123
+ if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
124
+ {
125
+ if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
126
+ {
127
+ md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
128
+ md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
129
+ checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
130
+
131
+ skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
132
+ refMasks[0] = allSplitRefs;
133
+ md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
134
+ checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
135
+ checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
136
+
137
+ if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
138
+ skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
139
+ }
140
+ }
141
+ }
142
+
143
/* Step 1. Evaluate Merge/Skip candidates for likely early-outs */
144
if (mightNotSplit && !md.bestMode)
145
{
146
147
bestME[i].ref = m_reuseRef[refOffset + index++];
148
}
149
}
150
+
151
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
152
+ {
153
+ uint32_t numPU = interMode.cu.getNumPartInter(0);
154
+ for (uint32_t part = 0; part < numPU; part++)
155
+ {
156
+ MotionData* bestME = interMode.bestME[part];
157
+ for (int32_t i = 0; i < numPredDir; i++)
158
+ {
159
+ bestME[i].ref = m_multipassRef[i][cuGeom.absPartIdx];
160
+ bestME[i].mv = m_multipassMv[i][cuGeom.absPartIdx];
161
+ bestME[i].mvpIdx = m_multipassMvpIdx[i][cuGeom.absPartIdx];
162
+ }
163
+ }
164
+ }
165
predInterSearch(interMode, cuGeom, m_bChromaSa8d && (m_csp != X265_CSP_I400 && m_frame->m_fencPic->m_picCsp != X265_CSP_I400), refMask);
166
167
/* predInterSearch sets interMode.sa8dBits */
168
169
bestME[i].ref = m_reuseRef[refOffset + index++];
170
}
171
}
172
+
173
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
174
+ {
175
+ uint32_t numPU = interMode.cu.getNumPartInter(0);
176
+ for (uint32_t part = 0; part < numPU; part++)
177
+ {
178
+ MotionData* bestME = interMode.bestME[part];
179
+ for (int32_t i = 0; i < numPredDir; i++)
180
+ {
181
+ bestME[i].ref = m_multipassRef[i][cuGeom.absPartIdx];
182
+ bestME[i].mv = m_multipassMv[i][cuGeom.absPartIdx];
183
+ bestME[i].mvpIdx = m_multipassMvpIdx[i][cuGeom.absPartIdx];
184
+ }
185
+ }
186
+ }
187
+
188
predInterSearch(interMode, cuGeom, m_csp != X265_CSP_I400 && m_frame->m_fencPic->m_picCsp != X265_CSP_I400, refMask);
189
190
/* predInterSearch sets interMode.sa8dBits, but this is ignored */
191
192
return false;
193
}
194
195
-int Analysis::calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom, double baseQp)
196
+int Analysis::calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom, int32_t complexCheck, double baseQp)
197
{
198
FrameData& curEncData = *m_frame->m_encData;
199
double qp = baseQp >= 0 ? baseQp : curEncData.m_cuStat[ctu.m_cuAddr].baseQp;
200
201
loopIncr = 16;
202
/* Use cuTree offsets if cuTree enabled and frame is referenced, else use AQ offsets */
203
bool isReferenced = IS_REFERENCED(m_frame);
204
- double *qpoffs = (isReferenced && m_param->rc.cuTree) ? m_frame->m_lowres.qpCuTreeOffset : m_frame->m_lowres.qpAqOffset;
205
+ double *qpoffs;
206
+ if (complexCheck)
207
+ qpoffs = m_frame->m_lowres.qpAqOffset;
208
+ else
209
+ qpoffs = (isReferenced && m_param->rc.cuTree) ? m_frame->m_lowres.qpCuTreeOffset : m_frame->m_lowres.qpAqOffset;
210
if (qpoffs)
211
{
212
uint32_t width = m_frame->m_fencPic->m_picWidth;
213
214
215
qp_offset /= cnt;
216
qp += qp_offset;
217
+ if (complexCheck)
218
+ {
219
+ int32_t offset = (int32_t)(qp_offset * 100 + .5);
220
+ double threshold = (1 - ((x265_ADAPT_RD_STRENGTH - m_param->dynamicRd) * 0.5));
221
+ int32_t max_threshold = (int32_t)(threshold * 100 + .5);
222
+ if (offset < max_threshold)
223
+ return 1;
224
+ else
225
+ return 0;
226
+ }
227
}
228
229
return x265_clip3(m_param->rc.qpMin, m_param->rc.qpMax, (int)(qp + 0.5));
230
}
231
+
232
+void Analysis::normFactor(const pixel* src, uint32_t blockSize, CUData& ctu, int qp, TextType ttype)
233
+{
234
+ static const int ssim_c1 = (int)(.01 * .01 * PIXEL_MAX * PIXEL_MAX * 64 + .5); // 416
235
+ static const int ssim_c2 = (int)(.03 * .03 * PIXEL_MAX * PIXEL_MAX * 64 * 63 + .5); // 235963
236
+ int shift = (X265_DEPTH - 8);
237
+
238
+ double s = 1 + 0.005 * qp;
239
+
240
+ // Calculate denominator of normalization factor
241
+ uint64_t fDc_den = 0, fAc_den = 0;
242
+
243
+ // 1. Calculate dc component
244
+ uint64_t z_o = 0;
245
+ for (uint32_t block_yy = 0; block_yy < blockSize; block_yy += 4)
246
+ {
247
+ for (uint32_t block_xx = 0; block_xx < blockSize; block_xx += 4)
248
+ {
249
+ uint32_t temp = src[block_yy * blockSize + block_xx] >> shift;
250
+ z_o += temp * temp; // 2 * (Z(0)) pow(2)
251
+ }
252
+ }
253
+ fDc_den = (2 * z_o) + (blockSize * blockSize * ssim_c1); // 2 * (Z(0)) pow(2) + N * C1
254
+ fDc_den /= ((blockSize >> 2) * (blockSize >> 2));
255
+
256
+ // 2. Calculate ac component
257
+ uint64_t z_k = 0;
258
+ for (uint32_t block_yy = 0; block_yy < blockSize; block_yy += 1)
259
+ {
260
+ for (uint32_t block_xx = 0; block_xx < blockSize; block_xx += 1)
261
+ {
262
+ uint32_t temp = src[block_yy * blockSize + block_xx] >> shift;
263
+ z_k += temp * temp;
264
+ }
265
+ }
266
+
267
+ // Remove the DC part
268
+ z_k -= z_o;
269
+
270
+ fAc_den = z_k + int(s * z_k) + ssim_c2;
271
+ fAc_den /= ((blockSize >> 2) * (blockSize >> 2));
272
+
273
+ ctu.m_fAc_den[ttype] = fAc_den;
274
+ ctu.m_fDc_den[ttype] = fDc_den;
275
+}
276
+
277
+void Analysis::calculateNormFactor(CUData& ctu, int qp)
278
+{
279
+ const pixel* srcY = m_modeDepth[0].fencYuv.m_buf[0];
280
+ uint32_t blockSize = m_modeDepth[0].fencYuv.m_size;
281
+
282
+ normFactor(srcY, blockSize, ctu, qp, TEXT_LUMA);
283
+
284
+ if (m_csp != X265_CSP_I400 && m_frame->m_fencPic->m_picCsp != X265_CSP_I400)
285
+ {
286
+ const pixel* srcU = m_modeDepth[0].fencYuv.m_buf[1];
287
+ const pixel* srcV = m_modeDepth[0].fencYuv.m_buf[2];
288
+ uint32_t blockSizeC = m_modeDepth[0].fencYuv.m_csize;
289
+
290
+ normFactor(srcU, blockSizeC, ctu, qp, TEXT_CHROMA_U);
291
+ normFactor(srcV, blockSizeC, ctu, qp, TEXT_CHROMA_V);
292
+ }
293
+}
294
x265_2.2.tar.gz/source/encoder/analysis.h -> x265_2.3.tar.gz/source/encoder/analysis.h
Changed
27
1
2
uint32_t m_splitRefIdx[4];
3
uint64_t* cacheCost;
4
5
+
6
+ analysis2PassFrameData* m_multipassAnalysis;
7
+ uint8_t* m_multipassDepth;
8
+ MV* m_multipassMv[2];
9
+ int* m_multipassMvpIdx[2];
10
+ int32_t* m_multipassRef[2];
11
+ uint8_t* m_multipassModes;
12
/* refine RD based on QP for rd-levels 5 and 6 */
13
void qprdRefine(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp, int32_t lqp);
14
15
16
/* generate residual and recon pixels for an entire CTU recursively (RD0) */
17
void encodeResidue(const CUData& parentCTU, const CUGeom& cuGeom);
18
19
- int calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom, double baseQP = -1);
20
+ int calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom, int32_t complexCheck = 0, double baseQP = -1);
21
22
+ void calculateNormFactor(CUData& ctu, int qp);
23
+ void normFactor(const pixel* src, uint32_t blockSize, CUData& ctu, int qp, TextType ttype);
24
/* check whether current mode is the new best */
25
inline void checkBestMode(Mode& mode, uint32_t depth)
26
{
27
x265_2.2.tar.gz/source/encoder/api.cpp -> x265_2.3.tar.gz/source/encoder/api.cpp
Changed
30
1
2
}
3
else
4
{
5
+ if (encoder->m_latestParam->scalingLists && encoder->m_latestParam->scalingLists != encoder->m_param->scalingLists)
6
+ {
7
+ if (encoder->m_param->bRepeatHeaders)
8
+ {
9
+ if (encoder->m_scalingList.parseScalingList(encoder->m_latestParam->scalingLists))
10
+ return -1;
11
+ encoder->m_scalingList.setupQuantMatrices(encoder->m_param->internalCsp);
12
+ }
13
+ else
14
+ {
15
+ x265_log(encoder->m_param, X265_LOG_ERROR, "Repeat headers is turned OFF, cannot reconfigure scalinglists\n");
16
+ return -1;
17
+ }
18
+ }
19
encoder->m_reconfigure = true;
20
encoder->printReconfigureParams();
21
}
22
23
{
24
pic_in->analysisData.intraData = NULL;
25
pic_in->analysisData.interData = NULL;
26
+ pic_in->analysis2Pass.analysisFramedata = NULL;
27
}
28
29
if (pp_nal && numEncoded > 0)
30
x265_2.2.tar.gz/source/encoder/encoder.cpp -> x265_2.3.tar.gz/source/encoder/encoder.cpp
Changed
739
1
2
m_latestParam = NULL;
3
m_threadPool = NULL;
4
m_analysisFile = NULL;
5
+ m_analysisFileIn = NULL;
6
+ m_analysisFileOut = NULL;
7
m_offsetEmergency = NULL;
8
m_iFrameNum = 0;
9
m_iPPSQpMinus26 = 0;
10
- m_iLastSliceQp = 0;
11
m_rpsInSpsCount = 0;
12
for (int i = 0; i < X265_MAX_FRAME_THREADS; i++)
13
m_frameEncoder[i] = NULL;
14
15
MotionEstimate::initScales();
16
}
17
18
+inline char *strcatFilename(const char *input, const char *suffix)
19
+{
20
+ char *output = X265_MALLOC(char, strlen(input) + strlen(suffix) + 1);
21
+ if (!output)
22
+ {
23
+ x265_log(NULL, X265_LOG_ERROR, "unable to allocate memory for filename\n");
24
+ return NULL;
25
+ }
26
+ strcpy(output, input);
27
+ strcat(output, suffix);
28
+ return output;
29
+}
30
+
31
void Encoder::create()
32
{
33
if (!primitives.pu[0].sad)
34
35
else
36
p->frameNumThreads = 1;
37
}
38
-
39
m_numPools = 0;
40
if (allowPools)
41
- m_threadPool = ThreadPool::allocThreadPools(p, m_numPools);
42
-
43
+ m_threadPool = ThreadPool::allocThreadPools(p, m_numPools, 0);
44
if (!m_numPools)
45
{
46
// issue warnings if any of these features were requested
47
48
m_scalingList.setDefaultScalingList();
49
else if (m_scalingList.parseScalingList(m_param->scalingLists))
50
m_aborted = true;
51
-
52
- m_lookahead = new Lookahead(m_param, m_threadPool);
53
- if (m_numPools)
54
+ int pools = m_numPools;
55
+ ThreadPool* lookAheadThreadPool = 0;
56
+ if (m_param->lookaheadThreads > 0)
57
{
58
- m_lookahead->m_jpId = m_threadPool[0].m_numProviders++;
59
- m_threadPool[0].m_jpTable[m_lookahead->m_jpId] = m_lookahead;
60
+ lookAheadThreadPool = ThreadPool::allocThreadPools(p, pools, 1);
61
}
62
-
63
+ else
64
+ lookAheadThreadPool = m_threadPool;
65
+ m_lookahead = new Lookahead(m_param, lookAheadThreadPool);
66
+ if (pools)
67
+ {
68
+ m_lookahead->m_jpId = lookAheadThreadPool[0].m_numProviders++;
69
+ lookAheadThreadPool[0].m_jpTable[m_lookahead->m_jpId] = m_lookahead;
70
+ }
71
+ if (m_param->lookaheadThreads > 0)
72
+ for (int i = 0; i < pools; i++)
73
+ lookAheadThreadPool[i].start();
74
+ m_lookahead->m_numPools = pools;
75
m_dpb = new DPB(m_param);
76
m_rateControl = new RateControl(*m_param);
77
-
78
initVPS(&m_vps);
79
initSPS(&m_sps);
80
initPPS(&m_pps);
81
82
if (!scalingEnabled)
83
{
84
m_scalingList.setDefaultScalingList();
85
- m_scalingList.setupQuantMatrices();
86
+ m_scalingList.setupQuantMatrices(m_sps.chromaFormatIdc);
87
}
88
else
89
- m_scalingList.setupQuantMatrices();
90
+ m_scalingList.setupQuantMatrices(m_sps.chromaFormatIdc);
91
92
for (int q = 0; q < QP_MAX_MAX - QP_MAX_SPEC; q++)
93
{
94
95
{
96
m_scalingList.m_bEnabled = false;
97
m_scalingList.m_bDataPresent = false;
98
- m_scalingList.setupQuantMatrices();
99
+ m_scalingList.setupQuantMatrices(m_sps.chromaFormatIdc);
100
}
101
}
102
else
103
- m_scalingList.setupQuantMatrices();
104
+ m_scalingList.setupQuantMatrices(m_sps.chromaFormatIdc);
105
106
int numRows = (m_param->sourceHeight + g_maxCUSize - 1) / g_maxCUSize;
107
int numCols = (m_param->sourceWidth + g_maxCUSize - 1) / g_maxCUSize;
108
109
}
110
}
111
112
+ if (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion)
113
+ {
114
+ const char* name = m_param->analysisFileName;
115
+ if (!name)
116
+ name = defaultAnalysisFileName;
117
+ if (m_param->rc.bStatWrite)
118
+ {
119
+ char* temp = strcatFilename(name, ".temp");
120
+ if (!temp)
121
+ m_aborted = true;
122
+ else
123
+ {
124
+ m_analysisFileOut = fopen(temp, "wb");
125
+ X265_FREE(temp);
126
+ }
127
+ if (!m_analysisFileOut)
128
+ {
129
+ x265_log(NULL, X265_LOG_ERROR, "Analysis 2 pass: failed to open file %s\n", temp);
130
+ m_aborted = true;
131
+ }
132
+ }
133
+ if (m_param->rc.bStatRead)
134
+ {
135
+ m_analysisFileIn = fopen(name, "rb");
136
+ if (!m_analysisFileIn)
137
+ {
138
+ x265_log(NULL, X265_LOG_ERROR, "Analysis 2 pass: failed to open file %s\n", name);
139
+ m_aborted = true;
140
+ }
141
+ }
142
+ }
143
+
144
m_bZeroLatency = !m_param->bframes && !m_param->lookaheadDepth && m_param->frameNumThreads == 1;
145
146
m_aborted |= parseLambdaFile(m_param);
147
148
if (m_analysisFile)
149
fclose(m_analysisFile);
150
151
+ if (m_latestParam != NULL && m_latestParam != m_param)
152
+ {
153
+ if (m_latestParam->scalingLists != m_param->scalingLists)
154
+ free((char*)m_latestParam->scalingLists);
155
+
156
+ PARAM_NS::x265_param_free(m_latestParam);
157
+ }
158
+ if (m_analysisFileIn)
159
+ fclose(m_analysisFileIn);
160
+
161
+ if (m_analysisFileOut)
162
+ {
163
+ int bError = 1;
164
+ fclose(m_analysisFileOut);
165
+ const char* name = m_param->analysisFileName;
166
+ if (!name)
167
+ name = defaultAnalysisFileName;
168
+ char* temp = strcatFilename(name, ".temp");
169
+ if (temp)
170
+ {
171
+ x265_unlink(name);
172
+ bError = x265_rename(temp, name);
173
+ }
174
+ if (bError)
175
+ {
176
+ x265_log(m_param, X265_LOG_ERROR, "failed to rename analysis stats file to \"%s\"\n", name);
177
+ }
178
+ X265_FREE(temp);
179
+ }
180
if (m_param)
181
{
182
/* release string arguments that were strdup'd */
183
184
185
PARAM_NS::x265_param_free(m_param);
186
}
187
-
188
- PARAM_NS::x265_param_free(m_latestParam);
189
}
190
191
void Encoder::updateVbvPlan(RateControl* rc)
192
193
if (m_dpb->m_freeList.empty())
194
{
195
inFrame = new Frame;
196
+ inFrame->m_encodeStartTime = x265_mdate();
197
x265_param* p = m_reconfigure ? m_latestParam : m_param;
198
if (inFrame->create(p, pic_in->quantOffsets))
199
{
200
201
else
202
{
203
inFrame = m_dpb->m_freeList.popBack();
204
+ inFrame->m_encodeStartTime = x265_mdate();
205
/* Set lowres scencut and satdCost here to aovid overwriting ANALYSIS_READ
206
decision by lowres init*/
207
inFrame->m_lowres.bScenecut = false;
208
209
freeAnalysis(&pic_out->analysisData);
210
}
211
}
212
+ if (m_param->rc.bStatWrite && (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion))
213
+ {
214
+ if (pic_out)
215
+ {
216
+ pic_out->analysis2Pass.poc = pic_out->poc;
217
+ pic_out->analysis2Pass.analysisFramedata = outFrame->m_analysis2Pass.analysisFramedata;
218
+ }
219
+ writeAnalysis2PassFile(&outFrame->m_analysis2Pass, *outFrame->m_encData, outFrame->m_lowres.sliceType);
220
+ }
221
+ if (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion)
222
+ freeAnalysis2Pass(&outFrame->m_analysis2Pass, outFrame->m_lowres.sliceType);
223
if (m_param->internalCsp == X265_CSP_I400)
224
{
225
if (slice->m_sliceType == P_SLICE)
226
227
frameEnc = m_lookahead->getDecidedPicture();
228
if (frameEnc && !pass)
229
{
230
+ if (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion)
231
+ {
232
+ allocAnalysis2Pass(&frameEnc->m_analysis2Pass, frameEnc->m_lowres.sliceType);
233
+ frameEnc->m_analysis2Pass.poc = frameEnc->m_poc;
234
+ if (m_param->rc.bStatRead)
235
+ readAnalysis2PassFile(&frameEnc->m_analysis2Pass, frameEnc->m_poc, frameEnc->m_lowres.sliceType);
236
+ }
237
if (curEncoder->m_reconfigure)
238
{
239
/* One round robin cycle of FE reconfigure is complete */
240
241
iLeastCost = m_iBitsCostSum[i];
242
}
243
}
244
-
245
/* If last slice Qp is close to (26 + m_iPPSQpMinus26) or outputs is all I-frame video,
246
we don't need to change m_iPPSQpMinus26. */
247
- if ((abs(m_iLastSliceQp - (26 + m_iPPSQpMinus26)) > 1) && (m_iFrameNum > 1))
248
+ if (m_iFrameNum > 1)
249
m_iPPSQpMinus26 = (iLeastId + 1) - 26;
250
m_iFrameNum = 0;
251
}
252
253
analysis->numPartitions = NUM_4x4_PARTITIONS;
254
allocAnalysis(analysis);
255
}
256
-
257
/* determine references, setup RPS, etc */
258
m_dpb->prepareEncode(frameEnc);
259
260
261
encParam->bEnableRectInter = param->bEnableRectInter;
262
encParam->maxNumMergeCand = param->maxNumMergeCand;
263
encParam->bIntraInBFrames = param->bIntraInBFrames;
264
+ if (param->scalingLists && !encParam->scalingLists)
265
+ encParam->scalingLists = strdup(param->scalingLists);
266
/* To add: Loop Filter/deblocking controls, transform skip, signhide require PPS to be resent */
267
/* To add: SAO, temporal MVP, AMP, TU depths require SPS to be resent, at every CVS boundary */
268
return x265_check_params(encParam);
269
270
frameStats->refWaitWallTime = ELAPSED_MSEC(curEncoder->m_row0WaitTime, curEncoder->m_allRowsAvailableTime);
271
frameStats->totalCTUTime = ELAPSED_MSEC(0, curEncoder->m_totalWorkerElapsedTime);
272
frameStats->stallTime = ELAPSED_MSEC(0, curEncoder->m_totalNoWorkerTime);
273
+ frameStats->totalFrameTime = ELAPSED_MSEC(curFrame->m_encodeStartTime, x265_mdate());
274
if (curEncoder->m_totalActiveWorkerCount)
275
frameStats->avgWPP = (double)curEncoder->m_totalActiveWorkerCount / curEncoder->m_activeWorkerCountSamples;
276
else
277
278
bs.writeByteAlignment();
279
list.serialize(NAL_UNIT_PPS, bs);
280
281
- if (m_param->masteringDisplayColorVolume)
282
- {
283
- SEIMasteringDisplayColorVolume mdsei;
284
- if (mdsei.parse(m_param->masteringDisplayColorVolume))
285
- {
286
- bs.resetBits();
287
- mdsei.write(bs, m_sps);
288
- bs.writeByteAlignment();
289
- list.serialize(NAL_UNIT_PREFIX_SEI, bs);
290
- }
291
- else
292
- x265_log(m_param, X265_LOG_WARNING, "unable to parse mastering display color volume info\n");
293
- }
294
-
295
- if (m_emitCLLSEI)
296
+ if (m_param->bEmitHDRSEI)
297
{
298
SEIContentLightLevel cllsei;
299
cllsei.max_content_light_level = m_param->maxCLL;
300
301
cllsei.write(bs, m_sps);
302
bs.writeByteAlignment();
303
list.serialize(NAL_UNIT_PREFIX_SEI, bs);
304
+
305
+ if (m_param->masteringDisplayColorVolume)
306
+ {
307
+ SEIMasteringDisplayColorVolume mdsei;
308
+ if (mdsei.parse(m_param->masteringDisplayColorVolume))
309
+ {
310
+ bs.resetBits();
311
+ mdsei.write(bs, m_sps);
312
+ bs.writeByteAlignment();
313
+ list.serialize(NAL_UNIT_PREFIX_SEI, bs);
314
+ }
315
+ else
316
+ x265_log(m_param, X265_LOG_WARNING, "unable to parse mastering display color volume info\n");
317
+ }
318
}
319
320
if (m_param->bEmitInfoSEI)
321
322
{
323
bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
324
325
- if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))
326
+ if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv || m_param->bAQMotion))
327
{
328
pps->bUseDQP = true;
329
pps->maxCuDQPDepth = g_log2Size[m_param->maxCUSize] - g_log2Size[m_param->rc.qgSize];
330
331
}
332
333
334
- if (p->scalingLists && p->internalCsp == X265_CSP_I444)
335
- {
336
- x265_log(p, X265_LOG_WARNING, "Scaling lists are not yet supported for 4:4:4 chroma subsampling\n");
337
- p->scalingLists = 0;
338
- }
339
-
340
if (p->interlaceMode)
341
x265_log(p, X265_LOG_WARNING, "Support for interlaced video is experimental\n");
342
343
344
p->rc.cuTree = 0;
345
}
346
347
+ if (p->analysisMode && (p->analysisMultiPassRefine || p->analysisMultiPassDistortion))
348
+ {
349
+ x265_log(p, X265_LOG_WARNING, "Cannot use Analysis load/save option and multi-pass-opt-analysis/multi-pass-opt-distortion together,"
350
+ "Disabling Analysis load/save and multi-pass-opt-analysis/multi-pass-opt-distortion\n");
351
+ p->analysisMode = p->analysisMultiPassRefine = p->analysisMultiPassDistortion = 0;
352
+ }
353
+
354
+ if ((p->analysisMultiPassRefine || p->analysisMultiPassDistortion) && (p->bDistributeModeAnalysis || p->bDistributeMotionEstimation))
355
+ {
356
+ x265_log(p, X265_LOG_WARNING, "multi-pass-opt-analysis/multi-pass-opt-distortion incompatible with pmode/pme, Disabling pmode/pme\n");
357
+ p->bDistributeMotionEstimation = p->bDistributeModeAnalysis = 0;
358
+ }
359
+
360
if (p->rc.bEnableGrain)
361
{
362
x265_log(p, X265_LOG_WARNING, "Rc Grain removes qp fluctuations caused by aq/cutree, Disabling aq,cu-tree\n");
363
364
p->bDistributeModeAnalysis = 0;
365
}
366
367
+ if (!p->rc.bStatWrite && !p->rc.bStatRead && (p->analysisMultiPassRefine || p->analysisMultiPassDistortion))
368
+ {
369
+ x265_log(p, X265_LOG_WARNING, "analysis-multi-pass/distortion is enabled only when rc multi pass is enabled. Disabling multi-pass-opt-analysis and multi-pass-opt-distortion");
370
+ p->analysisMultiPassRefine = 0;
371
+ p->analysisMultiPassDistortion = 0;
372
+ }
373
+ if (p->analysisMultiPassRefine && p->rc.bStatWrite && p->rc.bStatRead)
374
+ {
375
+ x265_log(p, X265_LOG_WARNING, "--multi-pass-opt-analysis doesn't support refining analysis through multiple-passes; it only reuses analysis from the second-to-last pass to the last pass.Disabling reading\n");
376
+ p->rc.bStatRead = 0;
377
+ }
378
+
379
/* some options make no sense if others are disabled */
380
p->bSaoNonDeblocked &= p->bEnableSAO;
381
p->bEnableTSkipFast &= p->bEnableTransformSkip;
382
383
x265_log(p, X265_LOG_WARNING, "--rd-refine disabled, requires RD level > 4 and adaptive quant\n");
384
}
385
386
+ if (p->bOptCUDeltaQP && p->rdLevel < 5)
387
+ {
388
+ p->bOptCUDeltaQP = false;
389
+ x265_log(p, X265_LOG_WARNING, "--opt-cu-delta-qp disabled, requires RD level > 4\n");
390
+ }
391
+
392
if (p->limitTU && p->tuQTMaxInterDepth < 2)
393
{
394
p->limitTU = 0;
395
x265_log(p, X265_LOG_WARNING, "limit-tu disabled, requires tu-inter-depth > 1\n");
396
}
397
bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
398
- if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))
399
+ if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv || m_param->bAQMotion))
400
{
401
if (p->rc.qgSize < X265_MAX(8, p->minCUSize))
402
{
403
404
else
405
m_param->rc.qgSize = p->maxCUSize;
406
407
+ if (m_param->dynamicRd && (!bIsVbv || !p->rc.aqMode || p->rdLevel > 4))
408
+ {
409
+ p->dynamicRd = 0;
410
+ x265_log(p, X265_LOG_WARNING, "Dynamic-rd disabled, requires RD <= 4, VBV and aq-mode enabled\n");
411
+ }
412
+
413
if (p->uhdBluray)
414
{
415
p->bEnableAccessUnitDelimiters = 1;
416
417
}
418
}
419
420
+void Encoder::allocAnalysis2Pass(x265_analysis_2Pass* analysis, int sliceType)
421
+{
422
+ analysis->analysisFramedata = NULL;
423
+ analysis2PassFrameData *analysisFrameData = (analysis2PassFrameData*)analysis->analysisFramedata;
424
+ uint32_t widthInCU = (m_param->sourceWidth + g_maxCUSize - 1) >> g_maxLog2CUSize;
425
+ uint32_t heightInCU = (m_param->sourceHeight + g_maxCUSize - 1) >> g_maxLog2CUSize;
426
+
427
+ uint32_t numCUsInFrame = widthInCU * heightInCU;
428
+ CHECKED_MALLOC_ZERO(analysisFrameData, analysis2PassFrameData, 1);
429
+ CHECKED_MALLOC_ZERO(analysisFrameData->depth, uint8_t, NUM_4x4_PARTITIONS * numCUsInFrame);
430
+ CHECKED_MALLOC_ZERO(analysisFrameData->distortion, sse_t, NUM_4x4_PARTITIONS * numCUsInFrame);
431
+ if (m_param->rc.bStatRead)
432
+ {
433
+ CHECKED_MALLOC_ZERO(analysisFrameData->ctuDistortion, sse_t, numCUsInFrame);
434
+ CHECKED_MALLOC_ZERO(analysisFrameData->scaledDistortion, double, numCUsInFrame);
435
+ CHECKED_MALLOC_ZERO(analysisFrameData->offset, double, numCUsInFrame);
436
+ CHECKED_MALLOC_ZERO(analysisFrameData->threshold, double, numCUsInFrame);
437
+ }
438
+ if (!IS_X265_TYPE_I(sliceType))
439
+ {
440
+ CHECKED_MALLOC_ZERO(analysisFrameData->m_mv[0], MV, NUM_4x4_PARTITIONS * numCUsInFrame);
441
+ CHECKED_MALLOC_ZERO(analysisFrameData->m_mv[1], MV, NUM_4x4_PARTITIONS * numCUsInFrame);
442
+ CHECKED_MALLOC_ZERO(analysisFrameData->mvpIdx[0], int, NUM_4x4_PARTITIONS * numCUsInFrame);
443
+ CHECKED_MALLOC_ZERO(analysisFrameData->mvpIdx[1], int, NUM_4x4_PARTITIONS * numCUsInFrame);
444
+ CHECKED_MALLOC_ZERO(analysisFrameData->ref[0], int32_t, NUM_4x4_PARTITIONS * numCUsInFrame);
445
+ CHECKED_MALLOC_ZERO(analysisFrameData->ref[1], int32_t, NUM_4x4_PARTITIONS * numCUsInFrame);
446
+ CHECKED_MALLOC(analysisFrameData->modes, uint8_t, NUM_4x4_PARTITIONS * numCUsInFrame);
447
+ }
448
+
449
+ analysis->analysisFramedata = analysisFrameData;
450
+
451
+ return;
452
+
453
+fail:
454
+ freeAnalysis2Pass(analysis, sliceType);
455
+ m_aborted = true;
456
+}
457
+
458
+void Encoder::freeAnalysis2Pass(x265_analysis_2Pass* analysis, int sliceType)
459
+{
460
+ if (analysis->analysisFramedata)
461
+ {
462
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->depth);
463
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->distortion);
464
+ if (m_param->rc.bStatRead)
465
+ {
466
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->ctuDistortion);
467
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->scaledDistortion);
468
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->offset);
469
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->threshold);
470
+ }
471
+ if (!IS_X265_TYPE_I(sliceType))
472
+ {
473
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->m_mv[0]);
474
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->m_mv[1]);
475
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->mvpIdx[0]);
476
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->mvpIdx[1]);
477
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->ref[0]);
478
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->ref[1]);
479
+ X265_FREE(((analysis2PassFrameData*)analysis->analysisFramedata)->modes);
480
+ }
481
+ X265_FREE(analysis->analysisFramedata);
482
+ }
483
+}
484
+
485
void Encoder::readAnalysisFile(x265_analysis_data* analysis, int curPoc)
486
{
487
488
489
#undef X265_FREAD
490
}
491
492
+void Encoder::readAnalysis2PassFile(x265_analysis_2Pass* analysis2Pass, int curPoc, int sliceType)
493
+{
494
+
495
+#define X265_FREAD(val, size, readSize, fileOffset)\
496
+ if (fread(val, size, readSize, fileOffset) != readSize)\
497
+ {\
498
+ x265_log(NULL, X265_LOG_ERROR, "Error reading analysis 2 pass data\n"); \
499
+ freeAnalysis2Pass(analysis2Pass, sliceType); \
500
+ m_aborted = true; \
501
+ return; \
502
+}\
503
+
504
+ uint32_t depthBytes = 0;
505
+ uint32_t widthInCU = (m_param->sourceWidth + g_maxCUSize - 1) >> g_maxLog2CUSize;
506
+ uint32_t heightInCU = (m_param->sourceHeight + g_maxCUSize - 1) >> g_maxLog2CUSize;
507
+ uint32_t numCUsInFrame = widthInCU * heightInCU;
508
+
509
+ int poc; uint32_t frameRecordSize;
510
+ X265_FREAD(&frameRecordSize, sizeof(uint32_t), 1, m_analysisFileIn);
511
+ X265_FREAD(&depthBytes, sizeof(uint32_t), 1, m_analysisFileIn);
512
+ X265_FREAD(&poc, sizeof(int), 1, m_analysisFileIn);
513
+
514
+ if (poc != curPoc || feof(m_analysisFileIn))
515
+ {
516
+ x265_log(NULL, X265_LOG_WARNING, "Error reading analysis 2 pass data: Cannot find POC %d\n", curPoc);
517
+ freeAnalysis2Pass(analysis2Pass, sliceType);
518
+ return;
519
+ }
520
+ /* Now arrived at the right frame, read the record */
521
+ analysis2Pass->frameRecordSize = frameRecordSize;
522
+ uint8_t* tempBuf = NULL, *depthBuf = NULL;
523
+ sse_t *tempdistBuf = NULL, *distortionBuf = NULL;
524
+ tempBuf = X265_MALLOC(uint8_t, depthBytes);
525
+ X265_FREAD(tempBuf, sizeof(uint8_t), depthBytes, m_analysisFileIn);
526
+ tempdistBuf = X265_MALLOC(sse_t, depthBytes);
527
+ X265_FREAD(tempdistBuf, sizeof(sse_t), depthBytes, m_analysisFileIn);
528
+ depthBuf = tempBuf;
529
+ distortionBuf = tempdistBuf;
530
+ analysis2PassFrameData* analysisFrameData = (analysis2PassFrameData*)analysis2Pass->analysisFramedata;
531
+ size_t count = 0;
532
+ uint32_t ctuCount = 0;
533
+ double sum = 0, sqrSum = 0;
534
+ for (uint32_t d = 0; d < depthBytes; d++)
535
+ {
536
+ int bytes = NUM_4x4_PARTITIONS >> (depthBuf[d] * 2);
537
+ memset(&analysisFrameData->depth[count], depthBuf[d], bytes);
538
+ analysisFrameData->distortion[count] = distortionBuf[d];
539
+ analysisFrameData->ctuDistortion[ctuCount] += analysisFrameData->distortion[count];
540
+ count += bytes;
541
+ if ((count % (size_t)NUM_4x4_PARTITIONS) == 0)
542
+ {
543
+ analysisFrameData->scaledDistortion[ctuCount] = X265_LOG2(X265_MAX(analysisFrameData->ctuDistortion[ctuCount], 1));
544
+ sum += analysisFrameData->scaledDistortion[ctuCount];
545
+ sqrSum += analysisFrameData->scaledDistortion[ctuCount] * analysisFrameData->scaledDistortion[ctuCount];
546
+ ctuCount++;
547
+ }
548
+ }
549
+ double avg = sum / numCUsInFrame;
550
+ analysisFrameData->sdDistortion = pow(((sqrSum / numCUsInFrame) - (avg * avg)), 0.5);
551
+ analysisFrameData->averageDistortion = avg;
552
+ analysisFrameData->highDistortionCtuCount = analysisFrameData->lowDistortionCtuCount = 0;
553
+ for (uint32_t i = 0; i < numCUsInFrame; ++i)
554
+ {
555
+ analysisFrameData->threshold[i] = analysisFrameData->scaledDistortion[i] / analysisFrameData->averageDistortion;
556
+ analysisFrameData->offset[i] = (analysisFrameData->averageDistortion - analysisFrameData->scaledDistortion[i]) / analysisFrameData->sdDistortion;
557
+ if (analysisFrameData->threshold[i] < 0.9 && analysisFrameData->offset[i] >= 1)
558
+ analysisFrameData->lowDistortionCtuCount++;
559
+ else if (analysisFrameData->threshold[i] > 1.1 && analysisFrameData->offset[i] <= -1)
560
+ analysisFrameData->highDistortionCtuCount++;
561
+ }
562
+ if (!IS_X265_TYPE_I(sliceType))
563
+ {
564
+ MV *tempMVBuf[2], *MVBuf[2];
565
+ int32_t *tempRefBuf[2], *refBuf[2];
566
+ int *tempMvpBuf[2], *mvpBuf[2];
567
+ uint8_t* tempModeBuf = NULL, *modeBuf = NULL;
568
+
569
+ int numDir = sliceType == X265_TYPE_P ? 1 : 2;
570
+ for (int i = 0; i < numDir; i++)
571
+ {
572
+ tempMVBuf[i] = X265_MALLOC(MV, depthBytes);
573
+ X265_FREAD(tempMVBuf[i], sizeof(MV), depthBytes, m_analysisFileIn);
574
+ MVBuf[i] = tempMVBuf[i];
575
+ tempMvpBuf[i] = X265_MALLOC(int, depthBytes);
576
+ X265_FREAD(tempMvpBuf[i], sizeof(int), depthBytes, m_analysisFileIn);
577
+ mvpBuf[i] = tempMvpBuf[i];
578
+ tempRefBuf[i] = X265_MALLOC(int32_t, depthBytes);
579
+ X265_FREAD(tempRefBuf[i], sizeof(int32_t), depthBytes, m_analysisFileIn);
580
+ refBuf[i] = tempRefBuf[i];
581
+ }
582
+ tempModeBuf = X265_MALLOC(uint8_t, depthBytes);
583
+ X265_FREAD(tempModeBuf, sizeof(uint8_t), depthBytes, m_analysisFileIn);
584
+ modeBuf = tempModeBuf;
585
+
586
+ count = 0;
587
+ for (uint32_t d = 0; d < depthBytes; d++)
588
+ {
589
+ size_t bytes = NUM_4x4_PARTITIONS >> (depthBuf[d] * 2);
590
+ for (int i = 0; i < numDir; i++)
591
+ {
592
+ for (size_t j = count, k = 0; k < bytes; j++, k++)
593
+ {
594
+ memcpy(&((analysis2PassFrameData*)analysis2Pass->analysisFramedata)->m_mv[i][j], MVBuf[i] + d, sizeof(MV));
595
+ memcpy(&((analysis2PassFrameData*)analysis2Pass->analysisFramedata)->mvpIdx[i][j], mvpBuf[i] + d, sizeof(int));
596
+ memcpy(&((analysis2PassFrameData*)analysis2Pass->analysisFramedata)->ref[i][j], refBuf[i] + d, sizeof(int32_t));
597
+ }
598
+ }
599
+ memset(&((analysis2PassFrameData *)analysis2Pass->analysisFramedata)->modes[count], modeBuf[d], bytes);
600
+ count += bytes;
601
+ }
602
+
603
+ for (int i = 0; i < numDir; i++)
604
+ {
605
+ X265_FREE(tempMVBuf[i]);
606
+ X265_FREE(tempMvpBuf[i]);
607
+ X265_FREE(tempRefBuf[i]);
608
+ }
609
+ X265_FREE(tempModeBuf);
610
+ }
611
+ X265_FREE(tempBuf);
612
+ X265_FREE(tempdistBuf);
613
+
614
+#undef X265_FREAD
615
+}
616
+
617
void Encoder::writeAnalysisFile(x265_analysis_data* analysis, FrameData &curEncData)
618
{
619
620
621
}
622
#undef X265_FWRITE
623
}
624
+void Encoder::writeAnalysis2PassFile(x265_analysis_2Pass* analysis2Pass, FrameData &curEncData, int slicetype)
625
+{
626
+#define X265_FWRITE(val, size, writeSize, fileOffset)\
627
+ if (fwrite(val, size, writeSize, fileOffset) < writeSize)\
628
+ {\
629
+ x265_log(NULL, X265_LOG_ERROR, "Error writing analysis 2 pass data\n"); \
630
+ freeAnalysis2Pass(analysis2Pass, slicetype); \
631
+ m_aborted = true; \
632
+ return; \
633
+}\
634
+
635
+ uint32_t depthBytes = 0;
636
+ uint32_t widthInCU = (m_param->sourceWidth + g_maxCUSize - 1) >> g_maxLog2CUSize;
637
+ uint32_t heightInCU = (m_param->sourceHeight + g_maxCUSize - 1) >> g_maxLog2CUSize;
638
+ uint32_t numCUsInFrame = widthInCU * heightInCU;
639
+ analysis2PassFrameData* analysisFrameData = (analysis2PassFrameData*)analysis2Pass->analysisFramedata;
640
+
641
+ for (uint32_t cuAddr = 0; cuAddr < numCUsInFrame; cuAddr++)
642
+ {
643
+ uint8_t depth = 0;
644
+
645
+ CUData* ctu = curEncData.getPicCTU(cuAddr);
646
+
647
+ for (uint32_t absPartIdx = 0; absPartIdx < ctu->m_numPartitions; depthBytes++)
648
+ {
649
+ depth = ctu->m_cuDepth[absPartIdx];
650
+ analysisFrameData->depth[depthBytes] = depth;
651
+ analysisFrameData->distortion[depthBytes] = ctu->m_distortion[absPartIdx];
652
+ absPartIdx += ctu->m_numPartitions >> (depth * 2);
653
+ }
654
+ }
655
+
656
+ if (curEncData.m_slice->m_sliceType != I_SLICE)
657
+ {
658
+ depthBytes = 0;
659
+ for (uint32_t cuAddr = 0; cuAddr < numCUsInFrame; cuAddr++)
660
+ {
661
+ uint8_t depth = 0;
662
+ uint8_t predMode = 0;
663
+
664
+ CUData* ctu = curEncData.getPicCTU(cuAddr);
665
+
666
+ for (uint32_t absPartIdx = 0; absPartIdx < ctu->m_numPartitions; depthBytes++)
667
+ {
668
+ depth = ctu->m_cuDepth[absPartIdx];
669
+ analysisFrameData->m_mv[0][depthBytes] = ctu->m_mv[0][absPartIdx];
670
+ analysisFrameData->mvpIdx[0][depthBytes] = ctu->m_mvpIdx[0][absPartIdx];
671
+ analysisFrameData->ref[0][depthBytes] = ctu->m_refIdx[0][absPartIdx];
672
+ predMode = ctu->m_predMode[absPartIdx];
673
+ if (ctu->m_refIdx[1][absPartIdx] != -1)
674
+ {
675
+ analysisFrameData->m_mv[1][depthBytes] = ctu->m_mv[1][absPartIdx];
676
+ analysisFrameData->mvpIdx[1][depthBytes] = ctu->m_mvpIdx[1][absPartIdx];
677
+ analysisFrameData->ref[1][depthBytes] = ctu->m_refIdx[1][absPartIdx];
678
+ predMode = 4; // used as indiacator if the block is coded as bidir
679
+ }
680
+ analysisFrameData->modes[depthBytes] = predMode;
681
+
682
+ absPartIdx += ctu->m_numPartitions >> (depth * 2);
683
+ }
684
+ }
685
+ }
686
+
687
+ /* calculate frameRecordSize */
688
+ analysis2Pass->frameRecordSize = sizeof(analysis2Pass->frameRecordSize) + sizeof(depthBytes) + sizeof(analysis2Pass->poc);
689
+
690
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(uint8_t);
691
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(sse_t);
692
+ if (curEncData.m_slice->m_sliceType != I_SLICE)
693
+ {
694
+ int numDir = (curEncData.m_slice->m_sliceType == P_SLICE) ? 1 : 2;
695
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(MV) * numDir;
696
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(int32_t) * numDir;
697
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(int) * numDir;
698
+ analysis2Pass->frameRecordSize += depthBytes * sizeof(uint8_t);
699
+ }
700
+ X265_FWRITE(&analysis2Pass->frameRecordSize, sizeof(uint32_t), 1, m_analysisFileOut);
701
+ X265_FWRITE(&depthBytes, sizeof(uint32_t), 1, m_analysisFileOut);
702
+ X265_FWRITE(&analysis2Pass->poc, sizeof(uint32_t), 1, m_analysisFileOut);
703
+
704
+ X265_FWRITE(analysisFrameData->depth, sizeof(uint8_t), depthBytes, m_analysisFileOut);
705
+ X265_FWRITE(analysisFrameData->distortion, sizeof(sse_t), depthBytes, m_analysisFileOut);
706
+ if (curEncData.m_slice->m_sliceType != I_SLICE)
707
+ {
708
+ int numDir = curEncData.m_slice->m_sliceType == P_SLICE ? 1 : 2;
709
+ for (int i = 0; i < numDir; i++)
710
+ {
711
+ X265_FWRITE(analysisFrameData->m_mv[i], sizeof(MV), depthBytes, m_analysisFileOut);
712
+ X265_FWRITE(analysisFrameData->mvpIdx[i], sizeof(int), depthBytes, m_analysisFileOut);
713
+ X265_FWRITE(analysisFrameData->ref[i], sizeof(int32_t), depthBytes, m_analysisFileOut);
714
+ }
715
+ X265_FWRITE(analysisFrameData->modes, sizeof(uint8_t), depthBytes, m_analysisFileOut);
716
+ }
717
+#undef X265_FWRITE
718
+}
719
720
void Encoder::printReconfigureParams()
721
{
722
723
724
x265_log(newParam, X265_LOG_DEBUG, "Reconfigured param options, input Frame: %d\n", m_pocLast + 1);
725
726
- char tmp[40];
727
+ char tmp[60];
728
#define TOOLCMP(COND1, COND2, STR) if (COND1 != COND2) { sprintf(tmp, STR, COND1, COND2); x265_log(newParam, X265_LOG_DEBUG, tmp); }
729
TOOLCMP(oldParam->maxNumReferences, newParam->maxNumReferences, "ref=%d to %d\n");
730
TOOLCMP(oldParam->bEnableFastIntra, newParam->bEnableFastIntra, "fast-intra=%d to %d\n");
731
732
TOOLCMP(oldParam->bEnableRectInter, newParam->bEnableRectInter, "rect=%d to %d\n");
733
TOOLCMP(oldParam->maxNumMergeCand, newParam->maxNumMergeCand, "max-merge=%d to %d\n");
734
TOOLCMP(oldParam->bIntraInBFrames, newParam->bIntraInBFrames, "b-intra=%d to %d\n");
735
+ TOOLCMP(oldParam->scalingLists, newParam->scalingLists, "scalinglists=%s to %s\n");
736
}
737
738
bool Encoder::computeSPSRPSIndex()
739
x265_2.2.tar.gz/source/encoder/encoder.h -> x265_2.3.tar.gz/source/encoder/encoder.h
Changed
45
1
2
#include "scalinglist.h"
3
#include "x265.h"
4
#include "nal.h"
5
+#include "framedata.h"
6
7
struct x265_encoder {};
8
9
10
DPB* m_dpb;
11
Frame* m_exportedPic;
12
FILE* m_analysisFile;
13
+ FILE* m_analysisFileIn;
14
+ FILE* m_analysisFileOut;
15
x265_param* m_param;
16
x265_param* m_latestParam; // Holds latest param during a reconfigure
17
RateControl* m_rateControl;
18
19
Lock m_sliceQpLock;
20
int m_iFrameNum;
21
int m_iPPSQpMinus26;
22
- int m_iLastSliceQp;
23
int64_t m_iBitsCostSum[QP_MAX_MAX + 1];
24
-
25
Lock m_sliceRefIdxLock;
26
RefIdxLastGOP m_refIdxLastGOP;
27
28
29
30
void freeAnalysis(x265_analysis_data* analysis);
31
32
+ void allocAnalysis2Pass(x265_analysis_2Pass* analysis, int sliceType);
33
+
34
+ void freeAnalysis2Pass(x265_analysis_2Pass* analysis, int sliceType);
35
+
36
void readAnalysisFile(x265_analysis_data* analysis, int poc);
37
38
void writeAnalysisFile(x265_analysis_data* pic, FrameData &curEncData);
39
-
40
+ void readAnalysis2PassFile(x265_analysis_2Pass* analysis2Pass, int poc, int sliceType);
41
+ void writeAnalysis2PassFile(x265_analysis_2Pass* analysis2Pass, FrameData &curEncData, int slicetype);
42
void finishFrameStats(Frame* pic, FrameEncoder *curEncoder, x265_frame_stats* frameStats, int inPoc);
43
44
void calcRefreshInterval(Frame* frameEnc);
45
x265_2.2.tar.gz/source/encoder/entropy.cpp -> x265_2.3.tar.gz/source/encoder/entropy.cpp
Changed
9
1
2
markValid();
3
m_fracBits = 0;
4
m_pad = 0;
5
+ m_meanQP = 0;
6
X265_CHECK(sizeof(m_contextState) >= sizeof(m_contextState[0]) * MAX_OFF_CTX_MOD, "context state table is too small\n");
7
}
8
9
x265_2.2.tar.gz/source/encoder/entropy.h -> x265_2.3.tar.gz/source/encoder/entropy.h
Changed
9
1
2
int m_bitsLeft;
3
uint64_t m_fracBits;
4
EstBitsSbac m_estBitsSbac;
5
+ double m_meanQP;
6
7
Entropy();
8
9
x265_2.2.tar.gz/source/encoder/frameencoder.cpp -> x265_2.3.tar.gz/source/encoder/frameencoder.cpp
Changed
148
1
2
m_top->m_iBitsCostSum[i] += codeLength;
3
}
4
m_top->m_iFrameNum++;
5
- m_top->m_iLastSliceQp = slice->m_sliceQp;
6
}
7
-
8
m_initSliceContext.resetEntropy(*slice);
9
10
m_frameFilter.start(m_frame, m_initSliceContext);
11
12
m_frame->m_encData->m_frameStats.lumaDistortion += m_rows[i].rowStats.lumaDistortion;
13
m_frame->m_encData->m_frameStats.chromaDistortion += m_rows[i].rowStats.chromaDistortion;
14
m_frame->m_encData->m_frameStats.psyEnergy += m_rows[i].rowStats.psyEnergy;
15
+ m_frame->m_encData->m_frameStats.ssimEnergy += m_rows[i].rowStats.ssimEnergy;
16
m_frame->m_encData->m_frameStats.resEnergy += m_rows[i].rowStats.resEnergy;
17
for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
18
{
19
20
m_frame->m_encData->m_frameStats.avgLumaDistortion = (double)(m_frame->m_encData->m_frameStats.lumaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
21
m_frame->m_encData->m_frameStats.avgChromaDistortion = (double)(m_frame->m_encData->m_frameStats.chromaDistortion) / m_frame->m_encData->m_frameStats.totalCtu;
22
m_frame->m_encData->m_frameStats.avgPsyEnergy = (double)(m_frame->m_encData->m_frameStats.psyEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
23
+ m_frame->m_encData->m_frameStats.avgSsimEnergy = (double)(m_frame->m_encData->m_frameStats.ssimEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
24
m_frame->m_encData->m_frameStats.avgResEnergy = (double)(m_frame->m_encData->m_frameStats.resEnergy) / m_frame->m_encData->m_frameStats.totalCtu;
25
m_frame->m_encData->m_frameStats.percentIntraNxN = (double)(m_frame->m_encData->m_frameStats.cntIntraNxN * 100) / m_frame->m_encData->m_frameStats.totalCu;
26
for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
27
28
}
29
m_accessUnitBits = bytes << 3;
30
31
- m_endCompressTime = x265_mdate();
32
-
33
+ int filler = 0;
34
/* rateControlEnd may also block for earlier frames to call rateControlUpdateStats */
35
- if (m_top->m_rateControl->rateControlEnd(m_frame, m_accessUnitBits, &m_rce) < 0)
36
+ if (m_top->m_rateControl->rateControlEnd(m_frame, m_accessUnitBits, &m_rce, &filler) < 0)
37
m_top->m_aborted = true;
38
39
+ if (filler > 0)
40
+ {
41
+ filler = (filler - FILLER_OVERHEAD * 8) >> 3;
42
+ m_bs.resetBits();
43
+ while (filler > 0)
44
+ {
45
+ m_bs.write(0xff, 8);
46
+ filler--;
47
+ }
48
+ m_bs.writeByteAlignment();
49
+ m_nalList.serialize(NAL_UNIT_FILLER_DATA, m_bs);
50
+ bytes += m_nalList.m_nal[m_nalList.m_numNal - 1].sizeBytes;
51
+ bytes -= 3; //exclude start code prefix
52
+ m_accessUnitBits = bytes << 3;
53
+ }
54
+
55
+ m_endCompressTime = x265_mdate();
56
+
57
/* Decrement referenced frame reference counts, allow them to be recycled */
58
for (int l = 0; l < numPredDir; l++)
59
{
60
61
//m_rows[row - 1].bufferedEntropy.loadContexts(m_initSliceContext);
62
}
63
64
+ // calculate mean QP for consistent deltaQP signalling calculation
65
+ if (m_param->bOptCUDeltaQP)
66
+ {
67
+ ScopedLock self(curRow.lock);
68
+ if (!curRow.avgQPComputed)
69
+ {
70
+ if (m_param->bEnableWavefront || !row)
71
+ {
72
+ double meanQPOff = 0;
73
+ uint32_t loopIncr, count = 0;
74
+ bool isReferenced = IS_REFERENCED(m_frame);
75
+ double *qpoffs = (isReferenced && m_param->rc.cuTree) ? m_frame->m_lowres.qpCuTreeOffset : m_frame->m_lowres.qpAqOffset;
76
+ if (qpoffs)
77
+ {
78
+ if (m_param->rc.qgSize == 8)
79
+ loopIncr = 8;
80
+ else
81
+ loopIncr = 16;
82
+ uint32_t cuYStart = 0, height = m_frame->m_fencPic->m_picHeight;
83
+ if (m_param->bEnableWavefront)
84
+ {
85
+ cuYStart = intRow * m_param->maxCUSize;
86
+ height = cuYStart + m_param->maxCUSize;
87
+ }
88
+
89
+ uint32_t qgSize = m_param->rc.qgSize, width = m_frame->m_fencPic->m_picWidth;
90
+ uint32_t maxOffsetCols = (m_frame->m_fencPic->m_picWidth + (loopIncr - 1)) / loopIncr;
91
+ for (uint32_t cuY = cuYStart; cuY < height && (cuY < m_frame->m_fencPic->m_picHeight); cuY += qgSize)
92
+ {
93
+ for (uint32_t cuX = 0; cuX < width; cuX += qgSize)
94
+ {
95
+ double qp_offset = 0;
96
+ uint32_t cnt = 0;
97
+
98
+ for (uint32_t block_yy = cuY; block_yy < cuY + qgSize && block_yy < m_frame->m_fencPic->m_picHeight; block_yy += loopIncr)
99
+ {
100
+ for (uint32_t block_xx = cuX; block_xx < cuX + qgSize && block_xx < width; block_xx += loopIncr)
101
+ {
102
+ int idx = ((block_yy / loopIncr) * (maxOffsetCols)) + (block_xx / loopIncr);
103
+ qp_offset += qpoffs[idx];
104
+ cnt++;
105
+ }
106
+ }
107
+ qp_offset /= cnt;
108
+ meanQPOff += qp_offset;
109
+ count++;
110
+ }
111
+ }
112
+ meanQPOff /= count;
113
+ }
114
+ rowCoder.m_meanQP = slice->m_sliceQp + meanQPOff;
115
+ }
116
+ else
117
+ {
118
+ rowCoder.m_meanQP = m_rows[0].rowGoOnCoder.m_meanQP;
119
+ }
120
+ curRow.avgQPComputed = 1;
121
+ }
122
+ }
123
124
// TODO: specially case handle on first and last row
125
126
127
rowCoder.copyState(m_initSliceContext);
128
rowCoder.loadContexts(m_rows[row - 1].bufferedEntropy);
129
}
130
+ analysis2PassFrameData* analysisFrameData = (analysis2PassFrameData*)(m_frame->m_analysis2Pass).analysisFramedata;
131
+ if (analysisFrameData && m_param->rc.bStatRead && m_param->analysisMultiPassDistortion && (analysisFrameData->threshold[cuAddr] < 0.9 || analysisFrameData->threshold[cuAddr] > 1.1)
132
+ && analysisFrameData->highDistortionCtuCount && analysisFrameData->lowDistortionCtuCount)
133
+ curEncData.m_cuStat[cuAddr].baseQp += analysisFrameData->offset[cuAddr];
134
+
135
+ if (m_param->dynamicRd && (int32_t)(m_rce.qpaRc - m_rce.qpNoVbv) > 0)
136
+ ctu->m_vbvAffected = true;
137
138
// Does all the CU analysis, returns best top level mode decision
139
Mode& best = tld.analysis.compressCTU(*ctu, *m_frame, m_cuGeoms[m_ctuGeomMap[cuAddr]], rowCoder);
140
141
curRow.rowStats.lumaDistortion += best.lumaDistortion;
142
curRow.rowStats.chromaDistortion += best.chromaDistortion;
143
curRow.rowStats.psyEnergy += best.psyEnergy;
144
+ curRow.rowStats.ssimEnergy += best.ssimEnergy;
145
curRow.rowStats.resEnergy += best.resEnergy;
146
curRow.rowStats.cntIntraNxN += frameLog.cntIntraNxN;
147
curRow.rowStats.totalCu += frameLog.totalCu;
148
x265_2.2.tar.gz/source/encoder/frameencoder.h -> x265_2.3.tar.gz/source/encoder/frameencoder.h
Changed
17
1
2
3
/* count of completed CUs in this row */
4
volatile uint32_t completed;
5
+ volatile uint32_t avgQPComputed;
6
7
/* called at the start of each frame to initialize state */
8
void init(Entropy& initContext, unsigned int sid)
9
10
active = false;
11
busy = false;
12
completed = 0;
13
+ avgQPComputed = 0;
14
sliceId = sid;
15
memset(&rowStats, 0, sizeof(rowStats));
16
rowGoOnCoder.load(initContext);
17
x265_2.2.tar.gz/source/encoder/framefilter.cpp -> x265_2.3.tar.gz/source/encoder/framefilter.cpp
Changed
43
1
2
const uint32_t numCols = m_frame->m_encData->m_slice->m_sps->numCuInWidth;
3
const uint32_t lineStartCUAddr = row * numCols;
4
5
+ /* Generate integral planes for SEA motion search */
6
+ if(m_param->searchMethod == X265_SEA)
7
+ computeMEIntegral(row);
8
// Notify other FrameEncoders that this row of reconstructed pixels is available
9
m_frame->m_reconRowFlag[row].set(1);
10
11
12
}
13
} // end of (m_param->maxSlices == 1)
14
15
- int lastRow = row == (int)m_frame->m_encData->m_slice->m_sps->numCuInHeight - 1;
16
+ if (ATOMIC_INC(&m_frameEncoder->m_completionCount) == 2 * (int)m_frameEncoder->m_numRows)
17
+ {
18
+ m_frameEncoder->m_completionEvent.trigger();
19
+ }
20
+}
21
22
- /* generate integral planes for SEA motion search */
23
- if (m_param->searchMethod == X265_SEA && m_frame->m_encData->m_meIntegral && m_frame->m_lowres.sliceType != X265_TYPE_B)
24
+void FrameFilter::computeMEIntegral(int row)
25
+{
26
+ int lastRow = row == (int)m_frame->m_encData->m_slice->m_sps->numCuInHeight - 1;
27
+ if (m_frame->m_encData->m_meIntegral && m_frame->m_lowres.sliceType != X265_TYPE_B)
28
{
29
/* If WPP, other than first row, integral calculation for current row needs to wait till the
30
* integral for the previous row is computed */
31
32
}
33
m_parallelFilter[row].m_frameFilter->integralCompleted.set(1);
34
}
35
-
36
- if (ATOMIC_INC(&m_frameEncoder->m_completionCount) == 2 * (int)m_frameEncoder->m_numRows)
37
- {
38
- m_frameEncoder->m_completionEvent.trigger();
39
- }
40
}
41
42
static uint64_t computeSSD(pixel *fenc, pixel *rec, intptr_t stride, uint32_t width, uint32_t height)
43
x265_2.2.tar.gz/source/encoder/framefilter.h -> x265_2.3.tar.gz/source/encoder/framefilter.h
Changed
9
1
2
3
void processRow(int row);
4
void processPostRow(int row);
5
+ void computeMEIntegral(int row);
6
};
7
}
8
9
x265_2.2.tar.gz/source/encoder/ratecontrol.cpp -> x265_2.3.tar.gz/source/encoder/ratecontrol.cpp
Changed
127
1
2
m_param->fpsNum, m_param->fpsDenom, k, l);
3
return false;
4
}
5
+ if (m_param->analysisMultiPassRefine)
6
+ {
7
+ p = strstr(opts, "ref=");
8
+ sscanf(p, "ref=%d", &i);
9
+ if (i > m_param->maxNumReferences)
10
+ {
11
+ x265_log(m_param, X265_LOG_ERROR, "maxNumReferences cannot be less than 1st pass (%d vs %d)\n",
12
+ i, m_param->maxNumReferences);
13
+ return false;
14
+ }
15
+ }
16
+ if (m_param->analysisMultiPassRefine || m_param->analysisMultiPassDistortion)
17
+ {
18
+ p = strstr(opts, "ctu=");
19
+ sscanf(p, "ctu=%u", &k);
20
+ if (k != m_param->maxCUSize)
21
+ {
22
+ x265_log(m_param, X265_LOG_ERROR, "maxCUSize mismatch with 1st pass (%u vs %u)\n",
23
+ k, m_param->maxCUSize);
24
+ return false;
25
+ }
26
+ }
27
CMP_OPT_FIRST_PASS("bitdepth", m_param->internalBitDepth);
28
CMP_OPT_FIRST_PASS("weightp", m_param->bEnableWeightedPred);
29
CMP_OPT_FIRST_PASS("bframes", m_param->bframes);
30
31
p->offset += new_offset;
32
}
33
34
-void RateControl::updateVbv(int64_t bits, RateControlEntry* rce)
35
+int RateControl::updateVbv(int64_t bits, RateControlEntry* rce)
36
{
37
int predType = rce->sliceType;
38
+ int filler = 0;
39
+ double bufferBits;
40
predType = rce->sliceType == B_SLICE && rce->keptAsRef ? 3 : predType;
41
if (rce->lastSatd >= m_ncu && rce->encodeOrder >= m_lastPredictorReset)
42
updatePredictor(&m_pred[predType], x265_qp2qScale(rce->qpaRc), (double)rce->lastSatd, (double)bits);
43
if (!m_isVbv)
44
- return;
45
+ return 0;
46
47
m_bufferFillFinal -= bits;
48
49
50
51
m_bufferFillFinal = X265_MAX(m_bufferFillFinal, 0);
52
m_bufferFillFinal += m_bufferRate;
53
- m_bufferFillFinal = X265_MIN(m_bufferFillFinal, m_bufferSize);
54
- double bufferBits = X265_MIN(bits + m_bufferExcess, m_bufferRate);
55
- m_bufferExcess = X265_MAX(m_bufferExcess - bufferBits + bits, 0);
56
- m_bufferFillActual += bufferBits - bits;
57
- m_bufferFillActual = X265_MIN(m_bufferFillActual, m_bufferSize);
58
+
59
+ if (m_bufferFillFinal > m_bufferSize)
60
+ {
61
+ if (m_param->rc.bStrictCbr)
62
+ {
63
+ filler = (int)(m_bufferFillFinal - m_bufferSize);
64
+ filler += FILLER_OVERHEAD * 8;
65
+ m_bufferFillFinal -= filler;
66
+ bufferBits = X265_MIN(bits + filler + m_bufferExcess, m_bufferRate);
67
+ m_bufferExcess = X265_MAX(m_bufferExcess - bufferBits + bits + filler, 0);
68
+ m_bufferFillActual += bufferBits - bits - filler;
69
+ }
70
+ else
71
+ {
72
+ m_bufferFillFinal = X265_MIN(m_bufferFillFinal, m_bufferSize);
73
+ bufferBits = X265_MIN(bits + m_bufferExcess, m_bufferRate);
74
+ m_bufferExcess = X265_MAX(m_bufferExcess - bufferBits + bits, 0);
75
+ m_bufferFillActual += bufferBits - bits;
76
+ m_bufferFillActual = X265_MIN(m_bufferFillActual, m_bufferSize);
77
+ }
78
+ }
79
+ return filler;
80
}
81
82
/* After encoding one frame, update rate control state */
83
-int RateControl::rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce)
84
+int RateControl::rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce, int *filler)
85
{
86
int orderValue = m_startEndOrder.get();
87
int endOrdinal = (rce->encodeOrder + m_param->frameNumThreads) * 2 - 1;
88
89
int64_t actualBits = bits;
90
Slice *slice = curEncData.m_slice;
91
92
- if (m_param->rc.aqMode || m_isVbv)
93
+ if (m_param->rc.aqMode || m_isVbv || m_param->bAQMotion)
94
{
95
if (m_isVbv && !(m_2pass && m_param->rc.rateControlMode == X265_RC_CRF))
96
{
97
98
rce->qpaRc = curEncData.m_avgQpRc;
99
}
100
101
- if (m_param->rc.aqMode)
102
+ if (m_param->rc.aqMode || m_param->bAQMotion)
103
{
104
double avgQpAq = 0;
105
/* determine actual avg encoded QP, after AQ/cutree adjustments */
106
107
108
if (m_isVbv)
109
{
110
- updateVbv(actualBits, rce);
111
+ *filler = updateVbv(actualBits, rce);
112
113
if (m_param->bEmitHRDSEI)
114
{
115
116
117
rce->hrdTiming->cpbInitialAT = hrd->cbrFlag ? m_prevCpbFinalAT : X265_MAX(m_prevCpbFinalAT, cpbEarliestAT);
118
}
119
-
120
+ int filler_bits = *filler ? (*filler - START_CODE_OVERHEAD * 8) : 0;
121
uint32_t cpbsizeUnscale = hrd->cpbSizeValue << (hrd->cpbSizeScale + CPB_SHIFT);
122
- rce->hrdTiming->cpbFinalAT = m_prevCpbFinalAT = rce->hrdTiming->cpbInitialAT + actualBits / cpbsizeUnscale;
123
+ rce->hrdTiming->cpbFinalAT = m_prevCpbFinalAT = rce->hrdTiming->cpbInitialAT + (actualBits + filler_bits)/ cpbsizeUnscale;
124
rce->hrdTiming->dpbOutputTime = (double)rce->picTimingSEI->m_picDpbOutputDelay * time->numUnitsInTick / time->timeScale + rce->hrdTiming->cpbRemovalTime;
125
}
126
}
127
x265_2.2.tar.gz/source/encoder/ratecontrol.h -> x265_2.3.tar.gz/source/encoder/ratecontrol.h
Changed
19
1
2
// to be called for each curFrame to process RateControl and set QP
3
int rateControlStart(Frame* curFrame, RateControlEntry* rce, Encoder* enc);
4
void rateControlUpdateStats(RateControlEntry* rce);
5
- int rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce);
6
+ int rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce, int *filler);
7
int rowVbvRateControl(Frame* curFrame, uint32_t row, RateControlEntry* rce, double& qpVbv);
8
int rateControlSliceType(int frameNum);
9
bool cuTreeReadFor2Pass(Frame* curFrame);
10
11
void accumPQpUpdate();
12
13
int getPredictorType(int lowresSliceType, int sliceType);
14
- void updateVbv(int64_t bits, RateControlEntry* rce);
15
+ int updateVbv(int64_t bits, RateControlEntry* rce);
16
void updatePredictor(Predictor *p, double q, double var, double bits);
17
double clipQscale(Frame* pic, RateControlEntry* rce, double q);
18
void updateVbvPlan(Encoder* enc);
19
x265_2.2.tar.gz/source/encoder/rdcost.h -> x265_2.3.tar.gz/source/encoder/rdcost.h
Changed
34
1
2
uint32_t m_chromaDistWeight[2];
3
uint32_t m_psyRdBase;
4
uint32_t m_psyRd;
5
+ uint32_t m_ssimRd;
6
int m_qp; /* QP used to configure lambda, may be higher than QP_MAX_SPEC but <= QP_MAX_MAX */
7
8
void setPsyRdScale(double scale) { m_psyRdBase = (uint32_t)floor(65536.0 * scale * 0.33); }
9
+ void setSsimRd(int ssimRd) { m_ssimRd = ssimRd; };
10
11
void setQP(const Slice& slice, int qp)
12
{
13
14
return distortion + ((m_lambda * m_psyRd * psycost) >> 24) + ((bits * m_lambda2) >> 8);
15
}
16
17
+ inline uint64_t calcSsimRdCost(uint64_t distortion, uint32_t bits, uint32_t ssimCost) const
18
+ {
19
+#if X265_DEPTH < 10
20
+ X265_CHECK((bits <= (UINT64_MAX / m_lambda2)) && (ssimCost <= UINT64_MAX / m_lambda),
21
+ "calcPsyRdCost wrap detected dist: " X265_LL " bits: %u, lambda: " X265_LL ", lambda2: " X265_LL "\n",
22
+ distortion, bits, m_lambda, m_lambda2);
23
+#else
24
+ X265_CHECK((bits <= (UINT64_MAX / m_lambda2)) && (ssimCost <= UINT64_MAX / m_lambda),
25
+ "calcPsyRdCost wrap detected dist: " X265_LL ", bits: %u, lambda: " X265_LL ", lambda2: " X265_LL "\n",
26
+ distortion, bits, m_lambda, m_lambda2);
27
+#endif
28
+ return distortion + ((m_lambda * ssimCost) >> 14) + ((bits * m_lambda2) >> 8);
29
+ }
30
+
31
inline uint64_t calcRdSADCost(uint32_t sadCost, uint32_t bits) const
32
{
33
X265_CHECK(bits <= (UINT64_MAX - 128) / m_lambda,
34
x265_2.2.tar.gz/source/encoder/search.cpp -> x265_2.3.tar.gz/source/encoder/search.cpp
Changed
506
1
2
m_numLayers = g_log2Size[param.maxCUSize] - 2;
3
4
m_rdCost.setPsyRdScale(param.psyRd);
5
+ m_rdCost.setSsimRd(param.bSsimRd);
6
m_me.init(param.internalCsp);
7
8
bool ok = m_quant.init(param.psyRdoq, scalingList, m_entropyCoder);
9
10
fullCost.energy = m_rdCost.psyCost(sizeIdx, fenc, mode.fencYuv->m_size, reconQt, reconQtStride);
11
fullCost.rdcost = m_rdCost.calcPsyRdCost(fullCost.distortion, fullCost.bits, fullCost.energy);
12
}
13
+ else if(m_rdCost.m_ssimRd)
14
+ {
15
+ fullCost.energy = m_quant.ssimDistortion(cu, fenc, stride, reconQt, reconQtStride, log2TrSize, TEXT_LUMA, absPartIdx);
16
+ fullCost.rdcost = m_rdCost.calcSsimRdCost(fullCost.distortion, fullCost.bits, fullCost.energy);
17
+ }
18
else
19
fullCost.rdcost = m_rdCost.calcRdCost(fullCost.distortion, fullCost.bits);
20
}
21
22
23
if (m_rdCost.m_psyRd)
24
splitCost.rdcost = m_rdCost.calcPsyRdCost(splitCost.distortion, splitCost.bits, splitCost.energy);
25
+ else if(m_rdCost.m_ssimRd)
26
+ splitCost.rdcost = m_rdCost.calcSsimRdCost(splitCost.distortion, splitCost.bits, splitCost.energy);
27
else
28
splitCost.rdcost = m_rdCost.calcRdCost(splitCost.distortion, splitCost.bits);
29
}
30
31
tmpEnergy = m_rdCost.psyCost(sizeIdx, fenc, fencYuv->m_size, tmpRecon, tmpReconStride);
32
tmpCost = m_rdCost.calcPsyRdCost(tmpDist, tmpBits, tmpEnergy);
33
}
34
+ else if(m_rdCost.m_ssimRd)
35
+ {
36
+ tmpEnergy = m_quant.ssimDistortion(cu, fenc, stride, tmpRecon, tmpReconStride, log2TrSize, TEXT_LUMA, absPartIdx);
37
+ tmpCost = m_rdCost.calcSsimRdCost(tmpDist, tmpBits, tmpEnergy);
38
+ }
39
else
40
tmpCost = m_rdCost.calcRdCost(tmpDist, tmpBits);
41
42
43
44
if (m_rdCost.m_psyRd)
45
outCost.energy += m_rdCost.psyCost(sizeIdxC, fenc, stride, reconQt, reconQtStride);
46
+ else if(m_rdCost.m_ssimRd)
47
+ outCost.energy += m_quant.ssimDistortion(cu, fenc, stride, reconQt, reconQtStride, log2TrSizeC, ttype, absPartIdxC);
48
49
primitives.cu[sizeIdxC].copy_pp(picReconC, picStride, reconQt, reconQtStride);
50
}
51
52
tmpEnergy = m_rdCost.psyCost(sizeIdxC, fenc, stride, reconQt, reconQtStride);
53
tmpCost = m_rdCost.calcPsyRdCost(tmpDist, tmpBits, tmpEnergy);
54
}
55
+ else if(m_rdCost.m_ssimRd)
56
+ {
57
+ tmpEnergy = m_quant.ssimDistortion(cu, fenc, stride, reconQt, reconQtStride, log2TrSizeC, ttype, absPartIdxC);
58
+ tmpCost = m_rdCost.calcSsimRdCost(tmpDist, tmpBits, tmpEnergy);
59
+ }
60
else
61
tmpCost = m_rdCost.calcRdCost(tmpDist, tmpBits);
62
63
64
}
65
else
66
intraMode.distortion += intraMode.lumaDistortion;
67
-
68
+ cu.m_distortion[0] = intraMode.distortion;
69
m_entropyCoder.resetBits();
70
if (m_slice->m_pps->bTransquantBypassEnabled)
71
m_entropyCoder.codeCUTransquantBypassFlag(cu.m_tqBypass[0]);
72
73
m_entropyCoder.store(intraMode.contexts);
74
intraMode.totalBits = m_entropyCoder.getNumberOfWrittenBits();
75
intraMode.coeffBits = intraMode.totalBits - intraMode.mvBits - skipFlagBits;
76
+ const Yuv* fencYuv = intraMode.fencYuv;
77
if (m_rdCost.m_psyRd)
78
- {
79
- const Yuv* fencYuv = intraMode.fencYuv;
80
intraMode.psyEnergy = m_rdCost.psyCost(cuGeom.log2CUSize - 2, fencYuv->m_buf[0], fencYuv->m_size, intraMode.reconYuv.m_buf[0], intraMode.reconYuv.m_size);
81
- }
82
+ else if(m_rdCost.m_ssimRd)
83
+ intraMode.ssimEnergy = m_quant.ssimDistortion(cu, fencYuv->m_buf[0], fencYuv->m_size, intraMode.reconYuv.m_buf[0], intraMode.reconYuv.m_size, cuGeom.log2CUSize, TEXT_LUMA, 0);
84
+
85
intraMode.resEnergy = primitives.cu[cuGeom.log2CUSize - 2].sse_pp(intraMode.fencYuv->m_buf[0], intraMode.fencYuv->m_size, intraMode.predYuv.m_buf[0], intraMode.predYuv.m_size);
86
87
updateModeCost(intraMode);
88
89
90
intraMode.totalBits = m_entropyCoder.getNumberOfWrittenBits();
91
intraMode.coeffBits = intraMode.totalBits - intraMode.mvBits - skipFlagBits;
92
+ const Yuv* fencYuv = intraMode.fencYuv;
93
if (m_rdCost.m_psyRd)
94
- {
95
- const Yuv* fencYuv = intraMode.fencYuv;
96
intraMode.psyEnergy = m_rdCost.psyCost(cuGeom.log2CUSize - 2, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size);
97
- }
98
- intraMode.resEnergy = primitives.cu[cuGeom.log2CUSize - 2].sse_pp(intraMode.fencYuv->m_buf[0], intraMode.fencYuv->m_size, intraMode.predYuv.m_buf[0], intraMode.predYuv.m_size);
99
+ else if(m_rdCost.m_ssimRd)
100
+ intraMode.ssimEnergy = m_quant.ssimDistortion(cu, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size, cuGeom.log2CUSize, TEXT_LUMA, 0);
101
+
102
+ intraMode.resEnergy = primitives.cu[cuGeom.log2CUSize - 2].sse_pp(fencYuv->m_buf[0], fencYuv->m_size, intraMode.predYuv.m_buf[0], intraMode.predYuv.m_size);
103
m_entropyCoder.store(intraMode.contexts);
104
updateModeCost(intraMode);
105
checkDQP(intraMode, cuGeom);
106
107
codeCoeffQTChroma(cu, initTuDepth, absPartIdxC, TEXT_CHROMA_U);
108
codeCoeffQTChroma(cu, initTuDepth, absPartIdxC, TEXT_CHROMA_V);
109
uint32_t bits = m_entropyCoder.getNumberOfWrittenBits();
110
- uint64_t cost = m_rdCost.m_psyRd ? m_rdCost.calcPsyRdCost(outCost.distortion, bits, outCost.energy)
111
+ uint64_t cost = m_rdCost.m_psyRd ? m_rdCost.calcPsyRdCost(outCost.distortion, bits, outCost.energy) : m_rdCost.m_ssimRd ? m_rdCost.calcSsimRdCost(outCost.distortion, bits, outCost.energy)
112
: m_rdCost.calcRdCost(outCost.distortion, bits);
113
114
if (cost < bestCost)
115
116
cu.getNeighbourMV(puIdx, pu.puAbsPartIdx, interMode.interNeighbours);
117
118
/* Uni-directional prediction */
119
- if (m_param->analysisMode == X265_ANALYSIS_LOAD)
120
+ if (m_param->analysisMode == X265_ANALYSIS_LOAD || (m_param->analysisMultiPassRefine && m_param->rc.bStatRead))
121
{
122
for (int list = 0; list < numPredDir; list++)
123
{
124
125
m_me.integral[planes] = interMode.fencYuv->m_integral[list][ref][planes] + puX * pu.width + puY * pu.height * m_slice->m_refFrameList[list][ref]->m_reconPic->m_stride;
126
}
127
setSearchRange(cu, mvp, m_param->searchRange, mvmin, mvmax);
128
- int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvp, numMvc, mvc, m_param->searchRange, outmv,
129
+ MV mvpIn = mvp;
130
+ if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && mvpIdx == bestME[list].mvpIdx)
131
+ mvpIn = bestME[list].mv;
132
+
133
+ int satdCost = m_me.motionEstimate(&slice->m_mref[list][ref], mvmin, mvmax, mvpIn, numMvc, mvc, m_param->searchRange, outmv,
134
m_param->bSourceReferenceEstimation ? m_slice->m_refFrameList[list][ref]->m_fencPic->getLumaAddr(0) : 0);
135
136
/* Get total cost of partition, but only include MV bit cost once */
137
138
uint32_t cost = (satdCost - mvCost) + m_rdCost.getCost(bits);
139
140
/* Refine MVP selection, updates: mvpIdx, bits, cost */
141
- mvp = checkBestMVP(amvp, outmv, mvpIdx, bits, cost);
142
+ if (!m_param->analysisMultiPassRefine)
143
+ mvp = checkBestMVP(amvp, outmv, mvpIdx, bits, cost);
144
+ else
145
+ {
146
+ /* It is more accurate to compare with actual mvp that was used in motionestimate than amvp[mvpIdx]. Here
147
+ the actual mvp is bestME from pass 1 for that mvpIdx */
148
+ int diffBits = m_me.bitcost(outmv, amvp[!mvpIdx]) - m_me.bitcost(outmv, mvpIn);
149
+ if (diffBits < 0)
150
+ {
151
+ mvpIdx = !mvpIdx;
152
+ uint32_t origOutBits = bits;
153
+ bits = origOutBits + diffBits;
154
+ cost = (cost - m_rdCost.getCost(origOutBits)) + m_rdCost.getCost(bits);
155
+ }
156
+ mvp = amvp[mvpIdx];
157
+ }
158
159
if (cost < bestME[list].cost)
160
{
161
162
interMode.chromaDistortion += m_rdCost.scaleChromaDist(2, primitives.chroma[m_csp].cu[part].sse_pp(fencYuv->m_buf[2], fencYuv->m_csize, reconYuv->m_buf[2], reconYuv->m_csize));
163
interMode.distortion += interMode.chromaDistortion;
164
}
165
+ cu.m_distortion[0] = interMode.distortion;
166
m_entropyCoder.load(m_rqt[depth].cur);
167
m_entropyCoder.resetBits();
168
if (m_slice->m_pps->bTransquantBypassEnabled)
169
170
interMode.totalBits = interMode.mvBits + skipFlagBits;
171
if (m_rdCost.m_psyRd)
172
interMode.psyEnergy = m_rdCost.psyCost(part, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size);
173
+ else if(m_rdCost.m_ssimRd)
174
+ interMode.ssimEnergy = m_quant.ssimDistortion(cu, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size, cu.m_log2CUSize[0], TEXT_LUMA, 0);
175
+
176
interMode.resEnergy = primitives.cu[part].sse_pp(fencYuv->m_buf[0], fencYuv->m_size, predYuv->m_buf[0], predYuv->m_size);
177
updateModeCost(interMode);
178
m_entropyCoder.store(interMode.contexts);
179
180
m_entropyCoder.codeQtRootCbfZero();
181
uint32_t cbf0Bits = m_entropyCoder.getNumberOfWrittenBits();
182
183
- uint64_t cbf0Cost;
184
- uint32_t cbf0Energy;
185
+ uint32_t cbf0Energy; uint64_t cbf0Cost;
186
if (m_rdCost.m_psyRd)
187
{
188
cbf0Energy = m_rdCost.psyCost(log2CUSize - 2, fencYuv->m_buf[0], fencYuv->m_size, predYuv->m_buf[0], predYuv->m_size);
189
cbf0Cost = m_rdCost.calcPsyRdCost(cbf0Dist, cbf0Bits, cbf0Energy);
190
}
191
+ else if(m_rdCost.m_ssimRd)
192
+ {
193
+ cbf0Energy = m_quant.ssimDistortion(cu, fencYuv->m_buf[0], fencYuv->m_size, predYuv->m_buf[0], predYuv->m_size, log2CUSize, TEXT_LUMA, 0);
194
+ cbf0Cost = m_rdCost.calcSsimRdCost(cbf0Dist, cbf0Bits, cbf0Energy);
195
+ }
196
else
197
cbf0Cost = m_rdCost.calcRdCost(cbf0Dist, cbf0Bits);
198
199
200
}
201
if (m_rdCost.m_psyRd)
202
interMode.psyEnergy = m_rdCost.psyCost(sizeIdx, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size);
203
+ else if(m_rdCost.m_ssimRd)
204
+ interMode.ssimEnergy = m_quant.ssimDistortion(cu, fencYuv->m_buf[0], fencYuv->m_size, reconYuv->m_buf[0], reconYuv->m_size, cu.m_log2CUSize[0], TEXT_LUMA, 0);
205
+
206
interMode.resEnergy = primitives.cu[sizeIdx].sse_pp(fencYuv->m_buf[0], fencYuv->m_size, predYuv->m_buf[0], predYuv->m_size);
207
interMode.totalBits = bits;
208
interMode.lumaDistortion = bestLumaDist;
209
interMode.coeffBits = coeffBits;
210
interMode.mvBits = mvBits;
211
+ cu.m_distortion[0] = interMode.distortion;
212
updateModeCost(interMode);
213
checkDQP(interMode, cuGeom);
214
}
215
216
}
217
}
218
219
-uint64_t Search::estimateNullCbfCost(sse_t dist, uint32_t psyEnergy, uint32_t tuDepth, TextType compId)
220
+uint64_t Search::estimateNullCbfCost(sse_t dist, uint32_t energy, uint32_t tuDepth, TextType compId)
221
{
222
uint32_t nullBits = m_entropyCoder.estimateCbfBits(0, compId, tuDepth);
223
224
if (m_rdCost.m_psyRd)
225
- return m_rdCost.calcPsyRdCost(dist, nullBits, psyEnergy);
226
+ return m_rdCost.calcPsyRdCost(dist, nullBits, energy);
227
+ else if(m_rdCost.m_ssimRd)
228
+ return m_rdCost.calcSsimRdCost(dist, nullBits, energy);
229
else
230
return m_rdCost.calcRdCost(dist, nullBits);
231
}
232
233
234
if (m_rdCost.m_psyRd)
235
splitCost.rdcost = m_rdCost.calcPsyRdCost(splitCost.distortion, splitCost.bits, splitCost.energy);
236
+ else if(m_rdCost.m_ssimRd)
237
+ splitCost.rdcost = m_rdCost.calcSsimRdCost(splitCost.distortion, splitCost.bits, splitCost.energy);
238
else
239
splitCost.rdcost = m_rdCost.calcRdCost(splitCost.distortion, splitCost.bits);
240
241
242
uint32_t numSig[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, {0, 0}, {0, 0} };
243
uint32_t singleBits[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, { 0, 0 }, { 0, 0 } };
244
sse_t singleDist[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, { 0, 0 }, { 0, 0 } };
245
- uint32_t singlePsyEnergy[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, { 0, 0 }, { 0, 0 } };
246
+ uint32_t singleEnergy[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, { 0, 0 }, { 0, 0 } };
247
uint32_t bestTransformMode[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { 0, 0 }, { 0, 0 }, { 0, 0 } };
248
uint64_t minCost[MAX_NUM_COMPONENT][2 /*0 = top (or whole TU for non-4:2:2) sub-TU, 1 = bottom sub-TU*/] = { { MAX_INT64, MAX_INT64 }, {MAX_INT64, MAX_INT64}, {MAX_INT64, MAX_INT64} };
249
250
251
252
//Assuming zero residual
253
sse_t zeroDistY = primitives.cu[partSize].sse_pp(fenc, fencYuv->m_size, mode.predYuv.getLumaAddr(absPartIdx), mode.predYuv.m_size);
254
- uint32_t zeroPsyEnergyY = 0;
255
+ uint32_t zeroEnergyY = 0;
256
if (m_rdCost.m_psyRd)
257
- zeroPsyEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, mode.predYuv.getLumaAddr(absPartIdx), mode.predYuv.m_size);
258
+ zeroEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, mode.predYuv.getLumaAddr(absPartIdx), mode.predYuv.m_size);
259
+ else if(m_rdCost.m_ssimRd)
260
+ zeroEnergyY = m_quant.ssimDistortion(cu, fenc, fencYuv->m_size, mode.predYuv.getLumaAddr(absPartIdx), mode.predYuv.m_size, log2TrSize, TEXT_LUMA, absPartIdx);
261
262
int16_t* curResiY = m_rqt[qtLayer].resiQtYuv.getLumaAddr(absPartIdx);
263
uint32_t strideResiY = m_rqt[qtLayer].resiQtYuv.m_size;
264
265
266
const sse_t nonZeroDistY = primitives.cu[partSize].sse_pp(fenc, fencYuv->m_size, curReconY, strideReconY);
267
uint32_t nzCbfBitsY = m_entropyCoder.estimateCbfBits(cbfFlag[TEXT_LUMA][0], TEXT_LUMA, tuDepth);
268
- uint32_t nonZeroPsyEnergyY = 0; uint64_t singleCostY = 0;
269
+ uint32_t nonZeroEnergyY = 0; uint64_t singleCostY = 0;
270
if (m_rdCost.m_psyRd)
271
{
272
- nonZeroPsyEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, curReconY, strideReconY);
273
- singleCostY = m_rdCost.calcPsyRdCost(nonZeroDistY, nzCbfBitsY + singleBits[TEXT_LUMA][0], nonZeroPsyEnergyY);
274
+ nonZeroEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, curReconY, strideReconY);
275
+ singleCostY = m_rdCost.calcPsyRdCost(nonZeroDistY, nzCbfBitsY + singleBits[TEXT_LUMA][0], nonZeroEnergyY);
276
+ }
277
+ else if(m_rdCost.m_ssimRd)
278
+ {
279
+ nonZeroEnergyY = m_quant.ssimDistortion(cu, fenc, fencYuv->m_size, curReconY, strideReconY, log2TrSize, TEXT_LUMA, absPartIdx);
280
+ singleCostY = m_rdCost.calcSsimRdCost(nonZeroDistY, nzCbfBitsY + singleBits[TEXT_LUMA][0], nonZeroEnergyY);
281
}
282
else
283
singleCostY = m_rdCost.calcRdCost(nonZeroDistY, nzCbfBitsY + singleBits[TEXT_LUMA][0]);
284
285
if (cu.m_tqBypass[0])
286
{
287
singleDist[TEXT_LUMA][0] = nonZeroDistY;
288
- singlePsyEnergy[TEXT_LUMA][0] = nonZeroPsyEnergyY;
289
+ singleEnergy[TEXT_LUMA][0] = nonZeroEnergyY;
290
}
291
else
292
{
293
// zero-cost calculation for luma. This is an approximation
294
// Initial cost calculation was also an approximation. First resetting the bit counter and then encoding zero cbf.
295
// Now encoding the zero cbf without writing into bitstream, keeping m_fracBits unchanged. The same is valid for chroma.
296
- uint64_t nullCostY = estimateNullCbfCost(zeroDistY, zeroPsyEnergyY, tuDepth, TEXT_LUMA);
297
+ uint64_t nullCostY = estimateNullCbfCost(zeroDistY, zeroEnergyY, tuDepth, TEXT_LUMA);
298
299
if (nullCostY < singleCostY)
300
{
301
302
if (checkTransformSkipY)
303
minCost[TEXT_LUMA][0] = nullCostY;
304
singleDist[TEXT_LUMA][0] = zeroDistY;
305
- singlePsyEnergy[TEXT_LUMA][0] = zeroPsyEnergyY;
306
+ singleEnergy[TEXT_LUMA][0] = zeroEnergyY;
307
}
308
else
309
{
310
if (checkTransformSkipY)
311
minCost[TEXT_LUMA][0] = singleCostY;
312
singleDist[TEXT_LUMA][0] = nonZeroDistY;
313
- singlePsyEnergy[TEXT_LUMA][0] = nonZeroPsyEnergyY;
314
+ singleEnergy[TEXT_LUMA][0] = nonZeroEnergyY;
315
}
316
}
317
}
318
else
319
{
320
if (checkTransformSkipY)
321
- minCost[TEXT_LUMA][0] = estimateNullCbfCost(zeroDistY, zeroPsyEnergyY, tuDepth, TEXT_LUMA);
322
+ minCost[TEXT_LUMA][0] = estimateNullCbfCost(zeroDistY, zeroEnergyY, tuDepth, TEXT_LUMA);
323
primitives.cu[partSize].blockfill_s(curResiY, strideResiY, 0);
324
singleDist[TEXT_LUMA][0] = zeroDistY;
325
singleBits[TEXT_LUMA][0] = 0;
326
- singlePsyEnergy[TEXT_LUMA][0] = zeroPsyEnergyY;
327
+ singleEnergy[TEXT_LUMA][0] = zeroEnergyY;
328
}
329
330
cu.setCbfSubParts(cbfFlag[TEXT_LUMA][0] << tuDepth, TEXT_LUMA, absPartIdx, depth);
331
332
for (uint32_t chromaId = TEXT_CHROMA_U; chromaId <= TEXT_CHROMA_V; chromaId++)
333
{
334
sse_t zeroDistC = 0;
335
- uint32_t zeroPsyEnergyC = 0;
336
+ uint32_t zeroEnergyC = 0;
337
coeff_t* coeffCurC = m_rqt[qtLayer].coeffRQT[chromaId] + coeffOffsetC;
338
TURecurse tuIterator(splitIntoSubTUs ? VERTICAL_SPLIT : DONT_SPLIT, absPartIdxStep, absPartIdx);
339
340
341
int16_t* curResiC = m_rqt[qtLayer].resiQtYuv.getChromaAddr(chromaId, absPartIdxC);
342
zeroDistC = m_rdCost.scaleChromaDist(chromaId, primitives.cu[log2TrSizeC - 2].sse_pp(fenc, fencYuv->m_csize, mode.predYuv.getChromaAddr(chromaId, absPartIdxC), mode.predYuv.m_csize));
343
344
+ // Assuming zero residual
345
if (m_rdCost.m_psyRd)
346
- //Assuming zero residual
347
- zeroPsyEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, mode.predYuv.getChromaAddr(chromaId, absPartIdxC), mode.predYuv.m_csize);
348
+ zeroEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, mode.predYuv.getChromaAddr(chromaId, absPartIdxC), mode.predYuv.m_csize);
349
+ else if(m_rdCost.m_ssimRd)
350
+ zeroEnergyC = m_quant.ssimDistortion(cu, fenc, fencYuv->m_csize, mode.predYuv.getChromaAddr(chromaId, absPartIdxC), mode.predYuv.m_csize, log2TrSizeC, (TextType)chromaId, absPartIdxC);
351
352
if (cbfFlag[chromaId][tuIterator.section])
353
{
354
355
primitives.cu[partSizeC].add_ps(curReconC, strideReconC, mode.predYuv.getChromaAddr(chromaId, absPartIdxC), curResiC, mode.predYuv.m_csize, strideResiC);
356
sse_t nonZeroDistC = m_rdCost.scaleChromaDist(chromaId, primitives.cu[partSizeC].sse_pp(fenc, fencYuv->m_csize, curReconC, strideReconC));
357
uint32_t nzCbfBitsC = m_entropyCoder.estimateCbfBits(cbfFlag[chromaId][tuIterator.section], (TextType)chromaId, tuDepth);
358
- uint32_t nonZeroPsyEnergyC = 0; uint64_t singleCostC = 0;
359
+ uint32_t nonZeroEnergyC = 0; uint64_t singleCostC = 0;
360
if (m_rdCost.m_psyRd)
361
{
362
- nonZeroPsyEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, curReconC, strideReconC);
363
- singleCostC = m_rdCost.calcPsyRdCost(nonZeroDistC, nzCbfBitsC + singleBits[chromaId][tuIterator.section], nonZeroPsyEnergyC);
364
+ nonZeroEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, curReconC, strideReconC);
365
+ singleCostC = m_rdCost.calcPsyRdCost(nonZeroDistC, nzCbfBitsC + singleBits[chromaId][tuIterator.section], nonZeroEnergyC);
366
+ }
367
+ else if(m_rdCost.m_ssimRd)
368
+ {
369
+ nonZeroEnergyC = m_quant.ssimDistortion(cu, fenc, fencYuv->m_csize, curReconC, strideReconC, log2TrSizeC, (TextType)chromaId, absPartIdxC);
370
+ singleCostC = m_rdCost.calcSsimRdCost(nonZeroDistC, nzCbfBitsC + singleBits[chromaId][tuIterator.section], nonZeroEnergyC);
371
}
372
else
373
singleCostC = m_rdCost.calcRdCost(nonZeroDistC, nzCbfBitsC + singleBits[chromaId][tuIterator.section]);
374
375
if (cu.m_tqBypass[0])
376
{
377
singleDist[chromaId][tuIterator.section] = nonZeroDistC;
378
- singlePsyEnergy[chromaId][tuIterator.section] = nonZeroPsyEnergyC;
379
+ singleEnergy[chromaId][tuIterator.section] = nonZeroEnergyC;
380
}
381
else
382
{
383
//zero-cost calculation for chroma. This is an approximation
384
- uint64_t nullCostC = estimateNullCbfCost(zeroDistC, zeroPsyEnergyC, tuDepth, (TextType)chromaId);
385
+ uint64_t nullCostC = estimateNullCbfCost(zeroDistC, zeroEnergyC, tuDepth, (TextType)chromaId);
386
387
if (nullCostC < singleCostC)
388
{
389
390
if (checkTransformSkipC)
391
minCost[chromaId][tuIterator.section] = nullCostC;
392
singleDist[chromaId][tuIterator.section] = zeroDistC;
393
- singlePsyEnergy[chromaId][tuIterator.section] = zeroPsyEnergyC;
394
+ singleEnergy[chromaId][tuIterator.section] = zeroEnergyC;
395
}
396
else
397
{
398
if (checkTransformSkipC)
399
minCost[chromaId][tuIterator.section] = singleCostC;
400
singleDist[chromaId][tuIterator.section] = nonZeroDistC;
401
- singlePsyEnergy[chromaId][tuIterator.section] = nonZeroPsyEnergyC;
402
+ singleEnergy[chromaId][tuIterator.section] = nonZeroEnergyC;
403
}
404
}
405
}
406
else
407
{
408
if (checkTransformSkipC)
409
- minCost[chromaId][tuIterator.section] = estimateNullCbfCost(zeroDistC, zeroPsyEnergyC, tuDepthC, (TextType)chromaId);
410
+ minCost[chromaId][tuIterator.section] = estimateNullCbfCost(zeroDistC, zeroEnergyC, tuDepthC, (TextType)chromaId);
411
primitives.cu[partSizeC].blockfill_s(curResiC, strideResiC, 0);
412
singleBits[chromaId][tuIterator.section] = 0;
413
singleDist[chromaId][tuIterator.section] = zeroDistC;
414
- singlePsyEnergy[chromaId][tuIterator.section] = zeroPsyEnergyC;
415
+ singleEnergy[chromaId][tuIterator.section] = zeroEnergyC;
416
}
417
418
cu.setCbfPartRange(cbfFlag[chromaId][tuIterator.section] << tuDepth, (TextType)chromaId, absPartIdxC, tuIterator.absPartIdxStep);
419
420
if (checkTransformSkipY)
421
{
422
sse_t nonZeroDistY = 0;
423
- uint32_t nonZeroPsyEnergyY = 0;
424
+ uint32_t nonZeroEnergyY = 0;
425
uint64_t singleCostY = MAX_INT64;
426
427
m_entropyCoder.load(m_rqt[depth].rqtRoot);
428
429
430
if (m_rdCost.m_psyRd)
431
{
432
- nonZeroPsyEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, m_tsRecon, trSize);
433
- singleCostY = m_rdCost.calcPsyRdCost(nonZeroDistY, skipSingleBitsY, nonZeroPsyEnergyY);
434
+ nonZeroEnergyY = m_rdCost.psyCost(partSize, fenc, fencYuv->m_size, m_tsRecon, trSize);
435
+ singleCostY = m_rdCost.calcPsyRdCost(nonZeroDistY, skipSingleBitsY, nonZeroEnergyY);
436
+ }
437
+ else if(m_rdCost.m_ssimRd)
438
+ {
439
+ nonZeroEnergyY = m_quant.ssimDistortion(cu, fenc, fencYuv->m_size, m_tsRecon, trSize, log2TrSize, TEXT_LUMA, absPartIdx);
440
+ singleCostY = m_rdCost.calcSsimRdCost(nonZeroDistY, skipSingleBitsY, nonZeroEnergyY);
441
}
442
else
443
singleCostY = m_rdCost.calcRdCost(nonZeroDistY, skipSingleBitsY);
444
445
else
446
{
447
singleDist[TEXT_LUMA][0] = nonZeroDistY;
448
- singlePsyEnergy[TEXT_LUMA][0] = nonZeroPsyEnergyY;
449
+ singleEnergy[TEXT_LUMA][0] = nonZeroEnergyY;
450
cbfFlag[TEXT_LUMA][0] = !!numSigTSkipY;
451
bestTransformMode[TEXT_LUMA][0] = 1;
452
if (m_param->limitTU)
453
454
if (codeChroma && checkTransformSkipC)
455
{
456
sse_t nonZeroDistC = 0;
457
- uint32_t nonZeroPsyEnergyC = 0;
458
+ uint32_t nonZeroEnergyC = 0;
459
uint64_t singleCostC = MAX_INT64;
460
uint32_t strideResiC = m_rqt[qtLayer].resiQtYuv.m_csize;
461
uint32_t coeffOffsetC = coeffOffsetY >> (m_hChromaShift + m_vChromaShift);
462
463
nonZeroDistC = m_rdCost.scaleChromaDist(chromaId, primitives.cu[partSizeC].sse_pp(fenc, fencYuv->m_csize, m_tsRecon, trSizeC));
464
if (m_rdCost.m_psyRd)
465
{
466
-
467
- nonZeroPsyEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, m_tsRecon, trSizeC);
468
- singleCostC = m_rdCost.calcPsyRdCost(nonZeroDistC, singleBits[chromaId][tuIterator.section], nonZeroPsyEnergyC);
469
+ nonZeroEnergyC = m_rdCost.psyCost(partSizeC, fenc, fencYuv->m_csize, m_tsRecon, trSizeC);
470
+ singleCostC = m_rdCost.calcPsyRdCost(nonZeroDistC, singleBits[chromaId][tuIterator.section], nonZeroEnergyC);
471
+ }
472
+ else if(m_rdCost.m_ssimRd)
473
+ {
474
+ nonZeroEnergyC = m_quant.ssimDistortion(cu, fenc, mode.fencYuv->m_csize, m_tsRecon, trSizeC, log2TrSizeC, (TextType)chromaId, absPartIdxC);
475
+ singleCostC = m_rdCost.calcSsimRdCost(nonZeroDistC, singleBits[chromaId][tuIterator.section], nonZeroEnergyC);
476
}
477
else
478
singleCostC = m_rdCost.calcRdCost(nonZeroDistC, singleBits[chromaId][tuIterator.section]);
479
480
else
481
{
482
singleDist[chromaId][tuIterator.section] = nonZeroDistC;
483
- singlePsyEnergy[chromaId][tuIterator.section] = nonZeroPsyEnergyC;
484
+ singleEnergy[chromaId][tuIterator.section] = nonZeroEnergyC;
485
cbfFlag[chromaId][tuIterator.section] = !!numSigTSkipC;
486
bestTransformMode[chromaId][tuIterator.section] = 1;
487
uint32_t numCoeffC = 1 << (log2TrSizeC << 1);
488
489
fullCost.bits = bSplitPresentFlag ? cbfBits + coeffBits : coeffBits;
490
491
fullCost.distortion += singleDist[TEXT_LUMA][0];
492
- fullCost.energy += singlePsyEnergy[TEXT_LUMA][0];// need to check we need to add chroma also
493
+ fullCost.energy += singleEnergy[TEXT_LUMA][0];// need to check we need to add chroma also
494
for (uint32_t subTUIndex = 0; subTUIndex < 2; subTUIndex++)
495
{
496
fullCost.distortion += singleDist[TEXT_CHROMA_U][subTUIndex];
497
498
499
if (m_rdCost.m_psyRd)
500
fullCost.rdcost = m_rdCost.calcPsyRdCost(fullCost.distortion, fullCost.bits, fullCost.energy);
501
+ else if(m_rdCost.m_ssimRd)
502
+ fullCost.rdcost = m_rdCost.calcSsimRdCost(fullCost.distortion, fullCost.bits, fullCost.energy);
503
else
504
fullCost.rdcost = m_rdCost.calcRdCost(fullCost.distortion, fullCost.bits);
505
506
x265_2.2.tar.gz/source/encoder/search.h -> x265_2.3.tar.gz/source/encoder/search.h
Changed
45
1
2
uint64_t sa8dCost; // sum of partition sa8d distortion costs (sa8d(fenc, pred) + lambda * bits)
3
uint32_t sa8dBits; // signal bits used in sa8dCost calculation
4
uint32_t psyEnergy; // sum of partition psycho-visual energy difference
5
+ uint32_t ssimEnergy;
6
sse_t resEnergy; // sum of partition residual energy after motion prediction
7
sse_t lumaDistortion;
8
sse_t chromaDistortion;
9
10
sa8dCost = 0;
11
sa8dBits = 0;
12
psyEnergy = 0;
13
+ ssimEnergy = 0;
14
resEnergy = 0;
15
lumaDistortion = 0;
16
chromaDistortion = 0;
17
18
sa8dCost += subMode.sa8dCost;
19
sa8dBits += subMode.sa8dBits;
20
psyEnergy += subMode.psyEnergy;
21
+ ssimEnergy += subMode.ssimEnergy;
22
resEnergy += subMode.resEnergy;
23
lumaDistortion += subMode.lumaDistortion;
24
chromaDistortion += subMode.chromaDistortion;
25
26
Entropy rqtStore[NUM_SUBPART];
27
} m_cacheTU;
28
29
- uint64_t estimateNullCbfCost(sse_t dist, uint32_t psyEnergy, uint32_t tuDepth, TextType compId);
30
+ uint64_t estimateNullCbfCost(sse_t dist, uint32_t energy, uint32_t tuDepth, TextType compId);
31
bool splitTU(Mode& mode, const CUGeom& cuGeom, uint32_t absPartIdx, uint32_t tuDepth, ShortYuv& resiYuv, Cost& splitCost, const uint32_t depthRange[2], int32_t splitMore);
32
void estimateResidualQT(Mode& mode, const CUGeom& cuGeom, uint32_t absPartIdx, uint32_t depth, ShortYuv& resiYuv, Cost& costs, const uint32_t depthRange[2], int32_t splitMore = -1);
33
34
35
// get most probable luma modes for CU part, and bit cost of all non mpm modes
36
uint32_t getIntraRemModeBits(CUData & cu, uint32_t absPartIdx, uint32_t mpmModes[3], uint64_t& mpms) const;
37
38
- void updateModeCost(Mode& m) const { m.rdCost = m_rdCost.m_psyRd ? m_rdCost.calcPsyRdCost(m.distortion, m.totalBits, m.psyEnergy) : m_rdCost.calcRdCost(m.distortion, m.totalBits); }
39
+ void updateModeCost(Mode& m) const { m.rdCost = m_rdCost.m_psyRd ? m_rdCost.calcPsyRdCost(m.distortion, m.totalBits, m.psyEnergy)
40
+ : (m_rdCost.m_ssimRd ? m_rdCost.calcSsimRdCost(m.distortion, m.totalBits, m.ssimEnergy)
41
+ : m_rdCost.calcRdCost(m.distortion, m.totalBits)); }
42
};
43
}
44
45
x265_2.2.tar.gz/source/encoder/slicetype.cpp -> x265_2.3.tar.gz/source/encoder/slicetype.cpp
Changed
153
1
2
m_lastKeyframe = -m_param->keyframeMax;
3
m_sliceTypeBusy = false;
4
m_fullQueueSize = X265_MAX(1, m_param->lookaheadDepth);
5
- m_bAdaptiveQuant = m_param->rc.aqMode || m_param->bEnableWeightedPred || m_param->bEnableWeightedBiPred;
6
+ m_bAdaptiveQuant = m_param->rc.aqMode || m_param->bEnableWeightedPred || m_param->bEnableWeightedBiPred || m_param->bAQMotion;
7
8
/* If we have a thread pool and are using --b-adapt 2, it is generally
9
* preferable to perform all motion searches for each lowres frame in large
10
11
if (wait)
12
m_outputSignal.wait();
13
}
14
+ if (m_pool && m_param->lookaheadThreads > 0)
15
+ {
16
+ for (int i = 0; i < m_numPools; i++)
17
+ m_pool[i].stopWorkers();
18
+ }
19
}
20
-
21
void Lookahead::destroy()
22
{
23
// these two queues will be empty unless the encode was aborted
24
25
}
26
27
X265_FREE(m_scratch);
28
-
29
delete [] m_tld;
30
+ if (m_param->lookaheadThreads > 0)
31
+ delete [] m_pool;
32
}
33
-
34
/* The synchronization of slicetypeDecide is managed here. The findJob() method
35
* polls the occupancy of the input queue. If the queue is
36
* full, it will run slicetypeDecide() and output a mini-gop of frames to the
37
38
uint32_t widthInLowresCu = (uint32_t)m_8x8Width, heightInLowresCu = (uint32_t)m_8x8Height;
39
double *qp_offset = 0;
40
/* Factor in qpoffsets based on Aq/Cutree in CU costs */
41
- if (m_param->rc.aqMode)
42
+ if (m_param->rc.aqMode || m_param->bAQMotion)
43
qp_offset = (frames[b]->sliceType == X265_TYPE_B || !m_param->rc.cuTree) ? frames[b]->qpAqOffset : frames[b]->qpCuTreeOffset;
44
45
for (uint32_t row = 0; row < numCuInHeight; row++)
46
47
CostEstimateGroup estGroup(*this, frames);
48
int64_t cost = estGroup.singleCost(p0, p1, b);
49
50
- if (m_param->rc.aqMode)
51
+ if (m_param->rc.aqMode || m_param->bAQMotion)
52
{
53
if (m_param->rc.cuTree)
54
return frameCostRecalculate(frames, p0, p1, b);
55
56
57
resetStart = bKeyframe ? 1 : 2;
58
}
59
+ if (m_param->bAQMotion)
60
+ aqMotion(frames, bKeyframe);
61
62
if (m_param->rc.cuTree)
63
cuTree(frames, X265_MIN(numFrames, m_param->keyframeMax), bKeyframe);
64
65
66
return cost;
67
}
68
+void Lookahead::aqMotion(Lowres **frames, bool bIntra)
69
+{
70
+ if (!bIntra)
71
+ {
72
+ int curnonb = 0, lastnonb = 1;
73
+ int bframes = 0, i = 1;
74
+ while (frames[lastnonb]->sliceType != X265_TYPE_P)
75
+ lastnonb++;
76
+ bframes = lastnonb - 1;
77
+ if (m_param->bBPyramid && bframes > 1)
78
+ {
79
+ int middle = (bframes + 1) / 2;
80
+ for (i = 1; i < lastnonb; i++)
81
+ {
82
+ int p0 = i > middle ? middle : curnonb;
83
+ int p1 = i < middle ? middle : lastnonb;
84
+ if (i != middle)
85
+ calcMotionAdaptiveQuantFrame(frames, p0, p1, i);
86
+ }
87
+ calcMotionAdaptiveQuantFrame(frames, curnonb, lastnonb, middle);
88
+ }
89
+ else
90
+ for (i = 1; i < lastnonb; i++)
91
+ calcMotionAdaptiveQuantFrame(frames, curnonb, lastnonb, i);
92
+ calcMotionAdaptiveQuantFrame(frames, curnonb, lastnonb, lastnonb);
93
+ }
94
+}
95
+
96
+void Lookahead::calcMotionAdaptiveQuantFrame(Lowres **frames, int p0, int p1, int b)
97
+{
98
+ int listDist[2] = { b - p0 - 1, p1 - b - 1 };
99
+ int32_t strideInCU = m_8x8Width;
100
+ double qp_adj = 0, avg_adj = 0, avg_adj_pow2 = 0, sd;
101
+ for (uint16_t blocky = 0; blocky < m_8x8Height; blocky++)
102
+ {
103
+ int cuIndex = blocky * strideInCU;
104
+ for (uint16_t blockx = 0; blockx < m_8x8Width; blockx++, cuIndex++)
105
+ {
106
+ int32_t lists_used = frames[b]->lowresCosts[b - p0][p1 - b][cuIndex] >> LOWRES_COST_SHIFT;
107
+ double displacement = 0;
108
+ for (uint16_t list = 0; list < 2; list++)
109
+ {
110
+ if ((lists_used >> list) & 1)
111
+ {
112
+ MV *mvs = frames[b]->lowresMvs[list][listDist[list]];
113
+ int32_t x = mvs[cuIndex].x;
114
+ int32_t y = mvs[cuIndex].y;
115
+ displacement += sqrt(pow(abs(x), 2) + pow(abs(y), 2));
116
+ }
117
+ else
118
+ displacement += 0.0;
119
+ }
120
+ if (lists_used == 3)
121
+ displacement = displacement / 2;
122
+ qp_adj = pow(displacement, 0.1);
123
+ frames[b]->qpAqMotionOffset[cuIndex] = qp_adj;
124
+ avg_adj += qp_adj;
125
+ avg_adj_pow2 += qp_adj * qp_adj;
126
+ }
127
+ }
128
+ avg_adj /= m_cuCount;
129
+ avg_adj_pow2 /= m_cuCount;
130
+ sd = sqrt((avg_adj_pow2 - (avg_adj * avg_adj)));
131
+ if (sd > 0)
132
+ {
133
+ for (uint16_t blocky = 0; blocky < m_8x8Height; blocky++)
134
+ {
135
+ int cuIndex = blocky * strideInCU;
136
+ for (uint16_t blockx = 0; blockx < m_8x8Width; blockx++, cuIndex++)
137
+ {
138
+ qp_adj = frames[b]->qpAqMotionOffset[cuIndex];
139
+ qp_adj = (qp_adj - avg_adj) / sd;
140
+ if (qp_adj > 1)
141
+ {
142
+ frames[b]->qpAqOffset[cuIndex] += qp_adj;
143
+ frames[b]->qpCuTreeOffset[cuIndex] += qp_adj;
144
+ frames[b]->invQscaleFactor[cuIndex] += x265_exp2fix8(qp_adj);
145
+ }
146
+ }
147
+ }
148
+ }
149
+}
150
151
void Lookahead::cuTree(Lowres **frames, int numframes, bool bIntra)
152
{
153
x265_2.2.tar.gz/source/encoder/slicetype.h -> x265_2.3.tar.gz/source/encoder/slicetype.h
Changed
21
1
2
bool m_bBatchFrameCosts;
3
bool m_filled;
4
bool m_isSceneTransition;
5
+ int m_numPools;
6
Lookahead(x265_param *param, ThreadPool *pool);
7
-
8
#if DETAILED_CU_STATS
9
int64_t m_slicetypeDecideElapsedTime;
10
int64_t m_preLookaheadElapsedTime;
11
12
int64_t slicetypePathCost(Lowres **frames, char *path, int64_t threshold);
13
int64_t vbvFrameCost(Lowres **frames, int p0, int p1, int b);
14
void vbvLookahead(Lowres **frames, int numFrames, int keyframes);
15
-
16
+ void aqMotion(Lowres **frames, bool bintra);
17
+ void calcMotionAdaptiveQuantFrame(Lowres **frames, int p0, int p1, int b);
18
/* called by slicetypeAnalyse() to effect cuTree adjustments to adaptive
19
* quant offsets */
20
void cuTree(Lowres **frames, int numframes, bool bintra);
21
x265_2.2.tar.gz/source/test/rate-control-tests.txt -> x265_2.3.tar.gz/source/test/rate-control-tests.txt
Changed
18
1
2
BasketballDrive_1920x1080_50.y4m,--preset ultrafast --bitrate 3000 --vbv-bufsize 3000 --vbv-maxrate 3000 --no-wpp
3
big_buck_bunny_360p24.y4m,--preset medium --bitrate 400 --vbv-bufsize 600 --vbv-maxrate 600 --no-wpp --aud --hrd --tune fast-decode
4
sita_1920x1080_30.yuv,--preset superfast --bitrate 3000 --vbv-bufsize 3000 --vbv-maxrate 3000 --aud --strict-cbr --no-wpp
5
+sintel_trailer_2k_480p24.y4m, --preset slow --crf 24 --vbv-bufsize 150 --vbv-maxrate 150 --dynamic-rd 1.53
6
7
8
9
10
RaceHorses_416x240_30_10bit.yuv,--preset medium --crf 26 --vbv-maxrate 1000 --vbv-bufsize 1000 --pass 1,--preset fast --bitrate 1000 --vbv-maxrate 1000 --vbv-bufsize 700 --pass 3 -F4,--preset slow --bitrate 500 --vbv-maxrate 500 --vbv-bufsize 700 --pass 2 -F4
11
sita_1920x1080_30.yuv, --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000, --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers
12
sita_1920x1080_30.yuv, --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps, --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps
13
+
14
+# multi-pass rate control and analysis
15
+ducks_take_off_1080p50.y4m,--bitrate 6000 --pass 1 --multi-pass-opt-analysis --hash 1 --ssim --psnr, --bitrate 6000 --pass 2 --multi-pass-opt-analysis --hash 1 --ssim --psnr
16
+big_buck_bunny_360p24.y4m,--preset veryslow --bitrate 600 --pass 1 --multi-pass-opt-analysis --multi-pass-opt-distortion --hash 1 --ssim --psnr, --preset veryslow --bitrate 600 --pass 2 --multi-pass-opt-analysis --multi-pass-opt-distortion --hash 1 --ssim --psnr
17
+parkrun_ter_720p50.y4m, --bitrate 3500 --pass 1 --multi-pass-opt-distortion --hash 1 --ssim --psnr, --bitrate 3500 --pass 3 --multi-pass-opt-distortion --hash 1 --ssim --psnr, --bitrate 3500 --pass 2 --multi-pass-opt-distortion --hash 1 --ssim --psnr
18
x265_2.2.tar.gz/source/test/regression-tests.txt -> x265_2.3.tar.gz/source/test/regression-tests.txt
Changed
35
1
2
CrowdRun_1920x1080_50_10bit_444.yuv,--preset veryfast --temporal-layers --repeat-headers --limit-refs 2
3
CrowdRun_1920x1080_50_10bit_444.yuv,--preset medium --dither --keyint -1 --rdoq-level 1 --limit-modes
4
CrowdRun_1920x1080_50_10bit_444.yuv,--preset veryslow --tskip --tskip-fast --no-scenecut --limit-tu 1
5
+CrowdRun_1920x1080_50_10bit_444.yuv,--preset veryslow --aq-mode 3 --aq-strength 1.5 --aq-motion --bitrate 5000
6
+CrowdRun_1920x1080_50_10bit_444.yuv,--preset veryslow --aq-mode 3 --aq-strength 1.5 --no-psy-rd --ssim-rd
7
DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset superfast --weightp --qg-size 16
8
DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset medium --tune psnr --bframes 16 --limit-modes
9
DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset slow --temporal-layers --no-psy-rd --qg-size 32 --limit-refs 0 --cu-lossless
10
11
News-4k.y4m,--preset superfast --lookahead-slices 6 --aq-mode 0
12
News-4k.y4m,--preset superfast --slices 4 --aq-mode 0
13
News-4k.y4m,--preset medium --tune ssim --no-sao --qg-size 16
14
+News-4k.y4m,--preset slower --opt-cu-delta-qp
15
News-4k.y4m,--preset veryslow --no-rskip
16
News-4k.y4m,--preset veryslow --pme --crf 40
17
OldTownCross_1920x1080_50_10bit_422.yuv,--preset superfast --weightp
18
19
city_4cif_60fps.y4m,--preset superfast --rdpenalty 1 --tu-intra-depth 2
20
city_4cif_60fps.y4m,--preset medium --crf 4 --cu-lossless --sao-non-deblock
21
city_4cif_60fps.y4m,--preset slower --scaling-list default
22
+city_4cif_60fps.y4m,--preset veryslow --opt-cu-delta-qp
23
city_4cif_60fps.y4m,--preset veryslow --rdpenalty 2 --sao-non-deblock --no-b-intra --limit-refs 0
24
ducks_take_off_420_720p50.y4m,--preset ultrafast --constrained-intra --rd 1
25
ducks_take_off_444_720p50.y4m,--preset superfast --weightp --limit-refs 2
26
27
CrowdRun_1920x1080_50_10bit_422.yuv,--preset fast --interlace bff
28
29
#SEA Implementation Test
30
-silent_cif_420.y4m,--preset veryslow --me 4
31
-big_buck_bunny_360p24.y4m,--preset superfast --me 4
32
+silent_cif_420.y4m,--preset veryslow --me sea
33
+big_buck_bunny_360p24.y4m,--preset superfast --me sea
34
# vim: tw=200
35
x265_2.2.tar.gz/source/test/smoke-tests.txt -> x265_2.3.tar.gz/source/test/smoke-tests.txt
Changed
10
1
2
old_town_cross_444_720p50.y4m,--preset=fast --keyint 20 --min-cu-size 16
3
old_town_cross_444_720p50.y4m,--preset=slow --sao-non-deblock --pmode --qg-size 32
4
RaceHorses_416x240_30_10bit.yuv,--preset=veryfast --max-tu-size 8
5
-RaceHorses_416x240_30_10bit.yuv,--preset=slower --bitrate 500 -F4 --rdoq-level 1
6
+RaceHorses_416x240_30_10bit.yuv,--preset=slower --bitrate 500 -F4 --rdoq-level 1 --opt-cu-delta-qp
7
CrowdRun_1920x1080_50_10bit_444.yuv,--preset=ultrafast --constrained-intra --min-keyint 5 --keyint 10
8
CrowdRun_1920x1080_50_10bit_444.yuv,--preset=medium --max-tu-size 16 --tu-inter-depth 2 --limit-tu 3
9
DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset=veryfast --min-cu 16
10
x265_2.2.tar.gz/source/x265-extras.cpp -> x265_2.3.tar.gz/source/x265-extras.cpp
Changed
19
1
2
3
/* detailed performance statistics */
4
if (level >= 2)
5
- fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms), Stall Time (ms), Avg WPP, Row Blocks");
6
+ fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms), Stall Time (ms), Total frame time (ms), Avg WPP, Row Blocks");
7
fprintf(csvfp, "\n");
8
}
9
else
10
11
12
if (level >= 2)
13
{
14
- fprintf(csvfp, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf,", frameStats->decideWaitTime, frameStats->row0WaitTime, frameStats->wallTime, frameStats->refWaitWallTime, frameStats->totalCTUTime, frameStats->stallTime);
15
+ fprintf(csvfp, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf,", frameStats->decideWaitTime, frameStats->row0WaitTime, frameStats->wallTime, frameStats->refWaitWallTime, frameStats->totalCTUTime, frameStats->stallTime, frameStats->totalFrameTime);
16
fprintf(csvfp, " %.3lf, %d", frameStats->avgWPP, frameStats->countRowBlocks);
17
}
18
fprintf(csvfp, "\n");
19
x265_2.2.tar.gz/source/x265.h -> x265_2.3.tar.gz/source/x265.h
Changed
80
1
2
/* All the above values will add up to 100%. */
3
} x265_cu_stats;
4
5
+
6
+typedef struct x265_analysis_2Pass
7
+{
8
+ uint32_t poc;
9
+ uint32_t frameRecordSize;
10
+ void* analysisFramedata;
11
+}x265_analysis_2Pass;
12
+
13
/* Frame level statistics */
14
typedef struct x265_frame_stats
15
{
16
17
int bScenecut;
18
int frameLatency;
19
x265_cu_stats cuStats;
20
+ double totalFrameTime;
21
} x265_frame_stats;
22
23
/* Arbitrary User SEI
24
25
uint64_t framesize;
26
27
int height;
28
+
29
+ x265_analysis_2Pass analysis2Pass;
30
} x265_picture;
31
32
typedef enum
33
34
#define X265_AQ_AUTO_VARIANCE 2
35
#define X265_AQ_AUTO_VARIANCE_BIASED 3
36
37
+#define x265_ADAPT_RD_STRENGTH 4
38
+
39
/* NOTE! For this release only X265_CSP_I420 and X265_CSP_I444 are supported */
40
41
/* Supported internal color space types (according to semantics of chroma_format_idc) */
42
43
* intra cost of a frame used in scenecut detection. Default 5. */
44
double scenecutBias;
45
46
+ /* Use multiple worker threads dedicated to doing only lookahead instead of sharing
47
+ * the worker threads with Frame Encoders. A dedicated lookahead threadpool is created with the
48
+ * specified number of worker threads. This can range from 0 upto half the
49
+ * hardware threads available for encoding. Using too many threads for lookahead can starve
50
+ * resources for frame Encoder and can harm performance. Default is 0 - disabled. */
51
+ int lookaheadThreads;
52
+
53
+ /* Optimize CU level QPs to signal consistent deltaQPs in frame for rd level > 4 */
54
+ int bOptCUDeltaQP;
55
+
56
+ /* Refine analysis in multipass ratecontrol based on analysis information stored */
57
+ int analysisMultiPassRefine;
58
+
59
+ /* Refine analysis in multipass ratecontrol based on distortion data stored */
60
+ int analysisMultiPassDistortion;
61
+
62
+ /* Adaptive Quantization based on relative motion */
63
+ int bAQMotion;
64
+
65
+ /* SSIM based RDO, based on residual divisive normalization scheme. Used for mode
66
+ * selection during analysis of CTUs, can achieve significant gain in terms of
67
+ * objective quality metrics SSIM and PSNR */
68
+ int bSsimRd;
69
+
70
+ /* Increase RD at points where bitrate drops due to vbv. Default 0 */
71
+ double dynamicRd;
72
+
73
+ /* Enables the emitting of HDR SEI packets which contains HDR-specific params.
74
+ * Auto-enabled when max-cll, max-fall, or mastering display info is specified.
75
+ * Default is disabled */
76
+ int bEmitHDRSEI;
77
} x265_param;
78
79
/* x265_param_alloc:
80
x265_2.2.tar.gz/source/x265cli.h -> x265_2.3.tar.gz/source/x265cli.h
Changed
100
1
2
{ "intra-refresh", no_argument, NULL, 0 },
3
{ "rc-lookahead", required_argument, NULL, 0 },
4
{ "lookahead-slices", required_argument, NULL, 0 },
5
+ { "lookahead-threads", required_argument, NULL, 0 },
6
{ "bframes", required_argument, NULL, 'b' },
7
{ "bframe-bias", required_argument, NULL, 0 },
8
{ "b-adapt", required_argument, NULL, 0 },
9
10
{ "rd", required_argument, NULL, 0 },
11
{ "rdoq-level", required_argument, NULL, 0 },
12
{ "no-rdoq-level", no_argument, NULL, 0 },
13
+ { "dynamic-rd", required_argument, NULL, 0 },
14
{ "psy-rd", required_argument, NULL, 0 },
15
{ "psy-rdoq", required_argument, NULL, 0 },
16
{ "no-psy-rd", no_argument, NULL, 0 },
17
18
{ "no-opt-qp-pps", no_argument, NULL, 0 },
19
{ "opt-ref-list-length-pps", no_argument, NULL, 0 },
20
{ "no-opt-ref-list-length-pps", no_argument, NULL, 0 },
21
+ { "opt-cu-delta-qp", no_argument, NULL, 0 },
22
+ { "no-opt-cu-delta-qp", no_argument, NULL, 0 },
23
{ "no-dither", no_argument, NULL, 0 },
24
{ "dither", no_argument, NULL, 0 },
25
{ "no-repeat-headers", no_argument, NULL, 0 },
26
27
{ "nr-inter", required_argument, NULL, 0 },
28
{ "stats", required_argument, NULL, 0 },
29
{ "pass", required_argument, NULL, 0 },
30
+ { "multi-pass-opt-analysis", no_argument, NULL, 0 },
31
+ { "no-multi-pass-opt-analysis", no_argument, NULL, 0 },
32
+ { "multi-pass-opt-distortion", no_argument, NULL, 0 },
33
+ { "no-multi-pass-opt-distortion", no_argument, NULL, 0 },
34
{ "slow-firstpass", no_argument, NULL, 0 },
35
{ "no-slow-firstpass", no_argument, NULL, 0 },
36
{ "multi-pass-opt-rps", no_argument, NULL, 0 },
37
38
{ "analyze-src-pics", no_argument, NULL, 0 },
39
{ "no-analyze-src-pics", no_argument, NULL, 0 },
40
{ "slices", required_argument, NULL, 0 },
41
+ { "aq-motion", no_argument, NULL, 0 },
42
+ { "no-aq-motion", no_argument, NULL, 0 },
43
+ { "ssim-rd", no_argument, NULL, 0 },
44
+ { "no-ssim-rd", no_argument, NULL, 0 },
45
+ { "hdr", no_argument, NULL, 0 },
46
+ { "no-hdr", no_argument, NULL, 0 },
47
{ 0, 0, 0, 0 },
48
{ 0, 0, 0, 0 },
49
{ 0, 0, 0, 0 },
50
51
H0(" --[no-]psy-rd <0..5.0> Strength of psycho-visual rate distortion optimization, 0 to disable. Default %.1f\n", param->psyRd);
52
H0(" --[no-]rdoq-level <0|1|2> Level of RDO in quantization 0:none, 1:levels, 2:levels & coding groups. Default %d\n", param->rdoqLevel);
53
H0(" --[no-]psy-rdoq <0..50.0> Strength of psycho-visual optimization in RDO quantization, 0 to disable. Default %.1f\n", param->psyRdoq);
54
+ H0(" --dynamic-rd <0..4.0> Strength of dynamic RD, 0 to disable. Default %.2f\n", param->dynamicRd);
55
+ H0(" --[no-]ssim-rd Enable ssim rate distortion optimization, 0 to disable. Default %s\n", OPT(param->bSsimRd));
56
H0(" --[no-]rd-refine Enable QP based RD refinement for rd levels 5 and 6. Default %s\n", OPT(param->bEnableRdRefine));
57
H0(" --[no-]early-skip Enable early SKIP detection. Default %s\n", OPT(param->bEnableEarlySkip));
58
H0(" --[no-]rskip Enable early exit from recursion. Default %s\n", OPT(param->bEnableRecursionSkip));
59
60
H0(" --intra-refresh Use Periodic Intra Refresh instead of IDR frames\n");
61
H0(" --rc-lookahead <integer> Number of frames for frame-type lookahead (determines encoder latency) Default %d\n", param->lookaheadDepth);
62
H1(" --lookahead-slices <0..16> Number of slices to use per lookahead cost estimate. Default %d\n", param->lookaheadSlices);
63
+ H0(" --lookahead-threads <integer> Number of threads to be dedicated to perform lookahead only. Default %d\n", param->lookaheadThreads);
64
H0(" --bframes <integer> Maximum number of consecutive b-frames (now it only enables B GOP structure) Default %d\n", param->bframes);
65
H1(" --bframe-bias <integer> Bias towards B frame decisions. Default %d\n", param->bFrameBias);
66
H0(" --b-adapt <0..2> 0 - none, 1 - fast, 2 - full (trellis) adaptive B frame scheduling. Default %d\n", param->bFrameAdaptive);
67
68
" - 1 : First pass, creates stats file\n"
69
" - 2 : Last pass, does not overwrite stats file\n"
70
" - 3 : Nth pass, overwrites stats file\n");
71
+ H0(" --[no-]multi-pass-opt-analysis Refine analysis in 2 pass based on analysis information from pass 1\n");
72
+ H0(" --[no-]multi-pass-opt-distortion Use distortion of CTU from pass 1 to refine qp in 2 pass\n");
73
H0(" --stats Filename for stats file in multipass pass rate control. Default x265_2pass.log\n");
74
H0(" --[no-]analyze-src-pics Motion estimation uses source frame planes. Default disable\n");
75
H0(" --[no-]slow-firstpass Enable a slow first pass in a multipass rate control mode. Default %s\n", OPT(param->rc.bEnableSlowFirstPass));
76
77
H0(" --analysis-file <filename> Specify file name used for either dumping or reading analysis data.\n");
78
H0(" --aq-mode <integer> Mode for Adaptive Quantization - 0:none 1:uniform AQ 2:auto variance 3:auto variance with bias to dark scenes. Default %d\n", param->rc.aqMode);
79
H0(" --aq-strength <float> Reduces blocking and blurring in flat and textured areas (0 to 3.0). Default %.2f\n", param->rc.aqStrength);
80
+ H0(" --[no-]aq-motion Adaptive Quantization based on the relative motion of each CU w.r.t., frame. Default %s\n", OPT(param->bOptCUDeltaQP));
81
H0(" --qg-size <int> Specifies the size of the quantization group (64, 32, 16, 8). Default %d\n", param->rc.qgSize);
82
H0(" --[no-]cutree Enable cutree for Adaptive Quantization. Default %s\n", OPT(param->rc.cuTree));
83
H0(" --[no-]rc-grain Enable ratecontrol mode to handle grains specifically. turned on with tune grain. Default %s\n", OPT(param->rc.bEnableGrain));
84
85
H0(" --master-display <string> SMPTE ST 2086 master display color volume info SEI (HDR)\n");
86
H0(" format: G(x,y)B(x,y)R(x,y)WP(x,y)L(max,min)\n");
87
H0(" --max-cll <string> Emit content light level info SEI as \"cll,fall\" (HDR)\n");
88
+ H0(" --[no-]hdr Control dumping of HDR SEI packet. If max-cll or master-display has non-zero values, this is enabled. Default %s\n", OPT(param->bEmitHDRSEI));
89
H0(" --min-luma <integer> Minimum luma plane value of input source picture\n");
90
H0(" --max-luma <integer> Maximum luma plane value of input source picture\n");
91
H0("\nBitstream options:\n");
92
93
H0(" --[no-]opt-qp-pps Dynamically optimize QP in PPS (instead of default 26) based on QPs in previous GOP. Default %s\n", OPT(param->bOptQpPPS));
94
H0(" --[no-]opt-ref-list-length-pps Dynamically set L0 and L1 ref list length in PPS (instead of default 0) based on values in last GOP. Default %s\n", OPT(param->bOptRefListLengthPPS));
95
H0(" --[no-]multi-pass-opt-rps Enable storing commonly used RPS in SPS in multi pass mode. Default %s\n", OPT(param->bMultiPassOptRPS));
96
+ H0(" --[no-]opt-cu-delta-qp Optimize to signal consistent CU level delta QPs in frame. Default %s\n", OPT(param->bOptCUDeltaQP));
97
H1("\nReconstructed video options (debugging):\n");
98
H1("-r/--recon <filename> Reconstructed raw image YUV or Y4M output file name\n");
99
H1(" --recon-depth <integer> Bit-depth of reconstructed raw image file. Defaults to input bit depth, or 8 if Y4M\n");
100