Overview

Request 4057 (accepted)

Update to version 2.6

Submit package home:enzokiel:branches:Essentials / x265 to package Essentials / x265

x265.changes Changed
x
 
1
@@ -1,4 +1,53 @@
2
 -------------------------------------------------------------------
3
+Fri Dec 01 16:40:13 UTC 2017 - joerg.lorenzen@ki.tng.de
4
+
5
+- Update to version 2.6
6
+  New features
7
+  * x265 can now refine analysis from a previous HEVC encode (using
8
+    options --refine-inter, and --refine-intra), or a previous AVC
9
+    encode (using option --refine-mv-type). The previous encode’s
10
+    information can be packaged using the x265_analysis_data_t data
11
+    field available in the x265_picture object.
12
+  * Basic support for segmented (or chunked) encoding added with
13
+    --vbv-end that can specify the status of CPB at the end of a
14
+    segment. String this together with --vbv-init to encode a title
15
+    as chunks while maintaining VBV compliance!
16
+  * --force-flush can be used to trigger a premature flush of the
17
+    encoder. This option is beneficial when input is known to be
18
+    bursty, and may be at a rate slower than the encoder.
19
+  * Experimental feature --lowpass-dct that uses truncated DCT for
20
+    transformation.
21
+  Encoder enhancements
22
+  * Slice-parallel mode gets a significant boost in performance,
23
+    particularly in low-latency mode.
24
+  * x265 now officially supported on VS2017.
25
+  * x265 now supports all depths from mono0 to mono16 for Y4M
26
+    format.
27
+  API changes
28
+  * Options that modified PPS dynamically (--opt-qp-pps and
29
+    --opt-ref-list-length-pps) are now disabled by default to
30
+    enable users to save bits by not sending headers. If these
31
+    options are enabled, headers have to be repeated for every GOP.
32
+  * Rate-control and analysis parameters can dynamically be
33
+    reconfigured simultaneously via the x265_encoder_reconfig API.
34
+  * New API functions to extract intermediate information such as
35
+    slice-type, scenecut information, reference frames, etc. are
36
+    now available. This information may be beneficial to
37
+    integrating applications that are attempting to perform
38
+    content-adaptive encoding. Refer to documentation on
39
+    x265_get_slicetype_poc_and_scenecut, and
40
+    x265_get_ref_frame_list for more details and suggested usage.
41
+  * A new API to pass supplemental CTU information to x265 to
42
+    influence analysis decisions has been added. Refer to
43
+    documentation on x265_encoder_ctu_info for more details.
44
+  Bug fixes
45
+  * Bug fixes when --slices is used with VBV settings.
46
+  * Minor memory leak fixed for HDR10+ builds, and default x265
47
+    when pools option is specified.
48
+  * HDR10+ bug fix to remove dependence on poc counter to select
49
+    meta-data information.
50
+
51
+-------------------------------------------------------------------
52
 Thu Jul 27 08:33:52 UTC 2017 - joerg.lorenzen@ki.tng.de
53
 
54
 - Update to version 2.5
55
x265.spec Changed
23
 
1
@@ -1,10 +1,10 @@
2
 # based on the spec file from https://build.opensuse.org/package/view_file/home:Simmphonie/libx265/
3
 
4
 Name:           x265
5
-%define soname  130
6
+%define soname  146
7
 %define libname lib%{name}
8
 %define libsoname %{libname}-%{soname}
9
-Version:        2.5
10
+Version:        2.6
11
 Release:        0
12
 License:        GPL-2.0+
13
 Summary:        A free h265/HEVC encoder - encoder binary
14
@@ -49,7 +49,7 @@
15
 streams. 
16
 
17
 %prep
18
-%setup -q -n %{name}_%{version}
19
+%setup -q -n %{name}_v%{version}
20
 %patch0 -p1
21
 %patch1 -p1
22
 
23
x265_2.5.tar.gz/source/x265-extras.cpp Deleted
449
 
1
@@ -1,447 +0,0 @@
2
-/*****************************************************************************
3
- * Copyright (C) 2013-2017 MulticoreWare, Inc
4
- *
5
- * Authors: Steve Borho <steve@borho.org>
6
- *          Selvakumar Nithiyaruban <selvakumar@multicorewareinc.com>
7
- *          Divya Manivannan <divya@multicorewareinc.com>
8
- *
9
- * This program is free software; you can redistribute it and/or modify
10
- * it under the terms of the GNU General Public License as published by
11
- * the Free Software Foundation; either version 2 of the License, or
12
- * (at your option) any later version.
13
- *
14
- * This program is distributed in the hope that it will be useful,
15
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
16
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
17
- * GNU General Public License for more details.
18
- *
19
- * You should have received a copy of the GNU General Public License
20
- * along with this program; if not, write to the Free Software
21
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
22
- *
23
- * This program is also available under a commercial proprietary license.
24
- * For more information, contact us at license @ x265.com.
25
- *****************************************************************************/
26
-
27
-#include "x265.h"
28
-#include "x265-extras.h"
29
-#include "param.h"
30
-#include "common.h"
31
-
32
-using namespace X265_NS;
33
-
34
-static const char* summaryCSVHeader =
35
-    "Command, Date/Time, Elapsed Time, FPS, Bitrate, "
36
-    "Y PSNR, U PSNR, V PSNR, Global PSNR, SSIM, SSIM (dB), "
37
-    "I count, I ave-QP, I kbps, I-PSNR Y, I-PSNR U, I-PSNR V, I-SSIM (dB), "
38
-    "P count, P ave-QP, P kbps, P-PSNR Y, P-PSNR U, P-PSNR V, P-SSIM (dB), "
39
-    "B count, B ave-QP, B kbps, B-PSNR Y, B-PSNR U, B-PSNR V, B-SSIM (dB), "
40
-    "MaxCLL, MaxFALL, Version\n";
41
-
42
-FILE* x265_csvlog_open(const x265_param& param, const char* fname, int level)
43
-{
44
-    FILE *csvfp = x265_fopen(fname, "r");
45
-    if (csvfp)
46
-    {
47
-        /* file already exists, re-open for append */
48
-        fclose(csvfp);
49
-        return x265_fopen(fname, "ab");
50
-    }
51
-    else
52
-    {
53
-        /* new CSV file, write header */
54
-        csvfp = x265_fopen(fname, "wb");
55
-        if (csvfp)
56
-        {
57
-            if (level)
58
-            {
59
-                fprintf(csvfp, "Encode Order, Type, POC, QP, Bits, Scenecut, ");
60
-                if (level >= 2)
61
-                    fprintf(csvfp, "I/P cost ratio, ");
62
-                if (param.rc.rateControlMode == X265_RC_CRF)
63
-                    fprintf(csvfp, "RateFactor, ");
64
-                if (param.rc.vbvBufferSize)
65
-                    fprintf(csvfp, "BufferFill, ");
66
-                if (param.bEnablePsnr)
67
-                    fprintf(csvfp, "Y PSNR, U PSNR, V PSNR, YUV PSNR, ");
68
-                if (param.bEnableSsim)
69
-                    fprintf(csvfp, "SSIM, SSIM(dB), ");
70
-                fprintf(csvfp, "Latency, ");
71
-                fprintf(csvfp, "List 0, List 1");
72
-                uint32_t size = param.maxCUSize;
73
-                for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
74
-                {
75
-                    fprintf(csvfp, ", Intra %dx%d DC, Intra %dx%d Planar, Intra %dx%d Ang", size, size, size, size, size, size);
76
-                    size /= 2;
77
-                }
78
-                fprintf(csvfp, ", 4x4");
79
-                size = param.maxCUSize;
80
-                if (param.bEnableRectInter)
81
-                {
82
-                    for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
83
-                    {
84
-                        fprintf(csvfp, ", Inter %dx%d, Inter %dx%d (Rect)", size, size, size, size);
85
-                        if (param.bEnableAMP)
86
-                            fprintf(csvfp, ", Inter %dx%d (Amp)", size, size);
87
-                        size /= 2;
88
-                    }
89
-                }
90
-                else
91
-                {
92
-                    for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
93
-                    {
94
-                        fprintf(csvfp, ", Inter %dx%d", size, size);
95
-                        size /= 2;
96
-                    }
97
-                }
98
-                size = param.maxCUSize;
99
-                for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
100
-                {
101
-                    fprintf(csvfp, ", Skip %dx%d", size, size);
102
-                    size /= 2;
103
-                }
104
-                size = param.maxCUSize;
105
-                for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
106
-                {
107
-                    fprintf(csvfp, ", Merge %dx%d", size, size);
108
-                    size /= 2;
109
-                }
110
-
111
-                if (level >= 2)
112
-                {
113
-                    fprintf(csvfp, ", Avg Luma Distortion, Avg Chroma Distortion, Avg psyEnergy, Avg Residual Energy,"
114
-                        " Min Luma Level, Max Luma Level, Avg Luma Level");
115
-
116
-                    if (param.internalCsp != X265_CSP_I400)
117
-                        fprintf(csvfp, ", Min Cb Level, Max Cb Level, Avg Cb Level, Min Cr Level, Max Cr Level, Avg Cr Level");
118
-
119
-                    /* PU statistics */
120
-                    size = param.maxCUSize;
121
-                    for (uint32_t i = 0; i< param.maxLog2CUSize - (uint32_t)g_log2Size[param.minCUSize] + 1; i++)
122
-                    {
123
-                        fprintf(csvfp, ", Intra %dx%d", size, size);
124
-                        fprintf(csvfp, ", Skip %dx%d", size, size);
125
-                        fprintf(csvfp, ", AMP %d", size);
126
-                        fprintf(csvfp, ", Inter %dx%d", size, size);
127
-                        fprintf(csvfp, ", Merge %dx%d", size, size);
128
-                        fprintf(csvfp, ", Inter %dx%d", size, size / 2);
129
-                        fprintf(csvfp, ", Merge %dx%d", size, size / 2);
130
-                        fprintf(csvfp, ", Inter %dx%d", size / 2, size);
131
-                        fprintf(csvfp, ", Merge %dx%d", size / 2, size);
132
-                        size /= 2;
133
-                    }
134
-
135
-                    if ((uint32_t)g_log2Size[param.minCUSize] == 3)
136
-                        fprintf(csvfp, ", 4x4");
137
-
138
-                    /* detailed performance statistics */
139
-                    fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms),"
140
-                    "Stall Time (ms), Total frame time (ms), Avg WPP, Row Blocks");
141
-                }
142
-                fprintf(csvfp, "\n");
143
-            }
144
-            else
145
-                fputs(summaryCSVHeader, csvfp);
146
-        }
147
-        return csvfp;
148
-    }
149
-}
150
-
151
-// per frame CSV logging
152
-void x265_csvlog_frame(FILE* csvfp, const x265_param& param, const x265_picture& pic, int level)
153
-{
154
-    if (!csvfp)
155
-        return;
156
-
157
-    const x265_frame_stats* frameStats = &pic.frameData;
158
-    fprintf(csvfp, "%d, %c-SLICE, %4d, %2.2lf, %10d, %d,", frameStats->encoderOrder, frameStats->sliceType, frameStats->poc, 
159
-                                                           frameStats->qp, (int)frameStats->bits, frameStats->bScenecut);
160
-    if (level >= 2)
161
-        fprintf(csvfp, "%.2f,", frameStats->ipCostRatio);
162
-    if (param.rc.rateControlMode == X265_RC_CRF)
163
-        fprintf(csvfp, "%.3lf,", frameStats->rateFactor);
164
-    if (param.rc.vbvBufferSize)
165
-        fprintf(csvfp, "%.3lf,", frameStats->bufferFill);
166
-    if (param.bEnablePsnr)
167
-        fprintf(csvfp, "%.3lf, %.3lf, %.3lf, %.3lf,", frameStats->psnrY, frameStats->psnrU, frameStats->psnrV, frameStats->psnr);
168
-    if (param.bEnableSsim)
169
-        fprintf(csvfp, " %.6f, %6.3f,", frameStats->ssim, x265_ssim2dB(frameStats->ssim));
170
-    fprintf(csvfp, "%d, ", frameStats->frameLatency);
171
-    if (frameStats->sliceType == 'I' || frameStats->sliceType == 'i')
172
-        fputs(" -, -,", csvfp);
173
-    else
174
-    {
175
-        int i = 0;
176
-        while (frameStats->list0POC[i] != -1)
177
-            fprintf(csvfp, "%d ", frameStats->list0POC[i++]);
178
-        fprintf(csvfp, ",");
179
-        if (frameStats->sliceType != 'P')
180
-        {
181
-            i = 0;
182
-            while (frameStats->list1POC[i] != -1)
183
-                fprintf(csvfp, "%d ", frameStats->list1POC[i++]);
184
-            fprintf(csvfp, ",");
185
-        }
186
-        else
187
-            fputs(" -,", csvfp);
188
-    }
189
-
190
-    if (level)
191
-    {
192
-        for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
193
-            fprintf(csvfp, "%5.2lf%%, %5.2lf%%, %5.2lf%%,", frameStats->cuStats.percentIntraDistribution[depth][0],
194
-            frameStats->cuStats.percentIntraDistribution[depth][1],
195
-            frameStats->cuStats.percentIntraDistribution[depth][2]);
196
-        fprintf(csvfp, "%5.2lf%%", frameStats->cuStats.percentIntraNxN);
197
-        if (param.bEnableRectInter)
198
-        {
199
-            for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
200
-            {
201
-                fprintf(csvfp, ", %5.2lf%%, %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0],
202
-                    frameStats->cuStats.percentInterDistribution[depth][1]);
203
-                if (param.bEnableAMP)
204
-                    fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][2]);
205
-            }
206
-        }
207
-        else
208
-        {
209
-            for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
210
-                fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0]);
211
-        }
212
-        for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
213
-            fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentSkipCu[depth]);
214
-        for (uint32_t depth = 0; depth <= param.maxCUDepth; depth++)
215
-            fprintf(csvfp, ", %5.2lf%%", frameStats->cuStats.percentMergeCu[depth]);
216
-    }
217
-
218
-    if (level >= 2)
219
-    {
220
-        fprintf(csvfp, ", %.2lf, %.2lf, %.2lf, %.2lf ", frameStats->avgLumaDistortion,
221
-            frameStats->avgChromaDistortion,
222
-            frameStats->avgPsyEnergy,
223
-            frameStats->avgResEnergy);
224
-
225
-        fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minLumaLevel, frameStats->maxLumaLevel, frameStats->avgLumaLevel);
226
-
227
-        if (param.internalCsp != X265_CSP_I400)
228
-        {
229
-            fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minChromaULevel, frameStats->maxChromaULevel, frameStats->avgChromaULevel);
230
-            fprintf(csvfp, ", %d, %d, %.2lf", frameStats->minChromaVLevel, frameStats->maxChromaVLevel, frameStats->avgChromaVLevel);
231
-        }
232
-
233
-        for (uint32_t i = 0; i < param.maxLog2CUSize - (uint32_t)g_log2Size[param.minCUSize] + 1; i++)
234
-        {
235
-            fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentIntraPu[i]);
236
-            fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentSkipPu[i]);
237
-            fprintf(csvfp, ",%.2lf%%", frameStats->puStats.percentAmpPu[i]);
238
-            for (uint32_t j = 0; j < 3; j++)
239
-            {
240
-                fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentInterPu[i][j]);
241
-                fprintf(csvfp, ", %.2lf%%", frameStats->puStats.percentMergePu[i][j]);
242
-            }
243
-        }
244
-        if ((uint32_t)g_log2Size[param.minCUSize] == 3)
245
-            fprintf(csvfp, ",%.2lf%%", frameStats->puStats.percentNxN);
246
-
247
-        fprintf(csvfp, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf,", frameStats->decideWaitTime, frameStats->row0WaitTime,
248
-                                                                             frameStats->wallTime, frameStats->refWaitWallTime,
249
-                                                                             frameStats->totalCTUTime, frameStats->stallTime,
250
-                                                                             frameStats->totalFrameTime);
251
-
252
-        fprintf(csvfp, " %.3lf, %d", frameStats->avgWPP, frameStats->countRowBlocks);
253
-    }
254
-    fprintf(csvfp, "\n");
255
-    fflush(stderr);
256
-}
257
-
258
-void x265_csvlog_encode(FILE* csvfp, const char* version, const x265_param& param, int padx, int pady, const x265_stats& stats, int level, int argc, char** argv)
259
-{
260
-    if (!csvfp)
261
-        return;
262
-
263
-    if (level)
264
-    {
265
-        // adding summary to a per-frame csv log file, so it needs a summary header
266
-        fprintf(csvfp, "\nSummary\n");
267
-        fputs(summaryCSVHeader, csvfp);
268
-    }
269
-
270
-    // CLI arguments or other
271
-    if (argc)
272
-    {
273
-        fputc('"', csvfp);
274
-        for (int i = 1; i < argc; i++)
275
-        {
276
-            fputc(' ', csvfp);
277
-            fputs(argv[i], csvfp);
278
-        }
279
-        fputc('"', csvfp);
280
-    }
281
-    else
282
-    {
283
-        const x265_param* paramTemp = &param;
284
-        char *opts = x265_param2string((x265_param*)paramTemp, padx, pady);
285
-        if (opts)
286
-        {
287
-            fputc('"', csvfp);
288
-            fputs(opts, csvfp);
289
-            fputc('"', csvfp);
290
-        }
291
-    }
292
-
293
-    // current date and time
294
-    time_t now;
295
-    struct tm* timeinfo;
296
-    time(&now);
297
-    timeinfo = localtime(&now);
298
-    char buffer[200];
299
-    strftime(buffer, 128, "%c", timeinfo);
300
-    fprintf(csvfp, ", %s, ", buffer);
301
-
302
-    // elapsed time, fps, bitrate
303
-    fprintf(csvfp, "%.2f, %.2f, %.2f,",
304
-        stats.elapsedEncodeTime, stats.encodedPictureCount / stats.elapsedEncodeTime, stats.bitrate);
305
-
306
-    if (param.bEnablePsnr)
307
-        fprintf(csvfp, " %.3lf, %.3lf, %.3lf, %.3lf,",
308
-        stats.globalPsnrY / stats.encodedPictureCount, stats.globalPsnrU / stats.encodedPictureCount,
309
-        stats.globalPsnrV / stats.encodedPictureCount, stats.globalPsnr);
310
-    else
311
-        fprintf(csvfp, " -, -, -, -,");
312
-    if (param.bEnableSsim)
313
-        fprintf(csvfp, " %.6f, %6.3f,", stats.globalSsim, x265_ssim2dB(stats.globalSsim));
314
-    else
315
-        fprintf(csvfp, " -, -,");
316
-
317
-    if (stats.statsI.numPics)
318
-    {
319
-        fprintf(csvfp, " %-6u, %2.2lf, %-8.2lf,", stats.statsI.numPics, stats.statsI.avgQp, stats.statsI.bitrate);
320
-        if (param.bEnablePsnr)
321
-            fprintf(csvfp, " %.3lf, %.3lf, %.3lf,", stats.statsI.psnrY, stats.statsI.psnrU, stats.statsI.psnrV);
322
-        else
323
-            fprintf(csvfp, " -, -, -,");
324
-        if (param.bEnableSsim)
325
-            fprintf(csvfp, " %.3lf,", stats.statsI.ssim);
326
-        else
327
-            fprintf(csvfp, " -,");
328
-    }
329
-    else
330
-        fprintf(csvfp, " -, -, -, -, -, -, -,");
331
-
332
-    if (stats.statsP.numPics)
333
-    {
334
-        fprintf(csvfp, " %-6u, %2.2lf, %-8.2lf,", stats.statsP.numPics, stats.statsP.avgQp, stats.statsP.bitrate);
335
-        if (param.bEnablePsnr)
336
-            fprintf(csvfp, " %.3lf, %.3lf, %.3lf,", stats.statsP.psnrY, stats.statsP.psnrU, stats.statsP.psnrV);
337
-        else
338
-            fprintf(csvfp, " -, -, -,");
339
-        if (param.bEnableSsim)
340
-            fprintf(csvfp, " %.3lf,", stats.statsP.ssim);
341
-        else
342
-            fprintf(csvfp, " -,");
343
-    }
344
-    else
345
-        fprintf(csvfp, " -, -, -, -, -, -, -,");
346
-
347
-    if (stats.statsB.numPics)
348
-    {
349
-        fprintf(csvfp, " %-6u, %2.2lf, %-8.2lf,", stats.statsB.numPics, stats.statsB.avgQp, stats.statsB.bitrate);
350
-        if (param.bEnablePsnr)
351
-            fprintf(csvfp, " %.3lf, %.3lf, %.3lf,", stats.statsB.psnrY, stats.statsB.psnrU, stats.statsB.psnrV);
352
-        else
353
-            fprintf(csvfp, " -, -, -,");
354
-        if (param.bEnableSsim)
355
-            fprintf(csvfp, " %.3lf,", stats.statsB.ssim);
356
-        else
357
-            fprintf(csvfp, " -,");
358
-    }
359
-    else
360
-        fprintf(csvfp, " -, -, -, -, -, -, -,");
361
-
362
-    fprintf(csvfp, " %-6u, %-6u, %s\n", stats.maxCLL, stats.maxFALL, version);
363
-}
364
-
365
-/* The dithering algorithm is based on Sierra-2-4A error diffusion.
366
- * We convert planes in place (without allocating a new buffer). */
367
-static void ditherPlane(uint16_t *src, int srcStride, int width, int height, int16_t *errors, int bitDepth)
368
-{
369
-    const int lShift = 16 - bitDepth;
370
-    const int rShift = 16 - bitDepth + 2;
371
-    const int half = (1 << (16 - bitDepth + 1));
372
-    const int pixelMax = (1 << bitDepth) - 1;
373
-
374
-    memset(errors, 0, (width + 1) * sizeof(int16_t));
375
-
376
-    if (bitDepth == 8)
377
-    {
378
-        for (int y = 0; y < height; y++, src += srcStride)
379
-        {
380
-            uint8_t* dst = (uint8_t *)src;
381
-            int16_t err = 0;
382
-            for (int x = 0; x < width; x++)
383
-            {
384
-                err = err * 2 + errors[x] + errors[x + 1];
385
-                int tmpDst = x265_clip3(0, pixelMax, ((src[x] << 2) + err + half) >> rShift);
386
-                errors[x] = err = (int16_t)(src[x] - (tmpDst << lShift));
387
-                dst[x] = (uint8_t)tmpDst;
388
-            }
389
-        }
390
-    }
391
-    else
392
-    {
393
-        for (int y = 0; y < height; y++, src += srcStride)
394
-        {
395
-            int16_t err = 0;
396
-            for (int x = 0; x < width; x++)
397
-            {
398
-                err = err * 2 + errors[x] + errors[x + 1];
399
-                int tmpDst = x265_clip3(0, pixelMax, ((src[x] << 2) + err + half) >> rShift);
400
-                errors[x] = err = (int16_t)(src[x] - (tmpDst << lShift));
401
-                src[x] = (uint16_t)tmpDst;
402
-            }
403
-        }
404
-    }
405
-}
406
-
407
-void x265_dither_image(const x265_api& api, x265_picture& picIn, int picWidth, int picHeight, int16_t *errorBuf, int bitDepth)
408
-{
409
-    if (sizeof(x265_picture) != api.sizeof_picture)
410
-    {
411
-        fprintf(stderr, "extras [error]: structure size skew, unable to dither\n");
412
-        return;
413
-    }
414
-
415
-    if (picIn.bitDepth <= 8)
416
-    {
417
-        fprintf(stderr, "extras [error]: dither support enabled only for input bitdepth > 8\n");
418
-        return;
419
-    }
420
-
421
-    if (picIn.bitDepth == bitDepth)
422
-    {
423
-        fprintf(stderr, "extras[error]: dither support enabled only if encoder depth is different from picture depth\n");
424
-        return;
425
-    }
426
-
427
-    /* This portion of code is from readFrame in x264. */
428
-    for (int i = 0; i < x265_cli_csps[picIn.colorSpace].planes; i++)
429
-    {
430
-        if (picIn.bitDepth < 16)
431
-        {
432
-            /* upconvert non 16bit high depth planes to 16bit */
433
-            uint16_t *plane = (uint16_t*)picIn.planes[i];
434
-            uint32_t pixelCount = x265_picturePlaneSize(picIn.colorSpace, picWidth, picHeight, i);
435
-            int lShift = 16 - picIn.bitDepth;
436
-
437
-            /* This loop assumes width is equal to stride which
438
-             * happens to be true for file reader outputs */
439
-            for (uint32_t j = 0; j < pixelCount; j++)
440
-                plane[j] = plane[j] << lShift;
441
-        }
442
-
443
-        int height = (int)(picHeight >> x265_cli_csps[picIn.colorSpace].height[i]);
444
-        int width = (int)(picWidth >> x265_cli_csps[picIn.colorSpace].width[i]);
445
-
446
-        ditherPlane(((uint16_t*)picIn.planes[i]), picIn.stride[i] / 2, width, height, errorBuf, bitDepth);
447
-    }
448
-}
449
x265_2.5.tar.gz/source/x265-extras.h Deleted
68
 
1
@@ -1,66 +0,0 @@
2
-/*****************************************************************************
3
- * Copyright (C) 2013-2017 MulticoreWare, Inc
4
- *
5
- * Authors: Steve Borho <steve@borho.org>
6
- *
7
- * This program is free software; you can redistribute it and/or modify
8
- * it under the terms of the GNU General Public License as published by
9
- * the Free Software Foundation; either version 2 of the License, or
10
- * (at your option) any later version.
11
- *
12
- * This program is distributed in the hope that it will be useful,
13
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
14
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
15
- * GNU General Public License for more details.
16
- *
17
- * You should have received a copy of the GNU General Public License
18
- * along with this program; if not, write to the Free Software
19
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
20
- *
21
- * This program is also available under a commercial proprietary license.
22
- * For more information, contact us at license @ x265.com.
23
- *****************************************************************************/
24
-
25
-#ifndef X265_EXTRAS_H
26
-#define X265_EXTRAS_H 1
27
-
28
-#include "x265.h"
29
-
30
-#include <stdio.h>
31
-#include <stdint.h>
32
-
33
-#ifdef __cplusplus
34
-extern "C" {
35
-#endif
36
-
37
-#if _WIN32
38
-#define LIBAPI __declspec(dllexport)
39
-#else
40
-#define LIBAPI
41
-#endif
42
-
43
-/* Open a CSV log file. On success it returns a file handle which must be passed
44
- * to x265_csvlog_frame() and/or x265_csvlog_encode(). The file handle must be
45
- * closed by the caller using fclose(). If level is 0, then no frame logging
46
- * header is written to the file. This function will return NULL if it is unable
47
- * to open the file for write or if it detects a structure size skew */
48
-LIBAPI FILE* x265_csvlog_open(const x265_param& param, const char* fname, int level);
49
-
50
-/* Log frame statistics to the CSV file handle. level should have been non-zero
51
- * in the call to x265_csvlog_open() if this function is called. */
52
-LIBAPI void x265_csvlog_frame(FILE* csvfp, const x265_param& param, const x265_picture& pic, int level);
53
-
54
-/* Log final encode statistics to the CSV file handle. 'argc' and 'argv' are
55
- * intended to be command line arguments passed to the encoder. Encode
56
- * statistics should be queried from the encoder just prior to closing it. */
57
-LIBAPI void x265_csvlog_encode(FILE* csvfp, const char* version, const x265_param& param, int padx, int pady, const x265_stats& stats, int level, int argc, char** argv);
58
-
59
-/* In-place downshift from a bit-depth greater than 8 to a bit-depth of 8, using
60
- * the residual bits to dither each row. */
61
-LIBAPI void x265_dither_image(const x265_api& api, x265_picture&, int picWidth, int picHeight, int16_t *errorBuf, int bitDepth);
62
-
63
-#ifdef __cplusplus
64
-}
65
-#endif
66
-
67
-#endif
68
x265_2.5.tar.gz/.hg_archival.txt -> x265_2.6.tar.gz/.hg_archival.txt Changed
8
 
1
@@ -1,4 +1,4 @@
2
 repo: 09fe40627f03a0f9c3e6ac78b22ac93da23f9fdf
3
-node: 64b2d0bf45a52511e57a6b7299160b961ca3d51c
4
+node: 0e9ea76945c89962cd46cee6537586e2054b2935
5
 branch: stable
6
-tag: 2.5
7
+tag: 2.6
8
x265_2.5.tar.gz/.hgtags -> x265_2.6.tar.gz/.hgtags Changed
6
 
1
@@ -23,3 +23,4 @@
2
 be14a7e9755e54f0fd34911c72bdfa66981220bc 2.2
3
 3037c1448549ca920967831482c653e5892fa8ed 2.3
4
 e7a4dd48293b7956d4a20df257d23904cc78e376 2.4
5
+64b2d0bf45a52511e57a6b7299160b961ca3d51c 2.5
6
x265_2.6.tar.gz/build/msys-cl/make-Makefiles-64bit.sh Added
30
 
1
@@ -0,0 +1,27 @@
2
+#!/bin/sh
3
+# This is to generate visual studio builds with required environment variables set in this shell, useful for ffmpeg integration
4
+# Run this from within an MSYS bash shell
5
+
6
+target_processor='amd64'
7
+path=$(which cl)
8
+
9
+if cl; then
10
+    echo
11
+else
12
+    echo "please launch 'visual studio command prompt' and run '..\vcvarsall.bat amd64'"
13
+    echo "and then launch msys bash shell from there"
14
+    exit 1
15
+fi
16
+
17
+if [[ $path  == *$target_processor* ]]; then
18
+    echo
19
+else
20
+    echo "64 bit target not set, please launch 'visual studio command prompt' and run '..\vcvarsall.bat amd64 | x86_amd64 | amd64_x86'"
21
+    exit 1
22
+fi
23
+
24
+cmake -G "NMake Makefiles" -DCMAKE_CXX_FLAGS="-DWIN32 -D_WINDOWS -W4 -GR -EHsc" -DCMAKE_C_FLAGS="-DWIN32 -D_WINDOWS -W4"  ../../source
25
+if [ -e Makefile ]
26
+then
27
+    nmake
28
+fi
29
\ No newline at end of file
30
x265_2.6.tar.gz/build/msys-cl/make-Makefiles.sh Added
20
 
1
@@ -0,0 +1,17 @@
2
+#!/bin/sh
3
+# This is to generate visual studio builds with required environment variables set in this shell, useful for ffmpeg integration
4
+# Run this from within an MSYS bash shell
5
+
6
+if cl; then
7
+    echo 
8
+else
9
+    echo "please launch msys from 'visual studio command prompt'"
10
+    exit 1
11
+fi
12
+
13
+cmake -G "NMake Makefiles" -DCMAKE_CXX_FLAGS="-DWIN32 -D_WINDOWS -W4 -GR -EHsc" -DCMAKE_C_FLAGS="-DWIN32 -D_WINDOWS -W4"  ../../source
14
+
15
+if [ -e Makefile ]
16
+then
17
+    nmake
18
+fi
19
\ No newline at end of file
20
x265_2.6.tar.gz/build/vc15-x86/build-all.bat Added
16
 
1
@@ -0,0 +1,14 @@
2
+@echo off
3
+if "%VS150COMNTOOLS%" == "" (
4
+  msg "%username%" "Visual Studio 15 not detected"
5
+  exit 1
6
+)
7
+if not exist x265.sln (
8
+  call make-solutions.bat
9
+)
10
+if exist x265.sln (
11
+  call "%VS150COMNTOOLS%\..\..\VC\vcvarsall.bat"
12
+  MSBuild /property:Configuration="Release" x265.sln
13
+  MSBuild /property:Configuration="Debug" x265.sln
14
+  MSBuild /property:Configuration="RelWithDebInfo" x265.sln
15
+)
16
x265_2.6.tar.gz/build/vc15-x86/make-solutions.bat Added
8
 
1
@@ -0,0 +1,6 @@
2
+@echo off
3
+::
4
+:: run this batch file to create a Visual Studio solution file for this project.
5
+:: See the cmake documentation for other generator targets
6
+::
7
+cmake -G "Visual Studio 15" ..\..\source && cmake-gui ..\..\source
8
x265_2.6.tar.gz/build/vc15-x86_64/build-all.bat Added
16
 
1
@@ -0,0 +1,14 @@
2
+@echo off
3
+if "%VS150COMNTOOLS%" == "" (
4
+  msg "%username%" "Visual Studio 15 not detected"
5
+  exit 1
6
+)
7
+if not exist x265.sln (
8
+  call make-solutions.bat
9
+)
10
+if exist x265.sln (
11
+  call "%VS150COMNTOOLS%\..\..\VC\vcvarsall.bat"
12
+  MSBuild /property:Configuration="Release" x265.sln
13
+  MSBuild /property:Configuration="Debug" x265.sln
14
+  MSBuild /property:Configuration="RelWithDebInfo" x265.sln
15
+)
16
x265_2.6.tar.gz/build/vc15-x86_64/make-solutions.bat Added
8
 
1
@@ -0,0 +1,6 @@
2
+@echo off
3
+::
4
+:: run this batch file to create a Visual Studio solution file for this project.
5
+:: See the cmake documentation for other generator targets
6
+::
7
+cmake -G "Visual Studio 15 Win64" ..\..\source && cmake-gui ..\..\source
8
x265_2.6.tar.gz/build/vc15-x86_64/multilib.bat Added
46
 
1
@@ -0,0 +1,44 @@
2
+@echo off
3
+if "%VS150COMNTOOLS%" == "" (
4
+  msg "%username%" "Visual Studio 15 not detected"
5
+  exit 1
6
+)
7
+
8
+call "%VS150COMNTOOLS%\..\..\VC\vcvarsall.bat"
9
+
10
+@mkdir 12bit
11
+@mkdir 10bit
12
+@mkdir 8bit
13
+
14
+@cd 12bit
15
+cmake -G "Visual Studio 15 Win64" ../../../source -DHIGH_BIT_DEPTH=ON -DEXPORT_C_API=OFF -DENABLE_SHARED=OFF -DENABLE_CLI=OFF -DMAIN12=ON
16
+if exist x265.sln (
17
+  MSBuild /property:Configuration="Release" x265.sln
18
+  copy/y Release\x265-static.lib ..\8bit\x265-static-main12.lib
19
+)
20
+
21
+@cd ..\10bit
22
+cmake -G "Visual Studio 15 Win64" ../../../source -DHIGH_BIT_DEPTH=ON -DEXPORT_C_API=OFF -DENABLE_SHARED=OFF -DENABLE_CLI=OFF
23
+if exist x265.sln (
24
+  MSBuild /property:Configuration="Release" x265.sln
25
+  copy/y Release\x265-static.lib ..\8bit\x265-static-main10.lib
26
+)
27
+
28
+@cd ..\8bit
29
+if not exist x265-static-main10.lib (
30
+  msg "%username%" "10bit build failed"
31
+  exit 1
32
+)
33
+if not exist x265-static-main12.lib (
34
+  msg "%username%" "12bit build failed"
35
+  exit 1
36
+)
37
+cmake -G "Visual Studio 15 Win64" ../../../source -DEXTRA_LIB="x265-static-main10.lib;x265-static-main12.lib" -DLINKED_10BIT=ON -DLINKED_12BIT=ON
38
+if exist x265.sln (
39
+  MSBuild /property:Configuration="Release" x265.sln
40
+  :: combine static libraries (ignore warnings caused by winxp.cpp hacks)
41
+  move Release\x265-static.lib x265-static-main.lib
42
+  LIB.EXE /ignore:4006 /ignore:4221 /OUT:Release\x265-static.lib x265-static-main.lib x265-static-main10.lib x265-static-main12.lib
43
+)
44
+
45
+pause
46
x265_2.5.tar.gz/doc/reST/api.rst -> x265_2.6.tar.gz/doc/reST/api.rst Changed
44
 
1
@@ -192,12 +192,36 @@
2
     *      presets is not recommended without a more fine-grained breakdown of
3
     *      parameters to take this into account. */
4
    int x265_encoder_reconfig(x265_encoder *, x265_param *);
5
-**x265_encoder_ctu_info**
6
-       /* x265_encoder_ctu_info:
7
-        *    Copy CTU information such as ctu address and ctu partition structure of all
8
-        *    CTUs in each frame. The function is invoked only if "--ctu-info" is enabled and
9
-        *    the encoder will wait for this copy to complete if enabled.
10
-        */
11
+
12
+**x265_get_slicetype_poc_and_scenecut()** may be used to fetch slice type, poc and scene cut information mid-encode::
13
+
14
+    /* x265_get_slicetype_poc_and_scenecut:
15
+     *     get the slice type, poc and scene cut information for the current frame,
16
+     *     returns negative on error, 0 on success.
17
+     *     This API must be called after(poc >= lookaheadDepth + bframes + 2) condition check. */
18
+     int x265_get_slicetype_poc_and_scenecut(x265_encoder *encoder, int *slicetype, int *poc, int* sceneCut);
19
+
20
+**x265_get_ref_frame_list()** may be used to fetch forward and backward refrence list::
21
+
22
+    /* x265_get_ref_frame_list:
23
+     *     returns negative on error, 0 when access unit were output.
24
+     *     This API must be called after(poc >= lookaheadDepth + bframes + 2) condition check */
25
+     int x265_get_ref_frame_list(x265_encoder *encoder, x265_picyuv**, x265_picyuv**, int, int);
26
+ 
27
+**x265_encoder_ctu_info** may be used to provide additional CTU-specific information to the encoder::
28
+
29
+    /* x265_encoder_ctu_info:
30
+     *    Copy CTU information such as ctu address and ctu partition structure of all
31
+     *    CTUs in each frame. The function is invoked only if "--ctu-info" is enabled and
32
+     *    the encoder will wait for this copy to complete if enabled.*/
33
+    int x265_encoder_ctu_info(x265_encoder *encoder, int poc, x265_ctu_info_t** ctu);
34
+
35
+**x265_set_analysis_data()** may be used to recive analysis information from external application::
36
+
37
+    /* x265_set_analysis_data:
38
+     *     set the analysis data. The incoming analysis_data structure is assumed to be AVC-sized blocks.
39
+     *     returns negative on error, 0 access unit were output.*/
40
+     int x265_set_analysis_data(x265_encoder *encoder, x265_analysis_data *analysis_data, int poc, uint32_t cuBytes);
41
 
42
 Pictures
43
 ========
44
x265_2.5.tar.gz/doc/reST/cli.rst -> x265_2.6.tar.gz/doc/reST/cli.rst Changed
296
 
1
@@ -399,6 +399,18 @@
2
 
3
    Default: 1 slice per frame. **Experimental feature**
4
 
5
+.. option:: --copy-pic, --no-copy-pic
6
+
7
+   Allow encoder to copy input x265 pictures to internal frame buffers. When disabled,
8
+   x265 will not make an internal copy of the input picture and will work with the
9
+   application's buffers. While this allows for deeper integration, it is the responsbility
10
+   of the application to (a) ensure that the allocated picture has extra space for padding
11
+   that will be done by the library, and (b) the buffers aren't recycled until the library
12
+   has completed encoding this frame (which can be figured out by tracking NALs output by x265)
13
+
14
+   Default: enabled
15
+
16
+
17
 Input/Output File Options
18
 =========================
19
 
20
@@ -875,17 +887,26 @@
21
 
22
    Note that --analysis-reuse-level must be paired with analysis-reuse-mode.
23
 
24
-   +--------+-----------------------------------------+
25
-   | Level  | Description                             |
26
-   +========+=========================================+
27
-   | 1      | Lookahead information                   |
28
-   +--------+-----------------------------------------+
29
-   | 2 to 4 | Level 1 + intra/inter modes, ref's      |
30
-   +--------+-----------------------------------------+
31
-   | 5 to 9 | Level 2 + rect-amp                      |
32
-   +--------+-----------------------------------------+
33
-   | 10     | Level 5 + Full CU analysis-info         |
34
-   +--------+-----------------------------------------+
35
+    +--------------+------------------------------------------+
36
+    | Level        | Description                              |
37
+    +==============+==========================================+
38
+    | 1            | Lookahead information                    |
39
+    +--------------+------------------------------------------+
40
+    | 2 to 4       | Level 1 + intra/inter modes, ref's       |
41
+    +--------------+------------------------------------------+
42
+    | 5,6 and 9    | Level 2 + rect-amp                       |
43
+    +--------------+------------------------------------------+
44
+    | 7            | Level 5 + AVC size CU refinement         |
45
+    +--------------+------------------------------------------+
46
+    | 8            | Level 5 + AVC size Full CU analysis-info |
47
+    +--------------+------------------------------------------+
48
+    | 10           | Level 5 + Full CU analysis-info          |
49
+    +--------------+------------------------------------------+
50
+
51
+.. option:: --refine-mv-type <string>
52
+
53
+    Reuse MV information received through API call. Currently receives information for AVC size and the accepted 
54
+    string input is "avc". Default is disabled.
55
 
56
 .. option:: --scale-factor
57
 
58
@@ -893,28 +914,44 @@
59
        This option should be coupled with analysis-reuse-mode option, --analysis-reuse-level 10.
60
        The ctu size of load should be double the size of save. Default 0.
61
 
62
-.. option:: --refine-intra <0|1|2>
63
+.. option:: --refine-intra <0..3>
64
    
65
    Enables refinement of intra blocks in current encode. 
66
    
67
-   Level 0 - Forces both mode and depth from the previous encode.
68
+   Level 0 - Forces both mode and depth from the save encode.
69
+   
70
+   Level 1 - Evaluates all intra modes at current depth(n) and at depth 
71
+   (n+1) when current block size is one greater than the min-cu-size.
72
+   Forces modes for larger blocks.
73
    
74
-   Level 1 - Evaluates all intra modes for blocks of size one smaller than 
75
-   the min-cu-size of the incoming analysis data from the previous encode, 
76
-   forces modes for blocks of larger size.
77
+   Level 2 - In addition to the functionality of level 1, at all depths, force 
78
+   (a) only depth when angular mode is chosen by the save encode.
79
+   (b) depth and mode when other intra modes are chosen by the save encode.
80
    
81
-   Level 2 - Evaluates all intra modes for blocks of size one smaller than 
82
-   the min-cu-size of the incoming analysis data from the previous encode. 
83
-   For larger blocks, force only depth when angular mode is chosen by the 
84
-   previous encode, force depth and mode when other intra modes are chosen.
85
+   Level 3 - Perform analysis of intra modes for depth reused from first encode.
86
    
87
    Default 0.
88
    
89
-.. option:: --refine-inter-depth
90
+.. option:: --refine-inter <0..3>
91
 
92
-   Enables refinement of inter blocks in current encode. Evaluates all 
93
-   inter modes for blocks of size one smaller than the min-cu-size of the 
94
-   incoming analysis data from the previous encode. Default disabled.
95
+   Enables refinement of inter blocks in current encode. 
96
+   
97
+   Level 0 - Forces both mode and depth from the save encode.
98
+   
99
+   Level 1 - Evaluates all inter modes at current depth(n) and at depth 
100
+   (n+1) when current block size is one greater than the min-cu-size.
101
+   Forces modes for larger blocks.
102
+   
103
+   Level 2 - In addition to the functionality of level 1, restricts the modes 
104
+   evaluated when specific modes are decided as the best mode by the save encode.
105
+   
106
+   2nx2n in save encode - disable re-evaluation of rect and amp.
107
+   
108
+   skip in save encode  - re-evaluates only skip, merge and 2nx2n modes.
109
+   
110
+   Level 3 - Perform analysis of inter modes while reusing depths from the save encode.
111
+   
112
+   Default 0.
113
 
114
 .. option:: --refine-mv
115
    
116
@@ -1405,6 +1442,16 @@
117
 .. option:: --b-pyramid, --no-b-pyramid
118
 
119
    Use B-frames as references, when possible. Default enabled
120
+   
121
+.. option:: --force-flush <integer>
122
+
123
+   Force the encoder to flush frames. Default is 0.
124
+   
125
+   Values:
126
+   0 - flush the encoder only when all the input pictures are over.
127
+   1 - flush all the frames even when the input is not over. 
128
+       slicetype decision may change with this option.
129
+   2 - flush the slicetype decided frames only.     
130
 
131
 Quality, rate control and rate distortion options
132
 =================================================
133
@@ -1470,6 +1517,24 @@
134
    Default 0.9
135
 
136
    **Range of values:** fractional: 0 - 1.0, or kbits: 2 .. bufsize
137
+   
138
+.. option:: --vbv-end <float>
139
+
140
+   Final buffer emptiness. The portion of the decode buffer that must be 
141
+   available after all the specified frames have been inserted into the 
142
+   decode buffer. Specified as a fractional value between 0 and 1, or in 
143
+   kbits. Default 0 (disabled)
144
+   
145
+   This enables basic support for chunk-parallel encoding where each segment 
146
+   can specify the starting and ending state of the VBV buffer so that VBV 
147
+   compliance can be maintained when chunks are independently encoded and 
148
+   stitched together.
149
+   
150
+.. option:: --vbv-end-fr-adj <float>
151
+
152
+   Frame from which qp has to be adjusted to achieve final decode buffer
153
+   emptiness. Specified as a fraction of the total frames. Fractions > 0 are 
154
+   supported only when the total number of frames is known. Default 0.
155
 
156
 .. option:: --qp, -q <integer>
157
 
158
@@ -1529,7 +1594,7 @@
159
    Enable adaptive quantization for sub-CTUs. This parameter specifies 
160
    the minimum CU size at which QP can be adjusted, ie. Quantization Group
161
    size. Allowed range of values are 64, 32, 16, 8 provided this falls within 
162
-   the inclusive range [maxCUSize, minCUSize]. Experimental.
163
+   the inclusive range [maxCUSize, minCUSize].
164
    Default: same as maxCUSize
165
 
166
 .. option:: --cutree, --no-cutree
167
@@ -1618,7 +1683,7 @@
168
    conservative, waiting until there is enough feedback in terms of 
169
    encoded frames to control QP. strict-cbr allows the encoder to be 
170
    more aggressive in hitting the target bitrate even for short segment 
171
-   videos. Experimental.
172
+   videos.
173
    
174
 .. option:: --cbqpoffs <integer>
175
 
176
@@ -1878,7 +1943,7 @@
177
    undefined (not signaled)
178
 
179
    1. bt709
180
-   2. undef
181
+   2. unknown
182
    3. **reserved**
183
    4. bt470m
184
    5. bt470bg
185
@@ -1886,13 +1951,16 @@
186
    7. smpte240m
187
    8. film
188
    9. bt2020
189
+    10. smpte428
190
+    11. smpte431
191
+    12. smpte432
192
 
193
 .. option:: --transfer <integer|string>
194
 
195
    Specify transfer characteristics. Default undefined (not signaled)
196
 
197
    1. bt709
198
-   2. undef
199
+   2. unknown
200
    3. **reserved**
201
    4. bt470m
202
    5. bt470bg
203
@@ -1906,8 +1974,8 @@
204
    13. iec61966-2-1
205
    14. bt2020-10
206
    15. bt2020-12
207
-   16. smpte-st-2084
208
-   17. smpte-st-428
209
+   16. smpte2084
210
+   17. smpte428
211
    18. arib-std-b67
212
 
213
 .. option:: --colormatrix <integer|string>
214
@@ -1926,6 +1994,10 @@
215
    8. YCgCo
216
    9. bt2020nc
217
    10. bt2020c
218
+    11. smpte2085
219
+    12. chroma-derived-nc
220
+    13. chroma-derived-c
221
+    14. ictcp
222
 
223
 .. option:: --chromaloc <0..5>
224
 
225
@@ -1976,15 +2048,15 @@
226
 .. option:: --hdr, --no-hdr
227
 
228
    Force signalling of HDR parameters in SEI packets. Enabled
229
-   automatically when :option`--master-display` or :option`--max-cll` is
230
+   automatically when :option:`--master-display` or :option:`--max-cll` is
231
    specified. Useful when there is a desire to signal 0 values for max-cll
232
    and max-fall. Default disabled.
233
    
234
 .. option:: --hdr-opt, --no-hdr-opt
235
 
236
    Add luma and chroma offsets for HDR/WCG content.
237
-   Input video should be 10 bit 4:2:0. Applicable for HDR content.
238
-   Default disabled. **Experimental Feature**
239
+   Input video should be 10 bit 4:2:0. Applicable for HDR content. It is recommended
240
+   that AQ-mode be enabled along with this feature. Default disabled.
241
    
242
 .. option:: --dhdr10-info <filename>
243
 
244
@@ -2004,12 +2076,12 @@
245
 .. option:: --min-luma <integer>
246
 
247
    Minimum luma value allowed for input pictures. Any values below min-luma
248
-   are clipped. Experimental. No default.
249
+   are clipped.  No default.
250
 
251
 .. option:: --max-luma <integer>
252
 
253
    Maximum luma value allowed for input pictures. Any values above max-luma
254
-   are clipped. Experimental. No default.
255
+   are clipped.  No default.
256
 
257
 Bitstream options
258
 =================
259
@@ -2091,12 +2163,12 @@
260
 .. option:: --opt-qp-pps, --no-opt-qp-pps
261
 
262
    Optimize QP in PPS (instead of default value of 26) based on the QP values
263
-   observed in last GOP. Default enabled.
264
+   observed in last GOP. Default disabled.
265
 
266
 .. option:: --opt-ref-list-length-pps, --no-opt-ref-list-length-pps
267
 
268
    Optimize L0 and L1 ref list length in PPS (instead of default value of 0)
269
-   based on the lengths observed in the last GOP. Default enabled.
270
+   based on the lengths observed in the last GOP. Default disabled.
271
 
272
 .. option:: --multi-pass-opt-rps, --no-multi-pass-opt-rps
273
 
274
@@ -2109,6 +2181,21 @@
275
 
276
    Only effective at RD levels 5 and 6
277
 
278
+DCT Approximations
279
+=================
280
+
281
+.. option:: --lowpass-dct
282
+
283
+    If enabled, x265 will use low-pass subband dct approximation instead of the
284
+    standard dct for 16x16 and 32x32 blocks. This approximation is less computational 
285
+    intensive but it generates truncated coefficient matrixes for the transformed block. 
286
+    Empirical analysis shows marginal loss in compression and performance gains up to 10%,
287
+    paticularly at moderate bit-rates.
288
+
289
+    This approximation should be considered for platforms with performance and time 
290
+    constrains.
291
+
292
+    Default disabled. **Experimental feature**
293
 
294
 Debugging options
295
 =================
296
x265_2.5.tar.gz/doc/reST/releasenotes.rst -> x265_2.6.tar.gz/doc/reST/releasenotes.rst Changed
39
 
1
@@ -2,6 +2,37 @@
2
 Release Notes
3
 *************
4
 
5
+Version 2.6
6
+===========
7
+
8
+Release date - 29th November, 2017.
9
+
10
+New features
11
+------------
12
+1. x265 can now refine analysis from a previous HEVC encode (using options :option:`--refine-inter`, and :option:`--refine-intra`), or a previous AVC encode (using option :option:`--refine-mv-type`). The previous encode's information can be packaged using the *x265_analysis_data_t*  data field available in the *x265_picture* object.
13
+2. Basic support for segmented (or chunked) encoding added with :option:`--vbv-end` that can specify the status of CPB at the end of a segment. String this together with :option:`--vbv-init` to encode a title as chunks while maintaining VBV compliance!
14
+3. :option:`--force-flush` can be used to trigger a premature flush of the encoder. This option is beneficial when input is known to be bursty, and may be at a rate slower than the encoder.
15
+4. Experimental feature :option:`--lowpass-dct` that uses truncated DCT for transformation.
16
+
17
+Encoder enhancements
18
+--------------------
19
+1. Slice-parallel mode gets a significant boost in performance, particularly in low-latency mode.
20
+2. x265 now officially supported on VS2017.
21
+3. x265 now supports all depths from mono0 to mono16 for Y4M format.
22
+
23
+API changes
24
+-----------
25
+1. Options that modified PPS dynamically (:option:`--opt-qp-pps` and :option:`--opt-ref-list-length-pps`) are now disabled by default to enable users to save bits by not sending headers. If these options are enabled, headers have to be repeated for every GOP.
26
+2. Rate-control and analysis parameters can dynamically be reconfigured simultaneously via the *x265_encoder_reconfig* API.
27
+3. New API functions to extract intermediate information such as slice-type, scenecut information, reference frames, etc. are now available. This information may be beneficial to integrating applications that are attempting to perform content-adaptive encoding. Refer to documentation on *x265_get_slicetype_poc_and_scenecut*, and *x265_get_ref_frame_list* for more details and suggested usage.
28
+4. A new API to pass supplemental CTU information to x265 to influence analysis decisions has been added. Refer to documentation on *x265_encoder_ctu_info* for more details.
29
+
30
+Bug fixes
31
+---------
32
+1. Bug fixes when :option:`--slices` is used with VBV settings.
33
+2. Minor memory leak fixed for HDR10+ builds, and default x265 when pools option is specified.
34
+3. HDR10+ bug fix to remove dependence on poc counter to select meta-data information.
35
+
36
 Version 2.5
37
 ===========
38
 
39
x265_2.5.tar.gz/source/CMakeLists.txt -> x265_2.6.tar.gz/source/CMakeLists.txt Changed
58
 
1
@@ -29,7 +29,7 @@
2
 option(STATIC_LINK_CRT "Statically link C runtime for release builds" OFF)
3
 mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
4
 # X265_BUILD must be incremented each time the public API is changed
5
-set(X265_BUILD 130)
6
+set(X265_BUILD 146)
7
 configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
8
                "${PROJECT_BINARY_DIR}/x265.def")
9
 configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
10
@@ -184,6 +184,14 @@
11
 endif()
12
 # this option is to enable the inclusion of dynamic HDR10 library to the libx265 compilation
13
 option(ENABLE_HDR10_PLUS "Enable dynamic HDR10 compilation" OFF)
14
+if(MSVC AND (MSVC_VERSION LESS 1800) AND ENABLE_HDR10_PLUS)
15
+    message(FATAL_ERROR "MSVC version 12.0 or above required to support hdr10plus")
16
+endif()
17
+if(WIN32 AND (MSVC_VERSION GREATER 1800))
18
+    if(CMAKE_VERSION VERSION_LESS 3.7)
19
+        message(FATAL_ERROR "cmake version not compatible for VS 2017. Update the cmake to versions 3.7 or above")
20
+    endif()
21
+endif()
22
 if(GCC)
23
     add_definitions(-Wall -Wextra -Wshadow)
24
     add_definitions(-D__STDC_LIMIT_MACROS=1)
25
@@ -539,6 +547,13 @@
26
 endif()
27
 install(FILES x265.h "${PROJECT_BINARY_DIR}/x265_config.h" DESTINATION include)
28
 
29
+if(WIN32)
30
+    install(FILES "${PROJECT_BINARY_DIR}/Debug/x265.pdb" DESTINATION ${BIN_INSTALL_DIR} CONFIGURATIONS Debug)
31
+    install(FILES "${PROJECT_BINARY_DIR}/RelWithDebInfo/x265.pdb" DESTINATION ${BIN_INSTALL_DIR} CONFIGURATIONS RelWithDebInfo)
32
+    install(FILES "${PROJECT_BINARY_DIR}/Debug/libx265.pdb" DESTINATION ${BIN_INSTALL_DIR} CONFIGURATIONS Debug OPTIONAL NAMELINK_ONLY)
33
+    install(FILES "${PROJECT_BINARY_DIR}/RelWithDebInfo/libx265.pdb" DESTINATION ${BIN_INSTALL_DIR} CONFIGURATIONS RelWithDebInfo OPTIONAL NAMELINK_ONLY)
34
+endif()
35
+
36
 if(CMAKE_RC_COMPILER)
37
     # The resource compiler does not need CFLAGS or macro defines. It
38
     # often breaks them
39
@@ -639,13 +654,11 @@
40
             DESTINATION "${LIB_INSTALL_DIR}/pkgconfig")
41
 endif()
42
 
43
-if(NOT WIN32)
44
-    configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/cmake_uninstall.cmake.in"
45
-                   "${CMAKE_CURRENT_BINARY_DIR}/cmake/cmake_uninstall.cmake"
46
-                   IMMEDIATE @ONLY)
47
-    add_custom_target(uninstall
48
-                      "${CMAKE_COMMAND}" -P "${CMAKE_CURRENT_BINARY_DIR}/cmake/cmake_uninstall.cmake")
49
-endif()
50
+configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/cmake_uninstall.cmake.in"
51
+               "${CMAKE_CURRENT_BINARY_DIR}/cmake/cmake_uninstall.cmake"
52
+               IMMEDIATE @ONLY)
53
+add_custom_target(uninstall
54
+                  "${CMAKE_COMMAND}" -P "${CMAKE_CURRENT_BINARY_DIR}/cmake/cmake_uninstall.cmake")
55
 
56
 # Main CLI application
57
 set(ENABLE_CLI ON CACHE BOOL "Build standalone CLI application")
58
x265_2.5.tar.gz/source/cmake/cmake_uninstall.cmake.in -> x265_2.6.tar.gz/source/cmake/cmake_uninstall.cmake.in Changed
9
 
1
@@ -17,3 +17,7 @@
2
         message(STATUS "File '$ENV{DESTDIR}${file}' does not exist.")
3
     endif()
4
 endforeach(file)
5
+
6
+if(EXISTS "${CMAKE_CURRENT_BINARY_DIR}/install_manifest.txt")
7
+    file(REMOVE "${CMAKE_CURRENT_BINARY_DIR}/install_manifest.txt")
8
+endif()
9
x265_2.5.tar.gz/source/common/CMakeLists.txt -> x265_2.6.tar.gz/source/common/CMakeLists.txt Changed
10
 
1
@@ -131,7 +131,7 @@
2
 add_library(common OBJECT
3
     ${ASM_PRIMITIVES} ${VEC_PRIMITIVES} ${ALTIVEC_PRIMITIVES} ${WINXP}
4
     primitives.cpp primitives.h
5
-    pixel.cpp dct.cpp ipfilter.cpp intrapred.cpp loopfilter.cpp
6
+    pixel.cpp dct.cpp lowpassdct.cpp ipfilter.cpp intrapred.cpp loopfilter.cpp
7
     constants.cpp constants.h
8
     cpu.cpp cpu.h version.cpp
9
     threading.cpp threading.h
10
x265_2.5.tar.gz/source/common/common.h -> x265_2.6.tar.gz/source/common/common.h Changed
9
 
1
@@ -207,7 +207,6 @@
2
 
3
 // arbitrary, but low because SATD scores are 1/4 normal
4
 #define X265_LOOKAHEAD_QP (12 + QP_BD_OFFSET)
5
-#define X265_LOOKAHEAD_MAX 250
6
 
7
 // Use the same size blocks as x264.  Using larger blocks seems to give artificially
8
 // high cost estimates (intra and inter both suffer)
9
x265_2.5.tar.gz/source/common/cudata.cpp -> x265_2.6.tar.gz/source/common/cudata.cpp Changed
19
 
1
@@ -201,6 +201,8 @@
2
         m_cuDepth            = charBuf; charBuf += m_numPartitions;
3
         m_predMode           = charBuf; charBuf += m_numPartitions; /* the order up to here is important in initCTU() and initSubCU() */
4
         m_partSize           = charBuf; charBuf += m_numPartitions;
5
+        m_skipFlag[0]        = charBuf; charBuf += m_numPartitions;
6
+        m_skipFlag[1]        = charBuf; charBuf += m_numPartitions;
7
         m_mergeFlag          = charBuf; charBuf += m_numPartitions;
8
         m_interDir           = charBuf; charBuf += m_numPartitions;
9
         m_mvpIdx[0]          = charBuf; charBuf += m_numPartitions;
10
@@ -239,6 +241,8 @@
11
         m_cuDepth            = charBuf; charBuf += m_numPartitions;
12
         m_predMode           = charBuf; charBuf += m_numPartitions; /* the order up to here is important in initCTU() and initSubCU() */
13
         m_partSize           = charBuf; charBuf += m_numPartitions;
14
+        m_skipFlag[0]        = charBuf; charBuf += m_numPartitions;
15
+        m_skipFlag[1]        = charBuf; charBuf += m_numPartitions;
16
         m_mergeFlag          = charBuf; charBuf += m_numPartitions;
17
         m_interDir           = charBuf; charBuf += m_numPartitions;
18
         m_mvpIdx[0]          = charBuf; charBuf += m_numPartitions;
19
x265_2.5.tar.gz/source/common/cudata.h -> x265_2.6.tar.gz/source/common/cudata.h Changed
17
 
1
@@ -199,13 +199,14 @@
2
     uint8_t*      m_predMode;         // array of prediction modes
3
     uint8_t*      m_partSize;         // array of partition sizes
4
     uint8_t*      m_mergeFlag;        // array of merge flags
5
+    uint8_t*      m_skipFlag[2];
6
     uint8_t*      m_interDir;         // array of inter directions
7
     uint8_t*      m_mvpIdx[2];        // array of motion vector predictor candidates or merge candidate indices [0]
8
     uint8_t*      m_tuDepth;          // array of transform indices
9
     uint8_t*      m_transformSkip[3]; // array of transform skipping flags per plane
10
     uint8_t*      m_cbf[3];           // array of coded block flags (CBF) per plane
11
     uint8_t*      m_chromaIntraDir;   // array of intra directions (chroma)
12
-    enum { BytesPerPartition = 21 };  // combined sizeof() of all per-part data
13
+    enum { BytesPerPartition = 23 };  // combined sizeof() of all per-part data
14
 
15
     sse_t*        m_distortion;
16
     coeff_t*      m_trCoeff[3];       // transformed coefficient buffer per plane
17
x265_2.5.tar.gz/source/common/frame.cpp -> x265_2.6.tar.gz/source/common/frame.cpp Changed
28
 
1
@@ -77,7 +77,15 @@
2
         }
3
     }
4
 
5
-    if (m_fencPic->create(param) && m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode || !!param->bAQMotion, param->rc.qgSize))
6
+    if (param->bMVType == AVC_INFO)
7
+    {
8
+        m_analysisData.wt = NULL;
9
+        m_analysisData.intraData = NULL;
10
+        m_analysisData.interData = NULL;
11
+        m_analysis2Pass.analysisFramedata = NULL;
12
+    }
13
+
14
+    if (m_fencPic->create(param, !!m_param->bCopyPicToFrame) && m_lowres.create(m_fencPic, param->bframes, !!param->rc.aqMode || !!param->bAQMotion, param->rc.qgSize))
15
     {
16
         X265_CHECK((m_reconColCount == NULL), "m_reconColCount was initialized");
17
         m_numRows = (m_fencPic->m_picHeight + param->maxCUSize - 1)  / param->maxCUSize;
18
@@ -150,7 +158,8 @@
19
 
20
     if (m_fencPic)
21
     {
22
-        m_fencPic->destroy();
23
+        if (m_param->bCopyPicToFrame)
24
+            m_fencPic->destroy();
25
         delete m_fencPic;
26
         m_fencPic = NULL;
27
     }
28
x265_2.5.tar.gz/source/common/frame.h -> x265_2.6.tar.gz/source/common/frame.h Changed
18
 
1
@@ -98,6 +98,7 @@
2
 
3
     float*                 m_quantOffsets;       // points to quantOffsets in x265_picture
4
     x265_sei               m_userSEI;
5
+    Event                  m_reconEncoded;
6
 
7
     /* Frame Parallelism - notification between FrameEncoders of available motion reference rows */
8
     ThreadSafeInteger*     m_reconRowFlag;       // flag of CTU rows completely reconstructed and extended for motion reference
9
@@ -112,6 +113,8 @@
10
     x265_analysis_2Pass    m_analysis2Pass;
11
     RcStats*               m_rcData;
12
 
13
+    Event                  m_copyMVType;
14
+
15
     x265_ctu_info_t**      m_ctuInfo;
16
     Event                  m_copied;
17
     int*                   m_prevCtuInfoChange;
18
x265_2.5.tar.gz/source/common/framedata.h -> x265_2.6.tar.gz/source/common/framedata.h Changed
9
 
1
@@ -195,6 +195,7 @@
2
     uint8_t*    mvpIdx[2];
3
     int8_t*     refIdx[2];
4
     MV*         mv[2];
5
+   int64_t*     sadCost;
6
 };
7
 
8
 struct analysis2PassFrameData
9
x265_2.6.tar.gz/source/common/lowpassdct.cpp Added
129
 
1
@@ -0,0 +1,127 @@
2
+/*****************************************************************************
3
+ * Copyright (C) 2017 
4
+ *
5
+ * Authors: Humberto Ribeiro Filho <mont3z.claro5@gmail.com>
6
+ *
7
+ * This program is free software; you can redistribute it and/or modify
8
+ * it under the terms of the GNU General Public License as published by
9
+ * the Free Software Foundation; either version 2 of the License, or
10
+ * (at your option) any later version.
11
+ *
12
+ * This program is distributed in the hope that it will be useful,
13
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
14
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
15
+ * GNU General Public License for more details.
16
+ *
17
+ * You should have received a copy of the GNU General Public License
18
+ * along with this program; if not, write to the Free Software
19
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
20
+ *
21
+ * This program is also available under a commercial proprietary license.
22
+ * For more information, contact us at license @ x265.com.
23
+ *****************************************************************************/
24
+
25
+#include "common.h"
26
+#include "primitives.h"
27
+
28
+using namespace X265_NS;
29
+
30
+/* standard dct transformations */
31
+static dct_t* s_dct4x4;
32
+static dct_t* s_dct8x8;
33
+static dct_t* s_dct16x16;
34
+
35
+static void lowPassDct8_c(const int16_t* src, int16_t* dst, intptr_t srcStride)
36
+{
37
+    ALIGN_VAR_32(int16_t, coef[4 * 4]);
38
+    ALIGN_VAR_32(int16_t, avgBlock[4 * 4]);
39
+    int16_t totalSum = 0;
40
+    int16_t sum = 0;
41
+    
42
+    for (int i = 0; i < 4; i++)
43
+        for (int j =0; j < 4; j++)
44
+        {
45
+            // Calculate average of 2x2 cells
46
+            sum = src[2*i*srcStride + 2*j] + src[2*i*srcStride + 2*j + 1]
47
+                    + src[(2*i+1)*srcStride + 2*j] + src[(2*i+1)*srcStride + 2*j + 1];
48
+            avgBlock[i*4 + j] = sum >> 2;
49
+
50
+            totalSum += sum; // use to calculate total block average
51
+        }
52
+
53
+    //dct4
54
+    (*s_dct4x4)(avgBlock, coef, 4);
55
+    memset(dst, 0, 64 * sizeof(int16_t));
56
+    for (int i = 0; i < 4; i++)
57
+    {
58
+        memcpy(&dst[i * 8], &coef[i * 4], 4 * sizeof(int16_t));
59
+    }
60
+
61
+    // replace first coef with total block average
62
+    dst[0] = totalSum << 1;
63
+}
64
+
65
+static void lowPassDct16_c(const int16_t* src, int16_t* dst, intptr_t srcStride)
66
+{
67
+    ALIGN_VAR_32(int16_t, coef[8 * 8]);
68
+    ALIGN_VAR_32(int16_t, avgBlock[8 * 8]);
69
+    int32_t totalSum = 0;
70
+    int16_t sum = 0;
71
+    for (int i = 0; i < 8; i++)
72
+        for (int j =0; j < 8; j++)
73
+        {
74
+            sum = src[2*i*srcStride + 2*j] + src[2*i*srcStride + 2*j + 1]
75
+                    + src[(2*i+1)*srcStride + 2*j] + src[(2*i+1)*srcStride + 2*j + 1];
76
+            avgBlock[i*8 + j] = sum >> 2;
77
+
78
+            totalSum += sum;
79
+        }
80
+
81
+    (*s_dct8x8)(avgBlock, coef, 8);
82
+    memset(dst, 0, 256 * sizeof(int16_t));
83
+    for (int i = 0; i < 8; i++)
84
+    {
85
+        memcpy(&dst[i * 16], &coef[i * 8], 8 * sizeof(int16_t));
86
+    }
87
+    dst[0] = static_cast<int16_t>(totalSum >> 1);
88
+}
89
+
90
+static void lowPassDct32_c(const int16_t* src, int16_t* dst, intptr_t srcStride)
91
+{
92
+    ALIGN_VAR_32(int16_t, coef[16 * 16]);
93
+    ALIGN_VAR_32(int16_t, avgBlock[16 * 16]);
94
+    int32_t totalSum = 0;
95
+    int16_t sum = 0;
96
+    for (int i = 0; i < 16; i++)
97
+        for (int j =0; j < 16; j++)
98
+        {
99
+            sum = src[2*i*srcStride + 2*j] + src[2*i*srcStride + 2*j + 1]
100
+                    + src[(2*i+1)*srcStride + 2*j] + src[(2*i+1)*srcStride + 2*j + 1];
101
+            avgBlock[i*16 + j] = sum >> 2;
102
+
103
+            totalSum += sum;
104
+        }
105
+
106
+    (*s_dct16x16)(avgBlock, coef, 16);
107
+    memset(dst, 0, 1024 * sizeof(int16_t));
108
+    for (int i = 0; i < 16; i++)
109
+    {
110
+        memcpy(&dst[i * 32], &coef[i * 16], 16 * sizeof(int16_t));
111
+    }
112
+    dst[0] = static_cast<int16_t>(totalSum >> 3);
113
+}
114
+
115
+namespace X265_NS {
116
+// x265 private namespace
117
+
118
+void setupLowPassPrimitives_c(EncoderPrimitives& p)
119
+{
120
+    s_dct4x4 = &(p.cu[BLOCK_4x4].standard_dct);
121
+    s_dct8x8 = &(p.cu[BLOCK_8x8].standard_dct);
122
+    s_dct16x16 = &(p.cu[BLOCK_16x16].standard_dct);
123
+
124
+    p.cu[BLOCK_8x8].lowpass_dct = lowPassDct8_c;
125
+    p.cu[BLOCK_16x16].lowpass_dct = lowPassDct16_c;
126
+    p.cu[BLOCK_32x32].lowpass_dct = lowPassDct32_c;
127
+}
128
+}
129
x265_2.5.tar.gz/source/common/lowres.cpp -> x265_2.6.tar.gz/source/common/lowres.cpp Changed
11
 
1
@@ -160,6 +160,9 @@
2
 
3
     for (int i = 0; i < bframes + 2; i++)
4
         intraMbs[i] = 0;
5
+    if (origPic->m_param->rc.vbvBufferSize)
6
+        for (int i = 0; i < X265_LOOKAHEAD_MAX + 1; i++)
7
+            plannedType[i] = X265_TYPE_AUTO;
8
 
9
     /* downscale and generate 4 hpel planes for lookahead */
10
     primitives.frameInitLowres(origPic->m_picOrg[0],
11
x265_2.5.tar.gz/source/common/param.cpp -> x265_2.6.tar.gz/source/common/param.cpp Changed
190
 
1
@@ -157,6 +157,7 @@
2
     param->bEnableConstrainedIntra = 0;
3
     param->bEnableStrongIntraSmoothing = 1;
4
     param->bEnableFastIntra = 0;
5
+    param->bEnableSplitRdSkip = 0;
6
 
7
     /* Inter Coding tools */
8
     param->searchMethod = X265_HEX_SEARCH;
9
@@ -211,6 +212,8 @@
10
     param->rc.vbvMaxBitrate = 0;
11
     param->rc.vbvBufferSize = 0;
12
     param->rc.vbvBufferInit = 0.9;
13
+    param->vbvBufferEnd = 0;
14
+    param->vbvEndFrameAdjust = 0;
15
     param->rc.rfConstant = 28;
16
     param->rc.bitrate = 0;
17
     param->rc.qCompress = 0.6;
18
@@ -268,8 +271,8 @@
19
 
20
     param->bEmitVUITimingInfo   = 1;
21
     param->bEmitVUIHRDInfo      = 1;
22
-    param->bOptQpPPS            = 1;
23
-    param->bOptRefListLengthPPS = 1;
24
+    param->bOptQpPPS            = 0;
25
+    param->bOptRefListLengthPPS = 0;
26
     param->bOptCUDeltaQP        = 0;
27
     param->bAQMotion = 0;
28
     param->bHDROpt = 0;
29
@@ -285,6 +288,13 @@
30
     param->mvRefine = 0;
31
     param->bUseAnalysisFile = 1;
32
     param->csvfpt = NULL;
33
+    param->forceFlush = 0;
34
+    param->bDisableLookahead = 0;
35
+    param->bCopyPicToFrame = 1;
36
+
37
+    /* DCT Approximations */
38
+    param->bLowPassDct = 0;
39
+    param->bMVType = 0;
40
 }
41
 
42
 int x265_param_default_preset(x265_param* param, const char* preset, const char* tune)
43
@@ -971,8 +981,29 @@
44
         OPT("ctu-info") p->bCTUInfo = atoi(value);
45
         OPT("scale-factor") p->scaleFactor = atoi(value);
46
         OPT("refine-intra")p->intraRefine = atoi(value);
47
-        OPT("refine-inter")p->interRefine = atobool(value);
48
+        OPT("refine-inter")p->interRefine = atoi(value);
49
         OPT("refine-mv")p->mvRefine = atobool(value);
50
+        OPT("force-flush")p->forceFlush = atoi(value);
51
+        OPT("splitrd-skip") p->bEnableSplitRdSkip = atobool(value);
52
+       OPT("lowpass-dct") p->bLowPassDct = atobool(value);
53
+        OPT("vbv-end") p->vbvBufferEnd = atof(value);
54
+        OPT("vbv-end-fr-adj") p->vbvEndFrameAdjust = atof(value);
55
+        OPT("copy-pic") p->bCopyPicToFrame = atobool(value);
56
+        OPT("refine-mv-type")
57
+        {
58
+            if (strcmp(strdup(value), "avc") == 0)
59
+            {
60
+                p->bMVType = AVC_INFO;
61
+            }
62
+            else if (strcmp(strdup(value), "off") == 0)
63
+            {
64
+                p->bMVType = NO_INFO;
65
+            }
66
+            else
67
+            {
68
+                bError = true;
69
+            }
70
+         }
71
         else
72
             return X265_PARAM_BAD_NAME;
73
     }
74
@@ -1236,10 +1267,10 @@
75
           "Video Format must be component,"
76
           " pal, ntsc, secam, mac or undef");
77
     CHECK(param->vui.colorPrimaries < 0
78
-          || param->vui.colorPrimaries > 9
79
+          || param->vui.colorPrimaries > 12
80
           || param->vui.colorPrimaries == 3,
81
           "Color Primaries must be undef, bt709, bt470m,"
82
-          " bt470bg, smpte170m, smpte240m, film or bt2020");
83
+          " bt470bg, smpte170m, smpte240m, film, bt2020, smpte-st-428, smpte-rp-431 or smpte-eg-432");
84
     CHECK(param->vui.transferCharacteristics < 0
85
           || param->vui.transferCharacteristics > 18
86
           || param->vui.transferCharacteristics == 3,
87
@@ -1247,10 +1278,10 @@
88
           " smpte170m, smpte240m, linear, log100, log316, iec61966-2-4, bt1361e,"
89
           " iec61966-2-1, bt2020-10, bt2020-12, smpte-st-2084, smpte-st-428 or arib-std-b67");
90
     CHECK(param->vui.matrixCoeffs < 0
91
-          || param->vui.matrixCoeffs > 10
92
+          || param->vui.matrixCoeffs > 14
93
           || param->vui.matrixCoeffs == 3,
94
           "Matrix Coefficients must be undef, bt709, fcc, bt470bg, smpte170m,"
95
-          " smpte240m, GBR, YCgCo, bt2020nc or bt2020c");
96
+          " smpte240m, GBR, YCgCo, bt2020nc, bt2020c, smpte-st-2085, chroma-nc, chroma-c or ictcp");
97
     CHECK(param->vui.chromaSampleLocTypeTopField < 0
98
           || param->vui.chromaSampleLocTypeTopField > 5,
99
           "Chroma Sample Location Type Top Field must be 0-5");
100
@@ -1291,6 +1322,12 @@
101
           "Maximum local bit rate can not be less than zero");
102
     CHECK(param->rc.vbvBufferInit < 0,
103
           "Valid initial VBV buffer occupancy must be a fraction 0 - 1, or size in kbits");
104
+    CHECK(param->vbvBufferEnd < 0,
105
+        "Valid final VBV buffer emptiness must be a fraction 0 - 1, or size in kbits");
106
+    CHECK(param->vbvEndFrameAdjust < 0,
107
+        "Valid vbv-end-fr-adj must be a fraction 0 - 1");
108
+    CHECK(!param->totalFrames && param->vbvEndFrameAdjust,
109
+        "vbv-end-fr-adj cannot be enabled when total number of frames is unknown");
110
     CHECK(param->rc.bitrate < 0,
111
           "Target bitrate can not be less than zero");
112
     CHECK(param->rc.qCompress < 0.5 || param->rc.qCompress > 1.0,
113
@@ -1316,6 +1353,10 @@
114
         "Supported range for log2MaxPocLsb is 4 to 16");
115
     CHECK(param->bCTUInfo < 0 || (param->bCTUInfo != 0 && param->bCTUInfo != 1 && param->bCTUInfo != 2 && param->bCTUInfo != 4 && param->bCTUInfo != 6) || param->bCTUInfo > 6,
116
         "Supported values for bCTUInfo are 0, 1, 2, 4, 6");
117
+    CHECK(param->interRefine > 3 || param->interRefine < 0,
118
+        "Invalid refine-inter value, refine-inter levels 0 to 3 supported");
119
+    CHECK(param->intraRefine > 3 || param->intraRefine < 0,
120
+        "Invalid refine-intra value, refine-intra levels 0 to 3 supported");
121
 #if !X86_64
122
     CHECK(param->searchMethod == X265_SEA && (param->sourceWidth > 840 || param->sourceHeight > 480),
123
         "SEA motion search does not support resolutions greater than 480p in 32 bit build");
124
@@ -1410,9 +1451,15 @@
125
     }
126
 
127
     if (param->rc.vbvBufferSize)
128
-        x265_log(param, X265_LOG_INFO, "VBV/HRD buffer / max-rate / init    : %d / %d / %.3f\n",
129
-                 param->rc.vbvBufferSize, param->rc.vbvMaxBitrate, param->rc.vbvBufferInit);
130
-
131
+    {
132
+        if (param->vbvBufferEnd)
133
+            x265_log(param, X265_LOG_INFO, "VBV/HRD buffer / max-rate / init / end / fr-adj: %d / %d / %.3f / %.3f / %.3f\n",
134
+            param->rc.vbvBufferSize, param->rc.vbvMaxBitrate, param->rc.vbvBufferInit, param->vbvBufferEnd, param->vbvEndFrameAdjust);
135
+        else
136
+            x265_log(param, X265_LOG_INFO, "VBV/HRD buffer / max-rate / init    : %d / %d / %.3f\n",
137
+            param->rc.vbvBufferSize, param->rc.vbvMaxBitrate, param->rc.vbvBufferInit);
138
+    }
139
+    
140
     char buf[80] = { 0 };
141
     char tmp[40];
142
 #define TOOLOPT(FLAG, STR) if (FLAG) appendtool(param, buf, sizeof(buf), STR);
143
@@ -1429,6 +1476,7 @@
144
     TOOLOPT(param->bEnableRdRefine, "rd-refine");
145
     TOOLOPT(param->bEnableEarlySkip, "early-skip");
146
     TOOLOPT(param->bEnableRecursionSkip, "rskip");
147
+    TOOLOPT(param->bEnableSplitRdSkip, "splitrd-skip");
148
     TOOLVAL(param->noiseReductionIntra, "nr-intra=%d");
149
     TOOLVAL(param->noiseReductionInter, "nr-inter=%d");
150
     TOOLOPT(param->bEnableTSkipFast, "tskip-fast");
151
@@ -1444,6 +1492,8 @@
152
     TOOLVAL(param->lookaheadSlices, "lslices=%d");
153
     TOOLVAL(param->lookaheadThreads, "lthreads=%d")
154
     TOOLVAL(param->bCTUInfo, "ctu-info=%d");
155
+    if (param->bMVType == AVC_INFO)
156
+        TOOLOPT(param->bMVType, "refine-mv-type=avc");
157
     if (param->maxSlices > 1)
158
         TOOLVAL(param->maxSlices, "slices=%d");
159
     if (param->bEnableLoopFilter)
160
@@ -1558,6 +1608,7 @@
161
     BOOL(p->bEnableTSkipFast, "tskip-fast");
162
     BOOL(p->bCULossless, "cu-lossless");
163
     BOOL(p->bIntraInBFrames, "b-intra");
164
+    BOOL(p->bEnableSplitRdSkip, "splitrd-skip");
165
     s += sprintf(s, " rdpenalty=%d", p->rdPenalty);
166
     s += sprintf(s, " psy-rd=%.2f", p->psyRd);
167
     s += sprintf(s, " psy-rdoq=%.2f", p->psyRdoq);
168
@@ -1587,8 +1638,10 @@
169
         {
170
             s += sprintf(s, " vbv-maxrate=%d vbv-bufsize=%d vbv-init=%.1f",
171
                  p->rc.vbvMaxBitrate, p->rc.vbvBufferSize, p->rc.vbvBufferInit);
172
+            if (p->vbvBufferEnd)
173
+                s += sprintf(s, " vbv-end=%.1f vbv-end-fr-adj=%.1f", p->vbvBufferEnd, p->vbvEndFrameAdjust);
174
             if (p->rc.rateControlMode == X265_RC_CRF)
175
-                s += sprintf(s, " crf-max=%.1f crf-min=%.1f", p->rc.rfConstantMax, p->rc.rfConstantMin);
176
+                s += sprintf(s, " crf-max=%.1f crf-min=%.1f", p->rc.rfConstantMax, p->rc.rfConstantMin);   
177
         }
178
     }
179
     else if (p->rc.rateControlMode == X265_RC_CQP)
180
@@ -1665,6 +1718,9 @@
181
     s += sprintf(s, " refine-mv=%d", p->mvRefine);
182
     BOOL(p->bLimitSAO, "limit-sao");
183
     s += sprintf(s, " ctu-info=%d", p->bCTUInfo);
184
+    BOOL(p->bLowPassDct, "lowpass-dct");
185
+    s += sprintf(s, " refine-mv-type=%d", p->bMVType);
186
+    s += sprintf(s, " copy-pic=%d", p->bCopyPicToFrame);
187
 #undef BOOL
188
     return buf;
189
 }
190
x265_2.5.tar.gz/source/common/piclist.cpp -> x265_2.6.tar.gz/source/common/piclist.cpp Changed
17
 
1
@@ -117,6 +117,15 @@
2
         return NULL;
3
 }
4
 
5
+Frame* PicList::getCurFrame(void)
6
+{
7
+    Frame *curFrame = m_start;
8
+    if (curFrame != NULL)
9
+        return curFrame;
10
+    else
11
+        return NULL;
12
+}
13
+
14
 void PicList::remove(Frame& curFrame)
15
 {
16
 #if _DEBUG
17
x265_2.5.tar.gz/source/common/piclist.h -> x265_2.6.tar.gz/source/common/piclist.h Changed
11
 
1
@@ -62,6 +62,9 @@
2
     /** Find frame with specified POC */
3
     Frame* getPOC(int poc);
4
 
5
+    /** Get the current Frame from the list **/
6
+    Frame* getCurFrame(void);
7
+
8
     /** Remove picture from list */
9
     void remove(Frame& pic);
10
 
11
x265_2.5.tar.gz/source/common/picyuv.cpp -> x265_2.6.tar.gz/source/common/picyuv.cpp Changed
149
 
1
@@ -69,7 +69,7 @@
2
     m_vChromaShift = 0;
3
 }
4
 
5
-bool PicYuv::create(x265_param* param, pixel *pixelbuf)
6
+bool PicYuv::create(x265_param* param, bool picAlloc, pixel *pixelbuf)
7
 {
8
     m_param = param;
9
     uint32_t picWidth = m_param->sourceWidth;
10
@@ -93,8 +93,11 @@
11
         m_picOrg[0] = pixelbuf;
12
     else
13
     {
14
-        CHECKED_MALLOC(m_picBuf[0], pixel, m_stride * (maxHeight + (m_lumaMarginY * 2)));
15
-        m_picOrg[0] = m_picBuf[0] + m_lumaMarginY * m_stride + m_lumaMarginX;
16
+        if (picAlloc)
17
+        {
18
+            CHECKED_MALLOC(m_picBuf[0], pixel, m_stride * (maxHeight + (m_lumaMarginY * 2)));
19
+            m_picOrg[0] = m_picBuf[0] + m_lumaMarginY * m_stride + m_lumaMarginX;
20
+        }
21
     }
22
 
23
     if (picCsp != X265_CSP_I400)
24
@@ -102,12 +105,14 @@
25
         m_chromaMarginX = m_lumaMarginX;  // keep 16-byte alignment for chroma CTUs
26
         m_chromaMarginY = m_lumaMarginY >> m_vChromaShift;
27
         m_strideC = ((numCuInWidth * m_param->maxCUSize) >> m_hChromaShift) + (m_chromaMarginX * 2);
28
+        if (picAlloc)
29
+        {
30
+            CHECKED_MALLOC(m_picBuf[1], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
31
+            CHECKED_MALLOC(m_picBuf[2], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
32
 
33
-        CHECKED_MALLOC(m_picBuf[1], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
34
-        CHECKED_MALLOC(m_picBuf[2], pixel, m_strideC * ((maxHeight >> m_vChromaShift) + (m_chromaMarginY * 2)));
35
-
36
-        m_picOrg[1] = m_picBuf[1] + m_chromaMarginY * m_strideC + m_chromaMarginX;
37
-        m_picOrg[2] = m_picBuf[2] + m_chromaMarginY * m_strideC + m_chromaMarginX;
38
+            m_picOrg[1] = m_picBuf[1] + m_chromaMarginY * m_strideC + m_chromaMarginX;
39
+            m_picOrg[2] = m_picBuf[2] + m_chromaMarginY * m_strideC + m_chromaMarginX;
40
+        }
41
     }
42
     else
43
     {
44
@@ -236,8 +241,10 @@
45
     uint64_t crSum;
46
     lumaSum = cbSum = crSum = 0;
47
 
48
-    if (pic.bitDepth == 8)
49
+    if (m_param->bCopyPicToFrame)
50
     {
51
+        if (pic.bitDepth == 8)
52
+        {
53
 #if (X265_DEPTH > 8)
54
         {
55
             pixel *yPixel = m_picOrg[0];
56
@@ -260,7 +267,7 @@
57
             }
58
         }
59
 #else /* Case for (X265_DEPTH == 8) */
60
-        // TODO: Does we need this path? may merge into above in future
61
+            // TODO: Does we need this path? may merge into above in future
62
         {
63
             pixel *yPixel = m_picOrg[0];
64
             uint8_t *yChar = (uint8_t*)pic.planes[0];
65
@@ -294,47 +301,54 @@
66
             }
67
         }
68
 #endif /* (X265_DEPTH > 8) */
69
-    }
70
-    else /* pic.bitDepth > 8 */
71
-    {
72
-        /* defensive programming, mask off bits that are supposed to be zero */
73
-        uint16_t mask = (1 << X265_DEPTH) - 1;
74
-        int shift = abs(pic.bitDepth - X265_DEPTH);
75
-        pixel *yPixel = m_picOrg[0];
76
-
77
-        uint16_t *yShort = (uint16_t*)pic.planes[0];
78
-
79
-        if (pic.bitDepth > X265_DEPTH)
80
-        {
81
-            /* shift right and mask pixels to final size */
82
-            primitives.planecopy_sp(yShort, pic.stride[0] / sizeof(*yShort), yPixel, m_stride, width, height, shift, mask);
83
-        }
84
-        else /* Case for (pic.bitDepth <= X265_DEPTH) */
85
-        {
86
-            /* shift left and mask pixels to final size */
87
-            primitives.planecopy_sp_shl(yShort, pic.stride[0] / sizeof(*yShort), yPixel, m_stride, width, height, shift, mask);
88
         }
89
-
90
-        if (param.internalCsp != X265_CSP_I400)
91
+        else /* pic.bitDepth > 8 */
92
         {
93
-            pixel *uPixel = m_picOrg[1];
94
-            pixel *vPixel = m_picOrg[2];
95
+            /* defensive programming, mask off bits that are supposed to be zero */
96
+            uint16_t mask = (1 << X265_DEPTH) - 1;
97
+            int shift = abs(pic.bitDepth - X265_DEPTH);
98
+            pixel *yPixel = m_picOrg[0];
99
 
100
-            uint16_t *uShort = (uint16_t*)pic.planes[1];
101
-            uint16_t *vShort = (uint16_t*)pic.planes[2];
102
+            uint16_t *yShort = (uint16_t*)pic.planes[0];
103
 
104
             if (pic.bitDepth > X265_DEPTH)
105
             {
106
-                primitives.planecopy_sp(uShort, pic.stride[1] / sizeof(*uShort), uPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
107
-                primitives.planecopy_sp(vShort, pic.stride[2] / sizeof(*vShort), vPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
108
+                /* shift right and mask pixels to final size */
109
+                primitives.planecopy_sp(yShort, pic.stride[0] / sizeof(*yShort), yPixel, m_stride, width, height, shift, mask);
110
             }
111
             else /* Case for (pic.bitDepth <= X265_DEPTH) */
112
             {
113
-                primitives.planecopy_sp_shl(uShort, pic.stride[1] / sizeof(*uShort), uPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
114
-                primitives.planecopy_sp_shl(vShort, pic.stride[2] / sizeof(*vShort), vPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
115
+                /* shift left and mask pixels to final size */
116
+                primitives.planecopy_sp_shl(yShort, pic.stride[0] / sizeof(*yShort), yPixel, m_stride, width, height, shift, mask);
117
+            }
118
+
119
+            if (param.internalCsp != X265_CSP_I400)
120
+            {
121
+                pixel *uPixel = m_picOrg[1];
122
+                pixel *vPixel = m_picOrg[2];
123
+
124
+                uint16_t *uShort = (uint16_t*)pic.planes[1];
125
+                uint16_t *vShort = (uint16_t*)pic.planes[2];
126
+
127
+                if (pic.bitDepth > X265_DEPTH)
128
+                {
129
+                    primitives.planecopy_sp(uShort, pic.stride[1] / sizeof(*uShort), uPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
130
+                    primitives.planecopy_sp(vShort, pic.stride[2] / sizeof(*vShort), vPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
131
+                }
132
+                else /* Case for (pic.bitDepth <= X265_DEPTH) */
133
+                {
134
+                    primitives.planecopy_sp_shl(uShort, pic.stride[1] / sizeof(*uShort), uPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
135
+                    primitives.planecopy_sp_shl(vShort, pic.stride[2] / sizeof(*vShort), vPixel, m_strideC, width >> m_hChromaShift, height >> m_vChromaShift, shift, mask);
136
+                }
137
             }
138
         }
139
     }
140
+    else
141
+    {
142
+        m_picOrg[0] = (pixel*)pic.planes[0];
143
+        m_picOrg[1] = (pixel*)pic.planes[1];
144
+        m_picOrg[2] = (pixel*)pic.planes[2];
145
+    }
146
 
147
     pixel *Y = m_picOrg[0];
148
     pixel *U = m_picOrg[1];
149
x265_2.5.tar.gz/source/common/picyuv.h -> x265_2.6.tar.gz/source/common/picyuv.h Changed
27
 
1
@@ -27,6 +27,7 @@
2
 #include "common.h"
3
 #include "md5.h"
4
 #include "x265.h"
5
+struct x265_picyuv {};
6
 
7
 namespace X265_NS {
8
 // private namespace
9
@@ -34,7 +35,7 @@
10
 class ShortYuv;
11
 struct SPS;
12
 
13
-class PicYuv
14
+class PicYuv : public x265_picyuv
15
 {
16
 public:
17
 
18
@@ -75,7 +76,7 @@
19
 
20
     PicYuv();
21
 
22
-    bool  create(x265_param* param, pixel *pixelbuf = NULL);
23
+    bool  create(x265_param* param, bool picAlloc = true, pixel *pixelbuf = NULL);
24
     bool  createOffsets(const SPS& sps);
25
     void  destroy();
26
     int   getLumaBufLen(uint32_t picWidth, uint32_t picHeight, uint32_t picCsp);
27
x265_2.5.tar.gz/source/common/primitives.cpp -> x265_2.6.tar.gz/source/common/primitives.cpp Changed
47
 
1
@@ -58,11 +58,13 @@
2
 void setupLoopFilterPrimitives_c(EncoderPrimitives &p);
3
 void setupSaoPrimitives_c(EncoderPrimitives &p);
4
 void setupSeaIntegralPrimitives_c(EncoderPrimitives &p);
5
+void setupLowPassPrimitives_c(EncoderPrimitives& p);
6
 
7
 void setupCPrimitives(EncoderPrimitives &p)
8
 {
9
     setupPixelPrimitives_c(p);      // pixel.cpp
10
     setupDCTPrimitives_c(p);        // dct.cpp
11
+    setupLowPassPrimitives_c(p);    // lowpassdct.cpp
12
     setupFilterPrimitives_c(p);     // ipfilter.cpp
13
     setupIntraPrimitives_c(p);      // intrapred.cpp
14
     setupLoopFilterPrimitives_c(p); // loopfilter.cpp
15
@@ -70,6 +72,19 @@
16
     setupSeaIntegralPrimitives_c(p);  // framefilter.cpp
17
 }
18
 
19
+void enableLowpassDCTPrimitives(EncoderPrimitives &p)
20
+{
21
+    // update copies of the standard dct transform
22
+    p.cu[BLOCK_4x4].standard_dct = p.cu[BLOCK_4x4].dct;
23
+    p.cu[BLOCK_8x8].standard_dct = p.cu[BLOCK_8x8].dct;
24
+    p.cu[BLOCK_16x16].standard_dct = p.cu[BLOCK_16x16].dct;
25
+    p.cu[BLOCK_32x32].standard_dct = p.cu[BLOCK_32x32].dct;
26
+
27
+    // replace active dct by lowpass dct for high dct transforms
28
+    p.cu[BLOCK_16x16].dct = p.cu[BLOCK_16x16].lowpass_dct;
29
+    p.cu[BLOCK_32x32].dct = p.cu[BLOCK_32x32].lowpass_dct;
30
+}
31
+
32
 void setupAliasPrimitives(EncoderPrimitives &p)
33
 {
34
 #if HIGH_BIT_DEPTH
35
@@ -256,6 +271,11 @@
36
 #endif
37
 
38
         setupAliasPrimitives(primitives);
39
+
40
+        if (param->bLowPassDct)
41
+        {
42
+            enableLowpassDCTPrimitives(primitives); 
43
+        }
44
     }
45
 
46
     x265_report_simd(param);
47
x265_2.5.tar.gz/source/common/primitives.h -> x265_2.6.tar.gz/source/common/primitives.h Changed
16
 
1
@@ -259,8 +259,12 @@
2
      * primitives will leave 64x64 pointers NULL.  Indexed by LumaCU */
3
     struct CU
4
     {
5
-        dct_t           dct;
6
-        idct_t          idct;
7
+        dct_t           dct;    // active dct transformation
8
+        idct_t          idct;   // active idct transformation
9
+
10
+        dct_t           standard_dct;   // original dct function, used by lowpass_dct
11
+        dct_t           lowpass_dct;    // lowpass dct approximation
12
+
13
         calcresidual_t  calcresidual;
14
         pixel_sub_ps_t  sub_ps;
15
         pixel_add_ps_t  add_ps;
16
x265_2.5.tar.gz/source/common/threadpool.cpp -> x265_2.6.tar.gz/source/common/threadpool.cpp Changed
9
 
1
@@ -454,6 +454,7 @@
2
                     if ((nodeMaskPerPool[node] >> j) & 1)
3
                         len += sprintf(nodesstr + len, ",%d", j);
4
                 x265_log(p, X265_LOG_INFO, "Thread pool %d using %d threads on numa nodes %s\n", i, numThreads, nodesstr + 1);
5
+                delete[] nodesstr;
6
             }
7
             else
8
                 x265_log(p, X265_LOG_INFO, "Thread pool created using %d threads\n", numThreads);
9
x265_2.5.tar.gz/source/common/wavefront.cpp -> x265_2.6.tar.gz/source/common/wavefront.cpp Changed
19
 
1
@@ -43,11 +43,17 @@
2
     if (m_externalDependencyBitmap)
3
         memset((void*)m_externalDependencyBitmap, 0, sizeof(uint32_t) * m_numWords);
4
 
5
+    m_row_to_idx = X265_MALLOC(uint32_t, m_numRows);
6
+    m_idx_to_row = X265_MALLOC(uint32_t, m_numRows);
7
+
8
     return m_internalDependencyBitmap && m_externalDependencyBitmap;
9
 }
10
 
11
 WaveFront::~WaveFront()
12
 {
13
+    x265_free((void*)m_row_to_idx);
14
+    x265_free((void*)m_idx_to_row);
15
+
16
     x265_free((void*)m_internalDependencyBitmap);
17
     x265_free((void*)m_externalDependencyBitmap);
18
 }
19
x265_2.5.tar.gz/source/common/wavefront.h -> x265_2.6.tar.gz/source/common/wavefront.h Changed
12
 
1
@@ -52,6 +52,10 @@
2
 
3
     int m_numRows;
4
 
5
+protected:
6
+    uint32_t *m_row_to_idx;
7
+    uint32_t *m_idx_to_row;
8
+
9
 public:
10
 
11
     WaveFront()
12
x265_2.5.tar.gz/source/dynamicHDR10/BasicStructures.h -> x265_2.6.tar.gz/source/dynamicHDR10/BasicStructures.h Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       BasicStructures.h
4
- * @brief                      Defines the structure of metadata parameters
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #ifndef BASICSTRUCTURES_H
38
x265_2.5.tar.gz/source/dynamicHDR10/JsonHelper.cpp -> x265_2.6.tar.gz/source/dynamicHDR10/JsonHelper.cpp Changed
57
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       JsonHelper.cpp
4
- * @brief                      Helper class for JSON parsing
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #include "JsonHelper.h"
38
@@ -188,9 +187,15 @@
39
 
40
     tfile.close();
41
 
42
-    size_t beginning = json_str2.find_first_of("[");
43
-    int fixchar = json_str2[json_str2.size() - 2] == ']' ? 1 : 0;
44
-    return Json::parse(json_str2.substr(beginning,json_str2.size() - fixchar),err).array_items();
45
+    vector<Json> data;
46
+    if (json_str2.size() != 0)
47
+    {
48
+        size_t beginning = json_str2.find_first_of("[");
49
+        int fixchar = json_str2[json_str2.size() - 2] == ']' ? 1 : 0;
50
+        return Json::parse(json_str2.substr(beginning, json_str2.size() - fixchar), err).array_items();
51
+    }
52
+    else
53
+        return data;
54
 }
55
 
56
 bool JsonHelper::validatePathExtension(string &path)
57
x265_2.5.tar.gz/source/dynamicHDR10/JsonHelper.h -> x265_2.6.tar.gz/source/dynamicHDR10/JsonHelper.h Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       JsonHelper.h
4
- * @brief                      Helper class for JSON parsing
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #ifndef JSON_H
38
x265_2.5.tar.gz/source/dynamicHDR10/LICENSE.txt -> x265_2.6.tar.gz/source/dynamicHDR10/LICENSE.txt Changed
31
 
1
@@ -1,13 +1,9 @@
2
-Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
3
+Copyright (C) 2013-2017 MulticoreWare, Inc
4
 
5
-This software is the confidential and proprietary information of Samsung Electronics, Inc. ("Confidential Information").
6
-You shall not disclose such Confidential Information and shall use it only in accordance with the terms of the license agreement
7
-you entered into with Samsung.
8
-
9
-This program is free software; you can redistribute it and/or
10
-modify it under the terms of the GNU General Public License
11
-as published by the Free Software Foundation; either version 2
12
-of the License, or (at your option) any later version.
13
+This program is free software; you can redistribute it and/or modify
14
+it under the terms of the GNU General Public License as published by
15
+the Free Software Foundation; either version 2 of the License, or
16
+(at your option) any later version.
17
 
18
 This program is distributed in the hope that it will be useful,
19
 but WITHOUT ANY WARRANTY; without even the implied warranty of
20
@@ -16,5 +12,7 @@
21
 
22
 You should have received a copy of the GNU General Public License
23
 along with this program; if not, write to the Free Software
24
-Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
25
-MA 02110-1301, USA.
26
\ No newline at end of file
27
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
28
+
29
+This program is also available under a commercial proprietary license.
30
+For more information, contact us at license @ x265.com.
31
x265_2.5.tar.gz/source/dynamicHDR10/SeiMetadataDictionary.cpp -> x265_2.6.tar.gz/source/dynamicHDR10/SeiMetadataDictionary.cpp Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       SeiMetadataDictionary.cpp
4
- * @brief                      Defines the tagname for each metadata value in a JSON dynamic tone mapping file.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #include "SeiMetadataDictionary.h"
38
x265_2.5.tar.gz/source/dynamicHDR10/SeiMetadataDictionary.h -> x265_2.6.tar.gz/source/dynamicHDR10/SeiMetadataDictionary.h Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       SeiMetadataDictionary.h
4
- * @brief                      Defines the tagname for each metadata value in a JSON dynamic tone mapping file.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #ifndef SEIMETADATADICTIONARY_H
38
x265_2.5.tar.gz/source/dynamicHDR10/api.cpp -> x265_2.6.tar.gz/source/dynamicHDR10/api.cpp Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       api.cpp
4
- * @brief                      Implementation of hdr10plus API functions.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, 
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #include "hdr10plus.h"
38
x265_2.5.tar.gz/source/dynamicHDR10/hdr10plus.h -> x265_2.6.tar.gz/source/dynamicHDR10/hdr10plus.h Changed
38
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       hdr10plus.h
4
- * @brief                      Definition of hdr10plus functions.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #include <stdint.h>
38
x265_2.5.tar.gz/source/dynamicHDR10/metadataFromJson.cpp -> x265_2.6.tar.gz/source/dynamicHDR10/metadataFromJson.cpp Changed
50
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       metadataFromJson.cpp
4
- * @brief                      Reads a JSON file and produces a byte array containing the metadata to be embedded in different ways.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,8 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
32
+ *
33
+ * This program is also available under a commercial proprietary license.
34
+ * For more information, contact us at license @ x265.com.
35
 **/
36
 
37
 #include "metadataFromJson.h"
38
@@ -33,11 +32,7 @@
39
 
40
 #include "BasicStructures.h"
41
 #include "SeiMetadataDictionary.h"
42
-
43
-#define M_PI 3.14159265358979323846
44
-
45
 using namespace SeiMetadataDictionary;
46
-
47
 class metadataFromJson::DynamicMetaIO
48
 {
49
 public:
50
x265_2.5.tar.gz/source/dynamicHDR10/metadataFromJson.h -> x265_2.6.tar.gz/source/dynamicHDR10/metadataFromJson.h Changed
39
 
1
@@ -1,16 +1,13 @@
2
 /**
3
- * @file                       metadataFromJson.h
4
- * @brief                      Reads a JSON file and produces a byte array containing the metadata to be embedded for a frame.
5
- * @author                     Daniel Maximiliano Valenzuela, Seongnam Oh.
6
- * @create date                03/01/2017
7
- * @version                    0.0.1
8
+ * Copyright (C) 2013-2017 MulticoreWare, Inc
9
  *
10
- * Copyright @ 2017 Samsung Electronics, DMS Lab, Samsung Research America and Samsung Research Tijuana
11
+ * Authors: Bhavna Hariharan <bhavna@multicorewareinc.com>
12
+ *          Kavitha Sampath <kavitha@multicorewareinc.com>
13
  *
14
- * This program is free software; you can redistribute it and/or
15
- * modify it under the terms of the GNU General Public License
16
- * as published by the Free Software Foundation; either version 2
17
- * of the License, or (at your option) any later version.
18
+ * This program is free software; you can redistribute it and/or modify
19
+ * it under the terms of the GNU General Public License as published by
20
+ * the Free Software Foundation; either version 2 of the License, or
21
+ * (at your option) any later version.
22
  *
23
  * This program is distributed in the hope that it will be useful,
24
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
25
@@ -19,9 +16,10 @@
26
  *
27
  * You should have received a copy of the GNU General Public License
28
  * along with this program; if not, write to the Free Software
29
- * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
30
- * MA 02110-1301, USA.
31
-
32
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02111, USA.
33
+ *
34
+ * This program is also available under a commercial proprietary license.
35
+ * For more information, contact us at license @ x265.com.
36
 **/
37
 
38
 #ifndef METADATAFROMJSON_H
39
x265_2.5.tar.gz/source/encoder/CMakeLists.txt -> x265_2.6.tar.gz/source/encoder/CMakeLists.txt Changed
8
 
1
@@ -43,5 +43,4 @@
2
     reference.cpp reference.h
3
     encoder.cpp encoder.h
4
     api.cpp
5
-    weightPrediction.cpp
6
-    ../x265-extras.cpp ../x265-extras.h)
7
+    weightPrediction.cpp)
8
x265_2.5.tar.gz/source/encoder/analysis.cpp -> x265_2.6.tar.gz/source/encoder/analysis.cpp Changed
2349
 
1
@@ -75,6 +75,10 @@
2
     m_reuseInterDataCTU = NULL;
3
     m_reuseRef = NULL;
4
     m_bHD = false;
5
+    m_modeFlag[0] = false;
6
+    m_modeFlag[1] = false;
7
+    m_checkMergeAndSkipOnly[0] = false;
8
+    m_checkMergeAndSkipOnly[1] = false;
9
     m_evaluateInter = 0;
10
 }
11
 
12
@@ -235,6 +239,32 @@
13
     }
14
     else
15
     {
16
+        bool bCopyAnalysis = ((m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel == 10) || (m_param->bMVType && m_param->analysisReuseLevel >= 7 && ctu.m_numPartitions <= 16));
17
+        bool BCompressInterCUrd0_4 = (m_param->bMVType && m_param->analysisReuseLevel >= 7 && m_param->rdLevel <= 4);
18
+        bool BCompressInterCUrd5_6 = (m_param->bMVType && m_param->analysisReuseLevel >= 7 && m_param->rdLevel >= 5 && m_param->rdLevel <= 6);
19
+        bCopyAnalysis = bCopyAnalysis || BCompressInterCUrd0_4 || BCompressInterCUrd5_6;
20
+
21
+        if (bCopyAnalysis)
22
+        {
23
+            analysis_inter_data* interDataCTU = (analysis_inter_data*)m_frame->m_analysisData.interData;
24
+            int posCTU = ctu.m_cuAddr * numPartition;
25
+            memcpy(ctu.m_cuDepth, &interDataCTU->depth[posCTU], sizeof(uint8_t) * numPartition);
26
+            memcpy(ctu.m_predMode, &interDataCTU->modes[posCTU], sizeof(uint8_t) * numPartition);
27
+            memcpy(ctu.m_partSize, &interDataCTU->partSize[posCTU], sizeof(uint8_t) * numPartition);
28
+            for (int list = 0; list < m_slice->isInterB() + 1; list++)
29
+                memcpy(ctu.m_skipFlag[list], &m_frame->m_analysisData.modeFlag[list][posCTU], sizeof(uint8_t) * numPartition);
30
+
31
+            if ((m_slice->m_sliceType == P_SLICE || m_param->bIntraInBFrames) && !m_param->bMVType)
32
+            {
33
+                analysis_intra_data* intraDataCTU = (analysis_intra_data*)m_frame->m_analysisData.intraData;
34
+                memcpy(ctu.m_lumaIntraDir, &intraDataCTU->modes[posCTU], sizeof(uint8_t) * numPartition);
35
+                memcpy(ctu.m_chromaIntraDir, &intraDataCTU->chromaModes[posCTU], sizeof(uint8_t) * numPartition);
36
+            }
37
+            //Calculate log2CUSize from depth
38
+            for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
39
+                ctu.m_log2CUSize[i] = (uint8_t)m_param->maxLog2CUSize - ctu.m_cuDepth[i];
40
+        }
41
+
42
         if (m_param->bIntraRefresh && m_slice->m_sliceType == P_SLICE &&
43
             ctu.m_cuPelX / m_param->maxCUSize >= frame.m_encData->m_pir.pirStartCol
44
             && ctu.m_cuPelX / m_param->maxCUSize < frame.m_encData->m_pir.pirEndCol)
45
@@ -250,14 +280,14 @@
46
             /* generate residual for entire CTU at once and copy to reconPic */
47
             encodeResidue(ctu, cuGeom);
48
         }
49
-        else if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel == 10)
50
+        else if ((m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel == 10) || ((m_param->bMVType == AVC_INFO) && m_param->analysisReuseLevel >= 7))
51
         {
52
             analysis_inter_data* interDataCTU = (analysis_inter_data*)m_frame->m_analysisData.interData;
53
             int posCTU = ctu.m_cuAddr * numPartition;
54
             memcpy(ctu.m_cuDepth, &interDataCTU->depth[posCTU], sizeof(uint8_t) * numPartition);
55
             memcpy(ctu.m_predMode, &interDataCTU->modes[posCTU], sizeof(uint8_t) * numPartition);
56
             memcpy(ctu.m_partSize, &interDataCTU->partSize[posCTU], sizeof(uint8_t) * numPartition);
57
-            if (m_slice->m_sliceType == P_SLICE || m_param->bIntraInBFrames)
58
+            if ((m_slice->m_sliceType == P_SLICE || m_param->bIntraInBFrames) && !(m_param->bMVType == AVC_INFO))
59
             {
60
                 analysis_intra_data* intraDataCTU = (analysis_intra_data*)m_frame->m_analysisData.intraData;
61
                 memcpy(ctu.m_lumaIntraDir, &intraDataCTU->modes[posCTU], sizeof(uint8_t) * numPartition);
62
@@ -306,11 +336,10 @@
63
                 mode = 2;
64
             else if (ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_2NxnU || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_2NxnD || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_nLx2N || ctu.m_partSize[puabsPartIdx + absPartIdx] == SIZE_nRx2N)
65
                  mode = 3;
66
-
67
             if (ctu.m_predMode[puabsPartIdx + absPartIdx] == MODE_SKIP)
68
             {
69
-                ctu.m_encData->m_frameStats.cntSkipPu[depth] += (uint64_t)(1 << shift);
70
-                ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
71
+                ctu.m_encData->m_frameStats.cntSkipPu[depth] += 1ULL << shift;
72
+                ctu.m_encData->m_frameStats.totalPu[depth] += 1ULL << shift;
73
             }
74
             else if (ctu.m_predMode[puabsPartIdx + absPartIdx] == MODE_INTRA)
75
             {
76
@@ -321,14 +350,14 @@
77
                 }
78
                 else
79
                 {
80
-                    ctu.m_encData->m_frameStats.cntIntraPu[depth] += (uint64_t)(1 << shift);
81
-                    ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
82
+                    ctu.m_encData->m_frameStats.cntIntraPu[depth] += 1ULL << shift;
83
+                    ctu.m_encData->m_frameStats.totalPu[depth] += 1ULL << shift;
84
                 }
85
             }
86
             else if (mode == 3)
87
             {
88
-                ctu.m_encData->m_frameStats.cntAmp[depth] += (uint64_t)(1 << shift);
89
-                ctu.m_encData->m_frameStats.totalPu[depth] += (uint64_t)(1 << shift);
90
+                ctu.m_encData->m_frameStats.cntAmp[depth] += 1ULL << shift;
91
+                ctu.m_encData->m_frameStats.totalPu[depth] += 1ULL << shift;
92
                 break;
93
             }
94
             else
95
@@ -485,7 +514,7 @@
96
     md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, parentCTU.m_cuAddr, cuGeom.absPartIdx);
97
 }
98
 
99
-void Analysis::compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp)
100
+uint64_t Analysis::compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp)
101
 {
102
     uint32_t depth = cuGeom.depth;
103
     ModeDepth& md = m_modeDepth[depth];
104
@@ -511,7 +540,9 @@
105
             Mode& mode = md.pred[0];
106
             md.bestMode = &mode;
107
             mode.cu.initSubCU(parentCTU, cuGeom, qp);
108
-            if (m_param->intraRefine != 2 || parentCTU.m_lumaIntraDir[cuGeom.absPartIdx] <= 1)
109
+            bool reuseModes = !((m_param->intraRefine == 3) ||
110
+                                (m_param->intraRefine == 2 && parentCTU.m_lumaIntraDir[cuGeom.absPartIdx] > DC_IDX));
111
+            if (reuseModes)
112
             {
113
                 memcpy(mode.cu.m_lumaIntraDir, parentCTU.m_lumaIntraDir + cuGeom.absPartIdx, cuGeom.numPartitions);
114
                 memcpy(mode.cu.m_chromaIntraDir, parentCTU.m_chromaIntraDir + cuGeom.absPartIdx, cuGeom.numPartitions);
115
@@ -560,6 +591,8 @@
116
         invalidateContexts(nextDepth);
117
         Entropy* nextContext = &m_rqt[depth].cur;
118
         int32_t nextQP = qp;
119
+        uint64_t curCost = 0;
120
+        int skipSplitCheck = 0;
121
 
122
         for (uint32_t subPartIdx = 0; subPartIdx < 4; subPartIdx++)
123
         {
124
@@ -572,7 +605,17 @@
125
                 if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
126
                     nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
127
 
128
-                compressIntraCU(parentCTU, childGeom, nextQP);
129
+                if (m_param->bEnableSplitRdSkip)
130
+                {
131
+                    curCost += compressIntraCU(parentCTU, childGeom, nextQP);
132
+                    if (m_modeDepth[depth].bestMode && curCost > m_modeDepth[depth].bestMode->rdCost)
133
+                    {
134
+                        skipSplitCheck = 1;
135
+                        break;
136
+                    }
137
+                }
138
+                else
139
+                    compressIntraCU(parentCTU, childGeom, nextQP);
140
 
141
                 // Save best CU and pred data for this sub CU
142
                 splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);
143
@@ -590,14 +633,17 @@
144
                     memset(parentCTU.m_cuDepth + childGeom.absPartIdx, 0, childGeom.numPartitions);
145
             }
146
         }
147
-        nextContext->store(splitPred->contexts);
148
-        if (mightNotSplit)
149
-            addSplitFlagCost(*splitPred, cuGeom.depth);
150
-        else
151
-            updateModeCost(*splitPred);
152
+        if (!skipSplitCheck)
153
+        {
154
+            nextContext->store(splitPred->contexts);
155
+            if (mightNotSplit)
156
+                addSplitFlagCost(*splitPred, cuGeom.depth);
157
+            else
158
+                updateModeCost(*splitPred);
159
 
160
-        checkDQPForSplitPred(*splitPred, cuGeom);
161
-        checkBestMode(*splitPred, depth);
162
+            checkDQPForSplitPred(*splitPred, cuGeom);
163
+            checkBestMode(*splitPred, depth);
164
+        }
165
     }
166
 
167
     if (m_param->bEnableRdRefine && depth <= m_slice->m_pps->maxCuDQPDepth)
168
@@ -620,6 +666,8 @@
169
     md.bestMode->cu.copyToPic(depth);
170
     if (md.bestMode != &md.pred[PRED_SPLIT])
171
         md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, parentCTU.m_cuAddr, cuGeom.absPartIdx);
172
+
173
+    return md.bestMode->rdCost;
174
 }
175
 
176
 void Analysis::PMODE::processTasks(int workerThreadId)
177
@@ -1106,7 +1154,7 @@
178
     uint32_t depth = cuGeom.depth;
179
     uint32_t cuAddr = parentCTU.m_cuAddr;
180
     ModeDepth& md = m_modeDepth[depth];
181
-    md.bestMode = NULL;
182
+
183
 
184
     if (m_param->searchMethod == X265_SEA)
185
     {
186
@@ -1119,609 +1167,669 @@
187
     }
188
 
189
     PicYuv& reconPic = *m_frame->m_reconPic;
190
+    SplitData splitCUData;
191
 
192
-    bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
193
-    bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
194
-    uint32_t minDepth = topSkipMinDepth(parentCTU, cuGeom);
195
-    bool bDecidedDepth = parentCTU.m_cuDepth[cuGeom.absPartIdx] == depth;
196
-    bool skipModes = false; /* Skip any remaining mode analyses at current depth */
197
-    bool skipRecursion = false; /* Skip recursion */
198
-    bool splitIntra = true;
199
-    bool skipRectAmp = false;
200
-    bool chooseMerge = false;
201
-    bool bCtuInfoCheck = false;
202
-    int sameContentRef = 0;
203
+    bool bHEVCBlockAnalysis = (m_param->bMVType && cuGeom.numPartitions > 16);
204
+    bool bRefineAVCAnalysis = (m_param->analysisReuseLevel == 7 && (m_modeFlag[0] || m_modeFlag[1]));
205
+    bool bNooffloading = !m_param->bMVType;
206
 
207
-    if (m_evaluateInter == 1)
208
+    if (bHEVCBlockAnalysis || bRefineAVCAnalysis || bNooffloading)
209
     {
210
-        skipRectAmp = !!md.bestMode;
211
-        mightSplit &= false;
212
-        minDepth = depth;
213
-    }
214
+        md.bestMode = NULL;
215
+        bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
216
+        bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
217
+        uint32_t minDepth = topSkipMinDepth(parentCTU, cuGeom);
218
+        bool bDecidedDepth = parentCTU.m_cuDepth[cuGeom.absPartIdx] == depth;
219
+        bool skipModes = false; /* Skip any remaining mode analyses at current depth */
220
+        bool skipRecursion = false; /* Skip recursion */
221
+        bool splitIntra = true;
222
+        bool skipRectAmp = false;
223
+        bool chooseMerge = false;
224
+        bool bCtuInfoCheck = false;
225
+        int sameContentRef = 0;
226
 
227
-    if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
228
-        m_maxTUDepth = loadTUDepth(cuGeom, parentCTU);
229
+        if (m_evaluateInter)
230
+        {
231
+            if (m_param->interRefine == 2)
232
+            {
233
+                if (parentCTU.m_predMode[cuGeom.absPartIdx] == MODE_SKIP)
234
+                    skipModes = true;
235
+                if (parentCTU.m_partSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
236
+                    skipRectAmp = true;
237
+            }
238
+            mightSplit &= false;
239
+            minDepth = depth;
240
+        }
241
 
242
-    SplitData splitData[4];
243
-    splitData[0].initSplitCUData();
244
-    splitData[1].initSplitCUData();
245
-    splitData[2].initSplitCUData();
246
-    splitData[3].initSplitCUData();
247
+        if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
248
+            m_maxTUDepth = loadTUDepth(cuGeom, parentCTU);
249
 
250
-    // avoid uninitialize value in below reference
251
-    if (m_param->limitModes)
252
-    {
253
-        md.pred[PRED_2Nx2N].bestME[0][0].mvCost = 0; // L0
254
-        md.pred[PRED_2Nx2N].bestME[0][1].mvCost = 0; // L1
255
-        md.pred[PRED_2Nx2N].sa8dCost = 0;
256
-    }
257
+        SplitData splitData[4];
258
+        splitData[0].initSplitCUData();
259
+        splitData[1].initSplitCUData();
260
+        splitData[2].initSplitCUData();
261
+        splitData[3].initSplitCUData();
262
 
263
-    if (m_param->bCTUInfo && depth <= parentCTU.m_cuDepth[cuGeom.absPartIdx])
264
-    {
265
-        if (bDecidedDepth && m_additionalCtuInfo[cuGeom.absPartIdx])
266
-            sameContentRef = findSameContentRefCount(parentCTU, cuGeom);
267
-        if (depth < parentCTU.m_cuDepth[cuGeom.absPartIdx])
268
+        // avoid uninitialize value in below reference
269
+        if (m_param->limitModes)
270
         {
271
-            mightNotSplit &= bDecidedDepth;
272
-            bCtuInfoCheck = skipRecursion = false;
273
-            skipModes = true;
274
+            md.pred[PRED_2Nx2N].bestME[0][0].mvCost = 0; // L0
275
+            md.pred[PRED_2Nx2N].bestME[0][1].mvCost = 0; // L1
276
+            md.pred[PRED_2Nx2N].sa8dCost = 0;
277
         }
278
-        else if (mightNotSplit && bDecidedDepth)
279
+
280
+        if (m_param->bCTUInfo && depth <= parentCTU.m_cuDepth[cuGeom.absPartIdx])
281
         {
282
-            if (m_additionalCtuInfo[cuGeom.absPartIdx])
283
+            if (bDecidedDepth && m_additionalCtuInfo[cuGeom.absPartIdx])
284
+                sameContentRef = findSameContentRefCount(parentCTU, cuGeom);
285
+            if (depth < parentCTU.m_cuDepth[cuGeom.absPartIdx])
286
+            {
287
+                mightNotSplit &= bDecidedDepth;
288
+                bCtuInfoCheck = skipRecursion = false;
289
+                skipModes = true;
290
+            }
291
+            else if (mightNotSplit && bDecidedDepth)
292
             {
293
-                bCtuInfoCheck = skipRecursion = true;
294
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
295
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
296
-                checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
297
-                if (!sameContentRef)
298
+                if (m_additionalCtuInfo[cuGeom.absPartIdx])
299
                 {
300
-                    if ((m_param->bCTUInfo & 2) && (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth))
301
+                    bCtuInfoCheck = skipRecursion = true;
302
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
303
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
304
+                    checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
305
+                    if (!sameContentRef)
306
                     {
307
-                        qp -= int32_t(0.04 * qp);
308
-                        setLambdaFromQP(parentCTU, qp);
309
+                        if ((m_param->bCTUInfo & 2) && (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth))
310
+                        {
311
+                            qp -= int32_t(0.04 * qp);
312
+                            setLambdaFromQP(parentCTU, qp);
313
+                        }
314
+                        if (m_param->bCTUInfo & 4)
315
+                            skipModes = false;
316
+                    }
317
+                    if (sameContentRef || (!sameContentRef && !(m_param->bCTUInfo & 4)))
318
+                    {
319
+                        if (m_param->rdLevel)
320
+                            skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0);
321
+                        if ((m_param->bCTUInfo & 4) && sameContentRef)
322
+                            skipModes = md.bestMode && true;
323
                     }
324
-                    if (m_param->bCTUInfo & 4)
325
-                        skipModes = false;
326
                 }
327
-                if (sameContentRef || (!sameContentRef && !(m_param->bCTUInfo & 4)))
328
+                else
329
                 {
330
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
331
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
332
+                    checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
333
                     if (m_param->rdLevel)
334
                         skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0);
335
-                    if ((m_param->bCTUInfo & 4) && sameContentRef)
336
-                        skipModes = md.bestMode && true;
337
                 }
338
+                mightSplit &= !bDecidedDepth;
339
             }
340
-            else
341
-            {
342
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
343
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
344
-                checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
345
-                if (m_param->rdLevel)
346
-                    skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0);
347
-            }
348
-            mightSplit &= !bDecidedDepth;
349
         }
350
-    }
351
-    if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10)
352
-    {
353
-        if (mightNotSplit && depth == m_reuseDepth[cuGeom.absPartIdx])
354
+        if ((m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10))
355
         {
356
-            if (m_reuseModes[cuGeom.absPartIdx] == MODE_SKIP)
357
+            if (mightNotSplit && depth == m_reuseDepth[cuGeom.absPartIdx])
358
             {
359
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
360
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
361
-                checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
362
+                if (m_reuseModes[cuGeom.absPartIdx] == MODE_SKIP)
363
+                {
364
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
365
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
366
+                    checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
367
 
368
-                skipRecursion = !!m_param->bEnableRecursionSkip && md.bestMode;
369
-                if (m_param->rdLevel)
370
-                    skipModes = m_param->bEnableEarlySkip && md.bestMode;
371
-            }
372
-            if (m_param->analysisReuseLevel > 4 && m_reusePartSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
373
-            {
374
-                if (m_reuseModes[cuGeom.absPartIdx] != MODE_INTRA  && m_reuseModes[cuGeom.absPartIdx] != 4)
375
+                    skipRecursion = !!m_param->bEnableRecursionSkip && md.bestMode;
376
+                    if (m_param->rdLevel)
377
+                        skipModes = m_param->bEnableEarlySkip && md.bestMode;
378
+                }
379
+                if (m_param->analysisReuseLevel > 4 && m_reusePartSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
380
                 {
381
-                    skipRectAmp = true && !!md.bestMode;
382
-                    chooseMerge = !!m_reuseMergeFlag[cuGeom.absPartIdx] && !!md.bestMode;
383
+                    if (m_reuseModes[cuGeom.absPartIdx] != MODE_INTRA  && m_reuseModes[cuGeom.absPartIdx] != 4)
384
+                    {
385
+                        skipRectAmp = true && !!md.bestMode;
386
+                        chooseMerge = !!m_reuseMergeFlag[cuGeom.absPartIdx] && !!md.bestMode;
387
+                    }
388
                 }
389
             }
390
         }
391
-    }
392
-    if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
393
-    {
394
-        if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
395
+        if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
396
         {
397
-            if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
398
+            if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
399
             {
400
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
401
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
402
-                checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
403
+                if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
404
+                {
405
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
406
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
407
+                    checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
408
 
409
-                skipRecursion = !!m_param->bEnableRecursionSkip && md.bestMode;
410
-                if (m_param->rdLevel)
411
-                    skipModes = m_param->bEnableEarlySkip && md.bestMode;
412
+                    skipRecursion = !!m_param->bEnableRecursionSkip && md.bestMode;
413
+                    if (m_param->rdLevel)
414
+                        skipModes = m_param->bEnableEarlySkip && md.bestMode;
415
+                }
416
             }
417
         }
418
-    }
419
 
420
-    /* Step 1. Evaluate Merge/Skip candidates for likely early-outs, if skip mode was not set above */
421
-    if (mightNotSplit && depth >= minDepth && !md.bestMode && !bCtuInfoCheck) /* TODO: Re-evaluate if analysis load/save still works */
422
-    {
423
-        /* Compute Merge Cost */
424
-        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
425
-        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
426
-        checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
427
-        if (m_param->rdLevel)
428
-            skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0); // TODO: sa8d threshold per depth
429
-    }
430
-
431
-    if (md.bestMode && m_param->bEnableRecursionSkip && !bCtuInfoCheck)
432
-    {
433
-        skipRecursion = md.bestMode->cu.isSkipped(0);
434
-        if (mightSplit && depth >= minDepth && !skipRecursion)
435
+        /* Step 1. Evaluate Merge/Skip candidates for likely early-outs, if skip mode was not set above */
436
+        if ((mightNotSplit && depth >= minDepth && !md.bestMode && !bCtuInfoCheck) || (m_param->bMVType && (m_modeFlag[0] || m_modeFlag[1]))) /* TODO: Re-evaluate if analysis load/save still works */
437
         {
438
-            if (depth)
439
-                skipRecursion = recursionDepthCheck(parentCTU, cuGeom, *md.bestMode);
440
-            if (m_bHD && !skipRecursion && m_param->rdLevel == 2 && md.fencYuv.m_size != MAX_CU_SIZE)
441
-                skipRecursion = complexityCheckCU(*md.bestMode);
442
+            /* Compute Merge Cost */
443
+            md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
444
+            md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
445
+            checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
446
+            if (m_param->rdLevel)
447
+                skipModes = (m_param->bEnableEarlySkip || m_param->interRefine == 2)
448
+                && md.bestMode && md.bestMode->cu.isSkipped(0); // TODO: sa8d threshold per depth
449
         }
450
-    }
451
 
452
-    /* Step 2. Evaluate each of the 4 split sub-blocks in series */
453
-    if (mightSplit && !skipRecursion)
454
-    {
455
-        if (bCtuInfoCheck && m_param->bCTUInfo & 2)
456
-            qp = int((1 / 0.96) * qp + 0.5);
457
-        Mode* splitPred = &md.pred[PRED_SPLIT];
458
-        splitPred->initCosts();
459
-        CUData* splitCU = &splitPred->cu;
460
-        splitCU->initSubCU(parentCTU, cuGeom, qp);
461
+        if (md.bestMode && m_param->bEnableRecursionSkip && !bCtuInfoCheck && !(m_param->bMVType && (m_modeFlag[0] || m_modeFlag[1])))
462
+        {
463
+            skipRecursion = md.bestMode->cu.isSkipped(0);
464
+            if (mightSplit && depth >= minDepth && !skipRecursion)
465
+            {
466
+                if (depth)
467
+                    skipRecursion = recursionDepthCheck(parentCTU, cuGeom, *md.bestMode);
468
+                if (m_bHD && !skipRecursion && m_param->rdLevel == 2 && md.fencYuv.m_size != MAX_CU_SIZE)
469
+                    skipRecursion = complexityCheckCU(*md.bestMode);
470
+            }
471
+        }
472
 
473
-        uint32_t nextDepth = depth + 1;
474
-        ModeDepth& nd = m_modeDepth[nextDepth];
475
-        invalidateContexts(nextDepth);
476
-        Entropy* nextContext = &m_rqt[depth].cur;
477
-        int nextQP = qp;
478
-        splitIntra = false;
479
+        if (m_param->bMVType && md.bestMode && cuGeom.numPartitions <= 16)
480
+            skipRecursion = true;
481
 
482
-        for (uint32_t subPartIdx = 0; subPartIdx < 4; subPartIdx++)
483
+        /* Step 2. Evaluate each of the 4 split sub-blocks in series */
484
+        if (mightSplit && !skipRecursion)
485
         {
486
-            const CUGeom& childGeom = *(&cuGeom + cuGeom.childOffset + subPartIdx);
487
-            if (childGeom.flags & CUGeom::PRESENT)
488
+            if (bCtuInfoCheck && m_param->bCTUInfo & 2)
489
+                qp = int((1 / 0.96) * qp + 0.5);
490
+            Mode* splitPred = &md.pred[PRED_SPLIT];
491
+            splitPred->initCosts();
492
+            CUData* splitCU = &splitPred->cu;
493
+            splitCU->initSubCU(parentCTU, cuGeom, qp);
494
+
495
+            uint32_t nextDepth = depth + 1;
496
+            ModeDepth& nd = m_modeDepth[nextDepth];
497
+            invalidateContexts(nextDepth);
498
+            Entropy* nextContext = &m_rqt[depth].cur;
499
+            int nextQP = qp;
500
+            splitIntra = false;
501
+
502
+            for (uint32_t subPartIdx = 0; subPartIdx < 4; subPartIdx++)
503
             {
504
-                m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);
505
-                m_rqt[nextDepth].cur.load(*nextContext);
506
+                const CUGeom& childGeom = *(&cuGeom + cuGeom.childOffset + subPartIdx);
507
+                if (childGeom.flags & CUGeom::PRESENT)
508
+                {
509
+                    m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);
510
+                    m_rqt[nextDepth].cur.load(*nextContext);
511
 
512
-                if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
513
-                    nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
514
+                    if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
515
+                        nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
516
 
517
-                splitData[subPartIdx] = compressInterCU_rd0_4(parentCTU, childGeom, nextQP);
518
+                    splitData[subPartIdx] = compressInterCU_rd0_4(parentCTU, childGeom, nextQP);
519
 
520
-                // Save best CU and pred data for this sub CU
521
-                splitIntra |= nd.bestMode->cu.isIntra(0);
522
-                splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);
523
-                splitPred->addSubCosts(*nd.bestMode);
524
+                    // Save best CU and pred data for this sub CU
525
+                    splitIntra |= nd.bestMode->cu.isIntra(0);
526
+                    splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);
527
+                    splitPred->addSubCosts(*nd.bestMode);
528
 
529
-                if (m_param->rdLevel)
530
-                    nd.bestMode->reconYuv.copyToPartYuv(splitPred->reconYuv, childGeom.numPartitions * subPartIdx);
531
+                    if (m_param->rdLevel)
532
+                        nd.bestMode->reconYuv.copyToPartYuv(splitPred->reconYuv, childGeom.numPartitions * subPartIdx);
533
+                    else
534
+                        nd.bestMode->predYuv.copyToPartYuv(splitPred->predYuv, childGeom.numPartitions * subPartIdx);
535
+                    if (m_param->rdLevel > 1)
536
+                        nextContext = &nd.bestMode->contexts;
537
+                }
538
                 else
539
-                    nd.bestMode->predYuv.copyToPartYuv(splitPred->predYuv, childGeom.numPartitions * subPartIdx);
540
-                if (m_param->rdLevel > 1)
541
-                    nextContext = &nd.bestMode->contexts;
542
+                    splitCU->setEmptyPart(childGeom, subPartIdx);
543
             }
544
+            nextContext->store(splitPred->contexts);
545
+
546
+            if (mightNotSplit)
547
+                addSplitFlagCost(*splitPred, cuGeom.depth);
548
+            else if (m_param->rdLevel > 1)
549
+                updateModeCost(*splitPred);
550
             else
551
-                splitCU->setEmptyPart(childGeom, subPartIdx);
552
+                splitPred->sa8dCost = m_rdCost.calcRdSADCost((uint32_t)splitPred->distortion, splitPred->sa8dBits);
553
         }
554
-        nextContext->store(splitPred->contexts);
555
 
556
-        if (mightNotSplit)
557
-            addSplitFlagCost(*splitPred, cuGeom.depth);
558
-        else if (m_param->rdLevel > 1)
559
-            updateModeCost(*splitPred);
560
-        else
561
-            splitPred->sa8dCost = m_rdCost.calcRdSADCost((uint32_t)splitPred->distortion, splitPred->sa8dBits);
562
-    }
563
+        /* If analysis mode is simple do not Evaluate other modes */
564
+        if ((m_param->bMVType && cuGeom.numPartitions <= 16) && (m_slice->m_sliceType == P_SLICE || m_slice->m_sliceType == B_SLICE))
565
+            mightNotSplit = !(m_checkMergeAndSkipOnly[0] || (m_checkMergeAndSkipOnly[0] && m_checkMergeAndSkipOnly[1]));
566
 
567
-    /* Split CUs
568
-     *   0  1
569
-     *   2  3 */
570
-    uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
571
-    /* Step 3. Evaluate ME (2Nx2N, rect, amp) and intra modes at current depth */
572
-    if (mightNotSplit && (depth >= minDepth || (m_param->bCTUInfo && !md.bestMode)))
573
-    {
574
-        if (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth && m_slice->m_pps->maxCuDQPDepth != 0)
575
-            setLambdaFromQP(parentCTU, qp);
576
-
577
-        if (!skipModes)
578
+        /* Split CUs
579
+         *   0  1
580
+         *   2  3 */
581
+        uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
582
+        /* Step 3. Evaluate ME (2Nx2N, rect, amp) and intra modes at current depth */
583
+        if (mightNotSplit && (depth >= minDepth || (m_param->bCTUInfo && !md.bestMode)))
584
         {
585
-            uint32_t refMasks[2];
586
-            refMasks[0] = allSplitRefs;
587
-            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
588
-            checkInter_rd0_4(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
589
-
590
-            if (m_param->limitReferences & X265_REF_LIMIT_CU)
591
-            {
592
-                CUData& cu = md.pred[PRED_2Nx2N].cu;
593
-                uint32_t refMask = cu.getBestRefIdx(0);
594
-                allSplitRefs = splitData[0].splitRefs = splitData[1].splitRefs = splitData[2].splitRefs = splitData[3].splitRefs = refMask;
595
-            }
596
+            if (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth && m_slice->m_pps->maxCuDQPDepth != 0)
597
+                setLambdaFromQP(parentCTU, qp);
598
 
599
-            if (m_slice->m_sliceType == B_SLICE)
600
+            if (!skipModes)
601
             {
602
-                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
603
-                checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);
604
-            }
605
+                uint32_t refMasks[2];
606
+                refMasks[0] = allSplitRefs;
607
+                md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
608
+                checkInter_rd0_4(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
609
 
610
-            Mode *bestInter = &md.pred[PRED_2Nx2N];
611
-            if (!skipRectAmp)
612
-            {
613
-                if (m_param->bEnableRectInter)
614
+                if (m_param->limitReferences & X265_REF_LIMIT_CU)
615
                 {
616
-                    uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
617
-                    uint32_t threshold_2NxN, threshold_Nx2N;
618
-
619
-                    if (m_slice->m_sliceType == P_SLICE)
620
-                    {
621
-                        threshold_2NxN = splitData[0].mvCost[0] + splitData[1].mvCost[0];
622
-                        threshold_Nx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
623
-                    }
624
-                    else
625
-                    {
626
-                        threshold_2NxN = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
627
-                                       + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
628
-                        threshold_Nx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
629
-                                       + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
630
-                    }
631
-
632
-                    int try_2NxN_first = threshold_2NxN < threshold_Nx2N;
633
-                    if (try_2NxN_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxN)
634
-                    {
635
-                        refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
636
-                        refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
637
-                        md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
638
-                        checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
639
-                        if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)
640
-                            bestInter = &md.pred[PRED_2NxN];
641
-                    }
642
-
643
-                    if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_Nx2N)
644
-                    {
645
-                        refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* left */
646
-                        refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* right */
647
-                        md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
648
-                        checkInter_rd0_4(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N, refMasks);
649
-                        if (md.pred[PRED_Nx2N].sa8dCost < bestInter->sa8dCost)
650
-                            bestInter = &md.pred[PRED_Nx2N];
651
-                    }
652
-
653
-                    if (!try_2NxN_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxN)
654
-                    {
655
-                        refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
656
-                        refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
657
-                        md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
658
-                        checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
659
-                        if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)
660
-                            bestInter = &md.pred[PRED_2NxN];
661
-                    }
662
+                    CUData& cu = md.pred[PRED_2Nx2N].cu;
663
+                    uint32_t refMask = cu.getBestRefIdx(0);
664
+                    allSplitRefs = splitData[0].splitRefs = splitData[1].splitRefs = splitData[2].splitRefs = splitData[3].splitRefs = refMask;
665
                 }
666
 
667
-                if (m_slice->m_sps->maxAMPDepth > depth)
668
+                if (m_slice->m_sliceType == B_SLICE)
669
                 {
670
-                    uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
671
-                    uint32_t threshold_2NxnU, threshold_2NxnD, threshold_nLx2N, threshold_nRx2N;
672
-
673
-                    if (m_slice->m_sliceType == P_SLICE)
674
-                    {
675
-                        threshold_2NxnU = splitData[0].mvCost[0] + splitData[1].mvCost[0];
676
-                        threshold_2NxnD = splitData[2].mvCost[0] + splitData[3].mvCost[0];
677
+                    md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
678
+                    checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);
679
+                }
680
 
681
-                        threshold_nLx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
682
-                        threshold_nRx2N = splitData[1].mvCost[0] + splitData[3].mvCost[0];
683
-                    }
684
-                    else
685
+                Mode *bestInter = &md.pred[PRED_2Nx2N];
686
+                if (!skipRectAmp)
687
+                {
688
+                    if (m_param->bEnableRectInter)
689
                     {
690
-                        threshold_2NxnU = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
691
-                                         + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
692
-                        threshold_2NxnD = (splitData[2].mvCost[0] + splitData[3].mvCost[0]
693
-                                         + splitData[2].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
694
+                        uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
695
+                        uint32_t threshold_2NxN, threshold_Nx2N;
696
 
697
-                        threshold_nLx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
698
-                                        + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
699
-                        threshold_nRx2N = (splitData[1].mvCost[0] + splitData[3].mvCost[0]
700
-                                        + splitData[1].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
701
-                    }
702
-
703
-                    bool bHor = false, bVer = false;
704
-                    if (bestInter->cu.m_partSize[0] == SIZE_2NxN)
705
-                        bHor = true;
706
-                    else if (bestInter->cu.m_partSize[0] == SIZE_Nx2N)
707
-                        bVer = true;
708
-                    else if (bestInter->cu.m_partSize[0] == SIZE_2Nx2N &&
709
-                        md.bestMode && md.bestMode->cu.getQtRootCbf(0))
710
-                    {
711
-                        bHor = true;
712
-                        bVer = true;
713
-                    }
714
+                        if (m_slice->m_sliceType == P_SLICE)
715
+                        {
716
+                            threshold_2NxN = splitData[0].mvCost[0] + splitData[1].mvCost[0];
717
+                            threshold_Nx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
718
+                        }
719
+                        else
720
+                        {
721
+                            threshold_2NxN = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
722
+                                + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
723
+                            threshold_Nx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
724
+                                + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
725
+                        }
726
 
727
-                    if (bHor)
728
-                    {
729
-                        int try_2NxnD_first = threshold_2NxnD < threshold_2NxnU;
730
-                        if (try_2NxnD_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnD)
731
+                        int try_2NxN_first = threshold_2NxN < threshold_Nx2N;
732
+                        if (try_2NxN_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxN)
733
                         {
734
-                            refMasks[0] = allSplitRefs;                                    /* 75% top */
735
-                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
736
-                            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
737
-                            checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
738
-                            if (md.pred[PRED_2NxnD].sa8dCost < bestInter->sa8dCost)
739
-                                bestInter = &md.pred[PRED_2NxnD];
740
+                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
741
+                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
742
+                            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
743
+                            checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
744
+                            if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)
745
+                                bestInter = &md.pred[PRED_2NxN];
746
                         }
747
 
748
-                        if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnU)
749
+                        if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_Nx2N)
750
                         {
751
-                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* 25% top */
752
-                            refMasks[1] = allSplitRefs;                                    /* 75% bot */
753
-                            md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);
754
-                            checkInter_rd0_4(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU, refMasks);
755
-                            if (md.pred[PRED_2NxnU].sa8dCost < bestInter->sa8dCost)
756
-                                bestInter = &md.pred[PRED_2NxnU];
757
+                            refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* left */
758
+                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* right */
759
+                            md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
760
+                            checkInter_rd0_4(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N, refMasks);
761
+                            if (md.pred[PRED_Nx2N].sa8dCost < bestInter->sa8dCost)
762
+                                bestInter = &md.pred[PRED_Nx2N];
763
                         }
764
 
765
-                        if (!try_2NxnD_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnD)
766
+                        if (!try_2NxN_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxN)
767
                         {
768
-                            refMasks[0] = allSplitRefs;                                    /* 75% top */
769
-                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
770
-                            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
771
-                            checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
772
-                            if (md.pred[PRED_2NxnD].sa8dCost < bestInter->sa8dCost)
773
-                                bestInter = &md.pred[PRED_2NxnD];
774
+                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
775
+                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
776
+                            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
777
+                            checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
778
+                            if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)
779
+                                bestInter = &md.pred[PRED_2NxN];
780
                         }
781
                     }
782
-                    if (bVer)
783
+
784
+                    if (m_slice->m_sps->maxAMPDepth > depth)
785
                     {
786
-                        int try_nRx2N_first = threshold_nRx2N < threshold_nLx2N;
787
-                        if (try_nRx2N_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nRx2N)
788
+                        uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
789
+                        uint32_t threshold_2NxnU, threshold_2NxnD, threshold_nLx2N, threshold_nRx2N;
790
+
791
+                        if (m_slice->m_sliceType == P_SLICE)
792
+                        {
793
+                            threshold_2NxnU = splitData[0].mvCost[0] + splitData[1].mvCost[0];
794
+                            threshold_2NxnD = splitData[2].mvCost[0] + splitData[3].mvCost[0];
795
+
796
+                            threshold_nLx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
797
+                            threshold_nRx2N = splitData[1].mvCost[0] + splitData[3].mvCost[0];
798
+                        }
799
+                        else
800
                         {
801
-                            refMasks[0] = allSplitRefs;                                    /* 75% left  */
802
-                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
803
-                            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
804
-                            checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
805
-                            if (md.pred[PRED_nRx2N].sa8dCost < bestInter->sa8dCost)
806
-                                bestInter = &md.pred[PRED_nRx2N];
807
+                            threshold_2NxnU = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
808
+                                + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
809
+                            threshold_2NxnD = (splitData[2].mvCost[0] + splitData[3].mvCost[0]
810
+                                + splitData[2].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
811
+
812
+                            threshold_nLx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
813
+                                + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
814
+                            threshold_nRx2N = (splitData[1].mvCost[0] + splitData[3].mvCost[0]
815
+                                + splitData[1].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
816
                         }
817
 
818
-                        if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nLx2N)
819
+                        bool bHor = false, bVer = false;
820
+                        if (bestInter->cu.m_partSize[0] == SIZE_2NxN)
821
+                            bHor = true;
822
+                        else if (bestInter->cu.m_partSize[0] == SIZE_Nx2N)
823
+                            bVer = true;
824
+                        else if (bestInter->cu.m_partSize[0] == SIZE_2Nx2N &&
825
+                            md.bestMode && md.bestMode->cu.getQtRootCbf(0))
826
                         {
827
-                            refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* 25% left  */
828
-                            refMasks[1] = allSplitRefs;                                    /* 75% right */
829
-                            md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);
830
-                            checkInter_rd0_4(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N, refMasks);
831
-                            if (md.pred[PRED_nLx2N].sa8dCost < bestInter->sa8dCost)
832
-                                bestInter = &md.pred[PRED_nLx2N];
833
+                            bHor = true;
834
+                            bVer = true;
835
                         }
836
 
837
-                        if (!try_nRx2N_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nRx2N)
838
+                        if (bHor)
839
                         {
840
-                            refMasks[0] = allSplitRefs;                                    /* 75% left  */
841
-                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
842
-                            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
843
-                            checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
844
-                            if (md.pred[PRED_nRx2N].sa8dCost < bestInter->sa8dCost)
845
-                                bestInter = &md.pred[PRED_nRx2N];
846
+                            int try_2NxnD_first = threshold_2NxnD < threshold_2NxnU;
847
+                            if (try_2NxnD_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnD)
848
+                            {
849
+                                refMasks[0] = allSplitRefs;                                    /* 75% top */
850
+                                refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
851
+                                md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
852
+                                checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
853
+                                if (md.pred[PRED_2NxnD].sa8dCost < bestInter->sa8dCost)
854
+                                    bestInter = &md.pred[PRED_2NxnD];
855
+                            }
856
+
857
+                            if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnU)
858
+                            {
859
+                                refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* 25% top */
860
+                                refMasks[1] = allSplitRefs;                                    /* 75% bot */
861
+                                md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);
862
+                                checkInter_rd0_4(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU, refMasks);
863
+                                if (md.pred[PRED_2NxnU].sa8dCost < bestInter->sa8dCost)
864
+                                    bestInter = &md.pred[PRED_2NxnU];
865
+                            }
866
+
867
+                            if (!try_2NxnD_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_2NxnD)
868
+                            {
869
+                                refMasks[0] = allSplitRefs;                                    /* 75% top */
870
+                                refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
871
+                                md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
872
+                                checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
873
+                                if (md.pred[PRED_2NxnD].sa8dCost < bestInter->sa8dCost)
874
+                                    bestInter = &md.pred[PRED_2NxnD];
875
+                            }
876
                         }
877
-                    }
878
-                }
879
-            }
880
-            bool bTryIntra = (m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames) && cuGeom.log2CUSize != MAX_LOG2_CU_SIZE && !((m_param->bCTUInfo & 4) && bCtuInfoCheck);
881
-            if (m_param->rdLevel >= 3)
882
-            {
883
-                /* Calculate RD cost of best inter option */
884
-                if ((!m_bChromaSa8d && (m_csp != X265_CSP_I400)) || (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)) /* When m_bChromaSa8d is enabled, chroma MC has already been done */
885
-                {
886
-                    uint32_t numPU = bestInter->cu.getNumPartInter(0);
887
-                    for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
888
-                    {
889
-                        PredictionUnit pu(bestInter->cu, cuGeom, puIdx);
890
-                        motionCompensation(bestInter->cu, pu, bestInter->predYuv, false, true);
891
-                    }
892
-                }
893
+                        if (bVer)
894
+                        {
895
+                            int try_nRx2N_first = threshold_nRx2N < threshold_nLx2N;
896
+                            if (try_nRx2N_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nRx2N)
897
+                            {
898
+                                refMasks[0] = allSplitRefs;                                    /* 75% left  */
899
+                                refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
900
+                                md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
901
+                                checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
902
+                                if (md.pred[PRED_nRx2N].sa8dCost < bestInter->sa8dCost)
903
+                                    bestInter = &md.pred[PRED_nRx2N];
904
+                            }
905
 
906
-                if (!chooseMerge)
907
-                {
908
-                    encodeResAndCalcRdInterCU(*bestInter, cuGeom);
909
-                    checkBestMode(*bestInter, depth);
910
+                            if (splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nLx2N)
911
+                            {
912
+                                refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* 25% left  */
913
+                                refMasks[1] = allSplitRefs;                                    /* 75% right */
914
+                                md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);
915
+                                checkInter_rd0_4(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N, refMasks);
916
+                                if (md.pred[PRED_nLx2N].sa8dCost < bestInter->sa8dCost)
917
+                                    bestInter = &md.pred[PRED_nLx2N];
918
+                            }
919
 
920
-                    /* If BIDIR is available and within 17/16 of best inter option, choose by RDO */
921
-                    if (m_slice->m_sliceType == B_SLICE && md.pred[PRED_BIDIR].sa8dCost != MAX_INT64 &&
922
-                        md.pred[PRED_BIDIR].sa8dCost * 16 <= bestInter->sa8dCost * 17)
923
-                    {
924
-                        uint32_t numPU = md.pred[PRED_BIDIR].cu.getNumPartInter(0);
925
-                        if (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)
926
-                            for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
927
+                            if (!try_nRx2N_first && splitCost < md.pred[PRED_2Nx2N].sa8dCost + threshold_nRx2N)
928
                             {
929
-                                PredictionUnit pu(md.pred[PRED_BIDIR].cu, cuGeom, puIdx);
930
-                                motionCompensation(md.pred[PRED_BIDIR].cu, pu, md.pred[PRED_BIDIR].predYuv, true, true);
931
+                                refMasks[0] = allSplitRefs;                                    /* 75% left  */
932
+                                refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
933
+                                md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
934
+                                checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
935
+                                if (md.pred[PRED_nRx2N].sa8dCost < bestInter->sa8dCost)
936
+                                    bestInter = &md.pred[PRED_nRx2N];
937
                             }
938
-                        encodeResAndCalcRdInterCU(md.pred[PRED_BIDIR], cuGeom);
939
-                        checkBestMode(md.pred[PRED_BIDIR], depth);
940
+                        }
941
                     }
942
                 }
943
-
944
-                if ((bTryIntra && md.bestMode->cu.getQtRootCbf(0)) ||
945
-                    md.bestMode->sa8dCost == MAX_INT64)
946
+                bool bTryIntra = (m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames) && cuGeom.log2CUSize != MAX_LOG2_CU_SIZE && !((m_param->bCTUInfo & 4) && bCtuInfoCheck);
947
+                if (m_param->rdLevel >= 3)
948
                 {
949
-                    if (!m_param->limitReferences || splitIntra)
950
+                    /* Calculate RD cost of best inter option */
951
+                    if ((!m_bChromaSa8d && (m_csp != X265_CSP_I400)) || (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)) /* When m_bChromaSa8d is enabled, chroma MC has already been done */
952
                     {
953
-                        ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
954
-                        md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
955
-                        checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
956
-                        encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);
957
-                        checkBestMode(md.pred[PRED_INTRA], depth);
958
-                    }
959
-                    else
960
-                    {
961
-                        ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
962
+                        uint32_t numPU = bestInter->cu.getNumPartInter(0);
963
+                        for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
964
+                        {
965
+                            PredictionUnit pu(bestInter->cu, cuGeom, puIdx);
966
+                            motionCompensation(bestInter->cu, pu, bestInter->predYuv, false, true);
967
+                        }
968
                     }
969
-                }
970
-            }
971
-            else
972
-            {
973
-                /* SA8D choice between merge/skip, inter, bidir, and intra */
974
-                if (!md.bestMode || bestInter->sa8dCost < md.bestMode->sa8dCost)
975
-                    md.bestMode = bestInter;
976
 
977
-                if (m_slice->m_sliceType == B_SLICE &&
978
-                    md.pred[PRED_BIDIR].sa8dCost < md.bestMode->sa8dCost)
979
-                    md.bestMode = &md.pred[PRED_BIDIR];
980
-
981
-                if (bTryIntra || md.bestMode->sa8dCost == MAX_INT64)
982
-                {
983
-                    if (!m_param->limitReferences || splitIntra)
984
+                    if (!chooseMerge)
985
                     {
986
-                        ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
987
-                        md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
988
-                        checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
989
-                        if (md.pred[PRED_INTRA].sa8dCost < md.bestMode->sa8dCost)
990
-                            md.bestMode = &md.pred[PRED_INTRA];
991
-                    }
992
-                    else
993
-                    {
994
-                        ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
995
+                        encodeResAndCalcRdInterCU(*bestInter, cuGeom);
996
+                        checkBestMode(*bestInter, depth);
997
+
998
+                        /* If BIDIR is available and within 17/16 of best inter option, choose by RDO */
999
+                        if (m_slice->m_sliceType == B_SLICE && md.pred[PRED_BIDIR].sa8dCost != MAX_INT64 &&
1000
+                            md.pred[PRED_BIDIR].sa8dCost * 16 <= bestInter->sa8dCost * 17)
1001
+                        {
1002
+                            uint32_t numPU = md.pred[PRED_BIDIR].cu.getNumPartInter(0);
1003
+                            if (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)
1004
+                                for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
1005
+                                {
1006
+                                    PredictionUnit pu(md.pred[PRED_BIDIR].cu, cuGeom, puIdx);
1007
+                                    motionCompensation(md.pred[PRED_BIDIR].cu, pu, md.pred[PRED_BIDIR].predYuv, true, true);
1008
+                                }
1009
+                            encodeResAndCalcRdInterCU(md.pred[PRED_BIDIR], cuGeom);
1010
+                            checkBestMode(md.pred[PRED_BIDIR], depth);
1011
+                        }
1012
                     }
1013
-                }
1014
 
1015
-                /* finally code the best mode selected by SA8D costs:
1016
-                 * RD level 2 - fully encode the best mode
1017
-                 * RD level 1 - generate recon pixels
1018
-                 * RD level 0 - generate chroma prediction */
1019
-                if (md.bestMode->cu.m_mergeFlag[0] && md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N)
1020
-                {
1021
-                    /* prediction already generated for this CU, and if rd level
1022
-                     * is not 0, it is already fully encoded */
1023
-                }
1024
-                else if (md.bestMode->cu.isInter(0))
1025
-                {
1026
-                    uint32_t numPU = md.bestMode->cu.getNumPartInter(0);
1027
-                    if (m_csp != X265_CSP_I400)
1028
+                    if ((bTryIntra && md.bestMode->cu.getQtRootCbf(0)) ||
1029
+                        md.bestMode->sa8dCost == MAX_INT64)
1030
                     {
1031
-                        for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
1032
+                        if (!m_param->limitReferences || splitIntra)
1033
                         {
1034
-                            PredictionUnit pu(md.bestMode->cu, cuGeom, puIdx);
1035
-                            motionCompensation(md.bestMode->cu, pu, md.bestMode->predYuv, false, true);
1036
+                            ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
1037
+                            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
1038
+                            checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
1039
+                            encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);
1040
+                            checkBestMode(md.pred[PRED_INTRA], depth);
1041
                         }
1042
-                    }
1043
-                    if (m_param->rdLevel == 2)
1044
-                        encodeResAndCalcRdInterCU(*md.bestMode, cuGeom);
1045
-                    else if (m_param->rdLevel == 1)
1046
-                    {
1047
-                        /* generate recon pixels with no rate distortion considerations */
1048
-                        CUData& cu = md.bestMode->cu;
1049
-
1050
-                        uint32_t tuDepthRange[2];
1051
-                        cu.getInterTUQtDepthRange(tuDepthRange, 0);
1052
-                        m_rqt[cuGeom.depth].tmpResiYuv.subtract(*md.bestMode->fencYuv, md.bestMode->predYuv, cuGeom.log2CUSize, m_frame->m_fencPic->m_picCsp);
1053
-                        residualTransformQuantInter(*md.bestMode, cuGeom, 0, 0, tuDepthRange);
1054
-                        if (cu.getQtRootCbf(0))
1055
-                            md.bestMode->reconYuv.addClip(md.bestMode->predYuv, m_rqt[cuGeom.depth].tmpResiYuv, cu.m_log2CUSize[0], m_frame->m_fencPic->m_picCsp);
1056
                         else
1057
                         {
1058
-                            md.bestMode->reconYuv.copyFromYuv(md.bestMode->predYuv);
1059
-                            if (cu.m_mergeFlag[0] && cu.m_partSize[0] == SIZE_2Nx2N)
1060
-                                cu.setPredModeSubParts(MODE_SKIP);
1061
+                            ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
1062
                         }
1063
                     }
1064
                 }
1065
                 else
1066
                 {
1067
-                    if (m_param->rdLevel == 2)
1068
-                        encodeIntraInInter(*md.bestMode, cuGeom);
1069
-                    else if (m_param->rdLevel == 1)
1070
-                    {
1071
-                        /* generate recon pixels with no rate distortion considerations */
1072
-                        CUData& cu = md.bestMode->cu;
1073
+                    /* SA8D choice between merge/skip, inter, bidir, and intra */
1074
+                    if (!md.bestMode || bestInter->sa8dCost < md.bestMode->sa8dCost)
1075
+                        md.bestMode = bestInter;
1076
 
1077
-                        uint32_t tuDepthRange[2];
1078
-                        cu.getIntraTUQtDepthRange(tuDepthRange, 0);
1079
+                    if (m_slice->m_sliceType == B_SLICE &&
1080
+                        md.pred[PRED_BIDIR].sa8dCost < md.bestMode->sa8dCost)
1081
+                        md.bestMode = &md.pred[PRED_BIDIR];
1082
 
1083
-                        residualTransformQuantIntra(*md.bestMode, cuGeom, 0, 0, tuDepthRange);
1084
+                    if (bTryIntra || md.bestMode->sa8dCost == MAX_INT64)
1085
+                    {
1086
+                        if (!m_param->limitReferences || splitIntra)
1087
+                        {
1088
+                            ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
1089
+                            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
1090
+                            checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
1091
+                            if (md.pred[PRED_INTRA].sa8dCost < md.bestMode->sa8dCost)
1092
+                                md.bestMode = &md.pred[PRED_INTRA];
1093
+                        }
1094
+                        else
1095
+                        {
1096
+                            ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
1097
+                        }
1098
+                    }
1099
+
1100
+                    /* finally code the best mode selected by SA8D costs:
1101
+                     * RD level 2 - fully encode the best mode
1102
+                     * RD level 1 - generate recon pixels
1103
+                     * RD level 0 - generate chroma prediction */
1104
+                    if (md.bestMode->cu.m_mergeFlag[0] && md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N)
1105
+                    {
1106
+                        /* prediction already generated for this CU, and if rd level
1107
+                         * is not 0, it is already fully encoded */
1108
+                    }
1109
+                    else if (md.bestMode->cu.isInter(0))
1110
+                    {
1111
+                        uint32_t numPU = md.bestMode->cu.getNumPartInter(0);
1112
                         if (m_csp != X265_CSP_I400)
1113
                         {
1114
-                            getBestIntraModeChroma(*md.bestMode, cuGeom);
1115
-                            residualQTIntraChroma(*md.bestMode, cuGeom, 0, 0);
1116
+                            for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
1117
+                            {
1118
+                                PredictionUnit pu(md.bestMode->cu, cuGeom, puIdx);
1119
+                                motionCompensation(md.bestMode->cu, pu, md.bestMode->predYuv, false, true);
1120
+                            }
1121
+                        }
1122
+                        if (m_param->rdLevel == 2)
1123
+                            encodeResAndCalcRdInterCU(*md.bestMode, cuGeom);
1124
+                        else if (m_param->rdLevel == 1)
1125
+                        {
1126
+                            /* generate recon pixels with no rate distortion considerations */
1127
+                            CUData& cu = md.bestMode->cu;
1128
+
1129
+                            uint32_t tuDepthRange[2];
1130
+                            cu.getInterTUQtDepthRange(tuDepthRange, 0);
1131
+                            m_rqt[cuGeom.depth].tmpResiYuv.subtract(*md.bestMode->fencYuv, md.bestMode->predYuv, cuGeom.log2CUSize, m_frame->m_fencPic->m_picCsp);
1132
+                            residualTransformQuantInter(*md.bestMode, cuGeom, 0, 0, tuDepthRange);
1133
+                            if (cu.getQtRootCbf(0))
1134
+                                md.bestMode->reconYuv.addClip(md.bestMode->predYuv, m_rqt[cuGeom.depth].tmpResiYuv, cu.m_log2CUSize[0], m_frame->m_fencPic->m_picCsp);
1135
+                            else
1136
+                            {
1137
+                                md.bestMode->reconYuv.copyFromYuv(md.bestMode->predYuv);
1138
+                                if (cu.m_mergeFlag[0] && cu.m_partSize[0] == SIZE_2Nx2N)
1139
+                                    cu.setPredModeSubParts(MODE_SKIP);
1140
+                            }
1141
+                        }
1142
+                    }
1143
+                    else
1144
+                    {
1145
+                        if (m_param->rdLevel == 2)
1146
+                            encodeIntraInInter(*md.bestMode, cuGeom);
1147
+                        else if (m_param->rdLevel == 1)
1148
+                        {
1149
+                            /* generate recon pixels with no rate distortion considerations */
1150
+                            CUData& cu = md.bestMode->cu;
1151
+
1152
+                            uint32_t tuDepthRange[2];
1153
+                            cu.getIntraTUQtDepthRange(tuDepthRange, 0);
1154
+
1155
+                            residualTransformQuantIntra(*md.bestMode, cuGeom, 0, 0, tuDepthRange);
1156
+                            if (m_csp != X265_CSP_I400)
1157
+                            {
1158
+                                getBestIntraModeChroma(*md.bestMode, cuGeom);
1159
+                                residualQTIntraChroma(*md.bestMode, cuGeom, 0, 0);
1160
+                            }
1161
+                            md.bestMode->reconYuv.copyFromPicYuv(reconPic, cu.m_cuAddr, cuGeom.absPartIdx); // TODO:
1162
                         }
1163
-                        md.bestMode->reconYuv.copyFromPicYuv(reconPic, cu.m_cuAddr, cuGeom.absPartIdx); // TODO:
1164
                     }
1165
                 }
1166
-            }
1167
-        } // !earlyskip
1168
+            } // !earlyskip
1169
 
1170
-        if (m_bTryLossless)
1171
-            tryLossless(cuGeom);
1172
+            if (m_bTryLossless)
1173
+                tryLossless(cuGeom);
1174
 
1175
-        if (mightSplit)
1176
-            addSplitFlagCost(*md.bestMode, cuGeom.depth);
1177
-    }
1178
+            if (mightSplit)
1179
+                addSplitFlagCost(*md.bestMode, cuGeom.depth);
1180
+        }
1181
 
1182
-    if (mightSplit && !skipRecursion)
1183
-    {
1184
-        Mode* splitPred = &md.pred[PRED_SPLIT];
1185
-        if (!md.bestMode)
1186
-            md.bestMode = splitPred;
1187
-        else if (m_param->rdLevel > 1)
1188
-            checkBestMode(*splitPred, cuGeom.depth);
1189
-        else if (splitPred->sa8dCost < md.bestMode->sa8dCost)
1190
-            md.bestMode = splitPred;
1191
+        if (mightSplit && !skipRecursion)
1192
+        {
1193
+            Mode* splitPred = &md.pred[PRED_SPLIT];
1194
+            if (!md.bestMode)
1195
+                md.bestMode = splitPred;
1196
+            else if (m_param->rdLevel > 1)
1197
+                checkBestMode(*splitPred, cuGeom.depth);
1198
+            else if (splitPred->sa8dCost < md.bestMode->sa8dCost)
1199
+                md.bestMode = splitPred;
1200
 
1201
-        checkDQPForSplitPred(*md.bestMode, cuGeom);
1202
-    }
1203
+            checkDQPForSplitPred(*md.bestMode, cuGeom);
1204
+        }
1205
 
1206
-    /* determine which motion references the parent CU should search */
1207
-    SplitData splitCUData;
1208
-    splitCUData.initSplitCUData();
1209
+        /* determine which motion references the parent CU should search */
1210
+        splitCUData.initSplitCUData();
1211
 
1212
-    if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
1213
-    {
1214
-        if (md.bestMode == &md.pred[PRED_SPLIT])
1215
-            splitCUData.splitRefs = allSplitRefs;
1216
-        else 
1217
+        if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
1218
         {
1219
-            /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
1220
-            CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
1221
-            uint32_t numPU = cu.getNumPartInter(0);
1222
-            for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
1223
-                splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
1224
+            if (md.bestMode == &md.pred[PRED_SPLIT])
1225
+                splitCUData.splitRefs = allSplitRefs;
1226
+            else
1227
+            {
1228
+                /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
1229
+                CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
1230
+                uint32_t numPU = cu.getNumPartInter(0);
1231
+                for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
1232
+                    splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
1233
+            }
1234
         }
1235
-    }
1236
 
1237
-    if (m_param->limitModes)
1238
-    {
1239
-        splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
1240
-        splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
1241
-        splitCUData.sa8dCost    = md.pred[PRED_2Nx2N].sa8dCost;
1242
-    }
1243
-    
1244
-    if (mightNotSplit && md.bestMode->cu.isSkipped(0))
1245
-    {
1246
-        FrameData& curEncData = *m_frame->m_encData;
1247
-        FrameData::RCStatCU& cuStat = curEncData.m_cuStat[parentCTU.m_cuAddr];
1248
-        uint64_t temp = cuStat.avgCost[depth] * cuStat.count[depth];
1249
-        cuStat.count[depth] += 1;
1250
-        cuStat.avgCost[depth] = (temp + md.bestMode->rdCost) / cuStat.count[depth];
1251
-    }
1252
+        if (m_param->limitModes)
1253
+        {
1254
+            splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
1255
+            splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
1256
+            splitCUData.sa8dCost = md.pred[PRED_2Nx2N].sa8dCost;
1257
+        }
1258
 
1259
-    /* Copy best data to encData CTU and recon */
1260
-    md.bestMode->cu.copyToPic(depth);
1261
-    if (m_param->rdLevel)
1262
-        md.bestMode->reconYuv.copyToPicYuv(reconPic, cuAddr, cuGeom.absPartIdx);
1263
+        if (mightNotSplit && md.bestMode->cu.isSkipped(0))
1264
+        {
1265
+            FrameData& curEncData = *m_frame->m_encData;
1266
+            FrameData::RCStatCU& cuStat = curEncData.m_cuStat[parentCTU.m_cuAddr];
1267
+            uint64_t temp = cuStat.avgCost[depth] * cuStat.count[depth];
1268
+            cuStat.count[depth] += 1;
1269
+            cuStat.avgCost[depth] = (temp + md.bestMode->rdCost) / cuStat.count[depth];
1270
+        }
1271
+
1272
+        /* Copy best data to encData CTU and recon */
1273
+        md.bestMode->cu.copyToPic(depth);
1274
+        if (m_param->rdLevel)
1275
+            md.bestMode->reconYuv.copyToPicYuv(reconPic, cuAddr, cuGeom.absPartIdx);
1276
 
1277
-    if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
1278
+        if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
1279
+        {
1280
+            if (mightNotSplit)
1281
+            {
1282
+                CUData* ctu = md.bestMode->cu.m_encData->getPicCTU(parentCTU.m_cuAddr);
1283
+                int8_t maxTUDepth = -1;
1284
+                for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
1285
+                    maxTUDepth = X265_MAX(maxTUDepth, md.bestMode->cu.m_tuDepth[i]);
1286
+                ctu->m_refTuDepth[cuGeom.geomRecurId] = maxTUDepth;
1287
+            }
1288
+        }
1289
+    }
1290
+    else
1291
     {
1292
-        if (mightNotSplit)
1293
+        if (m_param->bMVType && cuGeom.numPartitions <= 16)
1294
         {
1295
-            CUData* ctu = md.bestMode->cu.m_encData->getPicCTU(parentCTU.m_cuAddr);
1296
-            int8_t maxTUDepth = -1;
1297
-            for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
1298
-                maxTUDepth = X265_MAX(maxTUDepth, md.bestMode->cu.m_tuDepth[i]);
1299
-            ctu->m_refTuDepth[cuGeom.geomRecurId] = maxTUDepth;
1300
+            qprdRefine(parentCTU, cuGeom, qp, qp);
1301
+
1302
+            SplitData splitData[4];
1303
+            splitData[0].initSplitCUData();
1304
+            splitData[1].initSplitCUData();
1305
+            splitData[2].initSplitCUData();
1306
+            splitData[3].initSplitCUData();
1307
+
1308
+            uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
1309
+
1310
+            splitCUData.initSplitCUData();
1311
+
1312
+            if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
1313
+            {
1314
+                if (md.bestMode == &md.pred[PRED_SPLIT])
1315
+                    splitCUData.splitRefs = allSplitRefs;
1316
+                else
1317
+                {
1318
+                    /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
1319
+                    CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
1320
+                    uint32_t numPU = cu.getNumPartInter(0);
1321
+                    for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
1322
+                        splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
1323
+                }
1324
+            }
1325
+
1326
+            if (m_param->limitModes)
1327
+            {
1328
+                splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
1329
+                splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
1330
+                splitCUData.sa8dCost = md.pred[PRED_2Nx2N].sa8dCost;
1331
+            }
1332
         }
1333
     }
1334
 
1335
@@ -1747,484 +1855,544 @@
1336
                     m_modeDepth[depth].fencYuv.m_integral[list][i][planes] = m_frame->m_encData->m_slice->m_refFrameList[list][i]->m_encData->m_meIntegral[planes] + offset;
1337
     }
1338
 
1339
-    bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
1340
-    bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
1341
-    bool bDecidedDepth = parentCTU.m_cuDepth[cuGeom.absPartIdx] == depth;
1342
-    bool skipRecursion = false;
1343
-    bool skipModes = false;
1344
-    bool splitIntra = true;
1345
-    bool skipRectAmp = false;
1346
-    bool bCtuInfoCheck = false;
1347
-    int sameContentRef = 0;
1348
+    SplitData splitCUData;
1349
 
1350
-    if (m_evaluateInter == 1)
1351
-    {
1352
-        skipRectAmp = !!md.bestMode;
1353
-        mightSplit &= false;
1354
-    }
1355
+    bool bHEVCBlockAnalysis = (m_param->bMVType && cuGeom.numPartitions > 16);
1356
+    bool bRefineAVCAnalysis = (m_param->analysisReuseLevel == 7 && (m_modeFlag[0] || m_modeFlag[1]));
1357
+    bool bNooffloading = !m_param->bMVType;
1358
 
1359
-    // avoid uninitialize value in below reference
1360
-    if (m_param->limitModes)
1361
+    if (bHEVCBlockAnalysis || bRefineAVCAnalysis || bNooffloading)
1362
     {
1363
-        md.pred[PRED_2Nx2N].bestME[0][0].mvCost = 0; // L0
1364
-        md.pred[PRED_2Nx2N].bestME[0][1].mvCost = 0; // L1
1365
-        md.pred[PRED_2Nx2N].rdCost = 0;
1366
-    }
1367
+        bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
1368
+        bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
1369
+        bool bDecidedDepth = parentCTU.m_cuDepth[cuGeom.absPartIdx] == depth;
1370
+        bool skipRecursion = false;
1371
+        bool skipModes = false;
1372
+        bool splitIntra = true;
1373
+        bool skipRectAmp = false;
1374
+        bool bCtuInfoCheck = false;
1375
+        int sameContentRef = 0;
1376
 
1377
-    if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
1378
-        m_maxTUDepth = loadTUDepth(cuGeom, parentCTU);
1379
+        if (m_evaluateInter)
1380
+        {
1381
+            if (m_param->interRefine == 2)
1382
+            {
1383
+                if (parentCTU.m_predMode[cuGeom.absPartIdx] == MODE_SKIP)
1384
+                    skipModes = true;
1385
+                if (parentCTU.m_partSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
1386
+                    skipRectAmp = true;
1387
+            }
1388
+            mightSplit &= false;
1389
+        }
1390
 
1391
-    SplitData splitData[4];
1392
-    splitData[0].initSplitCUData();
1393
-    splitData[1].initSplitCUData();
1394
-    splitData[2].initSplitCUData();
1395
-    splitData[3].initSplitCUData();
1396
-    uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
1397
-    uint32_t refMasks[2];
1398
-    if (m_param->bCTUInfo && depth <= parentCTU.m_cuDepth[cuGeom.absPartIdx])
1399
-    {
1400
-        if (bDecidedDepth && m_additionalCtuInfo[cuGeom.absPartIdx])
1401
-            sameContentRef = findSameContentRefCount(parentCTU, cuGeom);
1402
-        if (depth < parentCTU.m_cuDepth[cuGeom.absPartIdx])
1403
+        // avoid uninitialize value in below reference
1404
+        if (m_param->limitModes)
1405
         {
1406
-            mightNotSplit &= bDecidedDepth;
1407
-            bCtuInfoCheck = skipRecursion = false;
1408
-            skipModes = true;
1409
+            md.pred[PRED_2Nx2N].bestME[0][0].mvCost = 0; // L0
1410
+            md.pred[PRED_2Nx2N].bestME[0][1].mvCost = 0; // L1
1411
+            md.pred[PRED_2Nx2N].rdCost = 0;
1412
         }
1413
-        else if (mightNotSplit && bDecidedDepth)
1414
+
1415
+        if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
1416
+            m_maxTUDepth = loadTUDepth(cuGeom, parentCTU);
1417
+
1418
+        SplitData splitData[4];
1419
+        splitData[0].initSplitCUData();
1420
+        splitData[1].initSplitCUData();
1421
+        splitData[2].initSplitCUData();
1422
+        splitData[3].initSplitCUData();
1423
+        uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
1424
+        uint32_t refMasks[2];
1425
+        if (m_param->bCTUInfo && depth <= parentCTU.m_cuDepth[cuGeom.absPartIdx])
1426
         {
1427
-            if (m_additionalCtuInfo[cuGeom.absPartIdx])
1428
+            if (bDecidedDepth && m_additionalCtuInfo[cuGeom.absPartIdx])
1429
+                sameContentRef = findSameContentRefCount(parentCTU, cuGeom);
1430
+            if (depth < parentCTU.m_cuDepth[cuGeom.absPartIdx])
1431
             {
1432
-                bCtuInfoCheck = skipRecursion = true;
1433
-                refMasks[0] = allSplitRefs;
1434
-                md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1435
-                checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1436
-                checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1437
-                if (!sameContentRef)
1438
+                mightNotSplit &= bDecidedDepth;
1439
+                bCtuInfoCheck = skipRecursion = false;
1440
+                skipModes = true;
1441
+            }
1442
+            else if (mightNotSplit && bDecidedDepth)
1443
+            {
1444
+                if (m_additionalCtuInfo[cuGeom.absPartIdx])
1445
                 {
1446
-                    if ((m_param->bCTUInfo & 2) && (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth))
1447
+                    bCtuInfoCheck = skipRecursion = true;
1448
+                    refMasks[0] = allSplitRefs;
1449
+                    md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1450
+                    checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1451
+                    checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1452
+                    if (!sameContentRef)
1453
                     {
1454
-                        qp -= int32_t(0.04 * qp);
1455
-                        setLambdaFromQP(parentCTU, qp);
1456
+                        if ((m_param->bCTUInfo & 2) && (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth))
1457
+                        {
1458
+                            qp -= int32_t(0.04 * qp);
1459
+                            setLambdaFromQP(parentCTU, qp);
1460
+                        }
1461
+                        if (m_param->bCTUInfo & 4)
1462
+                            skipModes = false;
1463
+                    }
1464
+                    if (sameContentRef || (!sameContentRef && !(m_param->bCTUInfo & 4)))
1465
+                    {
1466
+                        if (m_param->rdLevel)
1467
+                            skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0);
1468
+                        if ((m_param->bCTUInfo & 4) && sameContentRef)
1469
+                            skipModes = md.bestMode && true;
1470
                     }
1471
-                    if (m_param->bCTUInfo & 4)
1472
-                        skipModes = false;
1473
                 }
1474
-                if (sameContentRef || (!sameContentRef && !(m_param->bCTUInfo & 4)))
1475
+                else
1476
                 {
1477
-                    if (m_param->rdLevel)
1478
-                        skipModes = m_param->bEnableEarlySkip && md.bestMode && md.bestMode->cu.isSkipped(0);
1479
-                    if ((m_param->bCTUInfo & 4) && sameContentRef)
1480
-                        skipModes = md.bestMode && true;
1481
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1482
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1483
+                    checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1484
+                    skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1485
+                    refMasks[0] = allSplitRefs;
1486
+                    md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1487
+                    checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1488
+                    checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1489
                 }
1490
+                mightSplit &= !bDecidedDepth;
1491
             }
1492
-            else
1493
-            {
1494
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1495
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1496
-                checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1497
-                skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1498
-                refMasks[0] = allSplitRefs;
1499
-                md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1500
-                checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1501
-                checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1502
-            }
1503
-            mightSplit &= !bDecidedDepth;
1504
         }
1505
-    }
1506
-    if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10)
1507
-    {
1508
-        if (mightNotSplit && depth == m_reuseDepth[cuGeom.absPartIdx])
1509
+        if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10)
1510
         {
1511
-            if (m_reuseModes[cuGeom.absPartIdx] == MODE_SKIP)
1512
+            if (mightNotSplit && depth == m_reuseDepth[cuGeom.absPartIdx])
1513
             {
1514
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1515
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1516
-                checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1517
-                skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1518
-                refMasks[0] = allSplitRefs;
1519
-                md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1520
-                checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1521
-                checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1522
+                if (m_reuseModes[cuGeom.absPartIdx] == MODE_SKIP)
1523
+                {
1524
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1525
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1526
+                    checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1527
+                    skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1528
+                    refMasks[0] = allSplitRefs;
1529
+                    md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1530
+                    checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1531
+                    checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1532
 
1533
-                if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1534
-                    skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1535
+                    if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1536
+                        skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1537
+                }
1538
+                if (m_param->analysisReuseLevel > 4 && m_reusePartSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
1539
+                    skipRectAmp = true && !!md.bestMode;
1540
             }
1541
-            if (m_param->analysisReuseLevel > 4 && m_reusePartSize[cuGeom.absPartIdx] == SIZE_2Nx2N)
1542
-                skipRectAmp = true && !!md.bestMode;
1543
         }
1544
-    }
1545
 
1546
-    if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
1547
-    {
1548
-        if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
1549
+        if (m_param->analysisMultiPassRefine && m_param->rc.bStatRead && m_multipassAnalysis)
1550
         {
1551
-            if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
1552
+            if (mightNotSplit && depth == m_multipassDepth[cuGeom.absPartIdx])
1553
             {
1554
-                md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1555
-                md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1556
-                checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1557
+                if (m_multipassModes[cuGeom.absPartIdx] == MODE_SKIP)
1558
+                {
1559
+                    md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1560
+                    md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1561
+                    checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1562
 
1563
-                skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1564
-                refMasks[0] = allSplitRefs;
1565
-                md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1566
-                checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1567
-                checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1568
+                    skipModes = !!m_param->bEnableEarlySkip && md.bestMode;
1569
+                    refMasks[0] = allSplitRefs;
1570
+                    md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1571
+                    checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1572
+                    checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1573
 
1574
-                if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1575
-                    skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1576
+                    if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1577
+                        skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1578
+                }
1579
             }
1580
         }
1581
-    }
1582
 
1583
-    /* Step 1. Evaluate Merge/Skip candidates for likely early-outs */
1584
-    if (mightNotSplit && !md.bestMode && !bCtuInfoCheck)
1585
-    {
1586
-        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1587
-        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1588
-        checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1589
-        skipModes = m_param->bEnableEarlySkip && md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1590
-        refMasks[0] = allSplitRefs;
1591
-        md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1592
-        checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1593
-        checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1594
-
1595
-        if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1596
-            skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1597
-    }
1598
-
1599
-    // estimate split cost
1600
-    /* Step 2. Evaluate each of the 4 split sub-blocks in series */
1601
-    if (mightSplit && !skipRecursion)
1602
-    {
1603
-        if (bCtuInfoCheck && m_param->bCTUInfo & 2)
1604
-            qp = int((1 / 0.96) * qp + 0.5);
1605
-        Mode* splitPred = &md.pred[PRED_SPLIT];
1606
-        splitPred->initCosts();
1607
-        CUData* splitCU = &splitPred->cu;
1608
-        splitCU->initSubCU(parentCTU, cuGeom, qp);
1609
+        /* Step 1. Evaluate Merge/Skip candidates for likely early-outs */
1610
+        if ((mightNotSplit && !md.bestMode && !bCtuInfoCheck) ||
1611
+            (m_param->bMVType && (m_modeFlag[0] || m_modeFlag[1])))
1612
+        {
1613
+            md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
1614
+            md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
1615
+            checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE], cuGeom);
1616
+            skipModes = (m_param->bEnableEarlySkip || m_param->interRefine == 2) &&
1617
+                md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1618
+            refMasks[0] = allSplitRefs;
1619
+            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1620
+            checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N, refMasks);
1621
+            checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
1622
 
1623
-        uint32_t nextDepth = depth + 1;
1624
-        ModeDepth& nd = m_modeDepth[nextDepth];
1625
-        invalidateContexts(nextDepth);
1626
-        Entropy* nextContext = &m_rqt[depth].cur;
1627
-        int nextQP = qp;
1628
-        splitIntra = false;
1629
+            if (m_param->bEnableRecursionSkip && depth && m_modeDepth[depth - 1].bestMode)
1630
+                skipRecursion = md.bestMode && !md.bestMode->cu.getQtRootCbf(0);
1631
+        }
1632
 
1633
-        for (uint32_t subPartIdx = 0; subPartIdx < 4; subPartIdx++)
1634
+        if (m_param->bMVType && md.bestMode && cuGeom.numPartitions <= 16)
1635
+            skipRecursion = true;
1636
+
1637
+        // estimate split cost
1638
+        /* Step 2. Evaluate each of the 4 split sub-blocks in series */
1639
+        if (mightSplit && !skipRecursion)
1640
         {
1641
-            const CUGeom& childGeom = *(&cuGeom + cuGeom.childOffset + subPartIdx);
1642
-            if (childGeom.flags & CUGeom::PRESENT)
1643
+            if (bCtuInfoCheck && m_param->bCTUInfo & 2)
1644
+                qp = int((1 / 0.96) * qp + 0.5);
1645
+            Mode* splitPred = &md.pred[PRED_SPLIT];
1646
+            splitPred->initCosts();
1647
+            CUData* splitCU = &splitPred->cu;
1648
+            splitCU->initSubCU(parentCTU, cuGeom, qp);
1649
+
1650
+            uint32_t nextDepth = depth + 1;
1651
+            ModeDepth& nd = m_modeDepth[nextDepth];
1652
+            invalidateContexts(nextDepth);
1653
+            Entropy* nextContext = &m_rqt[depth].cur;
1654
+            int nextQP = qp;
1655
+            splitIntra = false;
1656
+
1657
+            for (uint32_t subPartIdx = 0; subPartIdx < 4; subPartIdx++)
1658
             {
1659
-                m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);
1660
-                m_rqt[nextDepth].cur.load(*nextContext);
1661
+                const CUGeom& childGeom = *(&cuGeom + cuGeom.childOffset + subPartIdx);
1662
+                if (childGeom.flags & CUGeom::PRESENT)
1663
+                {
1664
+                    m_modeDepth[0].fencYuv.copyPartToYuv(nd.fencYuv, childGeom.absPartIdx);
1665
+                    m_rqt[nextDepth].cur.load(*nextContext);
1666
 
1667
-                if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
1668
-                    nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
1669
+                    if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
1670
+                        nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
1671
 
1672
-                splitData[subPartIdx] = compressInterCU_rd5_6(parentCTU, childGeom, nextQP);
1673
+                    splitData[subPartIdx] = compressInterCU_rd5_6(parentCTU, childGeom, nextQP);
1674
 
1675
-                // Save best CU and pred data for this sub CU
1676
-                splitIntra |= nd.bestMode->cu.isIntra(0);
1677
-                splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);
1678
-                splitPred->addSubCosts(*nd.bestMode);
1679
-                nd.bestMode->reconYuv.copyToPartYuv(splitPred->reconYuv, childGeom.numPartitions * subPartIdx);
1680
-                nextContext = &nd.bestMode->contexts;
1681
+                    // Save best CU and pred data for this sub CU
1682
+                    splitIntra |= nd.bestMode->cu.isIntra(0);
1683
+                    splitCU->copyPartFrom(nd.bestMode->cu, childGeom, subPartIdx);
1684
+                    splitPred->addSubCosts(*nd.bestMode);
1685
+                    nd.bestMode->reconYuv.copyToPartYuv(splitPred->reconYuv, childGeom.numPartitions * subPartIdx);
1686
+                    nextContext = &nd.bestMode->contexts;
1687
+                }
1688
+                else
1689
+                {
1690
+                    splitCU->setEmptyPart(childGeom, subPartIdx);
1691
+                }
1692
             }
1693
+            nextContext->store(splitPred->contexts);
1694
+            if (mightNotSplit)
1695
+                addSplitFlagCost(*splitPred, cuGeom.depth);
1696
             else
1697
-            {
1698
-                splitCU->setEmptyPart(childGeom, subPartIdx);
1699
-            }
1700
-        }
1701
-        nextContext->store(splitPred->contexts);
1702
-        if (mightNotSplit)
1703
-            addSplitFlagCost(*splitPred, cuGeom.depth);
1704
-        else
1705
-            updateModeCost(*splitPred);
1706
+                updateModeCost(*splitPred);
1707
 
1708
-        checkDQPForSplitPred(*splitPred, cuGeom);
1709
-    }
1710
+            checkDQPForSplitPred(*splitPred, cuGeom);
1711
+        }
1712
 
1713
-    /* Split CUs
1714
-     *   0  1
1715
-     *   2  3 */
1716
-    allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
1717
-    /* Step 3. Evaluate ME (2Nx2N, rect, amp) and intra modes at current depth */
1718
-    if (mightNotSplit)
1719
-    {
1720
-        if (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth && m_slice->m_pps->maxCuDQPDepth != 0)
1721
-            setLambdaFromQP(parentCTU, qp);
1722
+        /* If analysis mode is simple do not Evaluate other modes */
1723
+        if ((m_param->bMVType && cuGeom.numPartitions <= 16) && (m_slice->m_sliceType == P_SLICE || m_slice->m_sliceType == B_SLICE))
1724
+            mightNotSplit = !(m_checkMergeAndSkipOnly[0] || (m_checkMergeAndSkipOnly[0] && m_checkMergeAndSkipOnly[1]));
1725
 
1726
-        if (!skipModes)
1727
+        /* Split CUs
1728
+         *   0  1
1729
+         *   2  3 */
1730
+        allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
1731
+        /* Step 3. Evaluate ME (2Nx2N, rect, amp) and intra modes at current depth */
1732
+        if (mightNotSplit)
1733
         {
1734
-            refMasks[0] = allSplitRefs;
1735
+            if (m_slice->m_pps->bUseDQP && depth <= m_slice->m_pps->maxCuDQPDepth && m_slice->m_pps->maxCuDQPDepth != 0)
1736
+                setLambdaFromQP(parentCTU, qp);
1737
 
1738
-            if (m_param->limitReferences & X265_REF_LIMIT_CU)
1739
+            if (!skipModes)
1740
             {
1741
-                CUData& cu = md.pred[PRED_2Nx2N].cu;
1742
-                uint32_t refMask = cu.getBestRefIdx(0);
1743
-                allSplitRefs = splitData[0].splitRefs = splitData[1].splitRefs = splitData[2].splitRefs = splitData[3].splitRefs = refMask;
1744
-            }
1745
+                refMasks[0] = allSplitRefs;
1746
 
1747
-            if (m_slice->m_sliceType == B_SLICE)
1748
-            {
1749
-                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
1750
-                checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);
1751
-                if (md.pred[PRED_BIDIR].sa8dCost < MAX_INT64)
1752
+                if (m_param->limitReferences & X265_REF_LIMIT_CU)
1753
                 {
1754
-                    uint32_t numPU = md.pred[PRED_BIDIR].cu.getNumPartInter(0);
1755
-                    if (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)
1756
-                        for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
1757
-                        {
1758
-                            PredictionUnit pu(md.pred[PRED_BIDIR].cu, cuGeom, puIdx);
1759
-                            motionCompensation(md.pred[PRED_BIDIR].cu, pu, md.pred[PRED_BIDIR].predYuv, true, true);
1760
-                        }
1761
-                    encodeResAndCalcRdInterCU(md.pred[PRED_BIDIR], cuGeom);
1762
-                    checkBestMode(md.pred[PRED_BIDIR], cuGeom.depth);
1763
+                    CUData& cu = md.pred[PRED_2Nx2N].cu;
1764
+                    uint32_t refMask = cu.getBestRefIdx(0);
1765
+                    allSplitRefs = splitData[0].splitRefs = splitData[1].splitRefs = splitData[2].splitRefs = splitData[3].splitRefs = refMask;
1766
                 }
1767
-            }
1768
 
1769
-            if (!skipRectAmp)
1770
-            {
1771
-                if (m_param->bEnableRectInter)
1772
+                if (m_slice->m_sliceType == B_SLICE)
1773
                 {
1774
-                    uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
1775
-                    uint32_t threshold_2NxN, threshold_Nx2N;
1776
-
1777
-                    if (m_slice->m_sliceType == P_SLICE)
1778
-                    {
1779
-                        threshold_2NxN = splitData[0].mvCost[0] + splitData[1].mvCost[0];
1780
-                        threshold_Nx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
1781
-                    }
1782
-                    else
1783
-                    {
1784
-                        threshold_2NxN = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
1785
-                                       + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
1786
-                        threshold_Nx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
1787
-                                       + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
1788
-                    }
1789
-
1790
-                    int try_2NxN_first = threshold_2NxN < threshold_Nx2N;
1791
-                    if (try_2NxN_first && splitCost < md.bestMode->rdCost + threshold_2NxN)
1792
-                    {
1793
-                        refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
1794
-                        refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
1795
-                        md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
1796
-                        checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
1797
-                        checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);
1798
-                    }
1799
-
1800
-                    if (splitCost < md.bestMode->rdCost + threshold_Nx2N)
1801
-                    {
1802
-                        refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* left */
1803
-                        refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* right */
1804
-                        md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1805
-                        checkInter_rd5_6(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N, refMasks);
1806
-                        checkBestMode(md.pred[PRED_Nx2N], cuGeom.depth);
1807
-                    }
1808
-
1809
-                    if (!try_2NxN_first && splitCost < md.bestMode->rdCost + threshold_2NxN)
1810
+                    md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
1811
+                    checkBidir2Nx2N(md.pred[PRED_2Nx2N], md.pred[PRED_BIDIR], cuGeom);
1812
+                    if (md.pred[PRED_BIDIR].sa8dCost < MAX_INT64)
1813
                     {
1814
-                        refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
1815
-                        refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
1816
-                        md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
1817
-                        checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
1818
-                        checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);
1819
+                        uint32_t numPU = md.pred[PRED_BIDIR].cu.getNumPartInter(0);
1820
+                        if (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400)
1821
+                            for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
1822
+                            {
1823
+                                PredictionUnit pu(md.pred[PRED_BIDIR].cu, cuGeom, puIdx);
1824
+                                motionCompensation(md.pred[PRED_BIDIR].cu, pu, md.pred[PRED_BIDIR].predYuv, true, true);
1825
+                            }
1826
+                        encodeResAndCalcRdInterCU(md.pred[PRED_BIDIR], cuGeom);
1827
+                        checkBestMode(md.pred[PRED_BIDIR], cuGeom.depth);
1828
                     }
1829
                 }
1830
 
1831
-                // Try AMP (SIZE_2NxnU, SIZE_2NxnD, SIZE_nLx2N, SIZE_nRx2N)
1832
-                if (m_slice->m_sps->maxAMPDepth > depth)
1833
+                if (!skipRectAmp)
1834
                 {
1835
-                    uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
1836
-                    uint32_t threshold_2NxnU, threshold_2NxnD, threshold_nLx2N, threshold_nRx2N;
1837
-
1838
-                    if (m_slice->m_sliceType == P_SLICE)
1839
+                    if (m_param->bEnableRectInter)
1840
                     {
1841
-                        threshold_2NxnU = splitData[0].mvCost[0] + splitData[1].mvCost[0];
1842
-                        threshold_2NxnD = splitData[2].mvCost[0] + splitData[3].mvCost[0];
1843
+                        uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
1844
+                        uint32_t threshold_2NxN, threshold_Nx2N;
1845
 
1846
-                        threshold_nLx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
1847
-                        threshold_nRx2N = splitData[1].mvCost[0] + splitData[3].mvCost[0];
1848
-                    }
1849
-                    else
1850
-                    {
1851
-                        threshold_2NxnU = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
1852
-                                        + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
1853
-                        threshold_2NxnD = (splitData[2].mvCost[0] + splitData[3].mvCost[0]
1854
-                                        + splitData[2].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
1855
-
1856
-                        threshold_nLx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
1857
-                                        + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
1858
-                        threshold_nRx2N = (splitData[1].mvCost[0] + splitData[3].mvCost[0]
1859
-                                        + splitData[1].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
1860
-                    }
1861
-
1862
-                    bool bHor = false, bVer = false;
1863
-                    if (md.bestMode->cu.m_partSize[0] == SIZE_2NxN)
1864
-                        bHor = true;
1865
-                    else if (md.bestMode->cu.m_partSize[0] == SIZE_Nx2N)
1866
-                        bVer = true;
1867
-                    else if (md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N && !md.bestMode->cu.m_mergeFlag[0])
1868
-                    {
1869
-                        bHor = true;
1870
-                        bVer = true;
1871
-                    }
1872
+                        if (m_slice->m_sliceType == P_SLICE)
1873
+                        {
1874
+                            threshold_2NxN = splitData[0].mvCost[0] + splitData[1].mvCost[0];
1875
+                            threshold_Nx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
1876
+                        }
1877
+                        else
1878
+                        {
1879
+                            threshold_2NxN = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
1880
+                                + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
1881
+                            threshold_Nx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
1882
+                                + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
1883
+                        }
1884
 
1885
-                    if (bHor)
1886
-                    {
1887
-                        int try_2NxnD_first = threshold_2NxnD < threshold_2NxnU;
1888
-                        if (try_2NxnD_first && splitCost < md.bestMode->rdCost + threshold_2NxnD)
1889
+                        int try_2NxN_first = threshold_2NxN < threshold_Nx2N;
1890
+                        if (try_2NxN_first && splitCost < md.bestMode->rdCost + threshold_2NxN)
1891
                         {
1892
-                            refMasks[0] = allSplitRefs;                                    /* 75% top */
1893
-                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
1894
-                            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
1895
-                            checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
1896
-                            checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);
1897
+                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
1898
+                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
1899
+                            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
1900
+                            checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
1901
+                            checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);
1902
                         }
1903
 
1904
-                        if (splitCost < md.bestMode->rdCost + threshold_2NxnU)
1905
+                        if (splitCost < md.bestMode->rdCost + threshold_Nx2N)
1906
                         {
1907
-                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* 25% top */
1908
-                            refMasks[1] = allSplitRefs;                                    /* 75% bot */
1909
-                            md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);
1910
-                            checkInter_rd5_6(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU, refMasks);
1911
-                            checkBestMode(md.pred[PRED_2NxnU], cuGeom.depth);
1912
+                            refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* left */
1913
+                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* right */
1914
+                            md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1915
+                            checkInter_rd5_6(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N, refMasks);
1916
+                            checkBestMode(md.pred[PRED_Nx2N], cuGeom.depth);
1917
                         }
1918
 
1919
-                        if (!try_2NxnD_first && splitCost < md.bestMode->rdCost + threshold_2NxnD)
1920
+                        if (!try_2NxN_first && splitCost < md.bestMode->rdCost + threshold_2NxN)
1921
                         {
1922
-                            refMasks[0] = allSplitRefs;                                    /* 75% top */
1923
-                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
1924
-                            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
1925
-                            checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
1926
-                            checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);
1927
+                            refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* top */
1928
+                            refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* bot */
1929
+                            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
1930
+                            checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN, refMasks);
1931
+                            checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);
1932
                         }
1933
                     }
1934
 
1935
-                    if (bVer)
1936
+                    // Try AMP (SIZE_2NxnU, SIZE_2NxnD, SIZE_nLx2N, SIZE_nRx2N)
1937
+                    if (m_slice->m_sps->maxAMPDepth > depth)
1938
                     {
1939
-                        int try_nRx2N_first = threshold_nRx2N < threshold_nLx2N;
1940
-                        if (try_nRx2N_first && splitCost < md.bestMode->rdCost + threshold_nRx2N)
1941
+                        uint64_t splitCost = splitData[0].sa8dCost + splitData[1].sa8dCost + splitData[2].sa8dCost + splitData[3].sa8dCost;
1942
+                        uint32_t threshold_2NxnU, threshold_2NxnD, threshold_nLx2N, threshold_nRx2N;
1943
+
1944
+                        if (m_slice->m_sliceType == P_SLICE)
1945
+                        {
1946
+                            threshold_2NxnU = splitData[0].mvCost[0] + splitData[1].mvCost[0];
1947
+                            threshold_2NxnD = splitData[2].mvCost[0] + splitData[3].mvCost[0];
1948
+
1949
+                            threshold_nLx2N = splitData[0].mvCost[0] + splitData[2].mvCost[0];
1950
+                            threshold_nRx2N = splitData[1].mvCost[0] + splitData[3].mvCost[0];
1951
+                        }
1952
+                        else
1953
                         {
1954
-                            refMasks[0] = allSplitRefs;                                    /* 75% left  */
1955
-                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
1956
-                            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1957
-                            checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
1958
-                            checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);
1959
+                            threshold_2NxnU = (splitData[0].mvCost[0] + splitData[1].mvCost[0]
1960
+                                + splitData[0].mvCost[1] + splitData[1].mvCost[1] + 1) >> 1;
1961
+                            threshold_2NxnD = (splitData[2].mvCost[0] + splitData[3].mvCost[0]
1962
+                                + splitData[2].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
1963
+
1964
+                            threshold_nLx2N = (splitData[0].mvCost[0] + splitData[2].mvCost[0]
1965
+                                + splitData[0].mvCost[1] + splitData[2].mvCost[1] + 1) >> 1;
1966
+                            threshold_nRx2N = (splitData[1].mvCost[0] + splitData[3].mvCost[0]
1967
+                                + splitData[1].mvCost[1] + splitData[3].mvCost[1] + 1) >> 1;
1968
                         }
1969
 
1970
-                        if (splitCost < md.bestMode->rdCost + threshold_nLx2N)
1971
+                        bool bHor = false, bVer = false;
1972
+                        if (md.bestMode->cu.m_partSize[0] == SIZE_2NxN)
1973
+                            bHor = true;
1974
+                        else if (md.bestMode->cu.m_partSize[0] == SIZE_Nx2N)
1975
+                            bVer = true;
1976
+                        else if (md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N && !md.bestMode->cu.m_mergeFlag[0])
1977
                         {
1978
-                            refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* 25% left  */
1979
-                            refMasks[1] = allSplitRefs;                                    /* 75% right */
1980
-                            md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1981
-                            checkInter_rd5_6(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N, refMasks);
1982
-                            checkBestMode(md.pred[PRED_nLx2N], cuGeom.depth);
1983
+                            bHor = true;
1984
+                            bVer = true;
1985
                         }
1986
 
1987
-                        if (!try_nRx2N_first && splitCost < md.bestMode->rdCost + threshold_nRx2N)
1988
+                        if (bHor)
1989
                         {
1990
-                            refMasks[0] = allSplitRefs;                                    /* 75% left  */
1991
-                            refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
1992
-                            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
1993
-                            checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
1994
-                            checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);
1995
+                            int try_2NxnD_first = threshold_2NxnD < threshold_2NxnU;
1996
+                            if (try_2NxnD_first && splitCost < md.bestMode->rdCost + threshold_2NxnD)
1997
+                            {
1998
+                                refMasks[0] = allSplitRefs;                                    /* 75% top */
1999
+                                refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
2000
+                                md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
2001
+                                checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
2002
+                                checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);
2003
+                            }
2004
+
2005
+                            if (splitCost < md.bestMode->rdCost + threshold_2NxnU)
2006
+                            {
2007
+                                refMasks[0] = splitData[0].splitRefs | splitData[1].splitRefs; /* 25% top */
2008
+                                refMasks[1] = allSplitRefs;                                    /* 75% bot */
2009
+                                md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);
2010
+                                checkInter_rd5_6(md.pred[PRED_2NxnU], cuGeom, SIZE_2NxnU, refMasks);
2011
+                                checkBestMode(md.pred[PRED_2NxnU], cuGeom.depth);
2012
+                            }
2013
+
2014
+                            if (!try_2NxnD_first && splitCost < md.bestMode->rdCost + threshold_2NxnD)
2015
+                            {
2016
+                                refMasks[0] = allSplitRefs;                                    /* 75% top */
2017
+                                refMasks[1] = splitData[2].splitRefs | splitData[3].splitRefs; /* 25% bot */
2018
+                                md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
2019
+                                checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom, SIZE_2NxnD, refMasks);
2020
+                                checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);
2021
+                            }
2022
+                        }
2023
+
2024
+                        if (bVer)
2025
+                        {
2026
+                            int try_nRx2N_first = threshold_nRx2N < threshold_nLx2N;
2027
+                            if (try_nRx2N_first && splitCost < md.bestMode->rdCost + threshold_nRx2N)
2028
+                            {
2029
+                                refMasks[0] = allSplitRefs;                                    /* 75% left  */
2030
+                                refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
2031
+                                md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
2032
+                                checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
2033
+                                checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);
2034
+                            }
2035
+
2036
+                            if (splitCost < md.bestMode->rdCost + threshold_nLx2N)
2037
+                            {
2038
+                                refMasks[0] = splitData[0].splitRefs | splitData[2].splitRefs; /* 25% left  */
2039
+                                refMasks[1] = allSplitRefs;                                    /* 75% right */
2040
+                                md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);
2041
+                                checkInter_rd5_6(md.pred[PRED_nLx2N], cuGeom, SIZE_nLx2N, refMasks);
2042
+                                checkBestMode(md.pred[PRED_nLx2N], cuGeom.depth);
2043
+                            }
2044
+
2045
+                            if (!try_nRx2N_first && splitCost < md.bestMode->rdCost + threshold_nRx2N)
2046
+                            {
2047
+                                refMasks[0] = allSplitRefs;                                    /* 75% left  */
2048
+                                refMasks[1] = splitData[1].splitRefs | splitData[3].splitRefs; /* 25% right */
2049
+                                md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
2050
+                                checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom, SIZE_nRx2N, refMasks);
2051
+                                checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);
2052
+                            }
2053
                         }
2054
                     }
2055
                 }
2056
-            }
2057
 
2058
-            if ((m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames) && (cuGeom.log2CUSize != MAX_LOG2_CU_SIZE) && !((m_param->bCTUInfo & 4) && bCtuInfoCheck))
2059
-            {
2060
-                if (!m_param->limitReferences || splitIntra)
2061
+                if ((m_slice->m_sliceType != B_SLICE || m_param->bIntraInBFrames) && (cuGeom.log2CUSize != MAX_LOG2_CU_SIZE) && !((m_param->bCTUInfo & 4) && bCtuInfoCheck))
2062
                 {
2063
-                    ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
2064
-                    md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
2065
-                    checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N);
2066
-                    checkBestMode(md.pred[PRED_INTRA], depth);
2067
+                    if (!m_param->limitReferences || splitIntra)
2068
+                    {
2069
+                        ProfileCounter(parentCTU, totalIntraCU[cuGeom.depth]);
2070
+                        md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
2071
+                        checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N);
2072
+                        checkBestMode(md.pred[PRED_INTRA], depth);
2073
 
2074
-                    if (cuGeom.log2CUSize == 3 && m_slice->m_sps->quadtreeTULog2MinSize < 3)
2075
+                        if (cuGeom.log2CUSize == 3 && m_slice->m_sps->quadtreeTULog2MinSize < 3)
2076
+                        {
2077
+                            md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);
2078
+                            checkIntra(md.pred[PRED_INTRA_NxN], cuGeom, SIZE_NxN);
2079
+                            checkBestMode(md.pred[PRED_INTRA_NxN], depth);
2080
+                        }
2081
+                    }
2082
+                    else
2083
                     {
2084
-                        md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);
2085
-                        checkIntra(md.pred[PRED_INTRA_NxN], cuGeom, SIZE_NxN);
2086
-                        checkBestMode(md.pred[PRED_INTRA_NxN], depth);
2087
+                        ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
2088
                     }
2089
                 }
2090
-                else
2091
+            }
2092
+
2093
+            if ((md.bestMode->cu.isInter(0) && !(md.bestMode->cu.m_mergeFlag[0] && md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N)) && (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400))
2094
+            {
2095
+                uint32_t numPU = md.bestMode->cu.getNumPartInter(0);
2096
+
2097
+                for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
2098
                 {
2099
-                    ProfileCounter(parentCTU, skippedIntraCU[cuGeom.depth]);
2100
+                    PredictionUnit pu(md.bestMode->cu, cuGeom, puIdx);
2101
+                    motionCompensation(md.bestMode->cu, pu, md.bestMode->predYuv, false, m_csp != X265_CSP_I400);
2102
                 }
2103
+                encodeResAndCalcRdInterCU(*md.bestMode, cuGeom);
2104
             }
2105
+            if (m_bTryLossless)
2106
+                tryLossless(cuGeom);
2107
+
2108
+            if (mightSplit)
2109
+                addSplitFlagCost(*md.bestMode, cuGeom.depth);
2110
         }
2111
 
2112
-        if ((md.bestMode->cu.isInter(0) && !(md.bestMode->cu.m_mergeFlag[0] && md.bestMode->cu.m_partSize[0] == SIZE_2Nx2N)) && (m_frame->m_fencPic->m_picCsp == X265_CSP_I400 && m_csp != X265_CSP_I400))
2113
+        if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
2114
         {
2115
-            uint32_t numPU = md.bestMode->cu.getNumPartInter(0);
2116
-
2117
-            for (uint32_t puIdx = 0; puIdx < numPU; puIdx++)
2118
+            if (mightNotSplit)
2119
             {
2120
-                PredictionUnit pu(md.bestMode->cu, cuGeom, puIdx);
2121
-                motionCompensation(md.bestMode->cu, pu, md.bestMode->predYuv, false, m_csp != X265_CSP_I400);
2122
+                CUData* ctu = md.bestMode->cu.m_encData->getPicCTU(parentCTU.m_cuAddr);
2123
+                int8_t maxTUDepth = -1;
2124
+                for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
2125
+                    maxTUDepth = X265_MAX(maxTUDepth, md.bestMode->cu.m_tuDepth[i]);
2126
+                ctu->m_refTuDepth[cuGeom.geomRecurId] = maxTUDepth;
2127
             }
2128
-            encodeResAndCalcRdInterCU(*md.bestMode, cuGeom);
2129
         }
2130
-        if (m_bTryLossless)
2131
-            tryLossless(cuGeom);
2132
 
2133
-        if (mightSplit)
2134
-            addSplitFlagCost(*md.bestMode, cuGeom.depth);
2135
-    }
2136
+        /* compare split RD cost against best cost */
2137
+        if (mightSplit && !skipRecursion)
2138
+            checkBestMode(md.pred[PRED_SPLIT], depth);
2139
 
2140
-    if ((m_limitTU & X265_TU_LIMIT_NEIGH) && cuGeom.log2CUSize >= 4)
2141
-    {
2142
-        if (mightNotSplit)
2143
+        if (m_param->bEnableRdRefine && depth <= m_slice->m_pps->maxCuDQPDepth)
2144
         {
2145
-            CUData* ctu = md.bestMode->cu.m_encData->getPicCTU(parentCTU.m_cuAddr);
2146
-            int8_t maxTUDepth = -1;
2147
-            for (uint32_t i = 0; i < cuGeom.numPartitions; i++)
2148
-                maxTUDepth = X265_MAX(maxTUDepth, md.bestMode->cu.m_tuDepth[i]);
2149
-            ctu->m_refTuDepth[cuGeom.geomRecurId] = maxTUDepth;
2150
+            int cuIdx = (cuGeom.childOffset - 1) / 3;
2151
+            cacheCost[cuIdx] = md.bestMode->rdCost;
2152
         }
2153
-    }
2154
 
2155
-    /* compare split RD cost against best cost */
2156
-    if (mightSplit && !skipRecursion)
2157
-        checkBestMode(md.pred[PRED_SPLIT], depth);
2158
-
2159
-    if (m_param->bEnableRdRefine && depth <= m_slice->m_pps->maxCuDQPDepth)
2160
-    {
2161
-        int cuIdx = (cuGeom.childOffset - 1) / 3;
2162
-        cacheCost[cuIdx] = md.bestMode->rdCost;
2163
-    }
2164
+        /* determine which motion references the parent CU should search */
2165
+        splitCUData.initSplitCUData();
2166
+        if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
2167
+        {
2168
+            if (md.bestMode == &md.pred[PRED_SPLIT])
2169
+                splitCUData.splitRefs = allSplitRefs;
2170
+            else
2171
+            {
2172
+                /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
2173
+                CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
2174
+                uint32_t numPU = cu.getNumPartInter(0);
2175
+                for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
2176
+                    splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
2177
+            }
2178
+        }
2179
 
2180
-       /* determine which motion references the parent CU should search */
2181
-    SplitData splitCUData;
2182
-    splitCUData.initSplitCUData();
2183
-    if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
2184
-    {
2185
-        if (md.bestMode == &md.pred[PRED_SPLIT])
2186
-            splitCUData.splitRefs = allSplitRefs;
2187
-        else
2188
+        if (m_param->limitModes)
2189
         {
2190
-            /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
2191
-            CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
2192
-            uint32_t numPU = cu.getNumPartInter(0);
2193
-            for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
2194
-                splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
2195
+            splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
2196
+            splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
2197
+            splitCUData.sa8dCost = md.pred[PRED_2Nx2N].rdCost;
2198
         }
2199
-    }
2200
 
2201
-    if (m_param->limitModes)
2202
-    {
2203
-        splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
2204
-        splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
2205
-        splitCUData.sa8dCost    = md.pred[PRED_2Nx2N].rdCost;
2206
+        /* Copy best data to encData CTU and recon */
2207
+        md.bestMode->cu.copyToPic(depth);
2208
+        md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, parentCTU.m_cuAddr, cuGeom.absPartIdx);
2209
     }
2210
+    else
2211
+    {
2212
+        if (m_param->bMVType && cuGeom.numPartitions <= 16)
2213
+        {
2214
+            qprdRefine(parentCTU, cuGeom, qp, qp);
2215
 
2216
-    /* Copy best data to encData CTU and recon */
2217
-    md.bestMode->cu.copyToPic(depth);
2218
-    md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, parentCTU.m_cuAddr, cuGeom.absPartIdx);
2219
+            SplitData splitData[4];
2220
+            splitData[0].initSplitCUData();
2221
+            splitData[1].initSplitCUData();
2222
+            splitData[2].initSplitCUData();
2223
+            splitData[3].initSplitCUData();
2224
+
2225
+            uint32_t allSplitRefs = splitData[0].splitRefs | splitData[1].splitRefs | splitData[2].splitRefs | splitData[3].splitRefs;
2226
+
2227
+            splitCUData.initSplitCUData();
2228
+            if (m_param->limitReferences & X265_REF_LIMIT_DEPTH)
2229
+            {
2230
+                if (md.bestMode == &md.pred[PRED_SPLIT])
2231
+                    splitCUData.splitRefs = allSplitRefs;
2232
+                else
2233
+                {
2234
+                    /* use best merge/inter mode, in case of intra use 2Nx2N inter references */
2235
+                    CUData& cu = md.bestMode->cu.isIntra(0) ? md.pred[PRED_2Nx2N].cu : md.bestMode->cu;
2236
+                    uint32_t numPU = cu.getNumPartInter(0);
2237
+                    for (uint32_t puIdx = 0, subPartIdx = 0; puIdx < numPU; puIdx++, subPartIdx += cu.getPUOffset(puIdx, 0))
2238
+                        splitCUData.splitRefs |= cu.getBestRefIdx(subPartIdx);
2239
+                }
2240
+            }
2241
+
2242
+            if (m_param->limitModes)
2243
+            {
2244
+                splitCUData.mvCost[0] = md.pred[PRED_2Nx2N].bestME[0][0].mvCost; // L0
2245
+                splitCUData.mvCost[1] = md.pred[PRED_2Nx2N].bestME[0][1].mvCost; // L1
2246
+                splitCUData.sa8dCost = md.pred[PRED_2Nx2N].rdCost;
2247
+            }
2248
+        }
2249
+    }
2250
 
2251
     return splitCUData;
2252
 }
2253
@@ -2240,8 +2408,7 @@
2254
     bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
2255
     bool bDecidedDepth = parentCTU.m_cuDepth[cuGeom.absPartIdx] == depth;
2256
 
2257
-    int split = (m_param->interRefine && cuGeom.log2CUSize == (uint32_t)(g_log2Size[m_param->minCUSize] + 1)
2258
-                && bDecidedDepth && parentCTU.m_predMode[cuGeom.absPartIdx] == MODE_SKIP);
2259
+    int split = (m_param->interRefine && cuGeom.log2CUSize == (uint32_t)(g_log2Size[m_param->minCUSize] + 1) && bDecidedDepth);
2260
 
2261
     if (bDecidedDepth)
2262
     {
2263
@@ -2251,23 +2418,25 @@
2264
         md.bestMode = &mode;
2265
         mode.cu.initSubCU(parentCTU, cuGeom, qp);
2266
         PartSize size = (PartSize)parentCTU.m_partSize[cuGeom.absPartIdx];
2267
-        if (parentCTU.isIntra(cuGeom.absPartIdx))
2268
+        if (parentCTU.isIntra(cuGeom.absPartIdx) && m_param->interRefine < 2)
2269
         {
2270
-            if (m_param->intraRefine != 2 || parentCTU.m_lumaIntraDir[cuGeom.absPartIdx] <= 1)
2271
+            bool reuseModes = !((m_param->intraRefine == 3) ||
2272
+                                (m_param->intraRefine == 2 && parentCTU.m_lumaIntraDir[cuGeom.absPartIdx] > DC_IDX));
2273
+            if (reuseModes)
2274
             {
2275
                 memcpy(mode.cu.m_lumaIntraDir, parentCTU.m_lumaIntraDir + cuGeom.absPartIdx, cuGeom.numPartitions);
2276
                 memcpy(mode.cu.m_chromaIntraDir, parentCTU.m_chromaIntraDir + cuGeom.absPartIdx, cuGeom.numPartitions);
2277
             }
2278
             checkIntra(mode, cuGeom, size);
2279
         }
2280
-        else
2281
+        else if (!parentCTU.isIntra(cuGeom.absPartIdx) && m_param->interRefine < 2)
2282
         {
2283
             mode.cu.copyFromPic(parentCTU, cuGeom, m_csp, false);
2284
             uint32_t numPU = parentCTU.getNumPartInter(cuGeom.absPartIdx);
2285
             for (uint32_t part = 0; part < numPU; part++)
2286
             {
2287
                 PredictionUnit pu(mode.cu, cuGeom, part);
2288
-                if (m_param->analysisReuseLevel == 10)
2289
+                if (m_param->analysisReuseLevel >= 7)
2290
                 {
2291
                     analysis_inter_data* interDataCTU = (analysis_inter_data*)m_frame->m_analysisData.interData;
2292
                     int cuIdx = (mode.cu.m_cuAddr * parentCTU.m_numPartitions) + cuGeom.absPartIdx;
2293
@@ -2328,19 +2497,39 @@
2294
                 checkDQP(mode, cuGeom);
2295
         }
2296
 
2297
-        if (m_bTryLossless)
2298
-            tryLossless(cuGeom);
2299
+        if (m_param->interRefine < 2)
2300
+        {
2301
+            if (m_bTryLossless)
2302
+                tryLossless(cuGeom);
2303
 
2304
-        if (mightSplit)
2305
-            addSplitFlagCost(*md.bestMode, cuGeom.depth);
2306
+            if (mightSplit)
2307
+                addSplitFlagCost(*md.bestMode, cuGeom.depth);
2308
 
2309
-        if (mightSplit && m_param->rdLevel < 5)
2310
-            checkDQPForSplitPred(*md.bestMode, cuGeom);
2311
+            if (mightSplit && m_param->rdLevel < 5)
2312
+                checkDQPForSplitPred(*md.bestMode, cuGeom);
2313
+        }
2314
+
2315
+        if (m_param->bMVType && m_param->analysisReuseLevel == 7)
2316
+        {
2317
+            for (int list = 0; list < m_slice->isInterB() + 1; list++)
2318
+            {
2319
+                m_modeFlag[list] = true;
2320
+                if (parentCTU.m_skipFlag[list][cuGeom.absPartIdx] == 1 && cuGeom.numPartitions <= 16)
2321
+                    m_checkMergeAndSkipOnly[list] = true;
2322
+            }
2323
+            m_param->rdLevel > 4 ? compressInterCU_rd5_6(parentCTU, cuGeom, qp) : compressInterCU_rd0_4(parentCTU, cuGeom, qp);
2324
+            for (int list = 0; list < m_slice->isInterB() + 1; list++)
2325
+            {
2326
+                m_modeFlag[list] = false;
2327
+                m_checkMergeAndSkipOnly[list] = false;
2328
+            }
2329
+        }
2330
 
2331
-        if (m_param->interRefine && parentCTU.m_predMode[cuGeom.absPartIdx] == MODE_SKIP  && !mode.cu.isSkipped(0))
2332
+        if (m_param->interRefine > 1 || (m_param->interRefine && parentCTU.m_predMode[cuGeom.absPartIdx] == MODE_SKIP  && !mode.cu.isSkipped(0)))
2333
         {
2334
             m_evaluateInter = 1;
2335
             m_param->rdLevel > 4 ? compressInterCU_rd5_6(parentCTU, cuGeom, qp) : compressInterCU_rd0_4(parentCTU, cuGeom, qp);
2336
+            m_evaluateInter = 0;
2337
         }
2338
     }
2339
     if (!bDecidedDepth || split)
2340
@@ -2369,7 +2558,7 @@
2341
                 if (m_slice->m_pps->bUseDQP && nextDepth <= m_slice->m_pps->maxCuDQPDepth)
2342
                     nextQP = setLambdaFromQP(parentCTU, calculateQpforCuSize(parentCTU, childGeom));
2343
 
2344
-                int lamdaQP = m_param->analysisReuseLevel == 10 ? nextQP : lqp;
2345
+                int lamdaQP = (m_param->analysisReuseLevel >= 7) ? nextQP : lqp;
2346
 
2347
                 if (split)
2348
                     m_param->rdLevel > 4 ? compressInterCU_rd5_6(parentCTU, childGeom, nextQP) : compressInterCU_rd0_4(parentCTU, childGeom, nextQP);
2349
x265_2.5.tar.gz/source/encoder/analysis.h -> x265_2.6.tar.gz/source/encoder/analysis.h Changed
20
 
1
@@ -110,6 +110,9 @@
2
     bool      m_bChromaSa8d;
3
     bool      m_bHD;
4
 
5
+    bool      m_modeFlag[2];
6
+    bool      m_checkMergeAndSkipOnly[2];
7
+
8
     Analysis();
9
 
10
     bool create(ThreadLocalData* tld);
11
@@ -145,7 +148,7 @@
12
     void qprdRefine(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp, int32_t lqp);
13
 
14
     /* full analysis for an I-slice CU */
15
-    void compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp);
16
+    uint64_t compressIntraCU(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp);
17
 
18
     /* full analysis for a P or B slice CU */
19
     uint32_t compressInterCU_dist(const CUData& parentCTU, const CUGeom& cuGeom, int32_t qp);
20
x265_2.5.tar.gz/source/encoder/api.cpp -> x265_2.6.tar.gz/source/encoder/api.cpp Changed
600
 
1
@@ -30,7 +30,6 @@
2
 #include "level.h"
3
 #include "nal.h"
4
 #include "bitcost.h"
5
-#include "x265-extras.h"
6
 
7
 /* multilib namespace reflectors */
8
 #if LINKED_8BIT
9
@@ -63,6 +62,14 @@
10
 namespace X265_NS {
11
 #endif
12
 
13
+static const char* summaryCSVHeader =
14
+    "Command, Date/Time, Elapsed Time, FPS, Bitrate, "
15
+    "Y PSNR, U PSNR, V PSNR, Global PSNR, SSIM, SSIM (dB), "
16
+    "I count, I ave-QP, I kbps, I-PSNR Y, I-PSNR U, I-PSNR V, I-SSIM (dB), "
17
+    "P count, P ave-QP, P kbps, P-PSNR Y, P-PSNR U, P-PSNR V, P-SSIM (dB), "
18
+    "B count, B ave-QP, B kbps, B-PSNR Y, B-PSNR U, B-PSNR V, B-SSIM (dB), "
19
+    "MaxCLL, MaxFALL, Version\n";
20
+
21
 x265_encoder *x265_encoder_open(x265_param *p)
22
 {
23
     if (!p)
24
@@ -120,7 +127,7 @@
25
     /* Try to open CSV file handle */
26
     if (encoder->m_param->csvfn)
27
     {
28
-        encoder->m_param->csvfpt = x265_csvlog_open(*encoder->m_param, encoder->m_param->csvfn, encoder->m_param->csvLogLevel);
29
+        encoder->m_param->csvfpt = x265_csvlog_open(encoder->m_param);
30
         if (!encoder->m_param->csvfpt)
31
         {
32
             x265_log(encoder->m_param, X265_LOG_ERROR, "Unable to open CSV log file <%s>, aborting\n", encoder->m_param->csvfn);
33
@@ -188,7 +195,10 @@
34
 
35
     x265_param save;
36
     Encoder* encoder = static_cast<Encoder*>(enc);
37
-    if (encoder->m_reconfigure || encoder->m_reconfigureRc) /* Reconfigure in progress */
38
+    if (encoder->m_latestParam->forceFlush != param_in->forceFlush)
39
+        return encoder->reconfigureParam(encoder->m_latestParam, param_in);
40
+    bool isReconfigureRc = encoder->isReconfigureRc(encoder->m_latestParam, param_in);
41
+    if ((encoder->m_reconfigure && !isReconfigureRc) || (encoder->m_reconfigureRc && isReconfigureRc)) /* Reconfigure in progress */
42
         return 1;
43
     memcpy(&save, encoder->m_latestParam, sizeof(x265_param));
44
     int ret = encoder->reconfigureParam(encoder->m_latestParam, param_in);
45
@@ -205,16 +215,22 @@
46
             if (encoder->m_param->bRepeatHeaders)
47
             {
48
                 if (encoder->m_scalingList.parseScalingList(encoder->m_latestParam->scalingLists))
49
+                {
50
+                    memcpy(encoder->m_latestParam, &save, sizeof(x265_param));
51
                     return -1;
52
+                }
53
                 encoder->m_scalingList.setupQuantMatrices(encoder->m_param->internalCsp);
54
             }
55
             else
56
             {
57
                 x265_log(encoder->m_param, X265_LOG_ERROR, "Repeat headers is turned OFF, cannot reconfigure scalinglists\n");
58
+                memcpy(encoder->m_latestParam, &save, sizeof(x265_param));
59
                 return -1;
60
             }
61
         }
62
-        if (encoder->m_reconfigureRc)
63
+        if (!isReconfigureRc)
64
+            encoder->m_reconfigure = true;
65
+        else if (encoder->m_reconfigureRc)
66
         {
67
             VPS saveVPS;
68
             memcpy(&saveVPS.ptl, &encoder->m_vps.ptl, sizeof(saveVPS.ptl));
69
@@ -225,11 +241,11 @@
70
                 x265_log(encoder->m_param, X265_LOG_WARNING, "Profile/Level/Tier has changed from %d/%d/%s to %d/%d/%s.Cannot reconfigure rate-control.\n",
71
                          saveVPS.ptl.profileIdc, saveVPS.ptl.levelIdc, saveVPS.ptl.tierFlag ? "High" : "Main", encoder->m_vps.ptl.profileIdc,
72
                          encoder->m_vps.ptl.levelIdc, encoder->m_vps.ptl.tierFlag ? "High" : "Main");
73
+                memcpy(encoder->m_latestParam, &save, sizeof(x265_param));
74
+                memcpy(&encoder->m_vps.ptl, &saveVPS.ptl, sizeof(saveVPS.ptl));
75
                 encoder->m_reconfigureRc = false;
76
             }
77
         }
78
-        else
79
-            encoder->m_reconfigure = true;
80
         encoder->printReconfigureParams();
81
     }
82
     return ret;
83
@@ -248,7 +264,9 @@
84
     {
85
         numEncoded = encoder->encode(pic_in, pic_out);
86
     }
87
-    while (numEncoded == 0 && !pic_in && encoder->m_numDelayedPic);
88
+    while ((numEncoded == 0 && !pic_in && encoder->m_numDelayedPic && !encoder->m_latestParam->forceFlush) && !encoder->m_externalFlush);
89
+    if (numEncoded)
90
+        encoder->m_externalFlush = false;
91
 
92
     // do not allow reuse of these buffers for more than one picture. The
93
     // encoder now owns these analysisData buffers.
94
@@ -269,7 +287,7 @@
95
         *pi_nal = 0;
96
 
97
     if (numEncoded && encoder->m_param->csvLogLevel)
98
-        x265_csvlog_frame(encoder->m_param->csvfpt, *encoder->m_param, *pic_out, encoder->m_param->csvLogLevel);
99
+        x265_csvlog_frame(encoder->m_param, pic_out);
100
 
101
     if (numEncoded < 0)
102
         encoder->m_aborted = true;
103
@@ -292,11 +310,8 @@
104
     {
105
         Encoder *encoder = static_cast<Encoder*>(enc);
106
         x265_stats stats;
107
-        int padx = encoder->m_sps.conformanceWindow.rightOffset;
108
-        int pady = encoder->m_sps.conformanceWindow.bottomOffset;
109
         encoder->fetchStats(&stats, sizeof(stats));
110
-        const x265_api * api = x265_api_get(0);
111
-        x265_csvlog_encode(encoder->m_param->csvfpt, api->version_str, *encoder->m_param, padx, pady, stats, encoder->m_param->csvLogLevel, argc, argv);
112
+        x265_csvlog_encode(enc, &stats, argc, argv);
113
     }
114
 }
115
 
116
@@ -331,6 +346,37 @@
117
     return 0;
118
 }
119
 
120
+int x265_get_slicetype_poc_and_scenecut(x265_encoder *enc, int *slicetype, int *poc, int *sceneCut)
121
+{
122
+    if (!enc)
123
+        return -1;
124
+    Encoder *encoder = static_cast<Encoder*>(enc);
125
+    if (!encoder->copySlicetypePocAndSceneCut(slicetype, poc, sceneCut))
126
+        return 0;
127
+    return -1;
128
+}
129
+
130
+int x265_get_ref_frame_list(x265_encoder *enc, x265_picyuv** l0, x265_picyuv** l1, int sliceType, int poc)
131
+{
132
+    if (!enc)
133
+        return -1;
134
+
135
+    Encoder *encoder = static_cast<Encoder*>(enc);
136
+    return encoder->getRefFrameList((PicYuv**)l0, (PicYuv**)l1, sliceType, poc);
137
+}
138
+
139
+int x265_set_analysis_data(x265_encoder *enc, x265_analysis_data *analysis_data, int poc, uint32_t cuBytes)
140
+{
141
+    if (!enc)
142
+        return -1;
143
+
144
+    Encoder *encoder = static_cast<Encoder*>(enc);
145
+    if (!encoder->setAnalysisData(analysis_data, poc, cuBytes))
146
+        return 0;
147
+
148
+    return -1;
149
+}
150
+
151
 void x265_cleanup(void)
152
 {
153
     BitCost::destroy();
154
@@ -352,7 +398,7 @@
155
     pic->userSEI.payloads = NULL;
156
     pic->userSEI.numPayloads = 0;
157
 
158
-    if (param->analysisReuseMode)
159
+    if (param->analysisReuseMode || (param->bMVType == AVC_INFO))
160
     {
161
         uint32_t widthInCU = (param->sourceWidth + param->maxCUSize - 1) >> param->maxLog2CUSize;
162
         uint32_t heightInCU = (param->sourceHeight + param->maxCUSize - 1) >> param->maxLog2CUSize;
163
@@ -404,6 +450,13 @@
164
     sizeof(x265_frame_stats),
165
     &x265_encoder_intra_refresh,
166
     &x265_encoder_ctu_info,
167
+    &x265_get_slicetype_poc_and_scenecut,
168
+    &x265_get_ref_frame_list,
169
+    &x265_csvlog_open,
170
+    &x265_csvlog_frame,
171
+    &x265_csvlog_encode,
172
+    &x265_dither_image,
173
+    &x265_set_analysis_data
174
 };
175
 
176
 typedef const x265_api* (*api_get_func)(int bitDepth);
177
@@ -598,4 +651,422 @@
178
     return &libapi;
179
 }
180
 
181
+FILE* x265_csvlog_open(const x265_param* param)
182
+{
183
+    FILE *csvfp = x265_fopen(param->csvfn, "r");
184
+    if (csvfp)
185
+    {
186
+        /* file already exists, re-open for append */
187
+        fclose(csvfp);
188
+        return x265_fopen(param->csvfn, "ab");
189
+    }
190
+    else
191
+    {
192
+        /* new CSV file, write header */
193
+        csvfp = x265_fopen(param->csvfn, "wb");
194
+        if (csvfp)
195
+        {
196
+            if (param->csvLogLevel)
197
+            {
198
+                fprintf(csvfp, "Encode Order, Type, POC, QP, Bits, Scenecut, ");
199
+                if (param->csvLogLevel >= 2)
200
+                    fprintf(csvfp, "I/P cost ratio, ");
201
+                if (param->rc.rateControlMode == X265_RC_CRF)
202
+                    fprintf(csvfp, "RateFactor, ");
203
+                if (param->rc.vbvBufferSize)
204
+                    fprintf(csvfp, "BufferFill, ");
205
+                if (param->bEnablePsnr)
206
+                    fprintf(csvfp, "Y PSNR, U PSNR, V PSNR, YUV PSNR, ");
207
+                if (param->bEnableSsim)
208
+                    fprintf(csvfp, "SSIM, SSIM(dB), ");
209
+                fprintf(csvfp, "Latency, ");
210
+                fprintf(csvfp, "List 0, List 1");
211
+                uint32_t size = param->maxCUSize;
212
+                for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
213
+                {
214
+                    fprintf(csvfp, ", Intra %dx%d DC, Intra %dx%d Planar, Intra %dx%d Ang", size, size, size, size, size, size);
215
+                    size /= 2;
216
+                }
217
+                fprintf(csvfp, ", 4x4");
218
+                size = param->maxCUSize;
219
+                if (param->bEnableRectInter)
220
+                {
221
+                    for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
222
+                    {
223
+                        fprintf(csvfp, ", Inter %dx%d, Inter %dx%d (Rect)", size, size, size, size);
224
+                        if (param->bEnableAMP)
225
+                            fprintf(csvfp, ", Inter %dx%d (Amp)", size, size);
226
+                        size /= 2;
227
+                    }
228
+                }
229
+                else
230
+                {
231
+                    for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
232
+                    {
233
+                        fprintf(csvfp, ", Inter %dx%d", size, size);
234
+                        size /= 2;
235
+                    }
236
+                }
237
+                size = param->maxCUSize;
238
+                for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
239
+                {
240
+                    fprintf(csvfp, ", Skip %dx%d", size, size);
241
+                    size /= 2;
242
+                }
243
+                size = param->maxCUSize;
244
+                for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
245
+                {
246
+                    fprintf(csvfp, ", Merge %dx%d", size, size);
247
+                    size /= 2;
248
+                }
249
+
250
+                if (param->csvLogLevel >= 2)
251
+                {
252
+                    fprintf(csvfp, ", Avg Luma Distortion, Avg Chroma Distortion, Avg psyEnergy, Avg Residual Energy,"
253
+                        " Min Luma Level, Max Luma Level, Avg Luma Level");
254
+
255
+                    if (param->internalCsp != X265_CSP_I400)
256
+                        fprintf(csvfp, ", Min Cb Level, Max Cb Level, Avg Cb Level, Min Cr Level, Max Cr Level, Avg Cr Level");
257
+
258
+                    /* PU statistics */
259
+                    size = param->maxCUSize;
260
+                    for (uint32_t i = 0; i< param->maxLog2CUSize - (uint32_t)g_log2Size[param->minCUSize] + 1; i++)
261
+                    {
262
+                        fprintf(csvfp, ", Intra %dx%d", size, size);
263
+                        fprintf(csvfp, ", Skip %dx%d", size, size);
264
+                        fprintf(csvfp, ", AMP %d", size);
265
+                        fprintf(csvfp, ", Inter %dx%d", size, size);
266
+                        fprintf(csvfp, ", Merge %dx%d", size, size);
267
+                        fprintf(csvfp, ", Inter %dx%d", size, size / 2);
268
+                        fprintf(csvfp, ", Merge %dx%d", size, size / 2);
269
+                        fprintf(csvfp, ", Inter %dx%d", size / 2, size);
270
+                        fprintf(csvfp, ", Merge %dx%d", size / 2, size);
271
+                        size /= 2;
272
+                    }
273
+
274
+                    if ((uint32_t)g_log2Size[param->minCUSize] == 3)
275
+                        fprintf(csvfp, ", 4x4");
276
+
277
+                    /* detailed performance statistics */
278
+                    fprintf(csvfp, ", DecideWait (ms), Row0Wait (ms), Wall time (ms), Ref Wait Wall (ms), Total CTU time (ms),"
279
+                        "Stall Time (ms), Total frame time (ms), Avg WPP, Row Blocks");
280
+                }
281
+                fprintf(csvfp, "\n");
282
+            }
283
+            else
284
+                fputs(summaryCSVHeader, csvfp);
285
+        }
286
+        return csvfp;
287
+    }
288
+}
289
+
290
+// per frame CSV logging
291
+void x265_csvlog_frame(const x265_param* param, const x265_picture* pic)
292
+{
293
+    if (!param->csvfpt)
294
+        return;
295
+
296
+    const x265_frame_stats* frameStats = &pic->frameData;
297
+    fprintf(param->csvfpt, "%d, %c-SLICE, %4d, %2.2lf, %10d, %d,", frameStats->encoderOrder, frameStats->sliceType, frameStats->poc,
298
+                                                                   frameStats->qp, (int)frameStats->bits, frameStats->bScenecut);
299
+    if (param->csvLogLevel >= 2)
300
+        fprintf(param->csvfpt, "%.2f,", frameStats->ipCostRatio);
301
+    if (param->rc.rateControlMode == X265_RC_CRF)
302
+        fprintf(param->csvfpt, "%.3lf,", frameStats->rateFactor);
303
+    if (param->rc.vbvBufferSize)
304
+        fprintf(param->csvfpt, "%.3lf,", frameStats->bufferFill);
305
+    if (param->bEnablePsnr)
306
+        fprintf(param->csvfpt, "%.3lf, %.3lf, %.3lf, %.3lf,", frameStats->psnrY, frameStats->psnrU, frameStats->psnrV, frameStats->psnr);
307
+    if (param->bEnableSsim)
308
+        fprintf(param->csvfpt, " %.6f, %6.3f,", frameStats->ssim, x265_ssim2dB(frameStats->ssim));
309
+    fprintf(param->csvfpt, "%d, ", frameStats->frameLatency);
310
+    if (frameStats->sliceType == 'I' || frameStats->sliceType == 'i')
311
+        fputs(" -, -,", param->csvfpt);
312
+    else
313
+    {
314
+        int i = 0;
315
+        while (frameStats->list0POC[i] != -1)
316
+            fprintf(param->csvfpt, "%d ", frameStats->list0POC[i++]);
317
+        fprintf(param->csvfpt, ",");
318
+        if (frameStats->sliceType != 'P')
319
+        {
320
+            i = 0;
321
+            while (frameStats->list1POC[i] != -1)
322
+                fprintf(param->csvfpt, "%d ", frameStats->list1POC[i++]);
323
+            fprintf(param->csvfpt, ",");
324
+        }
325
+        else
326
+            fputs(" -,", param->csvfpt);
327
+    }
328
+
329
+    if (param->csvLogLevel)
330
+    {
331
+        for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
332
+            fprintf(param->csvfpt, "%5.2lf%%, %5.2lf%%, %5.2lf%%,", frameStats->cuStats.percentIntraDistribution[depth][0],
333
+                                                                    frameStats->cuStats.percentIntraDistribution[depth][1],
334
+                                                                    frameStats->cuStats.percentIntraDistribution[depth][2]);
335
+        fprintf(param->csvfpt, "%5.2lf%%", frameStats->cuStats.percentIntraNxN);
336
+        if (param->bEnableRectInter)
337
+        {
338
+            for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
339
+            {
340
+                fprintf(param->csvfpt, ", %5.2lf%%, %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0],
341
+                                                               frameStats->cuStats.percentInterDistribution[depth][1]);
342
+                if (param->bEnableAMP)
343
+                    fprintf(param->csvfpt, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][2]);
344
+            }
345
+        }
346
+        else
347
+        {
348
+            for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
349
+                fprintf(param->csvfpt, ", %5.2lf%%", frameStats->cuStats.percentInterDistribution[depth][0]);
350
+        }
351
+        for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
352
+            fprintf(param->csvfpt, ", %5.2lf%%", frameStats->cuStats.percentSkipCu[depth]);
353
+        for (uint32_t depth = 0; depth <= param->maxCUDepth; depth++)
354
+            fprintf(param->csvfpt, ", %5.2lf%%", frameStats->cuStats.percentMergeCu[depth]);
355
+    }
356
+
357
+    if (param->csvLogLevel >= 2)
358
+    {
359
+        fprintf(param->csvfpt, ", %.2lf, %.2lf, %.2lf, %.2lf ", frameStats->avgLumaDistortion,
360
+                                                                frameStats->avgChromaDistortion,
361
+                                                                frameStats->avgPsyEnergy,
362
+                                                                frameStats->avgResEnergy);
363
+
364
+        fprintf(param->csvfpt, ", %d, %d, %.2lf", frameStats->minLumaLevel, frameStats->maxLumaLevel, frameStats->avgLumaLevel);
365
+
366
+        if (param->internalCsp != X265_CSP_I400)
367
+        {
368
+            fprintf(param->csvfpt, ", %d, %d, %.2lf", frameStats->minChromaULevel, frameStats->maxChromaULevel, frameStats->avgChromaULevel);
369
+            fprintf(param->csvfpt, ", %d, %d, %.2lf", frameStats->minChromaVLevel, frameStats->maxChromaVLevel, frameStats->avgChromaVLevel);
370
+        }
371
+
372
+        for (uint32_t i = 0; i < param->maxLog2CUSize - (uint32_t)g_log2Size[param->minCUSize] + 1; i++)
373
+        {
374
+            fprintf(param->csvfpt, ", %.2lf%%", frameStats->puStats.percentIntraPu[i]);
375
+            fprintf(param->csvfpt, ", %.2lf%%", frameStats->puStats.percentSkipPu[i]);
376
+            fprintf(param->csvfpt, ",%.2lf%%", frameStats->puStats.percentAmpPu[i]);
377
+            for (uint32_t j = 0; j < 3; j++)
378
+            {
379
+                fprintf(param->csvfpt, ", %.2lf%%", frameStats->puStats.percentInterPu[i][j]);
380
+                fprintf(param->csvfpt, ", %.2lf%%", frameStats->puStats.percentMergePu[i][j]);
381
+            }
382
+        }
383
+        if ((uint32_t)g_log2Size[param->minCUSize] == 3)
384
+            fprintf(param->csvfpt, ",%.2lf%%", frameStats->puStats.percentNxN);
385
+
386
+        fprintf(param->csvfpt, ", %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf, %.1lf,", frameStats->decideWaitTime, frameStats->row0WaitTime,
387
+                                                                                     frameStats->wallTime, frameStats->refWaitWallTime,
388
+                                                                                     frameStats->totalCTUTime, frameStats->stallTime,
389
+                                                                                     frameStats->totalFrameTime);
390
+
391
+        fprintf(param->csvfpt, " %.3lf, %d", frameStats->avgWPP, frameStats->countRowBlocks);
392
+    }
393
+    fprintf(param->csvfpt, "\n");
394
+    fflush(stderr);
395
+}
396
+
397
+void x265_csvlog_encode(x265_encoder *enc, const x265_stats* stats, int argc, char** argv)
398
+{
399
+    if (enc)
400
+    {
401
+        Encoder *encoder = static_cast<Encoder*>(enc);
402
+        int padx = encoder->m_sps.conformanceWindow.rightOffset;
403
+        int pady = encoder->m_sps.conformanceWindow.bottomOffset;
404
+        const x265_api * api = x265_api_get(0);
405
+
406
+        if (!encoder->m_param->csvfpt)
407
+            return;
408
+
409
+        if (encoder->m_param->csvLogLevel)
410
+        {
411
+            // adding summary to a per-frame csv log file, so it needs a summary header
412
+            fprintf(encoder->m_param->csvfpt, "\nSummary\n");
413
+            fputs(summaryCSVHeader, encoder->m_param->csvfpt);
414
+        }
415
+
416
+        // CLI arguments or other
417
+        if (argc)
418
+        {
419
+            fputc('"', encoder->m_param->csvfpt);
420
+            for (int i = 1; i < argc; i++)
421
+            {
422
+                fputc(' ', encoder->m_param->csvfpt);
423
+                fputs(argv[i], encoder->m_param->csvfpt);
424
+            }
425
+            fputc('"', encoder->m_param->csvfpt);
426
+        }
427
+        else
428
+        {
429
+            const x265_param* paramTemp = encoder->m_param;
430
+            char *opts = x265_param2string((x265_param*)paramTemp, padx, pady);
431
+            if (opts)
432
+            {
433
+                fputc('"', encoder->m_param->csvfpt);
434
+                fputs(opts, encoder->m_param->csvfpt);
435
+                fputc('"', encoder->m_param->csvfpt);
436
+            }
437
+        }
438
+
439
+        // current date and time
440
+        time_t now;
441
+        struct tm* timeinfo;
442
+        time(&now);
443
+        timeinfo = localtime(&now);
444
+        char buffer[200];
445
+        strftime(buffer, 128, "%c", timeinfo);
446
+        fprintf(encoder->m_param->csvfpt, ", %s, ", buffer);
447
+
448
+        // elapsed time, fps, bitrate
449
+        fprintf(encoder->m_param->csvfpt, "%.2f, %.2f, %.2f,",
450
+            stats->elapsedEncodeTime, stats->encodedPictureCount / stats->elapsedEncodeTime, stats->bitrate);
451
+
452
+        if (encoder->m_param->bEnablePsnr)
453
+            fprintf(encoder->m_param->csvfpt, " %.3lf, %.3lf, %.3lf, %.3lf,",
454
+            stats->globalPsnrY / stats->encodedPictureCount, stats->globalPsnrU / stats->encodedPictureCount,
455
+            stats->globalPsnrV / stats->encodedPictureCount, stats->globalPsnr);
456
+        else
457
+            fprintf(encoder->m_param->csvfpt, " -, -, -, -,");
458
+        if (encoder->m_param->bEnableSsim)
459
+            fprintf(encoder->m_param->csvfpt, " %.6f, %6.3f,", stats->globalSsim, x265_ssim2dB(stats->globalSsim));
460
+        else
461
+            fprintf(encoder->m_param->csvfpt, " -, -,");
462
+
463
+        if (stats->statsI.numPics)
464
+        {
465
+            fprintf(encoder->m_param->csvfpt, " %-6u, %2.2lf, %-8.2lf,", stats->statsI.numPics, stats->statsI.avgQp, stats->statsI.bitrate);
466
+            if (encoder->m_param->bEnablePsnr)
467
+                fprintf(encoder->m_param->csvfpt, " %.3lf, %.3lf, %.3lf,", stats->statsI.psnrY, stats->statsI.psnrU, stats->statsI.psnrV);
468
+            else
469
+                fprintf(encoder->m_param->csvfpt, " -, -, -,");
470
+            if (encoder->m_param->bEnableSsim)
471
+                fprintf(encoder->m_param->csvfpt, " %.3lf,", stats->statsI.ssim);
472
+            else
473
+                fprintf(encoder->m_param->csvfpt, " -,");
474
+        }
475
+        else
476
+            fprintf(encoder->m_param->csvfpt, " -, -, -, -, -, -, -,");
477
+
478
+        if (stats->statsP.numPics)
479
+        {
480
+            fprintf(encoder->m_param->csvfpt, " %-6u, %2.2lf, %-8.2lf,", stats->statsP.numPics, stats->statsP.avgQp, stats->statsP.bitrate);
481
+            if (encoder->m_param->bEnablePsnr)
482
+                fprintf(encoder->m_param->csvfpt, " %.3lf, %.3lf, %.3lf,", stats->statsP.psnrY, stats->statsP.psnrU, stats->statsP.psnrV);
483
+            else
484
+                fprintf(encoder->m_param->csvfpt, " -, -, -,");
485
+            if (encoder->m_param->bEnableSsim)
486
+                fprintf(encoder->m_param->csvfpt, " %.3lf,", stats->statsP.ssim);
487
+            else
488
+                fprintf(encoder->m_param->csvfpt, " -,");
489
+        }
490
+        else
491
+            fprintf(encoder->m_param->csvfpt, " -, -, -, -, -, -, -,");
492
+
493
+        if (stats->statsB.numPics)
494
+        {
495
+            fprintf(encoder->m_param->csvfpt, " %-6u, %2.2lf, %-8.2lf,", stats->statsB.numPics, stats->statsB.avgQp, stats->statsB.bitrate);
496
+            if (encoder->m_param->bEnablePsnr)
497
+                fprintf(encoder->m_param->csvfpt, " %.3lf, %.3lf, %.3lf,", stats->statsB.psnrY, stats->statsB.psnrU, stats->statsB.psnrV);
498
+            else
499
+                fprintf(encoder->m_param->csvfpt, " -, -, -,");
500
+            if (encoder->m_param->bEnableSsim)
501
+                fprintf(encoder->m_param->csvfpt, " %.3lf,", stats->statsB.ssim);
502
+            else
503
+                fprintf(encoder->m_param->csvfpt, " -,");
504
+        }
505
+        else
506
+            fprintf(encoder->m_param->csvfpt, " -, -, -, -, -, -, -,");
507
+
508
+        fprintf(encoder->m_param->csvfpt, " %-6u, %-6u, %s\n", stats->maxCLL, stats->maxFALL, api->version_str);
509
+    }
510
+}
511
+
512
+/* The dithering algorithm is based on Sierra-2-4A error diffusion.
513
+ * We convert planes in place (without allocating a new buffer). */
514
+static void ditherPlane(uint16_t *src, int srcStride, int width, int height, int16_t *errors, int bitDepth)
515
+{
516
+    const int lShift = 16 - bitDepth;
517
+    const int rShift = 16 - bitDepth + 2;
518
+    const int half = (1 << (16 - bitDepth + 1));
519
+    const int pixelMax = (1 << bitDepth) - 1;
520
+
521
+    memset(errors, 0, (width + 1) * sizeof(int16_t));
522
+
523
+    if (bitDepth == 8)
524
+    {
525
+        for (int y = 0; y < height; y++, src += srcStride)
526
+        {
527
+            uint8_t* dst = (uint8_t *)src;
528
+            int16_t err = 0;
529
+            for (int x = 0; x < width; x++)
530
+            {
531
+                err = err * 2 + errors[x] + errors[x + 1];
532
+                int tmpDst = x265_clip3(0, pixelMax, ((src[x] << 2) + err + half) >> rShift);
533
+                errors[x] = err = (int16_t)(src[x] - (tmpDst << lShift));
534
+                dst[x] = (uint8_t)tmpDst;
535
+            }
536
+        }
537
+    }
538
+    else
539
+    {
540
+        for (int y = 0; y < height; y++, src += srcStride)
541
+        {
542
+            int16_t err = 0;
543
+            for (int x = 0; x < width; x++)
544
+            {
545
+                err = err * 2 + errors[x] + errors[x + 1];
546
+                int tmpDst = x265_clip3(0, pixelMax, ((src[x] << 2) + err + half) >> rShift);
547
+                errors[x] = err = (int16_t)(src[x] - (tmpDst << lShift));
548
+                src[x] = (uint16_t)tmpDst;
549
+            }
550
+        }
551
+    }
552
+}
553
+
554
+void x265_dither_image(x265_picture* picIn, int picWidth, int picHeight, int16_t *errorBuf, int bitDepth)
555
+{
556
+    const x265_api* api = x265_api_get(0);
557
+
558
+    if (sizeof(x265_picture) != api->sizeof_picture)
559
+    {
560
+        fprintf(stderr, "extras [error]: structure size skew, unable to dither\n");
561
+        return;
562
+    }
563
+
564
+    if (picIn->bitDepth <= 8)
565
+    {
566
+        fprintf(stderr, "extras [error]: dither support enabled only for input bitdepth > 8\n");
567
+        return;
568
+    }
569
+
570
+    if (picIn->bitDepth == bitDepth)
571
+    {
572
+        fprintf(stderr, "extras[error]: dither support enabled only if encoder depth is different from picture depth\n");
573
+        return;
574
+    }
575
+
576
+    /* This portion of code is from readFrame in x264. */
577
+    for (int i = 0; i < x265_cli_csps[picIn->colorSpace].planes; i++)
578
+    {
579
+        if (picIn->bitDepth < 16)
580
+        {
581
+            /* upconvert non 16bit high depth planes to 16bit */
582
+            uint16_t *plane = (uint16_t*)picIn->planes[i];
583
+            uint32_t pixelCount = x265_picturePlaneSize(picIn->colorSpace, picWidth, picHeight, i);
584
+            int lShift = 16 - picIn->bitDepth;
585
+
586
+            /* This loop assumes width is equal to stride which
587
+             * happens to be true for file reader outputs */
588
+            for (uint32_t j = 0; j < pixelCount; j++)
589
+                plane[j] = plane[j] << lShift;
590
+        }
591
+
592
+        int height = (int)(picHeight >> x265_cli_csps[picIn->colorSpace].height[i]);
593
+        int width = (int)(picWidth >> x265_cli_csps[picIn->colorSpace].width[i]);
594
+
595
+        ditherPlane(((uint16_t*)picIn->planes[i]), picIn->stride[i] / 2, width, height, errorBuf, bitDepth);
596
+    }
597
+}
598
+
599
 } /* end namespace or extern "C" */
600
x265_2.5.tar.gz/source/encoder/encoder.cpp -> x265_2.6.tar.gz/source/encoder/encoder.cpp Changed
748
 
1
@@ -48,6 +48,12 @@
2
 const char g_sliceTypeToChar[] = {'B', 'P', 'I'};
3
 }
4
 
5
+/* Threshold for motion vection, based on expermental result.
6
+ * TODO: come up an algorithm for adoptive threshold */
7
+
8
+#define MVTHRESHOLD 10
9
+#define PU_2Nx2N 1
10
+
11
 static const char* defaultAnalysisFileName = "x265_analysis.dat";
12
 
13
 using namespace X265_NS;
14
@@ -88,8 +94,8 @@
15
 
16
 #if ENABLE_HDR10_PLUS
17
     m_hdr10plus_api = hdr10plus_api_get();
18
-    numCimInfo = 0;
19
-    cim = NULL;
20
+    m_numCimInfo = 0;
21
+    m_cim = NULL;
22
 #endif
23
 
24
     m_prevTonemapPayload.payload = NULL;
25
@@ -386,9 +392,7 @@
26
             }
27
         }
28
     }
29
-
30
-    m_bZeroLatency = !m_param->bframes && !m_param->lookaheadDepth && m_param->frameNumThreads == 1;
31
-
32
+    m_bZeroLatency = !m_param->bframes && !m_param->lookaheadDepth && m_param->frameNumThreads == 1 && m_param->maxSlices == 1;
33
     m_aborted |= parseLambdaFile(m_param);
34
 
35
     m_encodeStartTime = x265_mdate();
36
@@ -396,6 +400,11 @@
37
     m_nalList.m_annexB = !!m_param->bAnnexB;
38
 
39
     m_emitCLLSEI = p->maxCLL || p->maxFALL;
40
+
41
+#if ENABLE_HDR10_PLUS
42
+    if (m_bToneMap)
43
+        m_numCimInfo = m_hdr10plus_api->hdr10plus_json_to_movie_cim(m_param->toneMapFile, m_cim);
44
+#endif
45
 }
46
 
47
 void Encoder::stopJobs()
48
@@ -424,10 +433,257 @@
49
     }
50
 }
51
 
52
+int Encoder::copySlicetypePocAndSceneCut(int *slicetype, int *poc, int *sceneCut)
53
+{
54
+    Frame *FramePtr = m_dpb->m_picList.getCurFrame();
55
+    if (FramePtr != NULL)
56
+    {
57
+        *slicetype = FramePtr->m_lowres.sliceType;
58
+        *poc = FramePtr->m_encData->m_slice->m_poc;
59
+        *sceneCut = FramePtr->m_lowres.bScenecut;
60
+    }
61
+    else
62
+    {
63
+        x265_log(NULL, X265_LOG_WARNING, "Frame is still in lookahead pipeline, this API must be called after (poc >= lookaheadDepth + bframes + 2) condition check\n");
64
+        return -1;
65
+    }
66
+    return 0;
67
+}
68
+
69
+int Encoder::getRefFrameList(PicYuv** l0, PicYuv** l1, int sliceType, int poc)
70
+{
71
+    if (!(IS_X265_TYPE_I(sliceType)))
72
+    {
73
+        Frame *framePtr = m_dpb->m_picList.getPOC(poc);
74
+        if (framePtr != NULL)
75
+        {
76
+            for (int j = 0; j < framePtr->m_encData->m_slice->m_numRefIdx[0]; j++)    // check only for --ref=n number of frames.
77
+            {
78
+                if (framePtr->m_encData->m_slice->m_refFrameList[0][j] && framePtr->m_encData->m_slice->m_refFrameList[0][j]->m_reconPic != NULL)
79
+                {
80
+                    int l0POC = framePtr->m_encData->m_slice->m_refFrameList[0][j]->m_poc;
81
+                    Frame* l0Fp = m_dpb->m_picList.getPOC(l0POC);
82
+                    if (l0Fp->m_reconPic->m_picOrg[0] == NULL)
83
+                        l0Fp->m_reconEncoded.wait(); /* If recon is not ready, current frame encoder need to wait. */
84
+                    l0[j] = l0Fp->m_reconPic;
85
+                }
86
+            }
87
+            for (int j = 0; j < framePtr->m_encData->m_slice->m_numRefIdx[1]; j++)    // check only for --ref=n number of frames.
88
+            {
89
+                if (framePtr->m_encData->m_slice->m_refFrameList[1][j] && framePtr->m_encData->m_slice->m_refFrameList[1][j]->m_reconPic != NULL)
90
+                {
91
+                    int l1POC = framePtr->m_encData->m_slice->m_refFrameList[1][j]->m_poc;
92
+                    Frame* l1Fp = m_dpb->m_picList.getPOC(l1POC);
93
+                    if (l1Fp->m_reconPic->m_picOrg[0] == NULL)
94
+                        l1Fp->m_reconEncoded.wait(); /* If recon is not ready, current frame encoder need to wait. */
95
+                    l1[j] = l1Fp->m_reconPic;
96
+                }
97
+            }
98
+        }
99
+        else
100
+            x265_log(NULL, X265_LOG_WARNING, "Refrence List is not in piclist\n");
101
+    }
102
+    else
103
+    {
104
+        x265_log(NULL, X265_LOG_ERROR, "I frames does not have a refrence List\n");
105
+        return -1;
106
+    }
107
+    return 0;
108
+}
109
+
110
+int Encoder::setAnalysisDataAfterZScan(x265_analysis_data *analysis_data, Frame* curFrame)
111
+{
112
+    int mbImageWidth, mbImageHeight;
113
+    mbImageWidth = (curFrame->m_fencPic->m_picWidth + 16 - 1) >> 4; //AVC block sizes
114
+    mbImageHeight = (curFrame->m_fencPic->m_picHeight + 16 - 1) >> 4;
115
+    if (analysis_data->sliceType == X265_TYPE_IDR || analysis_data->sliceType == X265_TYPE_I)
116
+    {
117
+        curFrame->m_analysisData.sliceType = X265_TYPE_I;
118
+        if (m_param->analysisReuseLevel < 7)
119
+            return -1;
120
+        curFrame->m_analysisData.numPartitions = m_param->num4x4Partitions;
121
+        int num16x16inCUWidth = m_param->maxCUSize >> 4;
122
+        uint32_t ctuAddr, offset, cuPos;
123
+        analysis_intra_data * intraData = (analysis_intra_data *)curFrame->m_analysisData.intraData;
124
+        analysis_intra_data * srcIntraData = (analysis_intra_data *)analysis_data->intraData;
125
+        for (int i = 0; i < mbImageHeight; i++)
126
+        {
127
+            for (int j = 0; j < mbImageWidth; j++)
128
+            {
129
+                int mbIndex = j + i * mbImageWidth;
130
+                ctuAddr = (j / num16x16inCUWidth + ((i / num16x16inCUWidth) * (mbImageWidth / num16x16inCUWidth)));
131
+                offset = ((i % num16x16inCUWidth) << 5) + ((j % num16x16inCUWidth) << 4);
132
+                if ((j % 4 >= 2) && m_param->maxCUSize == 64)
133
+                    offset += (2 * 16);
134
+                if ((i % 4 >= 2) && m_param->maxCUSize == 64)
135
+                    offset += (2 * 32);
136
+                cuPos = ctuAddr  * curFrame->m_analysisData.numPartitions + offset;
137
+                memcpy(&(intraData)->depth[cuPos], &(srcIntraData)->depth[mbIndex * 16], 16);
138
+                memcpy(&(intraData)->chromaModes[cuPos], &(srcIntraData)->chromaModes[mbIndex * 16], 16);
139
+                memcpy(&(intraData)->partSizes[cuPos], &(srcIntraData)->partSizes[mbIndex * 16], 16);
140
+                memcpy(&(intraData)->partSizes[cuPos], &(srcIntraData)->partSizes[mbIndex * 16], 16);
141
+            }
142
+        }
143
+        memcpy(&(intraData)->modes, (srcIntraData)->modes, curFrame->m_analysisData.numPartitions * analysis_data->numCUsInFrame);
144
+    }
145
+    else
146
+    {
147
+        uint32_t numDir = analysis_data->sliceType == X265_TYPE_P ? 1 : 2;
148
+        if (m_param->analysisReuseLevel < 7)
149
+            return -1;
150
+        curFrame->m_analysisData.numPartitions = m_param->num4x4Partitions;
151
+        int num16x16inCUWidth = m_param->maxCUSize >> 4;
152
+        uint32_t ctuAddr, offset, cuPos;
153
+        analysis_inter_data * interData = (analysis_inter_data *)curFrame->m_analysisData.interData;
154
+        analysis_inter_data * srcInterData = (analysis_inter_data*)analysis_data->interData;
155
+        for (int i = 0; i < mbImageHeight; i++)
156
+        {
157
+            for (int j = 0; j < mbImageWidth; j++)
158
+            {
159
+                int mbIndex = j + i * mbImageWidth;
160
+                ctuAddr = (j / num16x16inCUWidth + ((i / num16x16inCUWidth) * (mbImageWidth / num16x16inCUWidth)));
161
+                offset = ((i % num16x16inCUWidth) << 5) + ((j % num16x16inCUWidth) << 4);
162
+                if ((j % 4 >= 2) && m_param->maxCUSize == 64)
163
+                    offset += (2 * 16);
164
+                if ((i % 4 >= 2) && m_param->maxCUSize == 64)
165
+                    offset += (2 * 32);
166
+                cuPos = ctuAddr  * curFrame->m_analysisData.numPartitions + offset;
167
+                memcpy(&(interData)->depth[cuPos], &(srcInterData)->depth[mbIndex * 16], 16);
168
+                memcpy(&(interData)->modes[cuPos], &(srcInterData)->modes[mbIndex * 16], 16);
169
+
170
+                memcpy(&(interData)->partSize[cuPos], &(srcInterData)->partSize[mbIndex * 16], 16);
171
+
172
+                int bytes = curFrame->m_analysisData.numPartitions >> ((srcInterData)->depth[mbIndex * 16] * 2);
173
+                int cuCount = 1;
174
+                if (bytes < 16)
175
+                    cuCount = 4;
176
+                for (int cuI = 0; cuI < cuCount; cuI++)
177
+                {
178
+                    int numPU = nbPartsTable[(srcInterData)->partSize[mbIndex * 16 + cuI * bytes]];
179
+                    for (int pu = 0; pu < numPU; pu++)
180
+                    {
181
+                        int cuOffset = cuI * bytes + pu;
182
+                        (interData)->mergeFlag[cuPos + cuOffset] = (srcInterData)->mergeFlag[(mbIndex * 16) + cuOffset];
183
+
184
+                        (interData)->interDir[cuPos + cuOffset] = (srcInterData)->interDir[(mbIndex * 16) + cuOffset];
185
+                        for (uint32_t k = 0; k < numDir; k++)
186
+                        {
187
+                            (interData)->mvpIdx[k][cuPos + cuOffset] = (srcInterData)->mvpIdx[k][(mbIndex * 16) + cuOffset];
188
+                            (interData)->refIdx[k][cuPos + cuOffset] = (srcInterData)->refIdx[k][(mbIndex * 16) + cuOffset];
189
+                            memcpy(&(interData)->mv[k][cuPos + cuOffset], &(srcInterData)->mv[k][(mbIndex * 16) + cuOffset], sizeof(MV));
190
+                            if (m_param->analysisReuseLevel == 7)
191
+                            {
192
+                                int mv_x = ((analysis_inter_data *)curFrame->m_analysisData.interData)->mv[k][(mbIndex * 16) + cuOffset].x;
193
+                                int mv_y = ((analysis_inter_data *)curFrame->m_analysisData.interData)->mv[k][(mbIndex * 16) + cuOffset].y;
194
+                                double mv = sqrt(mv_x*mv_x + mv_y*mv_y);
195
+                                if (numPU == PU_2Nx2N && ((srcInterData)->depth[cuPos + cuOffset] == (m_param->maxCUSize >> 5)) && mv <= MVTHRESHOLD)
196
+                                    memset(&curFrame->m_analysisData.modeFlag[k][cuPos + cuOffset], 1, bytes);
197
+                            }
198
+                        }
199
+                    }
200
+                }
201
+            }
202
+        }
203
+    }
204
+    return 0;
205
+}
206
+
207
+int Encoder::setAnalysisData(x265_analysis_data *analysis_data, int poc, uint32_t cuBytes)
208
+{
209
+    uint32_t widthInCU = (m_param->sourceWidth + m_param->maxCUSize - 1) >> m_param->maxLog2CUSize;
210
+    uint32_t heightInCU = (m_param->sourceHeight + m_param->maxCUSize - 1) >> m_param->maxLog2CUSize;
211
+
212
+    Frame* curFrame = m_dpb->m_picList.getPOC(poc);
213
+    if (curFrame != NULL)
214
+    {
215
+        curFrame->m_analysisData = (*analysis_data);
216
+        curFrame->m_analysisData.numCUsInFrame = widthInCU * heightInCU;
217
+        curFrame->m_analysisData.numPartitions = m_param->num4x4Partitions;
218
+        allocAnalysis(&curFrame->m_analysisData);
219
+        if (m_param->maxCUSize == 16)
220
+        {
221
+            if (analysis_data->sliceType == X265_TYPE_IDR || analysis_data->sliceType == X265_TYPE_I)
222
+            {
223
+                curFrame->m_analysisData.sliceType = X265_TYPE_I;
224
+                if (m_param->analysisReuseLevel < 2)
225
+                    return -1;
226
+
227
+                curFrame->m_analysisData.numPartitions = m_param->num4x4Partitions;
228
+                size_t count = 0;
229
+                analysis_intra_data * currIntraData = (analysis_intra_data *)curFrame->m_analysisData.intraData;
230
+                analysis_intra_data * intraData = (analysis_intra_data *)analysis_data->intraData;
231
+                for (uint32_t d = 0; d < cuBytes; d++)
232
+                {
233
+                    int bytes = curFrame->m_analysisData.numPartitions >> ((intraData)->depth[d] * 2);
234
+                    memset(&(currIntraData)->depth[count], (intraData)->depth[d], bytes);
235
+                    memset(&(currIntraData)->chromaModes[count], (intraData)->chromaModes[d], bytes);
236
+                    memset(&(currIntraData)->partSizes[count], (intraData)->partSizes[d], bytes);
237
+                    memset(&(currIntraData)->partSizes[count], (intraData)->partSizes[d], bytes);
238
+                    count += bytes;
239
+                }
240
+                memcpy(&(currIntraData)->modes, (intraData)->modes, curFrame->m_analysisData.numPartitions * analysis_data->numCUsInFrame);
241
+            }
242
+            else
243
+            {
244
+                uint32_t numDir = analysis_data->sliceType == X265_TYPE_P ? 1 : 2;
245
+                if (m_param->analysisReuseLevel < 2)
246
+                    return -1;
247
+
248
+                curFrame->m_analysisData.numPartitions = m_param->num4x4Partitions;
249
+                size_t count = 0;
250
+                analysis_inter_data * currInterData = (analysis_inter_data *)curFrame->m_analysisData.interData;
251
+                analysis_inter_data * interData = (analysis_inter_data *)analysis_data->interData;
252
+                for (uint32_t d = 0; d < cuBytes; d++)
253
+                {
254
+                    int bytes = curFrame->m_analysisData.numPartitions >> ((interData)->depth[d] * 2);
255
+                    memset(&(currInterData)->depth[count], (interData)->depth[d], bytes);
256
+                    memset(&(currInterData)->modes[count], (interData)->modes[d], bytes);
257
+                    memcpy(&(currInterData)->sadCost[count], &((analysis_inter_data*)analysis_data->interData)->sadCost[d], bytes);
258
+                    if (m_param->analysisReuseLevel > 4)
259
+                    {
260
+                        memset(&(currInterData)->partSize[count], (interData)->partSize[d], bytes);
261
+                        int numPU = nbPartsTable[(currInterData)->partSize[d]];
262
+                        for (int pu = 0; pu < numPU; pu++, d++)
263
+                        {
264
+                            (currInterData)->mergeFlag[count + pu] = (interData)->mergeFlag[d];
265
+                            if (m_param->analysisReuseLevel >= 7)
266
+                            {
267
+                                (currInterData)->interDir[count + pu] = (interData)->interDir[d];
268
+                                for (uint32_t i = 0; i < numDir; i++)
269
+                                {
270
+                                    (currInterData)->mvpIdx[i][count + pu] = (interData)->mvpIdx[i][d];
271
+                                    (currInterData)->refIdx[i][count + pu] = (interData)->refIdx[i][d];
272
+                                    memcpy(&(currInterData)->mv[i][count + pu], &(interData)->mv[i][d], sizeof(MV));
273
+                                    if (m_param->analysisReuseLevel == 7)
274
+                                    {
275
+                                        int mv_x = ((analysis_inter_data *)curFrame->m_analysisData.interData)->mv[i][count + pu].x;
276
+                                        int mv_y = ((analysis_inter_data *)curFrame->m_analysisData.interData)->mv[i][count + pu].y;
277
+                                        double mv = sqrt(mv_x*mv_x + mv_y*mv_y);
278
+                                        if (numPU == PU_2Nx2N && m_param->num4x4Partitions <= 16 && mv <= MVTHRESHOLD)
279
+                                            memset(&curFrame->m_analysisData.modeFlag[i][count + pu], 1, bytes);
280
+                                    }
281
+                                }
282
+                            }
283
+                        }
284
+                    }
285
+                    count += bytes;
286
+                }
287
+            }
288
+        }
289
+        else
290
+            setAnalysisDataAfterZScan(analysis_data, curFrame);
291
+
292
+        curFrame->m_copyMVType.trigger();
293
+        return 0;
294
+    }
295
+    return -1;
296
+}
297
+
298
 void Encoder::destroy()
299
 {
300
 #if ENABLE_HDR10_PLUS
301
-    m_hdr10plus_api->hdr10plus_clear_movie(cim, numCimInfo);
302
+    if (m_bToneMap)
303
+        m_hdr10plus_api->hdr10plus_clear_movie(m_cim, m_numCimInfo);
304
 #endif
305
         
306
     if (m_exportedPic)
307
@@ -603,24 +859,34 @@
308
     }
309
     if (pic_in)
310
     {
311
+        if (m_latestParam->forceFlush == 1)
312
+        {
313
+            m_lookahead->setLookaheadQueue();
314
+            m_latestParam->forceFlush = 0;
315
+        }
316
+        if (m_latestParam->forceFlush == 2)
317
+        {
318
+            m_lookahead->m_filled = false;
319
+            m_latestParam->forceFlush = 0;
320
+        }
321
+
322
         x265_sei_payload toneMap;
323
         toneMap.payload = NULL;
324
 #if ENABLE_HDR10_PLUS
325
         if (m_bToneMap)
326
         {
327
-            if (pic_in->poc == 0)
328
-                numCimInfo = m_hdr10plus_api->hdr10plus_json_to_movie_cim(m_param->toneMapFile, cim);
329
-            if (pic_in->poc < numCimInfo)
330
+            int currentPOC = m_pocLast + 1;
331
+            if (currentPOC < m_numCimInfo)
332
             {
333
                 int32_t i = 0;
334
                 toneMap.payloadSize = 0;
335
-                while (cim[pic_in->poc][i] == 0xFF)
336
-                    toneMap.payloadSize += cim[pic_in->poc][i++];
337
-                toneMap.payloadSize += cim[pic_in->poc][i++];
338
+                while (m_cim[currentPOC][i] == 0xFF)
339
+                    toneMap.payloadSize += m_cim[currentPOC][i++];
340
+                toneMap.payloadSize += m_cim[currentPOC][i];
341
 
342
                 toneMap.payload = (uint8_t*)x265_malloc(sizeof(uint8_t) * toneMap.payloadSize);
343
                 toneMap.payloadType = USER_DATA_REGISTERED_ITU_T_T35;
344
-                memcpy(toneMap.payload, cim[pic_in->poc] + i, toneMap.payloadSize);
345
+                memcpy(toneMap.payload, &m_cim[currentPOC][i+1], toneMap.payloadSize);
346
             }
347
         }
348
 #endif
349
@@ -714,7 +980,12 @@
350
 
351
         if (inFrame->m_userSEI.numPayloads)
352
         {
353
-            inFrame->m_userSEI.payloads = new x265_sei_payload[numPayloads];
354
+            if (!inFrame->m_userSEI.payloads)
355
+            {
356
+                inFrame->m_userSEI.payloads = new x265_sei_payload[numPayloads];
357
+                for (int i = 0; i < numPayloads; i++)
358
+                    inFrame->m_userSEI.payloads[i].payload = NULL;
359
+            }
360
             for (int i = 0; i < numPayloads; i++)
361
             {
362
                 x265_sei_payload input;
363
@@ -724,7 +995,8 @@
364
                     input = pic_in->userSEI.payloads[i];
365
                 int size = inFrame->m_userSEI.payloads[i].payloadSize = input.payloadSize;
366
                 inFrame->m_userSEI.payloads[i].payloadType = input.payloadType;
367
-                inFrame->m_userSEI.payloads[i].payload = new uint8_t[size];
368
+                if (!inFrame->m_userSEI.payloads[i].payload)
369
+                    inFrame->m_userSEI.payloads[i].payload = new uint8_t[size];
370
                 memcpy(inFrame->m_userSEI.payloads[i].payload, input.payload, size);
371
             }
372
             if (toneMap.payload)
373
@@ -768,9 +1040,22 @@
374
         {
375
             /* readAnalysisFile reads analysis data for the frame and allocates memory based on slicetype */
376
             readAnalysisFile(&inFrame->m_analysisData, inFrame->m_poc, pic_in);
377
+            inFrame->m_poc = inFrame->m_analysisData.poc;
378
             sliceType = inFrame->m_analysisData.sliceType;
379
             inFrame->m_lowres.bScenecut = !!inFrame->m_analysisData.bScenecut;
380
             inFrame->m_lowres.satdCost = inFrame->m_analysisData.satdCost;
381
+            if (m_param->bDisableLookahead)
382
+            {
383
+                inFrame->m_lowres.sliceType = sliceType;
384
+                inFrame->m_lowres.bKeyframe = !!inFrame->m_analysisData.lookahead.keyframe;
385
+                inFrame->m_lowres.bLastMiniGopBFrame = !!inFrame->m_analysisData.lookahead.lastMiniGopBFrame;
386
+                int vbvCount = m_param->lookaheadDepth + m_param->bframes + 2;
387
+                for (int index = 0; index < vbvCount; index++)
388
+                {
389
+                    inFrame->m_lowres.plannedSatd[index] = inFrame->m_analysisData.lookahead.plannedSatd[index];
390
+                    inFrame->m_lowres.plannedType[index] = inFrame->m_analysisData.lookahead.plannedType[index];
391
+                }
392
+            }
393
         }
394
         if (m_param->bUseRcStats && pic_in->rcData)
395
         {
396
@@ -804,6 +1089,8 @@
397
         m_lookahead->addPicture(*inFrame, sliceType);
398
         m_numDelayedPic++;
399
     }
400
+    else if (m_latestParam->forceFlush == 2)
401
+        m_lookahead->m_filled = true;
402
     else
403
         m_lookahead->flush();
404
 
405
@@ -831,7 +1118,7 @@
406
             x265_frame_stats* frameData = NULL;
407
 
408
             /* Free up pic_in->analysisData since it has already been used */
409
-            if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD)
410
+            if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD || (m_param->bMVType && slice->m_sliceType != I_SLICE))
411
                 freeAnalysis(&outFrame->m_analysisData);
412
 
413
             if (pic_out)
414
@@ -862,12 +1149,45 @@
415
                     pic_out->analysisData.poc = pic_out->poc;
416
                     pic_out->analysisData.sliceType = pic_out->sliceType;
417
                     pic_out->analysisData.bScenecut = outFrame->m_lowres.bScenecut;
418
-                    pic_out->analysisData.satdCost  = outFrame->m_lowres.satdCost;                    
419
+                    pic_out->analysisData.satdCost  = outFrame->m_lowres.satdCost;
420
                     pic_out->analysisData.numCUsInFrame = outFrame->m_analysisData.numCUsInFrame;
421
                     pic_out->analysisData.numPartitions = outFrame->m_analysisData.numPartitions;
422
                     pic_out->analysisData.wt = outFrame->m_analysisData.wt;
423
                     pic_out->analysisData.interData = outFrame->m_analysisData.interData;
424
                     pic_out->analysisData.intraData = outFrame->m_analysisData.intraData;
425
+                    pic_out->analysisData.modeFlag[0] = outFrame->m_analysisData.modeFlag[0];
426
+                    pic_out->analysisData.modeFlag[1] = outFrame->m_analysisData.modeFlag[1];
427
+                    if (m_param->bDisableLookahead)
428
+                    {
429
+                        int factor = 1;
430
+                        if (m_param->scaleFactor)
431
+                            factor = m_param->scaleFactor * 2;
432
+                        pic_out->analysisData.numCuInHeight = outFrame->m_analysisData.numCuInHeight;
433
+                        pic_out->analysisData.lookahead.dts = outFrame->m_dts;
434
+                        pic_out->analysisData.satdCost *= factor;
435
+                        pic_out->analysisData.lookahead.keyframe = outFrame->m_lowres.bKeyframe;
436
+                        pic_out->analysisData.lookahead.lastMiniGopBFrame = outFrame->m_lowres.bLastMiniGopBFrame;
437
+                        int vbvCount = m_param->lookaheadDepth + m_param->bframes + 2;
438
+                        for (int index = 0; index < vbvCount; index++)
439
+                        {
440
+                            pic_out->analysisData.lookahead.plannedSatd[index] = outFrame->m_lowres.plannedSatd[index] * factor;
441
+                            pic_out->analysisData.lookahead.plannedType[index] = outFrame->m_lowres.plannedType[index];
442
+                        }
443
+                        for (uint32_t index = 0; index < pic_out->analysisData.numCuInHeight; index++)
444
+                        {
445
+                            outFrame->m_analysisData.lookahead.intraSatdForVbv[index] = outFrame->m_encData->m_rowStat[index].intraSatdForVbv * factor;
446
+                            outFrame->m_analysisData.lookahead.satdForVbv[index] = outFrame->m_encData->m_rowStat[index].satdForVbv * factor;
447
+                        }
448
+                        pic_out->analysisData.lookahead.intraSatdForVbv = outFrame->m_analysisData.lookahead.intraSatdForVbv;
449
+                        pic_out->analysisData.lookahead.satdForVbv = outFrame->m_analysisData.lookahead.satdForVbv;
450
+                        for (uint32_t index = 0; index < pic_out->analysisData.numCUsInFrame; index++)
451
+                        {
452
+                            outFrame->m_analysisData.lookahead.intraVbvCost[index] = outFrame->m_encData->m_cuStat[index].intraVbvCost * factor;
453
+                            outFrame->m_analysisData.lookahead.vbvCost[index] = outFrame->m_encData->m_cuStat[index].vbvCost * factor;
454
+                        }
455
+                        pic_out->analysisData.lookahead.intraVbvCost = outFrame->m_analysisData.lookahead.intraVbvCost;
456
+                        pic_out->analysisData.lookahead.vbvCost = outFrame->m_analysisData.lookahead.vbvCost;
457
+                    }
458
                     writeAnalysisFile(&pic_out->analysisData, *outFrame->m_encData);
459
                     if (m_param->bUseAnalysisFile)
460
                         freeAnalysis(&pic_out->analysisData);
461
@@ -1030,7 +1350,20 @@
462
                 slice->m_maxNumMergeCand = m_param->maxNumMergeCand;
463
                 slice->m_endCUAddr = slice->realEndAddress(m_sps.numCUsInFrame * m_param->num4x4Partitions);
464
             }
465
-
466
+            if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->bDisableLookahead)
467
+            {
468
+                frameEnc->m_dts = frameEnc->m_analysisData.lookahead.dts;
469
+                for (uint32_t index = 0; index < frameEnc->m_analysisData.numCuInHeight; index++)
470
+                {
471
+                    frameEnc->m_encData->m_rowStat[index].intraSatdForVbv = frameEnc->m_analysisData.lookahead.intraSatdForVbv[index];
472
+                    frameEnc->m_encData->m_rowStat[index].satdForVbv = frameEnc->m_analysisData.lookahead.satdForVbv[index];
473
+                }
474
+                for (uint32_t index = 0; index < frameEnc->m_analysisData.numCUsInFrame; index++)
475
+                {
476
+                    frameEnc->m_encData->m_cuStat[index].intraVbvCost = frameEnc->m_analysisData.lookahead.intraVbvCost[index];
477
+                    frameEnc->m_encData->m_cuStat[index].vbvCost = frameEnc->m_analysisData.lookahead.vbvCost[index];
478
+                }
479
+            }
480
             if (m_param->searchMethod == X265_SEA && frameEnc->m_lowres.sliceType != X265_TYPE_B)
481
             {
482
                 int padX = m_param->maxCUSize + 32;
483
@@ -1083,16 +1416,19 @@
484
             frameEnc->m_encData->m_slice->m_iNumRPSInSPS = m_sps.spsrpsNum;
485
 
486
             curEncoder->m_rce.encodeOrder = frameEnc->m_encodeOrder = m_encodedFrameNum++;
487
-            if (m_bframeDelay)
488
+            if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || !m_param->bDisableLookahead)
489
             {
490
-                int64_t *prevReorderedPts = m_prevReorderedPts;
491
-                frameEnc->m_dts = m_encodedFrameNum > m_bframeDelay
492
-                    ? prevReorderedPts[(m_encodedFrameNum - m_bframeDelay) % m_bframeDelay]
493
-                    : frameEnc->m_reorderedPts - m_bframeDelayTime;
494
-                prevReorderedPts[m_encodedFrameNum % m_bframeDelay] = frameEnc->m_reorderedPts;
495
+                if (m_bframeDelay)
496
+                {
497
+                    int64_t *prevReorderedPts = m_prevReorderedPts;
498
+                    frameEnc->m_dts = m_encodedFrameNum > m_bframeDelay
499
+                        ? prevReorderedPts[(m_encodedFrameNum - m_bframeDelay) % m_bframeDelay]
500
+                        : frameEnc->m_reorderedPts - m_bframeDelayTime;
501
+                    prevReorderedPts[m_encodedFrameNum % m_bframeDelay] = frameEnc->m_reorderedPts;
502
+                }
503
+                else
504
+                    frameEnc->m_dts = frameEnc->m_reorderedPts;
505
             }
506
-            else
507
-                frameEnc->m_dts = frameEnc->m_reorderedPts;
508
 
509
             /* Allocate analysis data before encode in save mode. This is allocated in frameEnc */
510
             if (m_param->analysisReuseMode == X265_ANALYSIS_SAVE)
511
@@ -1105,6 +1441,7 @@
512
 
513
                 uint32_t numCUsInFrame   = widthInCU * heightInCU;
514
                 analysis->numCUsInFrame  = numCUsInFrame;
515
+                analysis->numCuInHeight = heightInCU;
516
                 analysis->numPartitions  = m_param->num4x4Partitions;
517
                 allocAnalysis(analysis);
518
             }
519
@@ -1130,48 +1467,62 @@
520
 
521
 int Encoder::reconfigureParam(x265_param* encParam, x265_param* param)
522
 {
523
-    encParam->maxNumReferences = param->maxNumReferences; // never uses more refs than specified in stream headers
524
-    encParam->bEnableFastIntra = param->bEnableFastIntra;
525
-    encParam->bEnableEarlySkip = param->bEnableEarlySkip;
526
-    encParam->bEnableRecursionSkip = param->bEnableRecursionSkip;
527
-    encParam->searchMethod = param->searchMethod;
528
-    /* Scratch buffer prevents me_range from being increased for esa/tesa */
529
-    if (param->searchRange < encParam->searchRange)
530
-        encParam->searchRange = param->searchRange;
531
-    /* We can't switch out of subme=0 during encoding. */
532
-    if (encParam->subpelRefine)
533
-        encParam->subpelRefine = param->subpelRefine;
534
-    encParam->rdoqLevel = param->rdoqLevel;
535
-    encParam->rdLevel = param->rdLevel;
536
-    encParam->bEnableRectInter = param->bEnableRectInter;
537
-    encParam->maxNumMergeCand = param->maxNumMergeCand;
538
-    encParam->bIntraInBFrames = param->bIntraInBFrames;
539
-    if (param->scalingLists && !encParam->scalingLists)
540
-        encParam->scalingLists = strdup(param->scalingLists);
541
-    /* VBV can't be turned ON if it wasn't ON to begin with and can't be turned OFF if it was ON to begin with*/
542
-    if (param->rc.vbvMaxBitrate > 0 && param->rc.vbvBufferSize > 0 &&
543
-        encParam->rc.vbvMaxBitrate > 0 && encParam->rc.vbvBufferSize > 0)
544
-    {
545
-        m_reconfigureRc |= encParam->rc.vbvMaxBitrate != param->rc.vbvMaxBitrate;
546
-        m_reconfigureRc |= encParam->rc.vbvBufferSize != param->rc.vbvBufferSize;
547
-        if (m_reconfigureRc && m_param->bEmitHRDSEI)
548
-            x265_log(m_param, X265_LOG_WARNING, "VBV parameters cannot be changed when HRD is in use.\n");
549
-        else
550
-        {
551
-            encParam->rc.vbvMaxBitrate = param->rc.vbvMaxBitrate;
552
-            encParam->rc.vbvBufferSize = param->rc.vbvBufferSize;
553
+    if (isReconfigureRc(encParam, param))
554
+    {
555
+        /* VBV can't be turned ON if it wasn't ON to begin with and can't be turned OFF if it was ON to begin with*/
556
+        if (param->rc.vbvMaxBitrate > 0 && param->rc.vbvBufferSize > 0 &&
557
+            encParam->rc.vbvMaxBitrate > 0 && encParam->rc.vbvBufferSize > 0)
558
+        {
559
+            m_reconfigureRc |= encParam->rc.vbvMaxBitrate != param->rc.vbvMaxBitrate;
560
+            m_reconfigureRc |= encParam->rc.vbvBufferSize != param->rc.vbvBufferSize;
561
+            if (m_reconfigureRc && m_param->bEmitHRDSEI)
562
+                x265_log(m_param, X265_LOG_WARNING, "VBV parameters cannot be changed when HRD is in use.\n");
563
+            else
564
+            {
565
+                encParam->rc.vbvMaxBitrate = param->rc.vbvMaxBitrate;
566
+                encParam->rc.vbvBufferSize = param->rc.vbvBufferSize;
567
+            }
568
         }
569
+        m_reconfigureRc |= encParam->rc.bitrate != param->rc.bitrate;
570
+        encParam->rc.bitrate = param->rc.bitrate;
571
+        m_reconfigureRc |= encParam->rc.rfConstant != param->rc.rfConstant;
572
+        encParam->rc.rfConstant = param->rc.rfConstant;
573
     }
574
-    m_reconfigureRc |= encParam->rc.bitrate != param->rc.bitrate;
575
-    encParam->rc.bitrate = param->rc.bitrate;
576
-    m_reconfigureRc |= encParam->rc.rfConstant != param->rc.rfConstant;
577
-    encParam->rc.rfConstant = param->rc.rfConstant; 
578
-
579
+    else
580
+    {
581
+        encParam->maxNumReferences = param->maxNumReferences; // never uses more refs than specified in stream headers
582
+        encParam->bEnableFastIntra = param->bEnableFastIntra;
583
+        encParam->bEnableEarlySkip = param->bEnableEarlySkip;
584
+        encParam->bEnableRecursionSkip = param->bEnableRecursionSkip;
585
+        encParam->searchMethod = param->searchMethod;
586
+        /* Scratch buffer prevents me_range from being increased for esa/tesa */
587
+        if (param->searchRange < encParam->searchRange)
588
+            encParam->searchRange = param->searchRange;
589
+        /* We can't switch out of subme=0 during encoding. */
590
+        if (encParam->subpelRefine)
591
+            encParam->subpelRefine = param->subpelRefine;
592
+        encParam->rdoqLevel = param->rdoqLevel;
593
+        encParam->rdLevel = param->rdLevel;
594
+        encParam->bEnableRectInter = param->bEnableRectInter;
595
+        encParam->maxNumMergeCand = param->maxNumMergeCand;
596
+        encParam->bIntraInBFrames = param->bIntraInBFrames;
597
+        if (param->scalingLists && !encParam->scalingLists)
598
+            encParam->scalingLists = strdup(param->scalingLists);
599
+    }
600
+    encParam->forceFlush = param->forceFlush;
601
     /* To add: Loop Filter/deblocking controls, transform skip, signhide require PPS to be resent */
602
     /* To add: SAO, temporal MVP, AMP, TU depths require SPS to be resent, at every CVS boundary */
603
     return x265_check_params(encParam);
604
 }
605
 
606
+bool Encoder::isReconfigureRc(x265_param* latestParam, x265_param* param_in)
607
+{
608
+    return (latestParam->rc.vbvMaxBitrate != param_in->rc.vbvMaxBitrate
609
+        || latestParam->rc.vbvBufferSize != param_in->rc.vbvBufferSize
610
+        || latestParam->rc.bitrate != param_in->rc.bitrate
611
+        || latestParam->rc.rfConstant != param_in->rc.rfConstant);
612
+}
613
+
614
 void Encoder::copyCtuInfo(x265_ctu_info_t** frameCtuInfo, int poc)
615
 {
616
     uint32_t widthInCU = (m_param->sourceWidth + m_param->maxCUSize - 1) >> m_param->maxLog2CUSize;
617
@@ -2096,6 +2447,10 @@
618
 void Encoder::configure(x265_param *p)
619
 {
620
     this->m_param = p;
621
+    if (p->bMVType == AVC_INFO)
622
+        this->m_externalFlush = true;
623
+    else 
624
+        this->m_externalFlush = false;
625
     if (p->keyframeMax < 0)
626
     {
627
         /* A negative max GOP size indicates the user wants only one I frame at
628
@@ -2311,6 +2666,11 @@
629
             x265_log(p, X265_LOG_WARNING, "MV refinement requires analysis load, analysis-reuse-level 10, scale factor. Disabling MV refine.\n");
630
             p->mvRefine = 0;
631
         }
632
+        else if (p->interRefine >= 2)
633
+        {
634
+            x265_log(p, X265_LOG_WARNING, "MVs are recomputed when refine-inter >= 2. MV refinement not applicable. Disabling MV refine\n");
635
+            p->mvRefine = 0;
636
+        }
637
     }
638
 
639
     if ((p->analysisMultiPassRefine || p->analysisMultiPassDistortion) && (p->bDistributeModeAnalysis || p->bDistributeMotionEstimation))
640
@@ -2662,6 +3022,13 @@
641
 {
642
     X265_CHECK(analysis->sliceType, "invalid slice type\n");
643
     analysis->interData = analysis->intraData = NULL;
644
+    if (m_param->bDisableLookahead)
645
+    {
646
+        CHECKED_MALLOC_ZERO(analysis->lookahead.intraSatdForVbv, uint32_t, analysis->numCuInHeight);
647
+        CHECKED_MALLOC_ZERO(analysis->lookahead.satdForVbv, uint32_t, analysis->numCuInHeight);
648
+        CHECKED_MALLOC_ZERO(analysis->lookahead.intraVbvCost, uint32_t, analysis->numCUsInFrame);
649
+        CHECKED_MALLOC_ZERO(analysis->lookahead.vbvCost, uint32_t, analysis->numCUsInFrame);
650
+    }
651
     if (analysis->sliceType == X265_TYPE_IDR || analysis->sliceType == X265_TYPE_I)
652
     {
653
         if (m_param->analysisReuseLevel < 2)
654
@@ -2679,7 +3046,8 @@
655
     {
656
         int numDir = analysis->sliceType == X265_TYPE_P ? 1 : 2;
657
         uint32_t numPlanes = m_param->internalCsp == X265_CSP_I400 ? 1 : 3;
658
-        CHECKED_MALLOC_ZERO(analysis->wt, WeightParam, numPlanes * numDir);
659
+        if (!(m_param->bMVType == AVC_INFO))
660
+            CHECKED_MALLOC_ZERO(analysis->wt, WeightParam, numPlanes * numDir);
661
         if (m_param->analysisReuseLevel < 2)
662
             return;
663
 
664
@@ -2693,7 +3061,7 @@
665
             CHECKED_MALLOC(interData->mergeFlag, uint8_t, analysis->numPartitions * analysis->numCUsInFrame);
666
         }
667
 
668
-        if (m_param->analysisReuseLevel == 10)
669
+        if (m_param->analysisReuseLevel >= 7)
670
         {
671
             CHECKED_MALLOC(interData->interDir, uint8_t, analysis->numPartitions * analysis->numCUsInFrame);
672
             for (int dir = 0; dir < numDir; dir++)
673
@@ -2701,6 +3069,7 @@
674
                 CHECKED_MALLOC(interData->mvpIdx[dir], uint8_t, analysis->numPartitions * analysis->numCUsInFrame);
675
                 CHECKED_MALLOC(interData->refIdx[dir], int8_t, analysis->numPartitions * analysis->numCUsInFrame);
676
                 CHECKED_MALLOC(interData->mv[dir], MV, analysis->numPartitions * analysis->numCUsInFrame);
677
+                CHECKED_MALLOC(analysis->modeFlag[dir], uint8_t, analysis->numPartitions * analysis->numCUsInFrame);
678
             }
679
 
680
             /* Allocate intra in inter */
681
@@ -2727,8 +3096,15 @@
682
 
683
 void Encoder::freeAnalysis(x265_analysis_data* analysis)
684
 {
685
+    if (m_param->bDisableLookahead)
686
+    {
687
+        X265_FREE(analysis->lookahead.satdForVbv);
688
+        X265_FREE(analysis->lookahead.intraSatdForVbv);
689
+        X265_FREE(analysis->lookahead.vbvCost);
690
+        X265_FREE(analysis->lookahead.intraVbvCost);
691
+    }
692
     /* Early exit freeing weights alone if level is 1 (when there is no analysis inter/intra) */
693
-    if (analysis->sliceType > X265_TYPE_I && analysis->wt)
694
+    if (analysis->sliceType > X265_TYPE_I && analysis->wt && !(m_param->bMVType == AVC_INFO))
695
         X265_FREE(analysis->wt);
696
     if (m_param->analysisReuseLevel < 2)
697
         return;
698
@@ -2763,15 +3139,21 @@
699
                 X265_FREE(((analysis_inter_data*)analysis->interData)->mergeFlag);
700
                 X265_FREE(((analysis_inter_data*)analysis->interData)->partSize);
701
             }
702
-            if (m_param->analysisReuseLevel == 10)
703
+            if (m_param->analysisReuseLevel >= 7)
704
             {
705
                 X265_FREE(((analysis_inter_data*)analysis->interData)->interDir);
706
+                X265_FREE(((analysis_inter_data*)analysis->interData)->sadCost);
707
                 int numDir = analysis->sliceType == X265_TYPE_P ? 1 : 2;
708
                 for (int dir = 0; dir < numDir; dir++)
709
                 {
710
                     X265_FREE(((analysis_inter_data*)analysis->interData)->mvpIdx[dir]);
711
                     X265_FREE(((analysis_inter_data*)analysis->interData)->refIdx[dir]);
712
                     X265_FREE(((analysis_inter_data*)analysis->interData)->mv[dir]);
713
+                    if (analysis->modeFlag[dir] != NULL)
714
+                    {
715
+                        X265_FREE(analysis->modeFlag[dir]);
716
+                        analysis->modeFlag[dir] = NULL;
717
+                    }
718
                 }
719
             }
720
             else
721
@@ -2907,6 +3289,11 @@
722
     X265_FREAD(&analysis->satdCost, sizeof(int64_t), 1, m_analysisFile, &(picData->satdCost));
723
     X265_FREAD(&analysis->numCUsInFrame, sizeof(int), 1, m_analysisFile, &(picData->numCUsInFrame));
724
     X265_FREAD(&analysis->numPartitions, sizeof(int), 1, m_analysisFile, &(picData->numPartitions));
725
+    if (m_param->bDisableLookahead)
726
+    {
727
+        X265_FREAD(&analysis->numCuInHeight, sizeof(uint32_t), 1, m_analysisFile, &(picData->numCuInHeight));
728
+        X265_FREAD(&analysis->lookahead, sizeof(x265_lookahead_data), 1, m_analysisFile, &(picData->lookahead));
729
+    }
730
     int scaledNumPartition = analysis->numPartitions;
731
     int factor = 1 << m_param->scaleFactor;
732
 
733
@@ -2915,7 +3302,13 @@
734
 
735
     /* Memory is allocated for inter and intra analysis data based on the slicetype */
736
     allocAnalysis(analysis);
737
-
738
+    if (m_param->bDisableLookahead)
739
+    {
740
+        X265_FREAD(analysis->lookahead.intraVbvCost, sizeof(uint32_t), analysis->numCUsInFrame, m_analysisFile, picData->lookahead.intraVbvCost);
741
+        X265_FREAD(analysis->lookahead.vbvCost, sizeof(uint32_t), analysis->numCUsInFrame, m_analysisFile, picData->lookahead.vbvCost);
742
+        X265_FREAD(analysis->lookahead.satdForVbv, sizeof(uint32_t), analysis->numCuInHeight, m_analysisFile, picData->lookahead.satdForVbv);
743
+        X265_FREAD(analysis->lookahead.intraSatdForVbv, sizeof(uint32_t), analysis->numCuInHeight, m_analysisFile, picData->lookahead.intraSatdForVbv);
744
+    }
745
     if (analysis->sliceType == X265_TYPE_IDR || analysis->sliceType == X265_TYPE_I)
746
     {
747
         if (m_param->analysisReuseLevel < 2)
748
x265_2.5.tar.gz/source/encoder/encoder.h -> x265_2.6.tar.gz/source/encoder/encoder.h Changed
39
 
1
@@ -138,6 +138,7 @@
2
     RateControl*       m_rateControl;
3
     Lookahead*         m_lookahead;
4
 
5
+    bool               m_externalFlush;
6
     /* Collect statistics globally */
7
     EncStats           m_analyzeAll;
8
     EncStats           m_analyzeI;
9
@@ -178,8 +179,8 @@
10
 
11
 #ifdef ENABLE_HDR10_PLUS
12
     const hdr10plus_api     *m_hdr10plus_api;
13
-    uint8_t                 **cim;
14
-    int                     numCimInfo;
15
+    uint8_t                 **m_cim;
16
+    int                     m_numCimInfo;
17
 #endif
18
 
19
     x265_sei_payload        m_prevTonemapPayload;
20
@@ -201,8 +202,18 @@
21
 
22
     int reconfigureParam(x265_param* encParam, x265_param* param);
23
 
24
+    bool isReconfigureRc(x265_param* latestParam, x265_param* param_in);
25
+
26
     void copyCtuInfo(x265_ctu_info_t** frameCtuInfo, int poc);
27
 
28
+    int copySlicetypePocAndSceneCut(int *slicetype, int *poc, int *sceneCut);
29
+
30
+    int getRefFrameList(PicYuv** l0, PicYuv** l1, int sliceType, int poc);
31
+
32
+    int setAnalysisDataAfterZScan(x265_analysis_data *analysis_data, Frame* curFrame);
33
+
34
+    int setAnalysisData(x265_analysis_data *analysis_data, int poc, uint32_t cuBytes);
35
+
36
     void getStreamHeaders(NALList& list, Entropy& sbacCoder, Bitstream& bs);
37
 
38
     void fetchStats(x265_stats* stats, size_t statsSizeBytes);
39
x265_2.5.tar.gz/source/encoder/frameencoder.cpp -> x265_2.6.tar.gz/source/encoder/frameencoder.cpp Changed
579
 
1
@@ -88,6 +88,7 @@
2
     delete[] m_outStreams;
3
     delete[] m_backupStreams;
4
     X265_FREE(m_sliceBaseRow);
5
+    X265_FREE(m_sliceMaxBlockRow);
6
     X265_FREE(m_cuGeoms);
7
     X265_FREE(m_ctuGeomMap);
8
     X265_FREE(m_substreamSizes);
9
@@ -118,6 +119,40 @@
10
 
11
     m_sliceBaseRow = X265_MALLOC(uint32_t, m_param->maxSlices + 1);
12
     ok &= !!m_sliceBaseRow;
13
+    m_sliceGroupSize = (uint16_t)(m_numRows + m_param->maxSlices - 1) / m_param->maxSlices;
14
+    uint32_t sliceGroupSizeAccu = (m_numRows << 8) / m_param->maxSlices;    
15
+    uint32_t rowSum = sliceGroupSizeAccu;
16
+    uint32_t sidx = 0;
17
+    for (uint32_t i = 0; i < m_numRows; i++)
18
+    {
19
+        const uint32_t rowRange = (rowSum >> 8);
20
+        if ((i >= rowRange) & (sidx != m_param->maxSlices - 1))
21
+        {
22
+            rowSum += sliceGroupSizeAccu;
23
+            m_sliceBaseRow[++sidx] = i;
24
+        }        
25
+    }
26
+    X265_CHECK(sidx < m_param->maxSlices, "sliceID check failed!");
27
+    m_sliceBaseRow[0] = 0;
28
+    m_sliceBaseRow[m_param->maxSlices] = m_numRows;
29
+
30
+    m_sliceMaxBlockRow = X265_MALLOC(uint32_t, m_param->maxSlices + 1);
31
+    ok &= !!m_sliceMaxBlockRow;
32
+    uint32_t maxBlockRows = (m_param->sourceHeight + (16 - 1)) / 16;
33
+    sliceGroupSizeAccu = (maxBlockRows << 8) / m_param->maxSlices;
34
+    rowSum = sliceGroupSizeAccu;
35
+    sidx = 0;
36
+    for (uint32_t i = 0; i < maxBlockRows; i++)
37
+    {
38
+        const uint32_t rowRange = (rowSum >> 8);
39
+        if ((i >= rowRange) & (sidx != m_param->maxSlices - 1))
40
+        {
41
+            rowSum += sliceGroupSizeAccu;
42
+            m_sliceMaxBlockRow[++sidx] = i;
43
+        }
44
+    }
45
+    m_sliceMaxBlockRow[0] = 0;
46
+    m_sliceMaxBlockRow[m_param->maxSlices] = maxBlockRows;
47
 
48
     /* determine full motion search range */
49
     int range  = m_param->searchRange;       /* fpel search */
50
@@ -300,8 +335,15 @@
51
             while (!m_frame->m_ctuInfo)
52
                 m_frame->m_copied.wait();
53
         }
54
+        if ((m_param->bMVType == AVC_INFO) && !m_param->analysisReuseMode && !(IS_X265_TYPE_I(m_frame->m_lowres.sliceType)))
55
+        {
56
+            while (((m_frame->m_analysisData.interData == NULL && m_frame->m_analysisData.intraData == NULL) || (uint32_t)m_frame->m_poc != m_frame->m_analysisData.poc))
57
+                m_frame->m_copyMVType.wait();
58
+        }
59
         compressFrame();
60
         m_done.trigger(); /* FrameEncoder::getEncodedPicture() blocks for this event */
61
+        if (m_frame != NULL)
62
+            m_frame->m_reconEncoded.trigger();
63
         m_enable.wait();
64
     }
65
 }
66
@@ -341,6 +383,8 @@
67
     m_completionCount = 0;
68
     m_bAllRowsStop = false;
69
     m_vbvResetTriggerRow = -1;
70
+    m_rowSliceTotalBits[0] = 0;
71
+    m_rowSliceTotalBits[1] = 0;
72
 
73
     m_SSDY = m_SSDU = m_SSDV = 0;
74
     m_ssim = 0;
75
@@ -550,28 +594,13 @@
76
 
77
     /* reset entropy coders and compute slice id */
78
     m_entropyCoder.load(m_initSliceContext);
79
-    const uint32_t sliceGroupSize = (m_numRows + m_param->maxSlices - 1) / m_param->maxSlices;
80
-    const uint32_t sliceGroupSizeAccu = (m_numRows << 8) / m_param->maxSlices;
81
-    m_sliceGroupSize = (uint16_t)sliceGroupSize;
82
+   
83
+    for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++)   
84
+        for (uint32_t row = m_sliceBaseRow[sliceId]; row < m_sliceBaseRow[sliceId + 1]; row++)
85
+            m_rows[row].init(m_initSliceContext, sliceId);   
86
 
87
-    uint32_t rowSum = sliceGroupSizeAccu;
88
-    uint32_t sidx = 0;
89
-    for (uint32_t i = 0; i < m_numRows; i++)
90
-    {
91
-        const uint32_t rowRange = (rowSum >> 8);
92
-
93
-        if ((i >= rowRange) & (sidx != m_param->maxSlices - 1))
94
-        {
95
-            rowSum += sliceGroupSizeAccu;
96
-            m_sliceBaseRow[++sidx] = i;
97
-        }
98
-
99
-        m_rows[i].init(m_initSliceContext, sidx);
100
-    }
101
-    X265_CHECK(sidx < m_param->maxSlices, "sliceID check failed!");
102
-
103
-    m_sliceBaseRow[0] = 0;
104
-    m_sliceBaseRow[m_param->maxSlices] = m_numRows;
105
+    // reset slice counter for rate control update
106
+    m_sliceCnt = 0;
107
 
108
     uint32_t numSubstreams = m_param->bEnableWavefront ? slice->m_sps->numCuInHeight : m_param->maxSlices;
109
     X265_CHECK(m_param->bEnableWavefront || (m_param->maxSlices == 1), "Multiple slices without WPP unsupport now!");
110
@@ -586,8 +615,10 @@
111
                 m_rows[i].rowGoOnCoder.setBitstream(&m_outStreams[i]);
112
     }
113
     else
114
+    {
115
         for (uint32_t i = 0; i < numSubstreams; i++)
116
             m_outStreams[i].resetBits();
117
+    }
118
 
119
     int prevBPSEI = m_rce.encodeOrder ? m_top->m_lastBPSEI : 0;
120
 
121
@@ -697,9 +728,26 @@
122
      * compressed in a wave-front pattern if WPP is enabled. Row based loop
123
      * filters runs behind the CTU compression and reconstruction */
124
 
125
-    for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++)
126
-    {
127
+    for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++)    
128
         m_rows[m_sliceBaseRow[sliceId]].active = true;
129
+    
130
+    if (m_param->bEnableWavefront)
131
+    {
132
+        int i = 0;
133
+        for (uint32_t rowInSlice = 0; rowInSlice < m_sliceGroupSize; rowInSlice++)
134
+        {
135
+            for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++)
136
+            {
137
+                const uint32_t sliceStartRow = m_sliceBaseRow[sliceId];
138
+                const uint32_t sliceEndRow = m_sliceBaseRow[sliceId + 1] - 1;
139
+                const uint32_t row = sliceStartRow + rowInSlice;
140
+                if (row > sliceEndRow)
141
+                    continue;
142
+                m_row_to_idx[row] = i;
143
+                m_idx_to_row[i] = row;
144
+                i += 1;
145
+            }
146
+        }
147
     }
148
 
149
     if (m_param->bEnableWavefront)
150
@@ -735,11 +783,11 @@
151
                     }
152
                 }
153
 
154
-                enableRowEncoder(row); /* clear external dependency for this row */
155
+                enableRowEncoder(m_row_to_idx[row]); /* clear external dependency for this row */
156
                 if (!rowInSlice)
157
                 {
158
                     m_row0WaitTime = x265_mdate();
159
-                    enqueueRowEncoder(row); /* clear internal dependency, start wavefront */
160
+                    enqueueRowEncoder(m_row_to_idx[row]); /* clear internal dependency, start wavefront */
161
                 }
162
                 tryWakeOne();
163
             } // end of loop rowInSlice
164
@@ -964,9 +1012,8 @@
165
             // complete the slice header by writing WPP row-starts
166
             m_entropyCoder.setBitstream(&m_bs);
167
             if (slice->m_pps->bEntropyCodingSyncEnabled)
168
-            {
169
                 m_entropyCoder.codeSliceHeaderWPPEntryPoints(&m_substreamSizes[prevSliceRow], (nextSliceRow - prevSliceRow - 1), maxStreamSize);
170
-            }
171
+            
172
             m_bs.writeByteAlignment();
173
 
174
             m_nalList.serialize(slice->m_nalUnitType, m_bs);
175
@@ -1196,8 +1243,8 @@
176
     if (ATOMIC_INC(&m_activeWorkerCount) == 1 && m_stallStartTime)
177
         m_totalNoWorkerTime += x265_mdate() - m_stallStartTime;
178
 
179
-    const uint32_t realRow = row >> 1;
180
-    const uint32_t typeNum = row & 1;
181
+    const uint32_t realRow = m_idx_to_row[row >> 1];
182
+    const uint32_t typeNum = m_idx_to_row[row & 1];
183
 
184
     if (!typeNum)
185
         processRowEncoder(realRow, m_tld[threadId]);
186
@@ -1207,7 +1254,7 @@
187
 
188
         // NOTE: Active next row
189
         if (realRow != m_sliceBaseRow[m_rows[realRow].sliceId + 1] - 1)
190
-            enqueueRowFilter(realRow + 1);
191
+            enqueueRowFilter(m_row_to_idx[realRow + 1]);
192
     }
193
 
194
     if (ATOMIC_DEC(&m_activeWorkerCount) == 0)
195
@@ -1252,23 +1299,17 @@
196
     const uint32_t lineStartCUAddr = row * numCols;
197
     bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
198
 
199
+    const uint32_t sliceId = curRow.sliceId;
200
     uint32_t maxBlockCols = (m_frame->m_fencPic->m_picWidth + (16 - 1)) / 16;
201
-    uint32_t maxBlockRows = (m_frame->m_fencPic->m_picHeight + (16 - 1)) / 16;
202
     uint32_t noOfBlocks = m_param->maxCUSize / 16;
203
     const uint32_t bFirstRowInSlice = ((row == 0) || (m_rows[row - 1].sliceId != curRow.sliceId)) ? 1 : 0;
204
     const uint32_t bLastRowInSlice = ((row == m_numRows - 1) || (m_rows[row + 1].sliceId != curRow.sliceId)) ? 1 : 0;
205
-    const uint32_t sliceId = curRow.sliceId;
206
     const uint32_t endRowInSlicePlus1 = m_sliceBaseRow[sliceId + 1];
207
     const uint32_t rowInSlice = row - m_sliceBaseRow[sliceId];
208
 
209
-    if (bFirstRowInSlice && !curRow.completed)
210
-    {
211
-        // Load SBAC coder context from previous row and initialize row state.
212
-        //rowCoder.copyState(m_initSliceContext);
213
-        //rowCoder.loadContexts(m_rows[row - 1].bufferedEntropy);
214
-        rowCoder.load(m_initSliceContext);
215
-        //m_rows[row - 1].bufferedEntropy.loadContexts(m_initSliceContext);
216
-    }
217
+    // Load SBAC coder context from previous row and initialize row state.
218
+    if (bFirstRowInSlice && !curRow.completed)        
219
+        rowCoder.load(m_initSliceContext);     
220
 
221
     // calculate mean QP for consistent deltaQP signalling calculation
222
     if (m_param->bOptCUDeltaQP)
223
@@ -1279,15 +1320,12 @@
224
             if (m_param->bEnableWavefront || !row)
225
             {
226
                 double meanQPOff = 0;
227
-                uint32_t loopIncr, count = 0;
228
                 bool isReferenced = IS_REFERENCED(m_frame);
229
                 double *qpoffs = (isReferenced && m_param->rc.cuTree) ? m_frame->m_lowres.qpCuTreeOffset : m_frame->m_lowres.qpAqOffset;
230
                 if (qpoffs)
231
                 {
232
-                    if (m_param->rc.qgSize == 8)
233
-                        loopIncr = 8;
234
-                    else
235
-                        loopIncr = 16;
236
+                    uint32_t loopIncr = (m_param->rc.qgSize == 8) ? 8 : 16;
237
+
238
                     uint32_t cuYStart = 0, height = m_frame->m_fencPic->m_picHeight;
239
                     if (m_param->bEnableWavefront)
240
                     {
241
@@ -1297,6 +1335,7 @@
242
 
243
                     uint32_t qgSize = m_param->rc.qgSize, width = m_frame->m_fencPic->m_picWidth;
244
                     uint32_t maxOffsetCols = (m_frame->m_fencPic->m_picWidth + (loopIncr - 1)) / loopIncr;
245
+                    uint32_t count = 0;
246
                     for (uint32_t cuY = cuYStart; cuY < height && (cuY < m_frame->m_fencPic->m_picHeight); cuY += qgSize)
247
                     {
248
                         for (uint32_t cuX = 0; cuX < width; cuX += qgSize)
249
@@ -1328,9 +1367,7 @@
250
             }
251
             curRow.avgQPComputed = 1;
252
         }
253
-    }
254
-
255
-    // TODO: specially case handle on first and last row
256
+    }    
257
 
258
     // Initialize restrict on MV range in slices
259
     tld.analysis.m_sliceMinY = -(int16_t)(rowInSlice * m_param->maxCUSize * 4) + 3 * 4;
260
@@ -1359,16 +1396,16 @@
261
                 curRow.bufferedEntropy.copyState(rowCoder);
262
                 curRow.bufferedEntropy.loadContexts(rowCoder);
263
             }
264
-            if (!row && m_vbvResetTriggerRow != intRow)
265
+            if (bFirstRowInSlice && m_vbvResetTriggerRow != intRow)            
266
             {
267
                 curEncData.m_rowStat[row].rowQp = curEncData.m_avgQpRc;
268
                 curEncData.m_rowStat[row].rowQpScale = x265_qp2qScale(curEncData.m_avgQpRc);
269
             }
270
 
271
             FrameData::RCStatCU& cuStat = curEncData.m_cuStat[cuAddr];
272
-            if (m_param->bEnableWavefront && row >= col && row && m_vbvResetTriggerRow != intRow)
273
+            if (m_param->bEnableWavefront && rowInSlice >= col && !bFirstRowInSlice && m_vbvResetTriggerRow != intRow)
274
                 cuStat.baseQp = curEncData.m_cuStat[cuAddr - numCols + 1].baseQp;
275
-            else if (!m_param->bEnableWavefront && row && m_vbvResetTriggerRow != intRow)
276
+            else if (!m_param->bEnableWavefront && !bFirstRowInSlice && m_vbvResetTriggerRow != intRow)
277
                 cuStat.baseQp = curEncData.m_rowStat[row - 1].rowQp;
278
             else
279
                 cuStat.baseQp = curEncData.m_rowStat[row].rowQp;
280
@@ -1376,17 +1413,20 @@
281
             /* TODO: use defines from slicetype.h for lowres block size */
282
             uint32_t block_y = (ctu->m_cuPelY >> m_param->maxLog2CUSize) * noOfBlocks;
283
             uint32_t block_x = (ctu->m_cuPelX >> m_param->maxLog2CUSize) * noOfBlocks;
284
-            
285
-            cuStat.vbvCost = 0;
286
-            cuStat.intraVbvCost = 0;
287
-            for (uint32_t h = 0; h < noOfBlocks && block_y < maxBlockRows; h++, block_y++)
288
+            if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || !m_param->bDisableLookahead)
289
             {
290
-                uint32_t idx = block_x + (block_y * maxBlockCols);
291
+                cuStat.vbvCost = 0;
292
+                cuStat.intraVbvCost = 0;
293
 
294
-                for (uint32_t w = 0; w < noOfBlocks && (block_x + w) < maxBlockCols; w++, idx++)
295
+                for (uint32_t h = 0; h < noOfBlocks && block_y < m_sliceMaxBlockRow[sliceId + 1]; h++, block_y++)
296
                 {
297
-                    cuStat.vbvCost += m_frame->m_lowres.lowresCostForRc[idx] & LOWRES_COST_MASK;
298
-                    cuStat.intraVbvCost += m_frame->m_lowres.intraCost[idx];
299
+                    uint32_t idx = block_x + (block_y * maxBlockCols);
300
+
301
+                    for (uint32_t w = 0; w < noOfBlocks && (block_x + w) < maxBlockCols; w++, idx++)
302
+                    {
303
+                        cuStat.vbvCost += m_frame->m_lowres.lowresCostForRc[idx] & LOWRES_COST_MASK;
304
+                        cuStat.intraVbvCost += m_frame->m_lowres.intraCost[idx];
305
+                    }
306
                 }
307
             }
308
         }
309
@@ -1426,15 +1466,10 @@
310
         {
311
             // NOTE: in VBV mode, we may reencode anytime, so we can't do Deblock stage-Horizon and SAO
312
             if (!bIsVbv)
313
-            {
314
-                // TODO: Multiple Threading
315
-                // Delay ONE row to avoid Intra Prediction Conflict
316
+            {                
317
+                // Delay one row to avoid intra prediction conflict
318
                 if (m_pool && !bFirstRowInSlice)
319
-                {
320
-                    // Waitting last threading finish
321
-                    m_frameFilter.m_parallelFilter[row - 1].waitForExit();
322
-
323
-                    // Processing new group
324
+                {                    
325
                     int allowCol = col;
326
 
327
                     // avoid race condition on last column
328
@@ -1444,15 +1479,11 @@
329
                                                                   : m_frameFilter.m_parallelFilter[row - 2].m_lastCol.get()), (int)col);
330
                     }
331
                     m_frameFilter.m_parallelFilter[row - 1].m_allowedCol.set(allowCol);
332
-                    m_frameFilter.m_parallelFilter[row - 1].tryBondPeers(*this, 1);
333
                 }
334
 
335
                 // Last Row may start early
336
                 if (m_pool && bLastRowInSlice)
337
                 {
338
-                    // Waiting for the last thread to finish
339
-                    m_frameFilter.m_parallelFilter[row].waitForExit();
340
-
341
                     // Deblocking last row
342
                     int allowCol = col;
343
 
344
@@ -1463,7 +1494,6 @@
345
                                                                   : m_frameFilter.m_parallelFilter[row - 1].m_lastCol.get()), (int)col);
346
                     }
347
                     m_frameFilter.m_parallelFilter[row].m_allowedCol.set(allowCol);
348
-                    m_frameFilter.m_parallelFilter[row].tryBondPeers(*this, 1);
349
                 }
350
             } // end of !bIsVbv
351
         }
352
@@ -1479,7 +1509,7 @@
353
         FrameStats frameLog;
354
         curEncData.m_rowStat[row].sumQpAq += collectCTUStatistics(*ctu, &frameLog);
355
 
356
-        // copy no. of intra, inter Cu cnt per row into frame stats for 2 pass
357
+        // copy number of intra, inter cu per row into frame stats for 2 pass
358
         if (m_param->rc.bStatWrite)
359
         {
360
             curRow.rowStats.mvBits    += best.mvBits;
361
@@ -1492,10 +1522,8 @@
362
                 int shift = 2 * (m_param->maxCUDepth - depth);
363
                 int cuSize = m_param->maxCUSize >> depth;
364
 
365
-                if (cuSize == 8)
366
-                    curRow.rowStats.intra8x8Cnt += (int)(frameLog.cntIntra[depth] + frameLog.cntIntraNxN);
367
-                else
368
-                    curRow.rowStats.intra8x8Cnt += (int)(frameLog.cntIntra[depth] << shift);
369
+                curRow.rowStats.intra8x8Cnt += (cuSize == 8) ? (int)(frameLog.cntIntra[depth] + frameLog.cntIntraNxN) :
370
+                                                               (int)(frameLog.cntIntra[depth] << shift);
371
 
372
                 curRow.rowStats.inter8x8Cnt += (int)(frameLog.cntInter[depth] << shift);
373
                 curRow.rowStats.skip8x8Cnt += (int)((frameLog.cntSkipCu[depth] + frameLog.cntMergeCu[depth]) << shift);
374
@@ -1525,21 +1553,21 @@
375
         if (bIsVbv)
376
         {   
377
             // Update encoded bits, satdCost, baseQP for each CU if tune grain is disabled
378
-            if ((m_param->bEnableWavefront && (!cuAddr || !m_param->rc.bEnableConstVbv)) || !m_param->bEnableWavefront)
379
+            FrameData::RCStatCU& cuStat = curEncData.m_cuStat[cuAddr];    
380
+            if ((m_param->bEnableWavefront && ((cuAddr == m_sliceBaseRow[sliceId] * numCols) || !m_param->rc.bEnableConstVbv)) || !m_param->bEnableWavefront)
381
             {
382
-                curEncData.m_rowStat[row].rowSatd += curEncData.m_cuStat[cuAddr].vbvCost;
383
-                curEncData.m_rowStat[row].rowIntraSatd += curEncData.m_cuStat[cuAddr].intraVbvCost;
384
-                curEncData.m_rowStat[row].encodedBits += curEncData.m_cuStat[cuAddr].totalBits;
385
-                curEncData.m_rowStat[row].sumQpRc += curEncData.m_cuStat[cuAddr].baseQp;
386
+                curEncData.m_rowStat[row].rowSatd += cuStat.vbvCost;
387
+                curEncData.m_rowStat[row].rowIntraSatd += cuStat.intraVbvCost;
388
+                curEncData.m_rowStat[row].encodedBits += cuStat.totalBits;
389
+                curEncData.m_rowStat[row].sumQpRc += cuStat.baseQp;
390
                 curEncData.m_rowStat[row].numEncodedCUs = cuAddr;
391
             }
392
             
393
             // If current block is at row end checkpoint, call vbv ratecontrol.
394
-
395
             if (!m_param->bEnableWavefront && col == numCols - 1)
396
             {
397
                 double qpBase = curEncData.m_cuStat[cuAddr].baseQp;
398
-                int reEncode = m_top->m_rateControl->rowVbvRateControl(m_frame, row, &m_rce, qpBase);
399
+                int reEncode = m_top->m_rateControl->rowVbvRateControl(m_frame, row, &m_rce, qpBase, m_sliceBaseRow, sliceId);
400
                 qpBase = x265_clip3((double)m_param->rc.qpMin, (double)m_param->rc.qpMax, qpBase);
401
                 curEncData.m_rowStat[row].rowQp = qpBase;
402
                 curEncData.m_rowStat[row].rowQpScale = x265_qp2qScale(qpBase);
403
@@ -1564,18 +1592,17 @@
404
                     curEncData.m_rowStat[row].sumQpAq = 0;
405
                 }
406
             }
407
-
408
             // If current block is at row diagonal checkpoint, call vbv ratecontrol.
409
-
410
-            else if (m_param->bEnableWavefront && row == col && row)
411
+            else if (m_param->bEnableWavefront && rowInSlice == col && !bFirstRowInSlice)
412
             {
413
                 if (m_param->rc.bEnableConstVbv)
414
                 {
415
-                    int32_t startCuAddr = numCols * row;
416
-                    int32_t EndCuAddr = startCuAddr + col;
417
-                    for (int32_t r = row; r >= 0; r--)
418
+                    uint32_t startCuAddr = numCols * row;
419
+                    uint32_t EndCuAddr = startCuAddr + col;
420
+
421
+                    for (int32_t r = row; r >= (int32_t)m_sliceBaseRow[sliceId]; r--)
422
                     {
423
-                        for (int32_t c = startCuAddr; c <= EndCuAddr && c <= (int32_t)numCols * (r + 1) - 1; c++)
424
+                        for (uint32_t c = startCuAddr; c <= EndCuAddr && c <= numCols * (r + 1) - 1; c++)
425
                         {
426
                             curEncData.m_rowStat[r].rowSatd += curEncData.m_cuStat[c].vbvCost;
427
                             curEncData.m_rowStat[r].rowIntraSatd += curEncData.m_cuStat[c].intraVbvCost;
428
@@ -1588,10 +1615,10 @@
429
                     }
430
                 }
431
                 double qpBase = curEncData.m_cuStat[cuAddr].baseQp;
432
-                int reEncode = m_top->m_rateControl->rowVbvRateControl(m_frame, row, &m_rce, qpBase);
433
+                int reEncode = m_top->m_rateControl->rowVbvRateControl(m_frame, row, &m_rce, qpBase, m_sliceBaseRow, sliceId);
434
                 qpBase = x265_clip3((double)m_param->rc.qpMin, (double)m_param->rc.qpMax, qpBase);
435
                 curEncData.m_rowStat[row].rowQp = qpBase;
436
-                curEncData.m_rowStat[row].rowQpScale =  x265_qp2qScale(qpBase);
437
+                curEncData.m_rowStat[row].rowQpScale = x265_qp2qScale(qpBase);
438
 
439
                 if (reEncode < 0)
440
                 {
441
@@ -1602,7 +1629,7 @@
442
                     m_vbvResetTriggerRow = row;
443
                     m_bAllRowsStop = true;
444
 
445
-                    for (uint32_t r = m_numRows - 1; r >= row; r--)
446
+                    for (uint32_t r = m_sliceBaseRow[sliceId + 1] - 1; r >= row; r--)
447
                     {
448
                         CTURow& stopRow = m_rows[r];
449
 
450
@@ -1665,7 +1692,7 @@
451
                 m_rows[row + 1].completed + 2 <= curRow.completed)
452
             {
453
                 m_rows[row + 1].active = true;
454
-                enqueueRowEncoder(row + 1);
455
+                enqueueRowEncoder(m_row_to_idx[row + 1]);
456
                 tryWakeOne(); /* wake up a sleeping thread or set the help wanted flag */
457
             }
458
         }
459
@@ -1681,14 +1708,14 @@
460
         }
461
     }
462
 
463
-    /** this row of CTUs has been compressed **/
464
+    /* this row of CTUs has been compressed */
465
     if (m_param->bEnableWavefront && m_param->rc.bEnableConstVbv)
466
     {
467
-        if (row == m_numRows - 1)
468
+        if (bLastRowInSlice)       
469
         {
470
-            for (int32_t r = 0; r < (int32_t)m_numRows; r++)
471
+            for (uint32_t r = m_sliceBaseRow[sliceId]; r < m_sliceBaseRow[sliceId + 1]; r++)
472
             {
473
-                for (int32_t c = curEncData.m_rowStat[r].numEncodedCUs + 1; c < (int32_t)numCols * (r + 1); c++)
474
+                for (uint32_t c = curEncData.m_rowStat[r].numEncodedCUs + 1; c < numCols * (r + 1); c++)
475
                 {
476
                     curEncData.m_rowStat[r].rowSatd += curEncData.m_cuStat[c].vbvCost;
477
                     curEncData.m_rowStat[r].rowIntraSatd += curEncData.m_cuStat[c].intraVbvCost;
478
@@ -1706,26 +1733,42 @@
479
      * after half the frame is encoded, but after this initial period we update
480
      * after refLagRows (the number of rows reference frames must have completed
481
      * before referencees may begin encoding) */
482
-    uint32_t rowCount = 0;
483
     if (m_param->rc.rateControlMode == X265_RC_ABR || bIsVbv)
484
     {
485
+        uint32_t rowCount = 0;
486
+        uint32_t maxRows = m_sliceBaseRow[sliceId + 1] - m_sliceBaseRow[sliceId];
487
+
488
         if (!m_rce.encodeOrder)
489
-            rowCount = m_numRows - 1;
490
+            rowCount = maxRows - 1; 
491
         else if ((uint32_t)m_rce.encodeOrder <= 2 * (m_param->fpsNum / m_param->fpsDenom))
492
-            rowCount = X265_MIN((m_numRows + 1) / 2, m_numRows - 1);
493
+            rowCount = X265_MIN((maxRows + 1) / 2, maxRows - 1);
494
         else
495
-            rowCount = X265_MIN(m_refLagRows, m_numRows - 1);
496
-        if (row == rowCount)
497
+           rowCount = X265_MIN(m_refLagRows / m_param->maxSlices, maxRows - 1);
498
+
499
+        if (rowInSlice == rowCount)
500
         {
501
-            m_rce.rowTotalBits = 0;
502
+            m_rowSliceTotalBits[sliceId] = 0;
503
             if (bIsVbv)
504
-                for (uint32_t i = 0; i < rowCount; i++)
505
-                    m_rce.rowTotalBits += curEncData.m_rowStat[i].encodedBits;
506
+            {                
507
+                for (uint32_t i = m_sliceBaseRow[sliceId]; i < rowCount + m_sliceBaseRow[sliceId]; i++)
508
+                    m_rowSliceTotalBits[sliceId] += curEncData.m_rowStat[i].encodedBits;
509
+            }
510
             else
511
-                for (uint32_t cuAddr = 0; cuAddr < rowCount * numCols; cuAddr++)
512
-                    m_rce.rowTotalBits += curEncData.m_cuStat[cuAddr].totalBits;
513
-
514
-            m_top->m_rateControl->rateControlUpdateStats(&m_rce);
515
+            {
516
+                uint32_t startAddr = m_sliceBaseRow[sliceId] * numCols;
517
+               uint32_t finishAddr = startAddr + rowCount * numCols;
518
+                
519
+               for (uint32_t cuAddr = startAddr; cuAddr < finishAddr; cuAddr++)
520
+                    m_rowSliceTotalBits[sliceId] += curEncData.m_cuStat[cuAddr].totalBits;
521
+            }            
522
+
523
+            if (ATOMIC_INC(&m_sliceCnt) == (int)m_param->maxSlices)
524
+            {
525
+                m_rce.rowTotalBits = 0;
526
+                for (uint32_t i = 0; i < m_param->maxSlices; i++)
527
+                    m_rce.rowTotalBits += m_rowSliceTotalBits[i];
528
+                m_top->m_rateControl->rateControlUpdateStats(&m_rce);
529
+            }
530
         }
531
     }
532
 
533
@@ -1738,13 +1781,10 @@
534
     /* Processing left Deblock block with current threading */
535
     if ((m_param->bEnableLoopFilter | m_param->bEnableSAO) & (rowInSlice >= 2))
536
     {
537
-        /* TODO: Multiple Threading */
538
-
539
         /* Check conditional to start previous row process with current threading */
540
         if (m_frameFilter.m_parallelFilter[row - 2].m_lastDeblocked.get() == (int)numCols)
541
         {
542
             /* stop threading on current row and restart it */
543
-            m_frameFilter.m_parallelFilter[row - 1].waitForExit();
544
             m_frameFilter.m_parallelFilter[row - 1].m_allowedCol.set(numCols);
545
             m_frameFilter.m_parallelFilter[row - 1].processTasks(-1);
546
         }
547
@@ -1755,11 +1795,11 @@
548
     {
549
         if (rowInSlice >= m_filterRowDelay)
550
         {
551
-            enableRowFilter(row - m_filterRowDelay);
552
+            enableRowFilter(m_row_to_idx[row - m_filterRowDelay]);
553
 
554
             /* NOTE: Activate filter if first row (row 0) */
555
             if (rowInSlice == m_filterRowDelay)
556
-                enqueueRowFilter(row - m_filterRowDelay);
557
+                enqueueRowFilter(m_row_to_idx[row - m_filterRowDelay]);
558
             tryWakeOne();
559
         }
560
 
561
@@ -1767,7 +1807,7 @@
562
         {
563
             for (uint32_t i = endRowInSlicePlus1 - m_filterRowDelay; i < endRowInSlicePlus1; i++)
564
             {
565
-                enableRowFilter(i);
566
+                enableRowFilter(m_row_to_idx[i]);
567
             }
568
             tryWakeOne();
569
         }
570
@@ -1775,7 +1815,7 @@
571
         // handle specially case - single row slice
572
         if  (bFirstRowInSlice & bLastRowInSlice)
573
         {
574
-            enqueueRowFilter(row);
575
+            enqueueRowFilter(m_row_to_idx[row]);
576
             tryWakeOne();
577
         }
578
     }
579
x265_2.5.tar.gz/source/encoder/frameencoder.h -> x265_2.6.tar.gz/source/encoder/frameencoder.h Changed
22
 
1
@@ -138,6 +138,7 @@
2
     volatile bool            m_bAllRowsStop;
3
     volatile int             m_completionCount;
4
     volatile int             m_vbvResetTriggerRow;
5
+    volatile int             m_sliceCnt;
6
 
7
     uint32_t                 m_numRows;
8
     uint32_t                 m_numCols;
9
@@ -147,8 +148,10 @@
10
 
11
     CTURow*                  m_rows;
12
     uint16_t                 m_sliceAddrBits;
13
-    uint16_t                 m_sliceGroupSize;
14
-    uint32_t*                m_sliceBaseRow;
15
+    uint32_t                 m_sliceGroupSize;
16
+    uint32_t*                m_sliceBaseRow;    
17
+    uint32_t*                m_sliceMaxBlockRow;
18
+    int64_t                  m_rowSliceTotalBits[2];
19
     RateControlEntry         m_rce;
20
     SEIDecodedPictureHash    m_seiReconPictureDigest;
21
 
22
x265_2.5.tar.gz/source/encoder/framefilter.cpp -> x265_2.6.tar.gz/source/encoder/framefilter.cpp Changed
21
 
1
@@ -582,10 +582,7 @@
2
     CUData* ctu = encData.getPicCTU(m_parallelFilter[row].m_rowAddr);
3
 
4
     /* Processing left block Deblock with current threading */
5
-    {
6
-        /* stop threading on current row */
7
-        m_parallelFilter[row].waitForExit();
8
-
9
+    {        
10
         /* Check to avoid previous row process slower than current row */
11
         X265_CHECK(ctu->m_bFirstRowInSlice || m_parallelFilter[row - 1].m_lastDeblocked.get() == m_numCols, "previous row not finish");
12
 
13
@@ -618,7 +615,6 @@
14
     }
15
 
16
     // this row of CTUs has been encoded
17
-
18
     if (!ctu->m_bFirstRowInSlice)
19
         processPostRow(row - 1);
20
 
21
x265_2.5.tar.gz/source/encoder/framefilter.h -> x265_2.6.tar.gz/source/encoder/framefilter.h Changed
21
 
1
@@ -62,7 +62,7 @@
2
     void*         m_ssimBuf;        /* Temp storage for ssim computation */
3
 
4
 #define MAX_PFILTER_CUS     (4) /* maximum CUs for every thread */
5
-    class ParallelFilter : public BondedTaskGroup, public Deblock
6
+    class ParallelFilter : public Deblock
7
     {
8
     public:
9
         uint32_t            m_rowHeight;
10
@@ -104,10 +104,6 @@
11
         {
12
             return m_rowHeight;
13
         }
14
-
15
-    protected:
16
-
17
-        ParallelFilter operator=(const ParallelFilter&);
18
     };
19
 
20
     ParallelFilter*     m_parallelFilter;
21
x265_2.5.tar.gz/source/encoder/ratecontrol.cpp -> x265_2.6.tar.gz/source/encoder/ratecontrol.cpp Changed
253
 
1
@@ -218,6 +218,7 @@
2
     m_param->rc.vbvBufferSize = x265_clip3(0, 2000000, m_param->rc.vbvBufferSize);
3
     m_param->rc.vbvMaxBitrate = x265_clip3(0, 2000000, m_param->rc.vbvMaxBitrate);
4
     m_param->rc.vbvBufferInit = x265_clip3(0.0, 2000000.0, m_param->rc.vbvBufferInit);
5
+    m_param->vbvBufferEnd = x265_clip3(0.0, 2000000.0, m_param->vbvBufferEnd);
6
     m_singleFrameVbv = 0;
7
     m_rateTolerance = 1.0;
8
 
9
@@ -255,6 +256,11 @@
10
         m_param->rc.vbvMaxBitrate = 0;
11
     }
12
     m_isVbv = m_param->rc.vbvMaxBitrate > 0 && m_param->rc.vbvBufferSize > 0;
13
+    if (m_param->vbvBufferEnd && !m_isVbv)
14
+    {
15
+        x265_log(m_param, X265_LOG_WARNING, "vbv-end requires VBV parameters, ignored\n");
16
+        m_param->vbvBufferEnd = 0;
17
+    }
18
     if (m_param->bEmitHRDSEI && !m_isVbv)
19
     {
20
         x265_log(m_param, X265_LOG_WARNING, "NAL HRD parameters require VBV parameters, ignored\n");
21
@@ -339,6 +345,10 @@
22
 
23
         if (m_param->rc.vbvBufferInit > 1.)
24
             m_param->rc.vbvBufferInit = x265_clip3(0.0, 1.0, m_param->rc.vbvBufferInit / m_param->rc.vbvBufferSize);
25
+        if (m_param->vbvBufferEnd > 1.)
26
+            m_param->vbvBufferEnd = x265_clip3(0.0, 1.0, m_param->vbvBufferEnd / m_param->rc.vbvBufferSize);
27
+        if (m_param->vbvEndFrameAdjust > 1.)
28
+            m_param->vbvEndFrameAdjust = x265_clip3(0.0, 1.0, m_param->vbvEndFrameAdjust);
29
         m_param->rc.vbvBufferInit = x265_clip3(0.0, 1.0, X265_MAX(m_param->rc.vbvBufferInit, m_bufferRate / m_bufferSize));
30
         m_bufferFillFinal = m_bufferSize * m_param->rc.vbvBufferInit;
31
         m_bufferFillActual = m_bufferFillFinal;
32
@@ -732,7 +742,6 @@
33
     m_bitrate = m_param->rc.bitrate * 1000;
34
 }
35
 
36
-
37
 void RateControl::initHRD(SPS& sps)
38
 {
39
     int vbvBufferSize = m_param->rc.vbvBufferSize * 1000;
40
@@ -765,6 +774,7 @@
41
 
42
     #undef MAX_DURATION
43
 }
44
+
45
 bool RateControl::analyseABR2Pass(uint64_t allAvailableBits)
46
 {
47
     double rateFactor, stepMult;
48
@@ -1473,6 +1483,7 @@
49
 
50
     return q;
51
 }
52
+
53
 double RateControl::countExpectedBits(int startPos, int endPos)
54
 {
55
     double expectedBits = 0;
56
@@ -1484,6 +1495,7 @@
57
     }
58
     return expectedBits;
59
 }
60
+
61
 bool RateControl::findUnderflow(double *fills, int *t0, int *t1, int over, int endPos)
62
 {
63
     /* find an interval ending on an overflow or underflow (depending on whether
64
@@ -1531,6 +1543,7 @@
65
     }
66
     return adjusted;
67
 }
68
+
69
 bool RateControl::cuTreeReadFor2Pass(Frame* frame)
70
 {
71
     int index = m_encOrder[frame->m_poc];
72
@@ -1579,24 +1592,24 @@
73
 double RateControl::tuneAbrQScaleFromFeedback(double qScale)
74
 {
75
     double abrBuffer = 2 * m_rateTolerance * m_bitrate;
76
-        /* use framesDone instead of POC as poc count is not serial with bframes enabled */
77
-        double overflow = 1.0;
78
-        double timeDone = (double)(m_framesDone - m_param->frameNumThreads + 1) * m_frameDuration;
79
-        double wantedBits = timeDone * m_bitrate;
80
-        int64_t encodedBits = m_totalBits;
81
-        if (m_param->totalFrames && m_param->totalFrames <= 2 * m_fps)
82
-        {
83
-            abrBuffer = m_param->totalFrames * (m_bitrate / m_fps);
84
-            encodedBits = m_encodedBits;
85
-        }
86
-
87
-        if (wantedBits > 0 && encodedBits > 0 && (!m_partialResidualFrames || 
88
-            m_param->rc.bStrictCbr || m_isGrainEnabled))
89
-        {
90
-            abrBuffer *= X265_MAX(1, sqrt(timeDone));
91
-            overflow = x265_clip3(.5, 2.0, 1.0 + (encodedBits - wantedBits) / abrBuffer);
92
-            qScale *= overflow;
93
-        }
94
+    /* use framesDone instead of POC as poc count is not serial with bframes enabled */
95
+    double overflow = 1.0;
96
+    double timeDone = (double)(m_framesDone - m_param->frameNumThreads + 1) * m_frameDuration;
97
+    double wantedBits = timeDone * m_bitrate;
98
+    int64_t encodedBits = m_totalBits;
99
+    if (m_param->totalFrames && m_param->totalFrames <= 2 * m_fps)
100
+    {
101
+        abrBuffer = m_param->totalFrames * (m_bitrate / m_fps);
102
+        encodedBits = m_encodedBits;
103
+    }
104
+
105
+    if (wantedBits > 0 && encodedBits > 0 && (!m_partialResidualFrames || 
106
+        m_param->rc.bStrictCbr || m_isGrainEnabled))
107
+    {
108
+        abrBuffer *= X265_MAX(1, sqrt(timeDone));
109
+        overflow = x265_clip3(.5, 2.0, 1.0 + (encodedBits - wantedBits) / abrBuffer);
110
+        qScale *= overflow;
111
+    }
112
     return qScale;
113
 }
114
 
115
@@ -2157,29 +2170,51 @@
116
                     curBits = predictSize(&m_pred[predType], frameQ[type], (double)satd);
117
                     bufferFillCur -= curBits;
118
                 }
119
-
120
-                /* Try to get the buffer at least 50% filled, but don't set an impossible goal. */
121
-                double finalDur = 1;
122
-                if (m_param->rc.bStrictCbr)
123
-                {
124
-                    finalDur = x265_clip3(0.4, 1.0, totalDuration);
125
-                }
126
-                targetFill = X265_MIN(m_bufferFill + totalDuration * m_vbvMaxRate * 0.5 , m_bufferSize * (1 - 0.5 * finalDur));
127
-                if (bufferFillCur < targetFill)
128
-                {
129
-                    q *= 1.01;
130
-                    loopTerminate |= 1;
131
-                    continue;
132
-                }
133
-                /* Try to get the buffer not more than 80% filled, but don't set an impossible goal. */
134
-                targetFill = x265_clip3(m_bufferSize * (1 - 0.2 * finalDur), m_bufferSize, m_bufferFill - totalDuration * m_vbvMaxRate * 0.5);
135
-                if (m_isCbr && bufferFillCur > targetFill && !m_isSceneTransition)
136
-                {
137
-                    q /= 1.01;
138
-                    loopTerminate |= 2;
139
-                    continue;
140
+                if (m_param->vbvBufferEnd && rce->encodeOrder >= m_param->vbvEndFrameAdjust * m_param->totalFrames)
141
+                {
142
+                    bool loopBreak = false;
143
+                    double bufferDiff = m_param->vbvBufferEnd - (m_bufferFill / m_bufferSize);
144
+                    targetFill = m_bufferFill + m_bufferSize * (bufferDiff / (m_param->totalFrames - rce->encodeOrder));
145
+                    if (bufferFillCur < targetFill)
146
+                    {
147
+                        q *= 1.01;
148
+                        loopTerminate |= 1;
149
+                        loopBreak = true;
150
+                    }
151
+                    if (bufferFillCur > m_param->vbvBufferEnd * m_bufferSize)
152
+                    {
153
+                        q /= 1.01;
154
+                        loopTerminate |= 2;
155
+                        loopBreak = true;
156
+                    }
157
+                    if (!loopBreak)
158
+                        break;
159
+                }
160
+                else
161
+                {
162
+                    /* Try to get the buffer at least 50% filled, but don't set an impossible goal. */
163
+                    double finalDur = 1;
164
+                    if (m_param->rc.bStrictCbr)
165
+                    {
166
+                        finalDur = x265_clip3(0.4, 1.0, totalDuration);
167
+                    }
168
+                    targetFill = X265_MIN(m_bufferFill + totalDuration * m_vbvMaxRate * 0.5, m_bufferSize * (1 - 0.5 * finalDur));
169
+                    if (bufferFillCur < targetFill)
170
+                    {
171
+                        q *= 1.01;
172
+                        loopTerminate |= 1;
173
+                        continue;
174
+                    }
175
+                    /* Try to get the buffer not more than 80% filled, but don't set an impossible goal. */
176
+                    targetFill = x265_clip3(m_bufferSize * (1 - 0.2 * finalDur), m_bufferSize, m_bufferFill - totalDuration * m_vbvMaxRate * 0.5);
177
+                    if (m_isCbr && bufferFillCur > targetFill && !m_isSceneTransition)
178
+                    {
179
+                        q /= 1.01;
180
+                        loopTerminate |= 2;
181
+                        continue;
182
+                    }
183
+                    break;
184
                 }
185
-                break;
186
             }
187
             q = X265_MAX(q0 / 2, q);
188
         }
189
@@ -2330,17 +2365,18 @@
190
     return totalSatdBits + encodedBitsSoFar;
191
 }
192
 
193
-int RateControl::rowVbvRateControl(Frame* curFrame, uint32_t row, RateControlEntry* rce, double& qpVbv)
194
+int RateControl::rowVbvRateControl(Frame* curFrame, uint32_t row, RateControlEntry* rce, double& qpVbv, uint32_t* m_sliceBaseRow, uint32_t sliceId)
195
 {
196
     FrameData& curEncData = *curFrame->m_encData;
197
     double qScaleVbv = x265_qp2qScale(qpVbv);
198
     uint64_t rowSatdCost = curEncData.m_rowStat[row].rowSatd;
199
     double encodedBits = curEncData.m_rowStat[row].encodedBits;
200
+    uint32_t rowInSlice = row - m_sliceBaseRow[sliceId];
201
 
202
-    if (m_param->bEnableWavefront && row == 1)
203
+    if (m_param->bEnableWavefront && rowInSlice == 1)
204
     {
205
-        rowSatdCost += curEncData.m_rowStat[0].rowSatd;
206
-        encodedBits += curEncData.m_rowStat[0].encodedBits;
207
+        rowSatdCost += curEncData.m_rowStat[row - 1].rowSatd;
208
+        encodedBits += curEncData.m_rowStat[row - 1].encodedBits;
209
     }
210
     rowSatdCost >>= X265_DEPTH - 8;
211
     updatePredictor(rce->rowPred[0], qScaleVbv, (double)rowSatdCost, encodedBits);
212
@@ -2350,8 +2386,8 @@
213
         if (qpVbv < refFrame->m_encData->m_rowStat[row].rowQp)
214
         {
215
             uint64_t intraRowSatdCost = curEncData.m_rowStat[row].rowIntraSatd;
216
-            if (m_param->bEnableWavefront && row == 1)
217
-                intraRowSatdCost += curEncData.m_rowStat[0].rowIntraSatd;
218
+            if (m_param->bEnableWavefront && rowInSlice == 1)
219
+                intraRowSatdCost += curEncData.m_rowStat[row - 1].rowIntraSatd;
220
             intraRowSatdCost >>= X265_DEPTH - 8;
221
             updatePredictor(rce->rowPred[1], qScaleVbv, (double)intraRowSatdCost, encodedBits);
222
         }
223
@@ -2376,7 +2412,7 @@
224
     const SPS& sps = *curEncData.m_slice->m_sps;
225
     double maxFrameError = X265_MAX(0.05, 1.0 / sps.numCuInHeight);
226
 
227
-    if (row < sps.numCuInHeight - 1)
228
+    if (row < m_sliceBaseRow[sliceId + 1] - 1)
229
     {
230
         /* More threads means we have to be more cautious in letting ratecontrol use up extra bits. */
231
         double rcTol = bufferLeftPlanned / m_param->frameNumThreads * m_rateTolerance;
232
@@ -2693,8 +2729,8 @@
233
             m_encodedBitsWindow[pos % s_slidingWindowFrames] = actualBits;
234
         if(rce->sliceType != I_SLICE)
235
         {
236
-        int qp = int (rce->qpaRc + 0.5);
237
-        m_qpToEncodedBits[qp] =  m_qpToEncodedBits[qp] == 0 ? actualBits : (m_qpToEncodedBits[qp] + actualBits) * 0.5;
238
+            int qp = int (rce->qpaRc + 0.5);
239
+            m_qpToEncodedBits[qp] =  m_qpToEncodedBits[qp] == 0 ? actualBits : (m_qpToEncodedBits[qp] + actualBits) * 0.5;
240
         }
241
         curFrame->m_rcData->wantedBitsWindow = m_wantedBitsWindow;
242
         curFrame->m_rcData->cplxrSum = m_cplxrSum;
243
@@ -2779,7 +2815,8 @@
244
             curFrame->m_encData->m_frameStats.percent8x8Skip  * m_ncu) < 0)
245
             goto writeFailure;
246
     }
247
-    else{
248
+    else
249
+    {
250
         RPS* rpsWriter = &curFrame->m_encData->m_slice->m_rps;
251
         int i, num = rpsWriter->numberOfPictures;
252
         char deltaPOC[128];
253
x265_2.5.tar.gz/source/encoder/ratecontrol.h -> x265_2.6.tar.gz/source/encoder/ratecontrol.h Changed
10
 
1
@@ -244,7 +244,7 @@
2
     int  rateControlStart(Frame* curFrame, RateControlEntry* rce, Encoder* enc);
3
     void rateControlUpdateStats(RateControlEntry* rce);
4
     int  rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce, int *filler);
5
-    int  rowVbvRateControl(Frame* curFrame, uint32_t row, RateControlEntry* rce, double& qpVbv);
6
+    int  rowVbvRateControl(Frame* curFrame, uint32_t row, RateControlEntry* rce, double& qpVbv, uint32_t* m_sliceBaseRow, uint32_t sliceId);
7
     int  rateControlSliceType(int frameNum);
8
     bool cuTreeReadFor2Pass(Frame* curFrame);
9
     void hrdFullness(SEIBufferingPeriod* sei);
10
x265_2.5.tar.gz/source/encoder/search.cpp -> x265_2.6.tar.gz/source/encoder/search.cpp Changed
10
 
1
@@ -2162,7 +2162,7 @@
2
 
3
         /* Uni-directional prediction */
4
         if ((m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->analysisReuseLevel > 1 && m_param->analysisReuseLevel != 10)
5
-            || (m_param->analysisMultiPassRefine && m_param->rc.bStatRead))
6
+            || (m_param->analysisMultiPassRefine && m_param->rc.bStatRead) || (m_param->bMVType == AVC_INFO))
7
         {
8
             for (int list = 0; list < numPredDir; list++)
9
             {
10
x265_2.5.tar.gz/source/encoder/slicetype.cpp -> x265_2.6.tar.gz/source/encoder/slicetype.cpp Changed
255
 
1
@@ -588,6 +588,7 @@
2
     m_filled   = false;
3
     m_outputSignalRequired = false;
4
     m_isActive = true;
5
+    m_inputCount = 0;
6
 
7
     m_8x8Height = ((m_param->sourceHeight / 2) + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
8
     m_8x8Width = ((m_param->sourceWidth / 2) + X265_LOWRES_CU_SIZE - 1) >> X265_LOWRES_CU_BITS;
9
@@ -741,23 +742,21 @@
10
 /* Called by API thread */
11
 void Lookahead::addPicture(Frame& curFrame, int sliceType)
12
 {
13
-    curFrame.m_lowres.sliceType = sliceType;
14
-
15
-    /* determine if the lookahead is (over) filled enough for frames to begin to
16
-     * be consumed by frame encoders */
17
-    if (!m_filled)
18
+    if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->bDisableLookahead)
19
     {
20
-        if (!m_param->bframes & !m_param->lookaheadDepth)
21
-            m_filled = true; /* zero-latency */
22
-        else if (curFrame.m_poc >= m_param->lookaheadDepth + 2 + m_param->bframes)
23
-            m_filled = true; /* full capacity plus mini-gop lag */
24
+        if (!m_filled)
25
+            m_filled = true;
26
+        m_outputLock.acquire();
27
+        m_outputQueue.pushBack(curFrame);
28
+        m_outputLock.release();
29
+        m_inputCount++;
30
+    }
31
+    else
32
+    {
33
+        checkLookaheadQueue(m_inputCount);
34
+        curFrame.m_lowres.sliceType = sliceType;
35
+        addPicture(curFrame);
36
     }
37
-
38
-    m_inputLock.acquire();
39
-    m_inputQueue.pushBack(curFrame);
40
-    if (m_pool && m_inputQueue.size() >= m_fullQueueSize)
41
-        tryWakeOne();
42
-    m_inputLock.release();
43
 }
44
 
45
 void Lookahead::addPicture(Frame& curFrame)
46
@@ -765,6 +764,7 @@
47
     m_inputLock.acquire();
48
     m_inputQueue.pushBack(curFrame);
49
     m_inputLock.release();
50
+    m_inputCount++;
51
 }
52
 
53
 void Lookahead::checkLookaheadQueue(int &frameCnt)
54
@@ -793,6 +793,12 @@
55
     m_filled = true;
56
 }
57
 
58
+void Lookahead::setLookaheadQueue()
59
+{
60
+    m_filled = false;
61
+    m_fullQueueSize = X265_MAX(1, m_param->lookaheadDepth);
62
+}
63
+
64
 void Lookahead::findJob(int /*workerThreadID*/)
65
 {
66
     bool doDecide;
67
@@ -832,7 +838,13 @@
68
         m_outputLock.release();
69
 
70
         if (out)
71
+        {
72
+            m_inputCount--;
73
             return out;
74
+        }
75
+
76
+        if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->bDisableLookahead)
77
+            return NULL;
78
 
79
         findJob(-1); /* run slicetypeDecide() if necessary */
80
 
81
@@ -843,7 +855,10 @@
82
         if (wait)
83
             m_outputSignal.wait();
84
 
85
-        return m_outputQueue.popFront();
86
+        out = m_outputQueue.popFront();
87
+        if (out)
88
+            m_inputCount--;
89
+        return out;
90
     }
91
     else
92
         return NULL;
93
@@ -887,68 +902,68 @@
94
     default:
95
         return;
96
     }
97
-
98
-    X265_CHECK(curFrame->m_lowres.costEst[b - p0][p1 - b] > 0, "Slice cost not estimated\n")
99
-
100
-    if (m_param->rc.cuTree && !m_param->rc.bStatRead)
101
-        /* update row satds based on cutree offsets */
102
-        curFrame->m_lowres.satdCost = frameCostRecalculate(frames, p0, p1, b);
103
-    else if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || m_param->scaleFactor)
104
-    {
105
-        if (m_param->rc.aqMode)
106
-            curFrame->m_lowres.satdCost = curFrame->m_lowres.costEstAq[b - p0][p1 - b];
107
-        else
108
-            curFrame->m_lowres.satdCost = curFrame->m_lowres.costEst[b - p0][p1 - b];
109
-    }
110
-
111
-    if (m_param->rc.vbvBufferSize && m_param->rc.vbvMaxBitrate)
112
+    if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || !m_param->bDisableLookahead)
113
     {
114
-        /* aggregate lowres row satds to CTU resolution */
115
-        curFrame->m_lowres.lowresCostForRc = curFrame->m_lowres.lowresCosts[b - p0][p1 - b];
116
-        uint32_t lowresRow = 0, lowresCol = 0, lowresCuIdx = 0, sum = 0, intraSum = 0;
117
-        uint32_t scale = m_param->maxCUSize / (2 * X265_LOWRES_CU_SIZE);
118
-        uint32_t numCuInHeight = (m_param->sourceHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
119
-        uint32_t widthInLowresCu = (uint32_t)m_8x8Width, heightInLowresCu = (uint32_t)m_8x8Height;
120
-        double *qp_offset = 0;
121
-        /* Factor in qpoffsets based on Aq/Cutree in CU costs */
122
-        if (m_param->rc.aqMode || m_param->bAQMotion)
123
-            qp_offset = (frames[b]->sliceType == X265_TYPE_B || !m_param->rc.cuTree) ? frames[b]->qpAqOffset : frames[b]->qpCuTreeOffset;
124
-
125
-        for (uint32_t row = 0; row < numCuInHeight; row++)
126
+        X265_CHECK(curFrame->m_lowres.costEst[b - p0][p1 - b] > 0, "Slice cost not estimated\n")
127
+        if (m_param->rc.cuTree && !m_param->rc.bStatRead)
128
+            /* update row satds based on cutree offsets */
129
+            curFrame->m_lowres.satdCost = frameCostRecalculate(frames, p0, p1, b);
130
+        else if (m_param->analysisReuseMode != X265_ANALYSIS_LOAD || m_param->scaleFactor)
131
+        {
132
+            if (m_param->rc.aqMode)
133
+                curFrame->m_lowres.satdCost = curFrame->m_lowres.costEstAq[b - p0][p1 - b];
134
+            else
135
+                curFrame->m_lowres.satdCost = curFrame->m_lowres.costEst[b - p0][p1 - b];
136
+        }
137
+        if (m_param->rc.vbvBufferSize && m_param->rc.vbvMaxBitrate)
138
         {
139
-            lowresRow = row * scale;
140
-            for (uint32_t cnt = 0; cnt < scale && lowresRow < heightInLowresCu; lowresRow++, cnt++)
141
+            /* aggregate lowres row satds to CTU resolution */
142
+            curFrame->m_lowres.lowresCostForRc = curFrame->m_lowres.lowresCosts[b - p0][p1 - b];
143
+            uint32_t lowresRow = 0, lowresCol = 0, lowresCuIdx = 0, sum = 0, intraSum = 0;
144
+            uint32_t scale = m_param->maxCUSize / (2 * X265_LOWRES_CU_SIZE);
145
+            uint32_t numCuInHeight = (m_param->sourceHeight + m_param->maxCUSize - 1) / m_param->maxCUSize;
146
+            uint32_t widthInLowresCu = (uint32_t)m_8x8Width, heightInLowresCu = (uint32_t)m_8x8Height;
147
+            double *qp_offset = 0;
148
+            /* Factor in qpoffsets based on Aq/Cutree in CU costs */
149
+            if (m_param->rc.aqMode || m_param->bAQMotion)
150
+                qp_offset = (frames[b]->sliceType == X265_TYPE_B || !m_param->rc.cuTree) ? frames[b]->qpAqOffset : frames[b]->qpCuTreeOffset;
151
+
152
+            for (uint32_t row = 0; row < numCuInHeight; row++)
153
             {
154
-                sum = 0; intraSum = 0;
155
-                int diff = 0;
156
-                lowresCuIdx = lowresRow * widthInLowresCu;
157
-                for (lowresCol = 0; lowresCol < widthInLowresCu; lowresCol++, lowresCuIdx++)
158
+                lowresRow = row * scale;
159
+                for (uint32_t cnt = 0; cnt < scale && lowresRow < heightInLowresCu; lowresRow++, cnt++)
160
                 {
161
-                    uint16_t lowresCuCost = curFrame->m_lowres.lowresCostForRc[lowresCuIdx] & LOWRES_COST_MASK;
162
-                    if (qp_offset)
163
+                    sum = 0; intraSum = 0;
164
+                    int diff = 0;
165
+                    lowresCuIdx = lowresRow * widthInLowresCu;
166
+                    for (lowresCol = 0; lowresCol < widthInLowresCu; lowresCol++, lowresCuIdx++)
167
                     {
168
-                        double qpOffset;
169
-                        if (m_param->rc.qgSize == 8)
170
-                            qpOffset = (qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4] +
171
-                                        qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + 1] +
172
-                                        qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + curFrame->m_lowres.maxBlocksInRowFullRes] +
173
-                                        qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + curFrame->m_lowres.maxBlocksInRowFullRes + 1]) / 4;
174
-                        else
175
-                            qpOffset = qp_offset[lowresCuIdx];
176
-                        lowresCuCost = (uint16_t)((lowresCuCost * x265_exp2fix8(qpOffset) + 128) >> 8);
177
-                        int32_t intraCuCost = curFrame->m_lowres.intraCost[lowresCuIdx];
178
-                        curFrame->m_lowres.intraCost[lowresCuIdx] = (intraCuCost * x265_exp2fix8(qpOffset) + 128) >> 8;
179
+                        uint16_t lowresCuCost = curFrame->m_lowres.lowresCostForRc[lowresCuIdx] & LOWRES_COST_MASK;
180
+                        if (qp_offset)
181
+                        {
182
+                            double qpOffset;
183
+                            if (m_param->rc.qgSize == 8)
184
+                                qpOffset = (qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4] +
185
+                                qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + 1] +
186
+                                qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + curFrame->m_lowres.maxBlocksInRowFullRes] +
187
+                                qp_offset[lowresCol * 2 + lowresRow * widthInLowresCu * 4 + curFrame->m_lowres.maxBlocksInRowFullRes + 1]) / 4;
188
+                            else
189
+                                qpOffset = qp_offset[lowresCuIdx];
190
+                            lowresCuCost = (uint16_t)((lowresCuCost * x265_exp2fix8(qpOffset) + 128) >> 8);
191
+                            int32_t intraCuCost = curFrame->m_lowres.intraCost[lowresCuIdx];
192
+                            curFrame->m_lowres.intraCost[lowresCuIdx] = (intraCuCost * x265_exp2fix8(qpOffset) + 128) >> 8;
193
+                        }
194
+                        if (m_param->bIntraRefresh && slice->m_sliceType == X265_TYPE_P)
195
+                            for (uint32_t x = curFrame->m_encData->m_pir.pirStartCol; x <= curFrame->m_encData->m_pir.pirEndCol; x++)
196
+                                diff += curFrame->m_lowres.intraCost[lowresCuIdx] - lowresCuCost;
197
+                        curFrame->m_lowres.lowresCostForRc[lowresCuIdx] = lowresCuCost;
198
+                        sum += lowresCuCost;
199
+                        intraSum += curFrame->m_lowres.intraCost[lowresCuIdx];
200
                     }
201
-                    if (m_param->bIntraRefresh && slice->m_sliceType == X265_TYPE_P)
202
-                        for (uint32_t x = curFrame->m_encData->m_pir.pirStartCol; x <= curFrame->m_encData->m_pir.pirEndCol; x++)
203
-                            diff += curFrame->m_lowres.intraCost[lowresCuIdx] - lowresCuCost;
204
-                    curFrame->m_lowres.lowresCostForRc[lowresCuIdx] = lowresCuCost;
205
-                    sum += lowresCuCost;
206
-                    intraSum += curFrame->m_lowres.intraCost[lowresCuIdx];
207
+                    curFrame->m_encData->m_rowStat[row].satdForVbv += sum;
208
+                    curFrame->m_encData->m_rowStat[row].satdForVbv += diff;
209
+                    curFrame->m_encData->m_rowStat[row].intraSatdForVbv += intraSum;
210
                 }
211
-                curFrame->m_encData->m_rowStat[row].satdForVbv += sum;
212
-                curFrame->m_encData->m_rowStat[row].satdForVbv += diff;
213
-                curFrame->m_encData->m_rowStat[row].intraSatdForVbv += intraSum;
214
             }
215
         }
216
     }
217
@@ -1036,6 +1051,18 @@
218
          (m_param->lookaheadDepth && m_param->rc.vbvBufferSize)))
219
     {
220
         slicetypeAnalyse(frames, false);
221
+        bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
222
+        if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->scaleFactor && bIsVbv)
223
+        {
224
+            int numFrames;
225
+            for (numFrames = 0; numFrames < maxSearch; numFrames++)
226
+            {
227
+                Lowres *fenc = frames[numFrames + 1];
228
+                if (!fenc)
229
+                    break;
230
+            }
231
+            vbvLookahead(frames, numFrames, true);
232
+        }
233
     }
234
 
235
     int bframes, brefs;
236
@@ -1219,6 +1246,18 @@
237
 
238
         frames[j + 1] = NULL;
239
         slicetypeAnalyse(frames, true);
240
+        bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
241
+        if (m_param->analysisReuseMode == X265_ANALYSIS_LOAD && m_param->scaleFactor && bIsVbv)
242
+        {
243
+            int numFrames;
244
+            for (numFrames = 0; numFrames < maxSearch; numFrames++)
245
+            {
246
+                Lowres *fenc = frames[numFrames + 1];
247
+                if (!fenc)
248
+                    break;
249
+            }
250
+            vbvLookahead(frames, numFrames, true);
251
+        }
252
     }
253
     m_outputLock.release();
254
 }
255
x265_2.5.tar.gz/source/encoder/slicetype.h -> x265_2.6.tar.gz/source/encoder/slicetype.h Changed
18
 
1
@@ -120,6 +120,7 @@
2
     int           m_cuCount;
3
     int           m_numCoopSlices;
4
     int           m_numRowsPerSlice;
5
+    int           m_inputCount;
6
     double        m_cuTreeStrength;
7
 
8
     bool          m_isActive;
9
@@ -151,7 +152,7 @@
10
     Frame*  getDecidedPicture();
11
 
12
     void    getEstimatedPictureCost(Frame *pic);
13
-
14
+    void    setLookaheadQueue();
15
 
16
 protected:
17
 
18
x265_2.5.tar.gz/source/input/y4m.cpp -> x265_2.6.tar.gz/source/input/y4m.cpp Changed
41
 
1
@@ -307,23 +307,26 @@
2
                         break;
3
                 }
4
 
5
-                switch (csp)
6
+                if (csp / 100 == ('m'-'0')*1000 + ('o'-'0')*100 + ('n'-'0')*10 + ('o'-'0'))
7
                 {
8
-                case ('m'-'0')*100000 + ('o'-'0')*10000 + ('n'-'0')*1000 + ('o'-'0')*100 + 16:
9
                     colorSpace = X265_CSP_I400;
10
-                    depth = 16;
11
-                    break;
12
-
13
-                case ('m'-'0')*1000 + ('o'-'0')*100 + ('n'-'0')*10 + ('o'-'0'):
14
+                    d = csp % 100;
15
+                }
16
+                else if (csp / 10 == ('m'-'0')*1000 + ('o'-'0')*100 + ('n'-'0')*10 + ('o'-'0'))
17
+                {
18
                     colorSpace = X265_CSP_I400;
19
-                    depth = 8;
20
-                    break;
21
-                   
22
-                default:
23
-                    if (d >= 8 && d <= 16)
24
-                        depth = d;
25
-                    colorSpace = (csp == 444) ? X265_CSP_I444 : (csp == 422) ? X265_CSP_I422 : X265_CSP_I420;
26
+                    d = csp % 10;
27
                 }
28
+                else if (csp == ('m'-'0')*1000 + ('o'-'0')*100 + ('n'-'0')*10 + ('o'-'0'))
29
+                {
30
+                    colorSpace = X265_CSP_I400;
31
+                    d = 8;
32
+                }
33
+                else
34
+                    colorSpace = (csp == 444) ? X265_CSP_I444 : (csp == 422) ? X265_CSP_I422 : X265_CSP_I420;
35
+
36
+                if (d >= 8 && d <= 16)
37
+                    depth = d;
38
                 break;
39
 
40
             default:
41
x265_2.5.tar.gz/source/test/rate-control-tests.txt -> x265_2.6.tar.gz/source/test/rate-control-tests.txt Changed
57
 
1
@@ -2,7 +2,7 @@
2
 
3
 #These tests should yeild deterministic results
4
 # This test is listed first since it currently reproduces bugs
5
-big_buck_bunny_360p24.y4m,--preset medium --bitrate 1000 --pass 1 -F4,--preset medium --bitrate 1000 --pass 2 -F4
6
+big_buck_bunny_360p24.y4m,--preset medium --bitrate 1000 --pass 1 -F4::--preset medium --bitrate 1000 --pass 2 -F4
7
 fire_1920x1080_30.yuv, --preset slow --bitrate 2000 --tune zero-latency 
8
 
9
 
10
@@ -25,27 +25,27 @@
11
 big_buck_bunny_360p24.y4m,--preset medium --bitrate 400 --vbv-bufsize 600 --vbv-maxrate 600 --no-wpp --aud --hrd --tune fast-decode
12
 sita_1920x1080_30.yuv,--preset superfast --bitrate 3000 --vbv-bufsize 3000 --vbv-maxrate 3000 --aud --strict-cbr --no-wpp
13
 sintel_trailer_2k_480p24.y4m, --preset slow --crf 24 --vbv-bufsize 150 --vbv-maxrate 150 --dynamic-rd 1.53
14
-
15
+BasketballDrive_1920x1080_50.y4m,--preset medium --bitrate 10000 --vbv-bufsize 15000 --vbv-maxrate 11500 --vbv-end 0.9 --vbv-end-fr-adj 0.7
16
 
17
 
18
 # multi-pass rate control tests
19
-sita_1920x1080_30.yuv, --preset ultrafast --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000, --preset ultrafast --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000
20
-sita_1920x1080_30.yuv, --preset medium --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000, --preset medium --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000
21
-sintel_trailer_2k_480p24.y4m, --preset medium --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 1, --preset medium --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 2
22
-sintel_trailer_2k_480p24.y4m, --preset veryslow --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 1, --preset veryslow --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 2
23
-ten_teaser_3840x2160_50_10bit.yuv, --preset medium --crf 25 --no-cutree --no-open-gop --no-scenecut --keyint 50 --vbv-maxrate 10000 --vbv-bufsize 12000 --pass 1, --preset medium --crf 25 --no-cutree --no-open-gop --no-scenecut --keyint 50 --vbv-maxrate 10000 --vbv-bufsize 12000 --pass 2
24
-big_buck_bunny_360p24.y4m,--preset slow --crf 40 --pass 1 -f 5000,--preset slow --bitrate 200 --pass 2 -f 5000
25
-big_buck_bunny_360p24.y4m,--preset medium --bitrate 700 --pass 1 -F4 --slow-firstpass -f 5000 ,--preset medium --bitrate 700 --vbv-bufsize 900 --vbv-maxrate 700 --pass 2 -F4 -f 5000
26
-112_1920x1080_25.yuv,--preset fast --bitrate 1000 --vbv-maxrate 1000 --vbv-bufsize 1000 --strict-cbr --pass 1 -F4,--preset fast --bitrate 1000 --vbv-maxrate 3000 --vbv-bufsize 3000 --pass 2 -F4
27
-pine_tree_1920x1080_30.yuv,--preset veryfast --crf 12 --pass 1 -F4,--preset faster --bitrate 4000 --pass 2 -F4
28
-SteamLocomotiveTrain_2560x1600_60_10bit_crop.yuv, --tune grain --preset ultrafast --bitrate 5000 --vbv-maxrate 5000 --vbv-bufsize 8000 --strict-cbr -F4 --pass 1, --tune grain --preset ultrafast --bitrate 8000 --vbv-maxrate 8000 --vbv-bufsize 8000 -F4 --pass 2
29
-RaceHorses_416x240_30_10bit.yuv,--preset medium --crf 40 --pass 1, --preset faster --bitrate 200 --pass 2 -F4
30
-CrowdRun_1920x1080_50_10bit_422.yuv,--preset superfast --bitrate 2500 --pass 1 -F4 --slow-firstpass,--preset superfast --bitrate 2500 --pass 2 -F4
31
-RaceHorses_416x240_30_10bit.yuv,--preset medium --crf 26 --vbv-maxrate 1000 --vbv-bufsize 1000 --pass 1,--preset fast --bitrate 1000  --vbv-maxrate 1000 --vbv-bufsize 700 --pass 3 -F4,--preset slow --bitrate 500 --vbv-maxrate 500  --vbv-bufsize 700 --pass 2 -F4
32
-sita_1920x1080_30.yuv, --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000, --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers
33
-sita_1920x1080_30.yuv, --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps, --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps
34
+sita_1920x1080_30.yuv, --preset ultrafast --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000:: --preset ultrafast --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000
35
+sita_1920x1080_30.yuv, --preset medium --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000:: --preset medium --crf 20 --no-cutree --no-scenecut --keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000
36
+sintel_trailer_2k_480p24.y4m, --preset medium --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 1:: --preset medium --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 2
37
+sintel_trailer_2k_480p24.y4m, --preset veryslow --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 1:: --preset veryslow --crf 18 --no-cutree --no-scenecut --no-open-gop --keyint 50 --vbv-bufsize 1200 --vbv-maxrate 1000 --pass 2
38
+ten_teaser_3840x2160_50_10bit.yuv, --preset medium --crf 25 --no-cutree --no-open-gop --no-scenecut --keyint 50 --vbv-maxrate 10000 --vbv-bufsize 12000 --pass 1:: --preset medium --crf 25 --no-cutree --no-open-gop --no-scenecut --keyint 50 --vbv-maxrate 10000 --vbv-bufsize 12000 --pass 2
39
+big_buck_bunny_360p24.y4m,--preset slow --crf 40 --pass 1 -f 5000::--preset slow --bitrate 200 --pass 2 -f 5000
40
+big_buck_bunny_360p24.y4m,--preset medium --bitrate 700 --pass 1 -F4 --slow-firstpass -f 5000 ::--preset medium --bitrate 700 --vbv-bufsize 900 --vbv-maxrate 700 --pass 2 -F4 -f 5000
41
+112_1920x1080_25.yuv,--preset fast --bitrate 1000 --vbv-maxrate 1000 --vbv-bufsize 1000 --strict-cbr --pass 1 -F4::--preset fast --bitrate 1000 --vbv-maxrate 3000 --vbv-bufsize 3000 --pass 2 -F4
42
+pine_tree_1920x1080_30.yuv,--preset veryfast --crf 12 --pass 1 -F4::--preset faster --bitrate 4000 --pass 2 -F4
43
+SteamLocomotiveTrain_2560x1600_60_10bit_crop.yuv, --tune grain --preset ultrafast --bitrate 5000 --vbv-maxrate 5000 --vbv-bufsize 8000 --strict-cbr -F4 --pass 1:: --tune grain --preset ultrafast --bitrate 8000 --vbv-maxrate 8000 --vbv-bufsize 8000 -F4 --pass 2
44
+RaceHorses_416x240_30_10bit.yuv,--preset medium --crf 40 --pass 1:: --preset faster --bitrate 200 --pass 2 -F4
45
+CrowdRun_1920x1080_50_10bit_422.yuv,--preset superfast --bitrate 2500 --pass 1 -F4 --slow-firstpass::--preset superfast --bitrate 2500 --pass 2 -F4
46
+RaceHorses_416x240_30_10bit.yuv,--preset medium --crf 26 --vbv-maxrate 1000 --vbv-bufsize 1000 --pass 1::--preset fast --bitrate 1000  --vbv-maxrate 1000 --vbv-bufsize 700 --pass 3 -F4::--preset slow --bitrate 500 --vbv-maxrate 500  --vbv-bufsize 700 --pass 2 -F4
47
+sita_1920x1080_30.yuv, --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000:: --preset ultrafast --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers
48
+sita_1920x1080_30.yuv, --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 1 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps:: --preset medium --crf 20 --no-cutree --keyint 50 --min-keyint 50 --no-open-gop --pass 2 --vbv-bufsize 7000 --vbv-maxrate 5000 --repeat-headers --multi-pass-opt-rps
49
 
50
 # multi-pass rate control and analysis
51
-ducks_take_off_1080p50.y4m,--bitrate 6000 --pass 1  --multi-pass-opt-analysis  --hash 1 --ssim --psnr, --bitrate 6000 --pass 2  --multi-pass-opt-analysis  --hash 1 --ssim --psnr
52
-big_buck_bunny_360p24.y4m,--preset veryslow --bitrate 600 --pass 1  --multi-pass-opt-analysis  --multi-pass-opt-distortion --hash 1 --ssim --psnr, --preset veryslow --bitrate 600 --pass 2  --multi-pass-opt-analysis  --multi-pass-opt-distortion --hash 1 --ssim --psnr
53
-parkrun_ter_720p50.y4m, --bitrate 3500 --pass 1 --multi-pass-opt-distortion --hash 1 --ssim --psnr, --bitrate 3500 --pass 3 --multi-pass-opt-distortion --hash 1 --ssim --psnr, --bitrate 3500 --pass 2 --multi-pass-opt-distortion --hash 1 --ssim --psnr
54
+ducks_take_off_1080p50.y4m,--bitrate 6000 --pass 1  --multi-pass-opt-analysis  --hash 1 --ssim --psnr:: --bitrate 6000 --pass 2  --multi-pass-opt-analysis  --hash 1 --ssim --psnr
55
+big_buck_bunny_360p24.y4m,--preset veryslow --bitrate 600 --pass 1  --multi-pass-opt-analysis  --multi-pass-opt-distortion --hash 1 --ssim --psnr:: --preset veryslow --bitrate 600 --pass 2  --multi-pass-opt-analysis  --multi-pass-opt-distortion --hash 1 --ssim --psnr
56
+parkrun_ter_720p50.y4m, --bitrate 3500 --pass 1 --multi-pass-opt-distortion --hash 1 --ssim --psnr:: --bitrate 3500 --pass 3 --multi-pass-opt-distortion --hash 1 --ssim --psnr:: --bitrate 3500 --pass 2 --multi-pass-opt-distortion --hash 1 --ssim --psnr
57
x265_2.5.tar.gz/source/test/regression-tests.txt -> x265_2.6.tar.gz/source/test/regression-tests.txt Changed
74
 
1
@@ -13,21 +13,22 @@
2
 
3
 BasketballDrive_1920x1080_50.y4m,--preset ultrafast --signhide --colormatrix bt709
4
 BasketballDrive_1920x1080_50.y4m,--preset superfast --psy-rd 1 --ctu 16 --no-wpp --limit-modes
5
+BasketballDrive_1920x1080_50.y4m,--preset superfast --tune zerolatency --bitrate 9000 --vbv-maxrate 9000 --vbv-bufsize 9000 -F 1 --slices 2
6
 BasketballDrive_1920x1080_50.y4m,--preset veryfast --tune zerolatency --no-temporal-mvp
7
 BasketballDrive_1920x1080_50.y4m,--preset faster --aq-strength 2 --merange 190 --slices 3
8
 BasketballDrive_1920x1080_50.y4m,--preset medium --ctu 16 --max-tu-size 8 --subme 7 --qg-size 16 --cu-lossless --tu-inter-depth 3 --limit-tu 1
9
 BasketballDrive_1920x1080_50.y4m,--preset medium --keyint -1 --nr-inter 100 -F4 --no-sao
10
-BasketballDrive_1920x1080_50.y4m,--preset medium --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 7000 --limit-modes,--preset medium --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 7000 --limit-modes
11
+BasketballDrive_1920x1080_50.y4m,--preset medium --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 7000 --limit-modes::--preset medium --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 7000 --limit-modes
12
 BasketballDrive_1920x1080_50.y4m,--preset slow --nr-intra 100 -F4 --aq-strength 3 --qg-size 16 --limit-refs 1
13
 BasketballDrive_1920x1080_50.y4m,--preset slower --lossless --chromaloc 3 --subme 0 --limit-tu 4
14
-BasketballDrive_1920x1080_50.y4m,--preset slower --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0,--preset slower --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0
15
+BasketballDrive_1920x1080_50.y4m,--preset slower --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0::--preset slower --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 10 --bitrate 7000 --limit-tu 0
16
 BasketballDrive_1920x1080_50.y4m,--preset veryslow --crf 4 --cu-lossless --pmode --limit-refs 1 --aq-mode 3 --limit-tu 3
17
-BasketballDrive_1920x1080_50.y4m,--preset veryslow --no-cutree --analysis-reuse-mode=save --bitrate 7000 --tskip-fast --limit-tu 4,--preset veryslow --no-cutree --analysis-reuse-mode=load --bitrate 7000  --tskip-fast --limit-tu 4
18
+BasketballDrive_1920x1080_50.y4m,--preset veryslow --no-cutree --analysis-reuse-mode=save --bitrate 7000 --tskip-fast --limit-tu 4::--preset veryslow --no-cutree --analysis-reuse-mode=load --bitrate 7000  --tskip-fast --limit-tu 4
19
 BasketballDrive_1920x1080_50.y4m,--preset veryslow --recon-y4m-exec "ffplay -i pipe:0 -autoexit"
20
 Coastguard-4k.y4m,--preset ultrafast --recon-y4m-exec "ffplay -i pipe:0 -autoexit"
21
 Coastguard-4k.y4m,--preset superfast --tune grain --overscan=crop
22
 Coastguard-4k.y4m,--preset superfast --tune grain --pme --aq-strength 2 --merange 190
23
-Coastguard-4k.y4m,--preset veryfast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 15000,--preset veryfast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 15000
24
+Coastguard-4k.y4m,--preset veryfast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 15000::--preset veryfast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 15000
25
 Coastguard-4k.y4m,--preset medium --rdoq-level 1 --tune ssim --no-signhide --me umh --slices 2
26
 Coastguard-4k.y4m,--preset slow --tune psnr --cbqpoffs -1 --crqpoffs 1 --limit-refs 1
27
 CrowdRun_1920x1080_50_10bit_422.yuv,--preset ultrafast --weightp --tune zerolatency --qg-size 16
28
@@ -51,7 +52,7 @@
29
 DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset veryfast --weightp --nr-intra 1000 -F4
30
 DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset medium --nr-inter 500 -F4 --no-psy-rdoq
31
 DucksAndLegs_1920x1080_60_10bit_444.yuv,--preset slower --no-weightp --rdoq-level 0 --limit-refs 3 --tu-inter-depth 4 --limit-tu 3
32
-DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset fast --no-cutree --analysis-reuse-mode=save --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1,--preset fast --no-cutree --analysis-reuse-mode=load --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1
33
+DucksAndLegs_1920x1080_60_10bit_422.yuv,--preset fast --no-cutree --analysis-reuse-mode=save --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1::--preset fast --no-cutree --analysis-reuse-mode=load --bitrate 3000 --early-skip --tu-inter-depth 3 --limit-tu 1
34
 FourPeople_1280x720_60.y4m,--preset superfast --no-wpp --lookahead-slices 2
35
 FourPeople_1280x720_60.y4m,--preset veryfast --aq-mode 2 --aq-strength 1.5 --qg-size 8
36
 FourPeople_1280x720_60.y4m,--preset medium --qp 38 --no-psy-rd
37
@@ -68,8 +69,8 @@
38
 KristenAndSara_1280x720_60.y4m,--preset slower --pmode --max-tu-size 8 --limit-refs 0 --limit-modes --limit-tu 1
39
 NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset superfast --tune psnr
40
 NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset medium --tune grain --limit-refs 2
41
-NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset slow --no-cutree --analysis-reuse-mode=save --rd 5 --analysis-reuse-level 10 --bitrate 9000,--preset slow --no-cutree --analysis-reuse-mode=load --rd 5 --analysis-reuse-level 10 --bitrate 9000
42
-News-4k.y4m,--preset ultrafast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 15000,--preset ultrafast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 15000
43
+NebutaFestival_2560x1600_60_10bit_crop.yuv,--preset slow --no-cutree --analysis-reuse-mode=save --rd 5 --analysis-reuse-level 10 --bitrate 9000::--preset slow --no-cutree --analysis-reuse-mode=load --rd 5 --analysis-reuse-level 10 --bitrate 9000
44
+News-4k.y4m,--preset ultrafast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 2 --bitrate 15000::--preset ultrafast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 2 --bitrate 15000
45
 News-4k.y4m,--preset superfast --lookahead-slices 6 --aq-mode 0
46
 News-4k.y4m,--preset superfast --slices 4 --aq-mode 0 
47
 News-4k.y4m,--preset medium --tune ssim --no-sao --qg-size 16
48
@@ -109,6 +110,7 @@
49
 ducks_take_off_444_720p50.y4m,--preset superfast --weightp --limit-refs 2
50
 ducks_take_off_420_720p50.y4m,--preset faster --qp 24 --deblock -6 --limit-refs 2
51
 ducks_take_off_420_720p50.y4m,--preset fast --deblock 6 --bframes 16 --rc-lookahead 40
52
+ducks_take_off_420_720p50.y4m,--preset fast --tune zerolatency --crf 21 --vbv-maxrate 6000 --vbv-bufsize 6000 -F 1 --slices 2
53
 ducks_take_off_420_720p50.y4m,--preset medium --tskip --tskip-fast --constrained-intra
54
 ducks_take_off_444_720p50.y4m,--preset medium --qp 38 --no-scenecut
55
 ducks_take_off_420_720p50.y4m,--preset slow --scaling-list default --qp 40
56
@@ -123,7 +125,7 @@
57
 old_town_cross_444_720p50.y4m,--preset superfast --weightp --min-cu 16 --limit-modes
58
 old_town_cross_444_720p50.y4m,--preset veryfast --qp 1 --tune ssim
59
 old_town_cross_444_720p50.y4m,--preset faster --rd 1 --tune zero-latency
60
-old_town_cross_444_720p50.y4m,--preset fast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 3000 --early-skip,--preset fast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 3000 --early-skip
61
+old_town_cross_444_720p50.y4m,--preset fast --no-cutree --analysis-reuse-mode=save --analysis-reuse-level 1 --bitrate 3000 --early-skip::--preset fast --no-cutree --analysis-reuse-mode=load --analysis-reuse-level 1 --bitrate 3000 --early-skip
62
 old_town_cross_444_720p50.y4m,--preset medium --keyint -1 --no-weightp --ref 6
63
 old_town_cross_444_720p50.y4m,--preset slow --rdoq-level 1 --early-skip --ref 7 --no-b-pyramid
64
 old_town_cross_444_720p50.y4m,--preset slower --crf 4 --cu-lossless
65
@@ -159,4 +161,8 @@
66
 #SEA Implementation Test
67
 silent_cif_420.y4m,--preset veryslow --me sea
68
 big_buck_bunny_360p24.y4m,--preset superfast --me sea
69
+
70
+#low-pass dct test
71
+720p50_parkrun_ter.y4m,--preset medium --lowpass-dct
72
+
73
 # vim: tw=200
74
x265_2.5.tar.gz/source/test/testharness.h -> x265_2.6.tar.gz/source/test/testharness.h Changed
12
 
1
@@ -68,6 +68,10 @@
2
 #include <intrin.h>
3
 #elif HAVE_RDTSC
4
 #include <intrin.h>
5
+#elif (!defined(__APPLE__) && (defined (__GNUC__) && (defined(__x86_64__) || defined(__i386__))))
6
+#include <x86intrin.h>
7
+#elif ( !defined(__APPLE__) && defined (__GNUC__) && defined(__ARM_NEON__))
8
+#include <arm_neon.h>
9
 #elif defined(__GNUC__)
10
 /* fallback for older GCC/MinGW */
11
 static inline uint32_t __rdtsc(void)
12
x265_2.5.tar.gz/source/x265.cpp -> x265_2.6.tar.gz/source/x265.cpp Changed
18
 
1
@@ -26,7 +26,6 @@
2
 #endif
3
 
4
 #include "x265.h"
5
-#include "x265-extras.h"
6
 #include "x265cli.h"
7
 
8
 #include "input/input.h"
9
@@ -639,7 +638,7 @@
10
         {
11
             if (pic_in->bitDepth > param->internalBitDepth && cliopt.bDither)
12
             {
13
-                x265_dither_image(*api, *pic_in, cliopt.input->getWidth(), cliopt.input->getHeight(), errorBuf, param->internalBitDepth);
14
+                x265_dither_image(pic_in, cliopt.input->getWidth(), cliopt.input->getHeight(), errorBuf, param->internalBitDepth);
15
                 pic_in->bitDepth = param->internalBitDepth;
16
             }
17
             /* Overwrite PTS */
18
x265_2.5.tar.gz/source/x265.def.in -> x265_2.6.tar.gz/source/x265.def.in Changed
13
 
1
@@ -23,3 +23,11 @@
2
 x265_api_get_${X265_BUILD}
3
 x265_api_query
4
 x265_encoder_intra_refresh
5
+x265_encoder_ctu_info
6
+x265_get_slicetype_poc_and_scenecut
7
+x265_get_ref_frame_list
8
+x265_csvlog_open
9
+x265_csvlog_frame
10
+x265_csvlog_encode
11
+x265_dither_image
12
+x265_set_analysis_data
13
x265_2.5.tar.gz/source/x265.h -> x265_2.6.tar.gz/source/x265.h Changed
177
 
1
@@ -35,6 +35,10 @@
2
  *      opaque handler for encoder */
3
 typedef struct x265_encoder x265_encoder;
4
 
5
+/* x265_picyuv:
6
+ *      opaque handler for PicYuv */
7
+typedef struct x265_picyuv x265_picyuv;
8
+
9
 /* Application developers planning to link against a shared library version of
10
  * libx265 from a Microsoft Visual Studio or similar development environment
11
  * will need to define X265_API_IMPORTS before including this header.
12
@@ -88,6 +92,21 @@
13
     uint8_t* payload;
14
 } x265_nal;
15
 
16
+#define X265_LOOKAHEAD_MAX 250
17
+
18
+typedef struct x265_lookahead_data
19
+{
20
+    int64_t   plannedSatd[X265_LOOKAHEAD_MAX + 1];
21
+    uint32_t  *vbvCost;
22
+    uint32_t  *intraVbvCost;
23
+    uint32_t  *satdForVbv;
24
+    uint32_t  *intraSatdForVbv;
25
+    int       keyframe;
26
+    int       lastMiniGopBFrame;
27
+    int       plannedType[X265_LOOKAHEAD_MAX + 1];
28
+    int64_t   dts;
29
+} x265_lookahead_data;
30
+
31
 /* Stores all analysis data for a single frame */
32
 typedef struct x265_analysis_data
33
 {
34
@@ -102,6 +121,9 @@
35
     void*            wt;
36
     void*            interData;
37
     void*            intraData;
38
+    uint32_t         numCuInHeight;
39
+    x265_lookahead_data lookahead;
40
+    uint8_t*         modeFlag[2];
41
 } x265_analysis_data;
42
 
43
 /* cu statistics */
44
@@ -202,6 +224,11 @@
45
     CTU_INFO_CHANGE = 2,
46
 }CTUInfo;
47
 
48
+typedef enum
49
+{
50
+    NO_INFO = 0,
51
+    AVC_INFO = 1,
52
+}MVRefineType;
53
 
54
 /* Arbitrary User SEI
55
  * Payload size is in bytes and the payload pointer must be non-NULL. 
56
@@ -523,15 +550,15 @@
57
 /* String values accepted by x265_param_parse() (and CLI) for various parameters */
58
 static const char * const x265_motion_est_names[] = { "dia", "hex", "umh", "star", "sea", "full", 0 };
59
 static const char * const x265_source_csp_names[] = { "i400", "i420", "i422", "i444", "nv12", "nv16", 0 };
60
-static const char * const x265_video_format_names[] = { "component", "pal", "ntsc", "secam", "mac", "undef", 0 };
61
+static const char * const x265_video_format_names[] = { "component", "pal", "ntsc", "secam", "mac", "unknown", 0 };
62
 static const char * const x265_fullrange_names[] = { "limited", "full", 0 };
63
-static const char * const x265_colorprim_names[] = { "", "bt709", "undef", "", "bt470m", "bt470bg", "smpte170m", "smpte240m", "film", "bt2020", 0 };
64
-static const char * const x265_transfer_names[] = { "", "bt709", "undef", "", "bt470m", "bt470bg", "smpte170m", "smpte240m", "linear", "log100",
65
+static const char * const x265_colorprim_names[] = { "reserved", "bt709", "unknown", "reserved", "bt470m", "bt470bg", "smpte170m", "smpte240m", "film", "bt2020", "smpte428", "smpte431", "smpte432", 0 };
66
+static const char * const x265_transfer_names[] = { "reserved", "bt709", "unknown", "reserved", "bt470m", "bt470bg", "smpte170m", "smpte240m", "linear", "log100",
67
                                                     "log316", "iec61966-2-4", "bt1361e", "iec61966-2-1", "bt2020-10", "bt2020-12",
68
-                                                    "smpte-st-2084", "smpte-st-428", "arib-std-b67", 0 };
69
-static const char * const x265_colmatrix_names[] = { "GBR", "bt709", "undef", "", "fcc", "bt470bg", "smpte170m", "smpte240m",
70
-                                                     "YCgCo", "bt2020nc", "bt2020c", 0 };
71
-static const char * const x265_sar_names[] = { "undef", "1:1", "12:11", "10:11", "16:11", "40:33", "24:11", "20:11",
72
+                                                    "smpte2084", "smpte428", "arib-std-b67", 0 };
73
+static const char * const x265_colmatrix_names[] = { "gbr", "bt709", "unknown", "", "fcc", "bt470bg", "smpte170m", "smpte240m",
74
+                                                     "ycgco", "bt2020nc", "bt2020c", "smpte2085", "chroma-derived-nc", "chroma-derived-c", "ictcp", 0 };
75
+static const char * const x265_sar_names[] = { "unknown", "1:1", "12:11", "10:11", "16:11", "40:33", "24:11", "20:11",
76
                                                "32:11", "80:33", "18:11", "15:11", "64:33", "160:99", "4:3", "3:2", "2:1", 0 };
77
 static const char * const x265_interlace_names[] = { "prog", "tff", "bff", 0 };
78
 static const char * const x265_analysis_names[] = { "off", "save", "load", 0 };
79
@@ -1479,6 +1506,35 @@
80
 
81
     /* File pointer for csv log */
82
     FILE*     csvfpt;
83
+
84
+    /* Force flushing the frames from encoder */
85
+    int       forceFlush;
86
+
87
+    /* Enable skipping split RD analysis when sum of split CU rdCost larger than none split CU rdCost for Intra CU */
88
+    int       bEnableSplitRdSkip;
89
+
90
+    /* Disable lookahead */
91
+    int       bDisableLookahead;
92
+
93
+    /* Use low-pass subband dct approximation 
94
+    *  This DCT approximation is less computational intensive and gives results close to standard DCT */
95
+    int       bLowPassDct;
96
+
97
+    /* Sets the portion of the decode buffer that must be available after all the
98
+    * specified frames have been inserted into the decode buffer. If it is less
99
+    * than 1, then the final buffer available is vbv-end * vbvBufferSize.  Otherwise,
100
+    * it is interpreted as the final buffer available in kbits. Default 0 (disabled) */
101
+    double    vbvBufferEnd;
102
+    
103
+    /* Frame from which qp has to be adjusted to hit final decode buffer emptiness.
104
+    * Specified as a fraction of the total frames. Default 0 */
105
+    double    vbvEndFrameAdjust;
106
+
107
+    /* Reuse MV information obtained through API */
108
+    int       bMVType;
109
+
110
+    /* Allow the encoder to have a copy of the planes of x265_picture in Frame */
111
+    int       bCopyPicToFrame;
112
 } x265_param;
113
 
114
 /* x265_param_alloc:
115
@@ -1677,10 +1733,47 @@
116
  *    the encoder will wait for this copy to complete if enabled.
117
  */
118
 int x265_encoder_ctu_info(x265_encoder *, int poc, x265_ctu_info_t** ctu);
119
+
120
+/* x265_get_slicetype_poc_and_scenecut:
121
+ *     get the slice type, poc and scene cut information for the current frame,
122
+ *     returns negative on error, 0 when access unit were output.
123
+ *     This API must be called after(poc >= lookaheadDepth + bframes + 2) condition check */
124
+int x265_get_slicetype_poc_and_scenecut(x265_encoder *encoder, int *slicetype, int *poc, int* sceneCut);
125
+
126
+/* x265_get_ref_frame_list:
127
+ *     returns negative on error, 0 when access unit were output.
128
+ *     This API must be called after(poc >= lookaheadDepth + bframes + 2) condition check */
129
+int x265_get_ref_frame_list(x265_encoder *encoder, x265_picyuv**, x265_picyuv**, int, int);
130
+
131
+/* x265_set_analysis_data:
132
+ *     set the analysis data. The incoming analysis_data structure is assumed to be AVC-sized blocks.
133
+ *     returns negative on error, 0 access unit were output. */
134
+int x265_set_analysis_data(x265_encoder *encoder, x265_analysis_data *analysis_data, int poc, uint32_t cuBytes);
135
+
136
 /* x265_cleanup:
137
  *       release library static allocations, reset configured CTU size */
138
 void x265_cleanup(void);
139
 
140
+/* Open a CSV log file. On success it returns a file handle which must be passed
141
+ * to x265_csvlog_frame() and/or x265_csvlog_encode(). The file handle must be
142
+ * closed by the caller using fclose(). If csv-loglevel is 0, then no frame logging
143
+ * header is written to the file. This function will return NULL if it is unable
144
+ * to open the file for write or if it detects a structure size skew */
145
+FILE* x265_csvlog_open(const x265_param *);
146
+
147
+/* Log frame statistics to the CSV file handle. csv-loglevel should have been non-zero
148
+ * in the call to x265_csvlog_open() if this function is called. */
149
+void x265_csvlog_frame(const x265_param *, const x265_picture *);
150
+
151
+/* Log final encode statistics to the CSV file handle. 'argc' and 'argv' are
152
+ * intended to be command line arguments passed to the encoder. Encode
153
+ * statistics should be queried from the encoder just prior to closing it. */
154
+void x265_csvlog_encode(x265_encoder *encoder, const x265_stats *, int argc, char** argv);
155
+
156
+/* In-place downshift from a bit-depth greater than 8 to a bit-depth of 8, using
157
+ * the residual bits to dither each row. */
158
+void x265_dither_image(x265_picture *, int picWidth, int picHeight, int16_t *errorBuf, int bitDepth);
159
+
160
 #define X265_MAJOR_VERSION 1
161
 
162
 /* === Multi-lib API ===
163
@@ -1726,6 +1819,13 @@
164
     int           sizeof_frame_stats;   /* sizeof(x265_frame_stats) */
165
     int           (*encoder_intra_refresh)(x265_encoder*);
166
     int           (*encoder_ctu_info)(x265_encoder*, int, x265_ctu_info_t**);
167
+    int           (*get_slicetype_poc_and_scenecut)(x265_encoder*, int*, int*, int*);
168
+    int           (*get_ref_frame_list)(x265_encoder*, x265_picyuv**, x265_picyuv**, int, int);
169
+    FILE*         (*csvlog_open)(const x265_param*);
170
+    void          (*csvlog_frame)(const x265_param*, const x265_picture*);
171
+    void          (*csvlog_encode)(x265_encoder*, const x265_stats*, int, char**);
172
+    void          (*dither_image)(x265_picture*, int, int, int16_t*, int);
173
+    int           (*set_analysis_data)(x265_encoder *encoder, x265_analysis_data *analysis_data, int poc, uint32_t cuBytes);
174
     /* add new pointers to the end, or increment X265_MAJOR_VERSION */
175
 } x265_api;
176
 
177
x265_2.5.tar.gz/source/x265cli.h -> x265_2.6.tar.gz/source/x265cli.h Changed
130
 
1
@@ -147,6 +147,8 @@
2
     { "vbv-maxrate",    required_argument, NULL, 0 },
3
     { "vbv-bufsize",    required_argument, NULL, 0 },
4
     { "vbv-init",       required_argument, NULL, 0 },
5
+    { "vbv-end",        required_argument, NULL, 0 },
6
+    { "vbv-end-fr-adj", required_argument, NULL, 0 },
7
     { "bitrate",        required_argument, NULL, 0 },
8
     { "qp",             required_argument, NULL, 'q' },
9
     { "aq-mode",        required_argument, NULL, 0 },
10
@@ -255,8 +257,7 @@
11
     { "analysis-reuse-level", required_argument, NULL, 0 },
12
     { "scale-factor",   required_argument, NULL, 0 },
13
     { "refine-intra",   required_argument, NULL, 0 },
14
-    { "refine-inter",   no_argument, NULL, 0 },
15
-    { "no-refine-inter",no_argument, NULL, 0 },
16
+    { "refine-inter",   required_argument, NULL, 0 },
17
     { "strict-cbr",           no_argument, NULL, 0 },
18
     { "temporal-layers",      no_argument, NULL, 0 },
19
     { "no-temporal-layers",   no_argument, NULL, 0 },
20
@@ -280,6 +281,13 @@
21
     { "no-dhdr10-opt",        no_argument, NULL, 0},
22
     { "refine-mv",            no_argument, NULL, 0 },
23
     { "no-refine-mv",         no_argument, NULL, 0 },
24
+    { "force-flush",    required_argument, NULL, 0 },
25
+    { "splitrd-skip",         no_argument, NULL, 0 },
26
+    { "no-splitrd-skip",      no_argument, NULL, 0 },
27
+    { "lowpass-dct",          no_argument, NULL, 0 },
28
+    { "refine-mv-type", required_argument, NULL, 0 },
29
+    { "copy-pic",             no_argument, NULL, 0 },
30
+    { "no-copy-pic",          no_argument, NULL, 0 },
31
     { 0, 0, 0, 0 },
32
     { 0, 0, 0, 0 },
33
     { 0, 0, 0, 0 },
34
@@ -333,6 +341,7 @@
35
     H0("   --seek <integer>              First frame to encode\n");
36
     H1("   --[no-]interlace <bff|tff>    Indicate input pictures are interlace fields in temporal order. Default progressive\n");
37
     H1("   --dither                      Enable dither if downscaling to 8 bit pixels. Default disabled\n");
38
+    H0("   --[no-]copy-pic               Copy buffers of input picture in frame. Default %s\n", OPT(param->bCopyPicToFrame));
39
     H0("\nQuality reporting metrics:\n");
40
     H0("   --[no-]ssim                   Enable reporting SSIM metric scores. Default %s\n", OPT(param->bEnableSsim));
41
     H0("   --[no-]psnr                   Enable reporting PSNR metric scores. Default %s\n", OPT(param->bEnablePsnr));
42
@@ -374,6 +383,7 @@
43
     H0("   --[no-]early-skip             Enable early SKIP detection. Default %s\n", OPT(param->bEnableEarlySkip));
44
     H0("   --[no-]rskip                  Enable early exit from recursion. Default %s\n", OPT(param->bEnableRecursionSkip));
45
     H1("   --[no-]tskip-fast             Enable fast intra transform skipping. Default %s\n", OPT(param->bEnableTSkipFast));
46
+    H1("   --[no-]splitrd-skip           Enable skipping split RD analysis when sum of split CU rdCost larger than none split CU rdCost for Intra CU. Default %s\n", OPT(param->bEnableSplitRdSkip));
47
     H1("   --nr-intra <integer>          An integer value in range of 0 to 2000, which denotes strength of noise reduction in intra CUs. Default 0\n");
48
     H1("   --nr-inter <integer>          An integer value in range of 0 to 2000, which denotes strength of noise reduction in inter CUs. Default 0\n");
49
     H0("   --ctu-info <integer>          Enable receiving ctu information asynchronously and determine reaction to the CTU information (0, 1, 2, 4, 6) Default 0\n"
50
@@ -415,7 +425,7 @@
51
     H0("   --rc-lookahead <integer>      Number of frames for frame-type lookahead (determines encoder latency) Default %d\n", param->lookaheadDepth);
52
     H1("   --lookahead-slices <0..16>    Number of slices to use per lookahead cost estimate. Default %d\n", param->lookaheadSlices);
53
     H0("   --lookahead-threads <integer> Number of threads to be dedicated to perform lookahead only. Default %d\n", param->lookaheadThreads);
54
-    H0("   --bframes <integer>           Maximum number of consecutive b-frames (now it only enables B GOP structure) Default %d\n", param->bframes);
55
+    H0("-b/--bframes <0..16>             Maximum number of consecutive b-frames. Default %d\n", param->bframes);
56
     H1("   --bframe-bias <integer>       Bias towards B frame decisions. Default %d\n", param->bFrameBias);
57
     H0("   --b-adapt <0..2>              0 - none, 1 - fast, 2 - full (trellis) adaptive B frame scheduling. Default %d\n", param->bFrameAdaptive);
58
     H0("   --[no-]b-pyramid              Use B-frames as references. Default %s\n", OPT(param->bBPyramid));
59
@@ -423,6 +433,10 @@
60
     H1("                                 Format of each line: framenumber frametype QP\n");
61
     H1("                                 QP is optional (none lets x265 choose). Frametypes: I,i,K,P,B,b.\n");
62
     H1("                                 QPs are restricted by qpmin/qpmax.\n");
63
+    H1("   --force-flush <integer>       Force the encoder to flush frames. Default %d\n", param->forceFlush);
64
+    H1("                                 0 - flush the encoder only when all the input pictures are over.\n");
65
+    H1("                                 1 - flush all the frames even when the input is not over. Slicetype decision may change with this option.\n");
66
+    H1("                                 2 - flush the slicetype decided frames only.\n");
67
     H0("\nRate control, Adaptive Quantization:\n");
68
     H0("   --bitrate <integer>           Target bitrate (kbps) for ABR (implied). Default %d\n", param->rc.bitrate);
69
     H1("-q/--qp <integer>                QP for P slices in CQP mode (implied). --ipratio and --pbration determine other slice QPs\n");
70
@@ -435,6 +449,8 @@
71
     H0("   --vbv-maxrate <integer>       Max local bitrate (kbit/s). Default %d\n", param->rc.vbvMaxBitrate);
72
     H0("   --vbv-bufsize <integer>       Set size of the VBV buffer (kbit). Default %d\n", param->rc.vbvBufferSize);
73
     H0("   --vbv-init <float>            Initial VBV buffer occupancy (fraction of bufsize or in kbits). Default %.2f\n", param->rc.vbvBufferInit);
74
+    H0("   --vbv-end <float>             Final VBV buffer emptiness (fraction of bufsize or in kbits). Default 0 (disabled)\n");
75
+    H0("   --vbv-end-fr-adj <float>      Frame from which qp has to be adjusted to achieve final decode buffer emptiness. Default 0\n");
76
     H0("   --pass                        Multi pass rate control.\n"
77
        "                                   - 1 : First pass, creates stats file\n"
78
        "                                   - 2 : Last pass, does not overwrite stats file\n"
79
@@ -448,9 +464,21 @@
80
     H0("   --analysis-reuse-mode <string|int>  save - Dump analysis info into file, load - Load analysis buffers from the file. Default %d\n", param->analysisReuseMode);
81
     H0("   --analysis-reuse-file <filename>    Specify file name used for either dumping or reading analysis data. Deault x265_analysis.dat\n");
82
     H0("   --analysis-reuse-level <1..10>      Level of analysis reuse indicates amount of info stored/reused in save/load mode, 1:least..10:most. Default %d\n", param->analysisReuseLevel);
83
+    H0("   --refine-mv-type <string>     Reuse MV information received through API call. Supported option is avc. Default disabled - %d\n", param->bMVType);
84
     H0("   --scale-factor <int>          Specify factor by which input video is scaled down for analysis save mode. Default %d\n", param->scaleFactor);
85
-    H0("   --refine-intra <int>          Enable intra refinement for load mode. Default %d\n", param->intraRefine);
86
-    H0("   --[no-]refine-inter           Enable inter refinement for load mode. Default %s\n", OPT(param->interRefine));
87
+    H0("   --refine-intra <0..3>         Enable intra refinement for encode that uses analysis-reuse-mode=load.\n"
88
+        "                                    - 0 : Forces both mode and depth from the save encode.\n"
89
+        "                                    - 1 : Functionality of (0) + evaluate all intra modes at min-cu-size's depth when current depth is one smaller than min-cu-size's depth.\n"
90
+        "                                    - 2 : Functionality of (1) + irrespective of size evaluate all angular modes when the save encode decides the best mode as angular.\n"
91
+        "                                    - 3 : Functionality of (1) + irrespective of size evaluate all intra modes.\n"
92
+        "                                Default:%d\n", param->intraRefine);
93
+    H0("   --refine-inter <0..3>         Enable inter refinement for encode that uses analysis-reuse-mode=load.\n"
94
+        "                                    - 0 : Forces both mode and depth from the save encode.\n"
95
+        "                                    - 1 : Functionality of (0) + evaluate all inter modes at min-cu-size's depth when current depth is one smaller than\n"
96
+        "                                          min-cu-size's depth. When save encode decides the current block as skip(for all sizes) evaluate skip/merge.\n"
97
+        "                                    - 2 : Functionality of (1) + irrespective of size restrict the modes evaluated when specific modes are decided as the best mode by the save encode.\n"
98
+        "                                    - 3 : Functionality of (1) + irrespective of size evaluate all inter modes.\n"
99
+        "                                Default:%d\n", param->interRefine);
100
     H0("   --[no-]refine-mv              Enable mv refinement for load mode. Default %s\n", OPT(param->mvRefine));
101
     H0("   --aq-mode <integer>           Mode for Adaptive Quantization - 0:none 1:uniform AQ 2:auto variance 3:auto variance with bias to dark scenes. Default %d\n", param->rc.aqMode);
102
     H0("   --aq-strength <float>         Reduces blocking and blurring in flat and textured areas (0 to 3.0). Default %.2f\n", param->rc.aqStrength);
103
@@ -492,13 +520,13 @@
104
     H1("   --overscan <string>           Specify whether it is appropriate for decoder to show cropped region: undef, show or crop. Default undef\n");
105
     H0("   --videoformat <string>        Specify video format from undef, component, pal, ntsc, secam, mac. Default undef\n");
106
     H0("   --range <string>              Specify black level and range of luma and chroma signals as full or limited Default limited\n");
107
-    H0("   --colorprim <string>          Specify color primaries from undef, bt709, bt470m, bt470bg, smpte170m,\n");
108
-    H0("                                 smpte240m, film, bt2020. Default undef\n");
109
-    H0("   --transfer <string>           Specify transfer characteristics from undef, bt709, bt470m, bt470bg, smpte170m,\n");
110
+    H0("   --colorprim <string>          Specify color primaries from  bt709, unknown, reserved, bt470m, bt470bg, smpte170m,\n");
111
+    H0("                                 smpte240m, film, bt2020, smpte428, smpte431, smpte432. Default undef\n");
112
+    H0("   --transfer <string>           Specify transfer characteristics from bt709, unknown, reserved, bt470m, bt470bg, smpte170m,\n");
113
     H0("                                 smpte240m, linear, log100, log316, iec61966-2-4, bt1361e, iec61966-2-1,\n");
114
-    H0("                                 bt2020-10, bt2020-12, smpte-st-2084, smpte-st-428, arib-std-b67. Default undef\n");
115
+    H0("                                 bt2020-10, bt2020-12, smpte2084, smpte428, arib-std-b67. Default undef\n");
116
     H1("   --colormatrix <string>        Specify color matrix setting from undef, bt709, fcc, bt470bg, smpte170m,\n");
117
-    H1("                                 smpte240m, GBR, YCgCo, bt2020nc, bt2020c. Default undef\n");
118
+    H1("                                 smpte240m, GBR, YCgCo, bt2020nc, bt2020c, smpte2085, chroma-derived-nc, chroma-derived-c, ictcp. Default undef\n");
119
     H1("   --chromaloc <integer>         Specify chroma sample location (0 to 5). Default of %d\n", param->vui.chromaSampleLocTypeTopField);
120
     H0("   --master-display <string>     SMPTE ST 2086 master display color volume info SEI (HDR)\n");
121
     H0("                                    format: G(x,y)B(x,y)R(x,y)WP(x,y)L(max,min)\n");
122
@@ -525,6 +553,7 @@
123
     H1("-r/--recon <filename>            Reconstructed raw image YUV or Y4M output file name\n");
124
     H1("   --recon-depth <integer>       Bit-depth of reconstructed raw image file. Defaults to input bit depth, or 8 if Y4M\n");
125
     H1("   --recon-y4m-exec <string>     pipe reconstructed frames to Y4M viewer, ex:\"ffplay -i pipe:0 -autoexit\"\n");
126
+    H0("   --lowpass-dct                 Use low-pass subband dct approximation. Default %s\n", OPT(param->bLowPassDct));
127
     H1("\nExecutable return codes:\n");
128
     H1("    0 - encode successful\n");
129
     H1("    1 - unable to parse command line\n");
130
Refresh

No build results available

Refresh

No rpmlint results available

Request History
enzokiel's avatar

enzokiel created request over 7 years ago

Update to version 2.6


enzokiel's avatar

enzokiel accepted request over 7 years ago