We truncated the diff of some files because they were too big.
If you want to see the full diff for every file, click here.
Overview
obs-backgroundremoval.changes
Changed
x
1
2
-------------------------------------------------------------------
3
-Tue Feb 18 12:35:51 UTC 2025 - Antonio Larrosa <alarrosa@suse.com>
4
+Mon Jun 26 17:51:23 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
5
6
-- Update to 1.1.13
7
- * Add video_tick function to background filter info
8
- * Update Onnxruntime version and fix Windows compilerconfig
9
-- Update to 1.1.12
10
- * Critical bugfix in the PSNR calculation for image-similarity
11
- skipping in background filter
12
-- Update to 1.1.11
13
- * New! RMBG model from Bria.AI
14
- https://huggingface.co/briaai/RMBG-1.4 - remove background from
15
- any object! (not just human)
16
- * We got rid of the annoying "update available" message in favor
17
- of a more discreet message on the plugin settings.
18
- * Better handling of local file paths on Windows
19
- * more.
20
-- Update to 1.1.10
21
- * This release will fix the Flatpak recipe for Linux after the
22
- dependency bump, as well as removing the start menu option from
23
- the Windows installer.
24
-- Update to 1.1.9
25
- * In this release we bumped versions of OpenCV and ONNXRuntime,
26
- and trying to get rid of the annoying "smart screen" block on
27
- Windows. We're also rolling out releases through AUR, Pacstall
28
- and Flatpak. 💪 Linux!
29
-- Update to 1.1.8
30
- * In this release we're introducing "simple mode" that hides most
31
- of the settings under an "Advanced" checkbox, which should make
32
- it far easier for newcomers to start using the filter without
33
- "settings shock".
34
- * Additionaly we implemented "temporal smoothing" that helps with
35
- reducing the flickering of the edges in the binary mask.
36
- * We bumped ONNX Runtime to v1.16.3 that increases robustness and
37
- speed.
38
- * We fixed the bug of the updater popping up the dialog because
39
- we changed the repo URL.
40
-- Update to 1.1.7
41
- * Upgrade to ONNXRuntime 1.16 which improves speed and
42
- robustness.
43
- * Repackaging of Mac OS release to a more consistent with Apple
44
- dev tools.
45
- * Fix crashes and bugs on Linux
46
- * We added a new "website" for the plugin, which will eventually
47
- have more installation info
48
- https://occ-ai.github.io/obs-backgroundremoval/
49
- * Adding a detailed log message with plugin info which helps us
50
- debug
51
+Build only x86_64
52
53
-- Update onnxruntime to 1.17.1.tgz
54
-- Use Source URLs in the spec file
55
-- Add patch to fix a cmake error:
56
- * fix-cmake-error.patch
57
+-------------------------------------------------------------------
58
+Mon Jun 26 16:29:21 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
59
+
60
+v1.0.3
61
+
62
+-------------------------------------------------------------------
63
+Fri Jun 23 17:28:31 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
64
+
65
+v1.0.2
66
67
-------------------------------------------------------------------
68
-Thu Sep 21 13:50:09 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
69
+Wed Jun 21 16:43:20 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
70
71
-- 1.1.6
72
+v1.0.1
73
obs-backgroundremoval.spec
Changed
70
1
2
3
4
Name: obs-backgroundremoval
5
-Version: 1.1.13
6
+Version: 1.0.3
7
Release: 0
8
Summary: OBS Plugin for Background Removal
9
License: GPL-2.0
10
-URL: https://github.com/locaal-ai/obs-backgroundremoval
11
-Source: https://github.com/locaal-ai/%{name}/archive/refs/tags/%{version}.tar.gz#/%{name}-%{version}.tar.gz
12
+URL: https://github.com/royshil/obs-backgroundremoval
13
+Source: %{name}-%{version}.tar.gz
14
Source1: %{name}-rpmlintrc
15
-Source2: opencv-linux-Release-4.8.0-1.tar.gz
16
-Source3: https://github.com/microsoft/onnxruntime/releases/download/v1.17.1/onnxruntime-linux-x64-gpu-1.17.1.tgz
17
-Patch0: fix-cmake-error.patch
18
+Source2: opencv-4.7.0.tar.gz
19
+Source3: onnxruntime-linux-x64-gpu-1.15.1.tgz
20
BuildRequires: cmake
21
-BuildRequires: libcurl-devel
22
+BuildRequires: gcc-c++
23
BuildRequires: obs-studio
24
BuildRequires: cmake(libobs)
25
-BuildRequires: cmake(Qt6Core)
26
-BuildRequires: cmake(Qt6Widgets)
27
-Requires: obs-studio >= 29.0.0
28
+Requires: obs-studio >= 28.0.0
29
ExclusiveArch: x86_64
30
31
%global __requires_exclude_from ^.*libonnxruntime.*$
32
-%global __builddir build_x86_64
33
34
%description
35
An OBS plugin for removing background in portrait images (video), making it easy to replace the background when screen recording.
36
37
%prep
38
-%autosetup -p1
39
+%autosetup
40
41
%build
42
-test -x "$(type -p gcc-13)" && export CC="$_"
43
-test -x "$(type -p g++-13)" && export CXX="$_"
44
-%cmake \
45
- -DQT_VERSION=6 \
46
- -DCMAKE_BUILD_TYPE=RelWithDebInfo \
47
- -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
48
- -DENABLE_FRONTEND_API=ON \
49
- -DENABLE_QT=ON \
50
- -DCMAKE_COMPILE_WARNING_AS_ERROR=ON \
51
- -DCUSTOM_OPENCV_URL=%{SOURCE2} \
52
- -DCUSTOM_OPENCV_HASH=MD5=7a668fbc3ac536812643c6b8c8f96be9 \
53
+%cmake -DLINUX_PORTABLE=OFF \
54
+ -DOPENCV_URL=%{SOURCE2} \
55
+ -DOPENCV_MD5=13e13244cb0cc6ec4f01eacd38d05d17 \
56
-DCUSTOM_ONNXRUNTIME_URL=%{SOURCE3} \
57
- -DCUSTOM_ONNXRUNTIME_HASH=MD5=da53e83b3ad3ab2cf46fbabd6a648a9d
58
+ -DCUSTOM_ONNXRUNTIME_MD5=8d2f5ee9f449bdecb10a45715fe74c53
59
%cmake_build
60
61
%install
62
63
%files
64
%license LICENSE
65
%doc README.md
66
+%defattr(-,root,root,-)
67
/usr/lib64/obs-plugins/obs-backgroundremoval.so
68
/usr/lib64/obs-plugins/obs-backgroundremoval
69
/usr/share/obs/obs-plugins/obs-backgroundremoval
70
fix-cmake-error.patch
Deleted
14
1
2
-Index: obs-backgroundremoval-1.1.13/cmake/common/helpers_common.cmake
3
-===================================================================
4
---- obs-backgroundremoval-1.1.13.orig/cmake/common/helpers_common.cmake
5
-+++ obs-backgroundremoval-1.1.13/cmake/common/helpers_common.cmake
6
-@@ -86,7 +86,6 @@ macro(find_qt)
7
- add_library(Qt::${component} INTERFACE IMPORTED)
8
- set_target_properties(Qt::${component} PROPERTIES INTERFACE_LINK_LIBRARIES Qt${_QT_VERSION}::${component})
9
- endif()
10
-- set_property(TARGET Qt::${component} PROPERTY INTERFACE_COMPILE_FEATURES "")
11
- endforeach()
12
-
13
- endmacro()
14
obs-backgroundremoval-1.1.13.tar.gz -> obs-backgroundremoval-1.0.3.tar.gz
Changed
onnxruntime-linux-x64-gpu-1.17.1.tgz/GIT_COMMIT_ID -> onnxruntime-linux-x64-gpu-1.15.1.tgz/GIT_COMMIT_ID
Changed
4
1
2
-8f5c79cb63f09ef1302e85081093a3fe4da1bc7d
3
+baeece44ba075009c6bfe95891a8c1b3d4571cb3
4
onnxruntime-linux-x64-gpu-1.17.1.tgz/README.md -> onnxruntime-linux-x64-gpu-1.15.1.tgz/README.md
Changed
43
1
2
3
**ONNX Runtime training** can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →(https://www.onnxruntime.ai/docs/#onnx-runtime-for-training)
4
5
+
6
## Get Started & Resources
7
8
* **General Information**: onnxruntime.ai(https://onnxruntime.ai)
9
10
-* **Usage documentation and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
11
+* **Usage documention and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
12
13
* **YouTube video tutorials**: youtube.com/@ONNXRuntime(https://www.youtube.com/@ONNXRuntime)
14
15
* **Upcoming Release Roadmap**(https://github.com/microsoft/onnxruntime/wiki/Upcoming-Release-Roadmap)
16
17
-* **Companion sample repositories**:
18
+* **Companion sample repositories**:
19
- ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples(https://github.com/microsoft/onnxruntime-inference-examples)
20
- ONNX Runtime Training: microsoft/onnxruntime-training-examples(https://github.com/microsoft/onnxruntime-training-examples)
21
22
-## Builtin Pipeline Status
23
24
+## Build Pipeline Status
25
|System|Inference|Training|
26
|---|---|---|
27
|Windows|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20CPU%20CI%20Pipeline?label=Windows+CPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=9)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20CI%20Pipeline?label=Windows+GPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=10)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20TensorRT%20CI%20Pipeline?label=Windows+GPU+TensorRT)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=47)||
28
29
|Android|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Android%20CI%20Pipeline?label=Android)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=53)||
30
|iOS|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/iOS%20CI%20Pipeline?label=iOS)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=134)||
31
|Web|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/ONNX%20Runtime%20Web%20CI%20Pipeline?label=Web)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=161)||
32
-|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)||
33
-
34
-## Third-party Pipeline Status
35
+|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-python-checks-ci-pipeline?label=Python+Checks)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=164)||
36
37
-|System|Inference|Training|
38
-|---|---|---|
39
-|Linux|!Build Status(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml/badge.svg)(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml)||
40
41
## Data/Telemetry
42
43
onnxruntime-linux-x64-gpu-1.17.1.tgz/ThirdPartyNotices.txt -> onnxruntime-linux-x64-gpu-1.15.1.tgz/ThirdPartyNotices.txt
Changed
492
1
2
3
Except as contained in this notice, the name of a copyright holder shall not
4
be used in advertising or otherwise to promote the sale, use or other dealings
5
-in this Software without prior written authorization of the copyright holder.
6
-
7
-_____
8
-
9
-Intel neural-compressor
10
-
11
-https://github.com/intel/neural-compressor
12
-
13
- Apache License
14
- Version 2.0, January 2004
15
- http://www.apache.org/licenses/
16
-
17
- TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
18
-
19
- 1. Definitions.
20
-
21
- "License" shall mean the terms and conditions for use, reproduction,
22
- and distribution as defined by Sections 1 through 9 of this document.
23
-
24
- "Licensor" shall mean the copyright owner or entity authorized by
25
- the copyright owner that is granting the License.
26
-
27
- "Legal Entity" shall mean the union of the acting entity and all
28
- other entities that control, are controlled by, or are under common
29
- control with that entity. For the purposes of this definition,
30
- "control" means (i) the power, direct or indirect, to cause the
31
- direction or management of such entity, whether by contract or
32
- otherwise, or (ii) ownership of fifty percent (50%) or more of the
33
- outstanding shares, or (iii) beneficial ownership of such entity.
34
-
35
- "You" (or "Your") shall mean an individual or Legal Entity
36
- exercising permissions granted by this License.
37
-
38
- "Source" form shall mean the preferred form for making modifications,
39
- including but not limited to software source code, documentation
40
- source, and configuration files.
41
-
42
- "Object" form shall mean any form resulting from mechanical
43
- transformation or translation of a Source form, including but
44
- not limited to compiled object code, generated documentation,
45
- and conversions to other media types.
46
-
47
- "Work" shall mean the work of authorship, whether in Source or
48
- Object form, made available under the License, as indicated by a
49
- copyright notice that is included in or attached to the work
50
- (an example is provided in the Appendix below).
51
-
52
- "Derivative Works" shall mean any work, whether in Source or Object
53
- form, that is based on (or derived from) the Work and for which the
54
- editorial revisions, annotations, elaborations, or other modifications
55
- represent, as a whole, an original work of authorship. For the purposes
56
- of this License, Derivative Works shall not include works that remain
57
- separable from, or merely link (or bind by name) to the interfaces of,
58
- the Work and Derivative Works thereof.
59
-
60
- "Contribution" shall mean any work of authorship, including
61
- the original version of the Work and any modifications or additions
62
- to that Work or Derivative Works thereof, that is intentionally
63
- submitted to Licensor for inclusion in the Work by the copyright owner
64
- or by an individual or Legal Entity authorized to submit on behalf of
65
- the copyright owner. For the purposes of this definition, "submitted"
66
- means any form of electronic, verbal, or written communication sent
67
- to the Licensor or its representatives, including but not limited to
68
- communication on electronic mailing lists, source code control systems,
69
- and issue tracking systems that are managed by, or on behalf of, the
70
- Licensor for the purpose of discussing and improving the Work, but
71
- excluding communication that is conspicuously marked or otherwise
72
- designated in writing by the copyright owner as "Not a Contribution."
73
-
74
- "Contributor" shall mean Licensor and any individual or Legal Entity
75
- on behalf of whom a Contribution has been received by Licensor and
76
- subsequently incorporated within the Work.
77
-
78
- 2. Grant of Copyright License. Subject to the terms and conditions of
79
- this License, each Contributor hereby grants to You a perpetual,
80
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
81
- copyright license to reproduce, prepare Derivative Works of,
82
- publicly display, publicly perform, sublicense, and distribute the
83
- Work and such Derivative Works in Source or Object form.
84
-
85
- 3. Grant of Patent License. Subject to the terms and conditions of
86
- this License, each Contributor hereby grants to You a perpetual,
87
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
88
- (except as stated in this section) patent license to make, have made,
89
- use, offer to sell, sell, import, and otherwise transfer the Work,
90
- where such license applies only to those patent claims licensable
91
- by such Contributor that are necessarily infringed by their
92
- Contribution(s) alone or by combination of their Contribution(s)
93
- with the Work to which such Contribution(s) was submitted. If You
94
- institute patent litigation against any entity (including a
95
- cross-claim or counterclaim in a lawsuit) alleging that the Work
96
- or a Contribution incorporated within the Work constitutes direct
97
- or contributory patent infringement, then any patent licenses
98
- granted to You under this License for that Work shall terminate
99
- as of the date such litigation is filed.
100
-
101
- 4. Redistribution. You may reproduce and distribute copies of the
102
- Work or Derivative Works thereof in any medium, with or without
103
- modifications, and in Source or Object form, provided that You
104
- meet the following conditions:
105
-
106
- (a) You must give any other recipients of the Work or
107
- Derivative Works a copy of this License; and
108
-
109
- (b) You must cause any modified files to carry prominent notices
110
- stating that You changed the files; and
111
-
112
- (c) You must retain, in the Source form of any Derivative Works
113
- that You distribute, all copyright, patent, trademark, and
114
- attribution notices from the Source form of the Work,
115
- excluding those notices that do not pertain to any part of
116
- the Derivative Works; and
117
-
118
- (d) If the Work includes a "NOTICE" text file as part of its
119
- distribution, then any Derivative Works that You distribute must
120
- include a readable copy of the attribution notices contained
121
- within such NOTICE file, excluding those notices that do not
122
- pertain to any part of the Derivative Works, in at least one
123
- of the following places: within a NOTICE text file distributed
124
- as part of the Derivative Works; within the Source form or
125
- documentation, if provided along with the Derivative Works; or,
126
- within a display generated by the Derivative Works, if and
127
- wherever such third-party notices normally appear. The contents
128
- of the NOTICE file are for informational purposes only and
129
- do not modify the License. You may add Your own attribution
130
- notices within Derivative Works that You distribute, alongside
131
- or as an addendum to the NOTICE text from the Work, provided
132
- that such additional attribution notices cannot be construed
133
- as modifying the License.
134
-
135
- You may add Your own copyright statement to Your modifications and
136
- may provide additional or different license terms and conditions
137
- for use, reproduction, or distribution of Your modifications, or
138
- for any such Derivative Works as a whole, provided Your use,
139
- reproduction, and distribution of the Work otherwise complies with
140
- the conditions stated in this License.
141
-
142
- 5. Submission of Contributions. Unless You explicitly state otherwise,
143
- any Contribution intentionally submitted for inclusion in the Work
144
- by You to the Licensor shall be under the terms and conditions of
145
- this License, without any additional terms or conditions.
146
- Notwithstanding the above, nothing herein shall supersede or modify
147
- the terms of any separate license agreement you may have executed
148
- with Licensor regarding such Contributions.
149
-
150
- 6. Trademarks. This License does not grant permission to use the trade
151
- names, trademarks, service marks, or product names of the Licensor,
152
- except as required for reasonable and customary use in describing the
153
- origin of the Work and reproducing the content of the NOTICE file.
154
-
155
- 7. Disclaimer of Warranty. Unless required by applicable law or
156
- agreed to in writing, Licensor provides the Work (and each
157
- Contributor provides its Contributions) on an "AS IS" BASIS,
158
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
159
- implied, including, without limitation, any warranties or conditions
160
- of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
161
- PARTICULAR PURPOSE. You are solely responsible for determining the
162
- appropriateness of using or redistributing the Work and assume any
163
- risks associated with Your exercise of permissions under this License.
164
-
165
- 8. Limitation of Liability. In no event and under no legal theory,
166
- whether in tort (including negligence), contract, or otherwise,
167
- unless required by applicable law (such as deliberate and grossly
168
- negligent acts) or agreed to in writing, shall any Contributor be
169
- liable to You for damages, including any direct, indirect, special,
170
- incidental, or consequential damages of any character arising as a
171
- result of this License or out of the use or inability to use the
172
- Work (including but not limited to damages for loss of goodwill,
173
- work stoppage, computer failure or malfunction, or any and all
174
- other commercial damages or losses), even if such Contributor
175
- has been advised of the possibility of such damages.
176
-
177
- 9. Accepting Warranty or Additional Liability. While redistributing
178
- the Work or Derivative Works thereof, You may choose to offer,
179
- and charge a fee for, acceptance of support, warranty, indemnity,
180
- or other liability obligations and/or rights consistent with this
181
- License. However, in accepting such obligations, You may act only
182
- on Your own behalf and on Your sole responsibility, not on behalf
183
- of any other Contributor, and only if You agree to indemnify,
184
- defend, and hold each Contributor harmless for any liability
185
- incurred by, or claims asserted against, such Contributor by reason
186
- of your accepting any such warranty or additional liability.
187
-
188
- END OF TERMS AND CONDITIONS
189
-
190
- ============================================================================
191
-
192
- Copyright 2016-2019 Intel Corporation
193
- Copyright 2018 YANDEX LLC
194
-
195
- Licensed under the Apache License, Version 2.0 (the "License");
196
- you may not use this file except in compliance with the License.
197
- You may obtain a copy of the License at
198
-
199
- http://www.apache.org/licenses/LICENSE-2.0
200
-
201
- Unless required by applicable law or agreed to in writing, software
202
- distributed under the License is distributed on an "AS IS" BASIS,
203
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
204
- See the License for the specific language governing permissions and
205
- limitations under the License.
206
-
207
- This distribution includes third party software ("third party programs").
208
- This third party software, even if included with the distribution of
209
- the Intel software, may be governed by separate license terms, including
210
- without limitation, third party license terms, other Intel software license
211
- terms, and open source software license terms. These separate license terms
212
- govern your use of the third party programs as set forth in the
213
- "THIRD-PARTY-PROGRAMS" file.
214
-
215
-_____
216
-
217
-FlashAttention, https://github.com/Dao-AILab/flash-attention
218
-
219
-BSD 3-Clause License
220
-
221
-Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file.
222
-All rights reserved.
223
-
224
-Redistribution and use in source and binary forms, with or without
225
-modification, are permitted provided that the following conditions are met:
226
-
227
-* Redistributions of source code must retain the above copyright notice, this
228
- list of conditions and the following disclaimer.
229
-
230
-* Redistributions in binary form must reproduce the above copyright notice,
231
- this list of conditions and the following disclaimer in the documentation
232
- and/or other materials provided with the distribution.
233
-
234
-* Neither the name of the copyright holder nor the names of its
235
- contributors may be used to endorse or promote products derived from
236
- this software without specific prior written permission.
237
-
238
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
239
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
240
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
241
-DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
242
-FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
243
-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
244
-SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
245
-CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
246
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
247
-OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
248
-
249
-_____
250
-
251
-composable_kernel
252
-
253
-https://github.com/ROCmSoftwarePlatform/composable_kernel
254
-
255
-Copyright (c) 2018- , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang)
256
-Copyright (c) 2019- , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang)
257
-Copyright (c) 2022- , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan)
258
-Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang)
259
-Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah)
260
-Copyright (c) 2020 , Advanced Micro Devices, Inc. (Xiaoyan Zhou)
261
-Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan)
262
-
263
-SPDX-License-Identifier: MIT
264
-Copyright (c) 2018-2023, Advanced Micro Devices, Inc. All rights reserved.
265
-
266
-Permission is hereby granted, free of charge, to any person obtaining a copy
267
-of this software and associated documentation files (the "Software"), to deal
268
-in the Software without restriction, including without limitation the rights
269
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
270
-copies of the Software, and to permit persons to whom the Software is
271
-furnished to do so, subject to the following conditions:
272
-
273
-The above copyright notice and this permission notice shall be included in all
274
-copies or substantial portions of the Software.
275
-
276
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
277
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
278
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
279
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
280
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
281
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
282
-SOFTWARE.
283
-
284
-_____
285
-
286
-neural-speed
287
-
288
-https://github.com/intel/neural-speed
289
-
290
- Apache License
291
- http://www.apache.org/licenses/
292
-
293
- TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
294
-
295
- 1. Definitions.
296
-
297
- "License" shall mean the terms and conditions for use, reproduction,
298
- and distribution as defined by Sections 1 through 9 of this document.
299
-
300
- "Licensor" shall mean the copyright owner or entity authorized by
301
- the copyright owner that is granting the License.
302
-
303
- "Legal Entity" shall mean the union of the acting entity and all
304
- other entities that control, are controlled by, or are under common
305
- control with that entity. For the purposes of this definition,
306
- "control" means (i) the power, direct or indirect, to cause the
307
- direction or management of such entity, whether by contract or
308
- otherwise, or (ii) ownership of fifty percent (50%) or more of the
309
- outstanding shares, or (iii) beneficial ownership of such entity.
310
-
311
- "You" (or "Your") shall mean an individual or Legal Entity
312
- exercising permissions granted by this License.
313
-
314
- "Source" form shall mean the preferred form for making modifications,
315
- including but not limited to software source code, documentation
316
- source, and configuration files.
317
-
318
- "Object" form shall mean any form resulting from mechanical
319
- transformation or translation of a Source form, including but
320
- not limited to compiled object code, generated documentation,
321
- and conversions to other media types.
322
-
323
- "Work" shall mean the work of authorship, whether in Source or
324
- Object form, made available under the License, as indicated by a
325
- copyright notice that is included in or attached to the work
326
- (an example is provided in the Appendix below).
327
-
328
- "Derivative Works" shall mean any work, whether in Source or Object
329
- form, that is based on (or derived from) the Work and for which the
330
- editorial revisions, annotations, elaborations, or other modifications
331
- represent, as a whole, an original work of authorship. For the purposes
332
- of this License, Derivative Works shall not include works that remain
333
- separable from, or merely link (or bind by name) to the interfaces of,
334
- the Work and Derivative Works thereof.
335
-
336
- "Contribution" shall mean any work of authorship, including
337
- the original version of the Work and any modifications or additions
338
- to that Work or Derivative Works thereof, that is intentionally
339
- submitted to Licensor for inclusion in the Work by the copyright owner
340
- or by an individual or Legal Entity authorized to submit on behalf of
341
- the copyright owner. For the purposes of this definition, "submitted"
342
- means any form of electronic, verbal, or written communication sent
343
- to the Licensor or its representatives, including but not limited to
344
- communication on electronic mailing lists, source code control systems,
345
- and issue tracking systems that are managed by, or on behalf of, the
346
- Licensor for the purpose of discussing and improving the Work, but
347
- excluding communication that is conspicuously marked or otherwise
348
- designated in writing by the copyright owner as "Not a Contribution."
349
-
350
- "Contributor" shall mean Licensor and any individual or Legal Entity
351
- on behalf of whom a Contribution has been received by Licensor and
352
- subsequently incorporated within the Work.
353
-
354
- 2. Grant of Copyright License. Subject to the terms and conditions of
355
- this License, each Contributor hereby grants to You a perpetual,
356
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
357
- copyright license to reproduce, prepare Derivative Works of,
358
- publicly display, publicly perform, sublicense, and distribute the
359
- Work and such Derivative Works in Source or Object form.
360
-
361
- 3. Grant of Patent License. Subject to the terms and conditions of
362
- this License, each Contributor hereby grants to You a perpetual,
363
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
364
- (except as stated in this section) patent license to make, have made,
365
- use, offer to sell, sell, import, and otherwise transfer the Work,
366
- where such license applies only to those patent claims licensable
367
- by such Contributor that are necessarily infringed by their
368
- Contribution(s) alone or by combination of their Contribution(s)
369
- with the Work to which such Contribution(s) was submitted. If You
370
- institute patent litigation against any entity (including a
371
- cross-claim or counterclaim in a lawsuit) alleging that the Work
372
- or a Contribution incorporated within the Work constitutes direct
373
- or contributory patent infringement, then any patent licenses
374
- granted to You under this License for that Work shall terminate
375
- as of the date such litigation is filed.
376
-
377
- 4. Redistribution. You may reproduce and distribute copies of the
378
- Work or Derivative Works thereof in any medium, with or without
379
- modifications, and in Source or Object form, provided that You
380
- meet the following conditions:
381
-
382
- (a) You must give any other recipients of the Work or
383
- Derivative Works a copy of this License; and
384
-
385
- (b) You must cause any modified files to carry prominent notices
386
- stating that You changed the files; and
387
-
388
- (c) You must retain, in the Source form of any Derivative Works
389
- that You distribute, all copyright, patent, trademark, and
390
- attribution notices from the Source form of the Work,
391
- excluding those notices that do not pertain to any part of
392
- the Derivative Works; and
393
-
394
- (d) If the Work includes a "NOTICE" text file as part of its
395
- distribution, then any Derivative Works that You distribute must
396
- include a readable copy of the attribution notices contained
397
- within such NOTICE file, excluding those notices that do not
398
- pertain to any part of the Derivative Works, in at least one
399
- of the following places: within a NOTICE text file distributed
400
- as part of the Derivative Works; within the Source form or
401
- documentation, if provided along with the Derivative Works; or,
402
- within a display generated by the Derivative Works, if and
403
- wherever such third-party notices normally appear. The contents
404
- of the NOTICE file are for informational purposes only and
405
- do not modify the License. You may add Your own attribution
406
- notices within Derivative Works that You distribute, alongside
407
- or as an addendum to the NOTICE text from the Work, provided
408
- that such additional attribution notices cannot be construed
409
- as modifying the License.
410
-
411
- You may add Your own copyright statement to Your modifications and
412
- may provide additional or different license terms and conditions
413
- for use, reproduction, or distribution of Your modifications, or
414
- for any such Derivative Works as a whole, provided Your use,
415
- reproduction, and distribution of the Work otherwise complies with
416
- the conditions stated in this License.
417
-
418
- 5. Submission of Contributions. Unless You explicitly state otherwise,
419
- any Contribution intentionally submitted for inclusion in the Work
420
- by You to the Licensor shall be under the terms and conditions of
421
- this License, without any additional terms or conditions.
422
- Notwithstanding the above, nothing herein shall supersede or modify
423
- the terms of any separate license agreement you may have executed
424
- with Licensor regarding such Contributions.
425
-
426
- 6. Trademarks. This License does not grant permission to use the trade
427
- names, trademarks, service marks, or product names of the Licensor,
428
- except as required for reasonable and customary use in describing the
429
- origin of the Work and reproducing the content of the NOTICE file.
430
-
431
- 7. Disclaimer of Warranty. Unless required by applicable law or
432
- agreed to in writing, Licensor provides the Work (and each
433
- Contributor provides its Contributions) on an "AS IS" BASIS,
434
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
435
- implied, including, without limitation, any warranties or conditions
436
- of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
437
- PARTICULAR PURPOSE. You are solely responsible for determining the
438
- appropriateness of using or redistributing the Work and assume any
439
- risks associated with Your exercise of permissions under this License.
440
-
441
- 8. Limitation of Liability. In no event and under no legal theory,
442
- whether in tort (including negligence), contract, or otherwise,
443
- unless required by applicable law (such as deliberate and grossly
444
- negligent acts) or agreed to in writing, shall any Contributor be
445
- liable to You for damages, including any direct, indirect, special,
446
- incidental, or consequential damages of any character arising as a
447
- result of this License or out of the use or inability to use the
448
- Work (including but not limited to damages for loss of goodwill,
449
- work stoppage, computer failure or malfunction, or any and all
450
- other commercial damages or losses), even if such Contributor
451
- has been advised of the possibility of such damages.
452
-
453
- 9. Accepting Warranty or Additional Liability. While redistributing
454
- the Work or Derivative Works thereof, You may choose to offer,
455
- and charge a fee for, acceptance of support, warranty, indemnity,
456
- or other liability obligations and/or rights consistent with this
457
- License. However, in accepting such obligations, You may act only
458
- on Your own behalf and on Your sole responsibility, not on behalf
459
- of any other Contributor, and only if You agree to indemnify,
460
- defend, and hold each Contributor harmless for any liability
461
- incurred by, or claims asserted against, such Contributor by reason
462
- of your accepting any such warranty or additional liability.
463
-
464
- END OF TERMS AND CONDITIONS
465
-
466
- ============================================================================
467
-
468
- Copyright 2016-2019 Intel Corporation
469
- Copyright 2018 YANDEX LLC
470
-
471
- Licensed under the Apache License, Version 2.0 (the "License");
472
- you may not use this file except in compliance with the License.
473
- You may obtain a copy of the License at
474
-
475
- http://www.apache.org/licenses/LICENSE-2.0
476
-
477
- Unless required by applicable law or agreed to in writing, software
478
- distributed under the License is distributed on an "AS IS" BASIS,
479
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
480
- See the License for the specific language governing permissions and
481
- limitations under the License.
482
-
483
- This distribution includes third party software ("third party programs").
484
- This third party software, even if included with the distribution of
485
- the Intel software, may be governed by separate license terms, including
486
- without limitation, third party license terms, other Intel software license
487
- terms, and open source software license terms. These separate license terms
488
- govern your use of the third party programs as set forth in the
489
- "THIRD-PARTY-PROGRAMS" file.
490
+in this Software without prior written authorization of the copyright holder.
491
\ No newline at end of file
492
onnxruntime-linux-x64-gpu-1.17.1.tgz/VERSION_NUMBER -> onnxruntime-linux-x64-gpu-1.15.1.tgz/VERSION_NUMBER
Changed
4
1
2
-1.17.1
3
+1.15.1
4
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_c_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_c_api.h
Changed
635
1
2
3
/** \mainpage ONNX Runtime
4
*
5
- * ONNX Runtime is a high-performance inference and training graph execution engine for deep learning models.
6
+ * ONNX Runtime is a high-performance inference and training graph execution engine for deeplearning models.
7
*
8
* ONNX Runtime's C, C++ APIs offer an easy to use interface to onboard and execute onnx models.
9
* - \subpage c_cpp_api "Core C, C++ APIs"
10
- * - \subpage training_c_cpp_api "Training C, C++ APIs for on-device training"
11
+ * - \subpage training_c_cpp_api "Training C, C++ APIs for learning on the edge"
12
*
13
* \page c_cpp_api Core C, C++ APIs
14
* <h1>C</h1>
15
16
*/
17
18
#pragma once
19
-#include <stdbool.h>
20
-#include <stdint.h>
21
#include <stdlib.h>
22
+#include <stdint.h>
23
#include <string.h>
24
25
/** \brief The API version defined in this header
26
*
27
* This value is used by some API functions to behave as this version of the header expects.
28
*/
29
-#define ORT_API_VERSION 17
30
+#define ORT_API_VERSION 15
31
32
#ifdef __cplusplus
33
extern "C" {
34
35
#define _Check_return_
36
#define _Outptr_result_maybenull_
37
#define _In_reads_(X)
38
-#define _Inout_updates_(X)
39
-#define _Out_writes_(X)
40
#define _Inout_updates_all_(X)
41
#define _Out_writes_bytes_all_(X)
42
#define _Out_writes_all_(X)
43
44
ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT64, // maps to c type uint64_t
45
ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX64, // complex with float32 real and imaginary components
46
ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX128, // complex with float64 real and imaginary components
47
- ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16, // Non-IEEE floating-point format based on IEEE754 single-precision
48
- // float 8 types were introduced in onnx 1.14, see https://onnx.ai/onnx/technical/float8.html
49
- ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN, // Non-IEEE floating-point format based on IEEE754 single-precision
50
- ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ, // Non-IEEE floating-point format based on IEEE754 single-precision
51
- ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2, // Non-IEEE floating-point format based on IEEE754 single-precision
52
- ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ // Non-IEEE floating-point format based on IEEE754 single-precision
53
+ ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16 // Non-IEEE floating-point format based on IEEE754 single-precision
54
} ONNXTensorElementDataType;
55
56
// Synced with onnx TypeProto oneof
57
58
ORT_RUNTIME_CLASS(Op);
59
ORT_RUNTIME_CLASS(OpAttr);
60
ORT_RUNTIME_CLASS(Logger);
61
-ORT_RUNTIME_CLASS(ShapeInferContext);
62
63
#ifdef _WIN32
64
typedef _Return_type_success_(return == 0) OrtStatus* OrtStatusPtr;
65
66
user_compute_stream{},
67
default_memory_arena_cfg{},
68
tunable_op_enable{false},
69
- tunable_op_tuning_enable{false},
70
- tunable_op_max_tuning_duration_ms{} {}
71
+ tunable_op_tuning_enable{false} {}
72
#endif
73
74
/** \brief CUDA device Id
75
76
*/
77
int tunable_op_tuning_enable;
78
79
- /** \brief Max tuning duration time limit for each instance of TunableOp.
80
- * Defaults to 0 to disable the limit.
81
- */
82
- int tunable_op_max_tuning_duration_ms;
83
-
84
} OrtCUDAProviderOptions;
85
86
/** \brief ROCM Provider Options
87
88
user_compute_stream{},
89
default_memory_arena_cfg{},
90
tunable_op_enable{false},
91
- tunable_op_tuning_enable{false},
92
- tunable_op_max_tuning_duration_ms{} {}
93
+ tunable_op_tuning_enable{false} {}
94
#endif
95
96
/** \brief ROCM device Id
97
98
*/
99
int tunable_op_tuning_enable;
100
101
- /** \brief Max tuning duration time limit for each instance of TunableOp.
102
- * Defaults to 0 to disable the limit.
103
- */
104
- int tunable_op_max_tuning_duration_ms;
105
-
106
} OrtROCMProviderOptions;
107
108
/** \brief TensorRT Provider Options
109
110
* \see OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
111
*/
112
typedef struct OrtMIGraphXProviderOptions {
113
- int device_id; // hip device id.
114
- int migraphx_fp16_enable; // MIGraphX FP16 precision. Default 0 = false, nonzero = true
115
- int migraphx_int8_enable; // MIGraphX INT8 precision. Default 0 = false, nonzero = true
116
- int migraphx_use_native_calibration_table; // MIGraphx INT8 cal table. Default 0 = false, noznero = true
117
- const char* migraphx_int8_calibration_table_name; // MIGraphx INT8 calibration table name
118
+ int device_id; // hip device id.
119
+ int migraphx_fp16_enable; // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
120
+ int migraphx_int8_enable; // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
121
} OrtMIGraphXProviderOptions;
122
123
/** \brief OpenVINO Provider Options
124
125
typedef struct OrtOpenVINOProviderOptions {
126
#ifdef __cplusplus
127
OrtOpenVINOProviderOptions() : device_type{},
128
- enable_npu_fast_compile{},
129
+ enable_vpu_fast_compile{},
130
device_id{},
131
num_of_threads{},
132
cache_dir{},
133
134
* Valid settings are one of: "CPU_FP32", "CPU_FP16", "GPU_FP32", "GPU_FP16"
135
*/
136
const char* device_type;
137
- unsigned char enable_npu_fast_compile; ///< 0 = disabled, nonzero = enabled
138
+ unsigned char enable_vpu_fast_compile; ///< 0 = disabled, nonzero = enabled
139
const char* device_id;
140
size_t num_of_threads; ///< 0 = Use default number of threads
141
const char* cache_dir; // path is set to empty by default
142
143
144
typedef OrtStatus*(ORT_API_CALL* RegisterCustomOpsFn)(OrtSessionOptions* options, const OrtApiBase* api);
145
146
-/** \brief Callback function for RunAsync
147
- *
148
- * \paramin user_data User specific data that passed back to the callback
149
- * \paramout outputs On succeed, outputs host inference results, on error, the value will be nullptr
150
- * \paramout num_outputs Number of outputs, on error, the value will be zero
151
- * \paramout status On error, status will provide details
152
- */
153
-typedef void (*RunAsyncCallbackFn)(void* user_data, OrtValue** outputs, size_t num_outputs, OrtStatusPtr status);
154
-
155
/** \brief The C API
156
*
157
* All C API functions are defined inside this structure as pointers to functions.
158
159
160
/** \brief Create an OrtEnv
161
*
162
- * \note Invoking this function will return the same instance of the environment as that returned by a previous call
163
- * to another env creation function; all arguments to this function will be ignored.
164
* \paramin log_severity_level The log severity level.
165
* \paramin logid The log identifier.
166
* \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
167
168
169
/** \brief Create an OrtEnv
170
*
171
- * \note Invoking this function will return the same instance of the environment as that returned by a previous call
172
- * to another env creation function; all arguments to this function will be ignored. If you want to provide your
173
- * own logging function, consider setting it using the SetUserLoggingFunction API instead.
174
* \paramin logging_function A pointer to a logging function.
175
* \paramin logger_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
176
- * `logging_function`. This parameter is optional.
177
+ * `logging_function`.
178
* \paramin log_severity_level The log severity level.
179
* \paramin logid The log identifier.
180
* \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
181
*
182
* \snippet{doc} snippets.dox OrtStatus Return Value
183
*/
184
- ORT_API2_STATUS(CreateEnvWithCustomLogger, _In_ OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
185
- _In_ OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
186
+ ORT_API2_STATUS(CreateEnvWithCustomLogger, OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
187
+ OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
188
189
/** \brief Enable Telemetry
190
*
191
192
193
/** \brief Set the optimization level to apply when loading a graph
194
*
195
- * Please see https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html for an in-depth explanation
196
+ * Please see https://onnxruntime.ai/docs/performance/graph-optimizations.html for an in-depth explanation
197
* \paramin,out options The session options object
198
* \paramin graph_optimization_level The optimization level
199
*
200
201
* crossing which the current chunk is chunked into 2.
202
* "initial_growth_chunk_size_bytes": (Possible) Size of the second allocation in the arena.
203
* Only relevant if arena strategy is `kNextPowerOfTwo`. Use -1 to allow ORT to choose the default.
204
- * "max_power_of_two_extend_bytes": The maximum enxtend size if arena strategy is `kNextPowerOfTwo`.
205
- * It is not an allocation limit, it is only a limit for extention when requested byte is less than the limit.
206
- * When requested bytes is more than the limit, allocator will still return as requested.
207
- * Use -1 to allow ORT to choose the default 1GB for max_power_of_two_extend_bytes.
208
* Ultimately, the allocation size is determined by the allocation memory request.
209
* Further allocation sizes are governed by the arena extend strategy.
210
*
211
212
*
213
* QNN supported keys:
214
* "backend_path": file path to QNN backend library.
215
- * "profiling_level": QNN profiling level, options: "off", "basic", "detailed". Default to off.
216
+ * "profiling_level": QNN profiling level, options: "basic", "detailed".
217
* "rpc_control_latency": QNN RPC control latency.
218
- * "vtcm_mb": QNN VTCM size in MB. default to 0(not set).
219
- * "htp_performance_mode": QNN performance mode, options: "burst", "balanced", "default", "high_performance",
220
- * "high_power_saver", "low_balanced", "extreme_power_saver", "low_power_saver", "power_saver", "sustained_high_performance". Default to "default".
221
- * "qnn_saver_path": File path to the QNN Saver backend library. If specified, QNN Saver will be enabled and will
222
- * dump QNN API calls to disk for replay/debugging. QNN Saver produces incorrect model inference results and
223
- * may alter model/EP partitioning. Use only for debugging.
224
- * "qnn_context_priority": QNN context priority, options: "low", "normal", "normal_high", "high". Default to "normal".
225
- * "htp_graph_finalization_optimization_mode": Set the optimization mode for graph finalization on the HTP backend. Available options:
226
- * - "0": Default.
227
- * - "1": Faster preparation time, less optimal graph.
228
- * - "2": Longer preparation time, more optimal graph.
229
- * - "3": Longest preparation time, most likely even more optimal graph. See QNN SDK documentation for specific details.
230
- * "soc_model": The SoC model number. Refer to the QNN SDK documentation for valid values. Defaults to "0" (unknown).
231
- * "htp_arch": The minimum HTP architecture the driver will use to select compatible QNN operators. Available options:
232
- * - "0": Default (none).
233
- * - "68"
234
- * - "69"
235
- * - "73"
236
- * - "75"
237
- * "device_id": The ID of the device to use when setting 'htp_arch'. Defaults to "0" (for single device).
238
*
239
* SNPE supported keys:
240
* "runtime": SNPE runtime engine, options: "CPU", "CPU_FLOAT32", "GPU", "GPU_FLOAT32_16_HYBRID", "GPU_FLOAT16",
241
242
* "buffer_type": ITensor or user buffers, options: "ITENSOR", user buffer with different types - "TF8", "TF16", "UINT8", "FLOAT".
243
* "ITENSOR" -- default, ITensor which is float only.
244
* "TF8" -- quantized model required, "FLOAT" -- for both quantized or non-quantized model
245
- * "enable_init_cache": enable SNPE init caching feature, set to 1 to enabled it. Disabled by default.
246
* If SNPE is not available (due to a non Snpe enabled build or its dependencies not being installed), this function will fail.
247
*
248
* XNNPACK supported keys:
249
250
*/
251
ORT_API2_STATUS(GetResizedStringTensorElementBuffer, _Inout_ OrtValue* value, _In_ size_t index, _In_ size_t length_in_bytes, _Inout_ char** buffer);
252
253
- /** \brief Get Allocator from KernelContext for a specific memoryInfo. Please use C API ReleaseAllocator to release out object
254
+ /** \brief Get Allocator from KernelContext for a specific memoryInfo.
255
*
256
* \paramin context OrtKernelContext instance
257
* \paramin mem_info OrtMemoryInfo instance
258
259
* \since Version 1.15.
260
*/
261
const char*(ORT_API_CALL* GetBuildInfoString)(void);
262
-
263
- /// \name OrtROCMProviderOptions
264
- /// @{
265
-
266
- /** \brief Create an OrtROCMProviderOptions
267
- *
268
- * \paramout out Newly created ::OrtROCMProviderOptions. Must be released with OrtApi::ReleaseROCMProviderOptions
269
- *
270
- * \snippet{doc} snippets.dox OrtStatus Return Value
271
- *
272
- * \since Version 1.16.
273
- */
274
- ORT_API2_STATUS(CreateROCMProviderOptions, _Outptr_ OrtROCMProviderOptions** out);
275
-
276
- /** \brief Set options in a ROCm Execution Provider.
277
- *
278
- * Please refer to https://onnxruntime.ai/docs/execution-providers/ROCm-ExecutionProvider.html
279
- * to know the available keys and values. Key should be in null terminated string format of the member of
280
- * ::OrtROCMProviderOptions and value should be its related range.
281
- *
282
- * For example, key="device_id" and value="0"
283
- *
284
- * \paramin rocm_options
285
- * \paramin provider_options_keys Array of UTF-8 null-terminated string for provider options keys
286
- * \paramin provider_options_values Array of UTF-8 null-terminated string for provider options values
287
- * \paramin num_keys Number of elements in the `provider_option_keys` and `provider_options_values` arrays
288
- *
289
- * \snippet{doc} snippets.dox OrtStatus Return Value
290
- *
291
- * \since Version 1.16.
292
- */
293
- ORT_API2_STATUS(UpdateROCMProviderOptions, _Inout_ OrtROCMProviderOptions* rocm_options,
294
- _In_reads_(num_keys) const char* const* provider_options_keys,
295
- _In_reads_(num_keys) const char* const* provider_options_values,
296
- _In_ size_t num_keys);
297
-
298
- /**
299
- * Get serialized ROCm provider options string.
300
- *
301
- * For example, "device_id=0;arena_extend_strategy=0;......"
302
- *
303
- * \param rocm_options - OrtROCMProviderOptions instance
304
- * \param allocator - a ptr to an instance of OrtAllocator obtained with CreateAllocator() or GetAllocatorWithDefaultOptions()
305
- * the specified allocator will be used to allocate continuous buffers for output strings and lengths.
306
- * \param ptr - is a UTF-8 null terminated string allocated using 'allocator'. The caller is responsible for using the same allocator to free it.
307
- *
308
- * \snippet{doc} snippets.dox OrtStatus Return Value
309
- *
310
- * \since Version 1.16.
311
- */
312
- ORT_API2_STATUS(GetROCMProviderOptionsAsString, _In_ const OrtROCMProviderOptions* rocm_options, _Inout_ OrtAllocator* allocator, _Outptr_ char** ptr);
313
-
314
- /** \brief Release an ::OrtROCMProviderOptions
315
- *
316
- * \note This is an exception in the naming convention of other Release* functions, as the name of the method does not have the V2 suffix, but the type does
317
- *
318
- * \since Version 1.16.
319
- */
320
- void(ORT_API_CALL* ReleaseROCMProviderOptions)(_Frees_ptr_opt_ OrtROCMProviderOptions* input);
321
-
322
- /** \brief Create an allocator with specific type and register it with the ::OrtEnv
323
- * This API enhance CreateAndRegisterAllocator that it can create an allocator with specific type, not just CPU allocator
324
- * Enables sharing the allocator between multiple sessions that use the same env instance.
325
- * Lifetime of the created allocator will be valid for the duration of the environment.
326
- * Returns an error if an allocator with the same ::OrtMemoryInfo is already registered.
327
- * \paramin env OrtEnv instance
328
- * \paramin provider_type ExecutionProvider type
329
- * \paramin mem_info OrtMemoryInfo instance
330
- * \paramin arena_cfg Arena configuration
331
- * \paramin provider_options_keys key of the provider options map
332
- * \paramin provider_options_values value of the provider options map
333
- * \paramin num_keys Length of the provider options map
334
- */
335
- ORT_API2_STATUS(CreateAndRegisterAllocatorV2, _Inout_ OrtEnv* env, _In_ const char* provider_type, _In_ const OrtMemoryInfo* mem_info, _In_ const OrtArenaCfg* arena_cfg,
336
- _In_reads_(num_keys) const char* const* provider_options_keys, _In_reads_(num_keys) const char* const* provider_options_values, _In_ size_t num_keys);
337
-
338
- /** \brief Run the model asynchronously in a thread owned by intra op thread pool
339
- *
340
- * \paramin session
341
- * \paramin run_options If nullptr, will use a default ::OrtRunOptions
342
- * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
343
- * \paramin input Array of ::OrtValue%s of the input values
344
- * \paramin input_len Number of elements in the input_names and inputs arrays
345
- * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
346
- * \paramin output_names_len Number of elements in the output_names and outputs array
347
- * \paramout output OrtValue* array of size output_names_len.
348
- * On calling RunAsync, outputi could either be a null or a pointer to a preallocated OrtValue.
349
- * Later, the output array will be passed to run_async_callback with all null(s) filled with valid
350
- * OrtValue pointer(s) allocated by onnxruntime.
351
- * NOTE: it is customer's duty to finally release the output array and each of its member,
352
- * regardless of whether the member (OrtValue*) is allocated by onnxruntime or preallocated by the customer.
353
- * \paramin run_async_callback Callback function on model run completion
354
- * \paramin user_data User data that pass back to run_async_callback
355
- */
356
- ORT_API2_STATUS(RunAsync, _Inout_ OrtSession* session, _In_opt_ const OrtRunOptions* run_options,
357
- _In_reads_(input_len) const char* const* input_names,
358
- _In_reads_(input_len) const OrtValue* const* input, size_t input_len,
359
- _In_reads_(output_names_len) const char* const* output_names, size_t output_names_len,
360
- _Inout_updates_all_(output_names_len) OrtValue** output,
361
- _In_ RunAsyncCallbackFn run_async_callback, _In_opt_ void* user_data);
362
-
363
- /**
364
- * Update TensorRT EP provider option where its data type is pointer, for example 'user_compute_stream'.
365
- * If the data type of the provider option can be represented by string please use UpdateTensorRTProviderOptions.
366
- *
367
- * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
368
- *
369
- * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
370
- * \param key - Name of the provider option
371
- * \param value - A pointer to the instance that will be assigned to this provider option
372
- *
373
- * \since Version 1.16.
374
- */
375
- ORT_API2_STATUS(UpdateTensorRTProviderOptionsWithValue, _Inout_ OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _In_ void* value);
376
-
377
- /**
378
- * Get TensorRT EP provider option where its data type is pointer.
379
- * If the data type of the provider option can be represented by string please use GetTensorRTProviderOptionsAsString.
380
- *
381
- * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
382
- * \param key - Name of the provider option
383
- * \param ptr - A pointer to the instance that is kept by the provider option
384
- *
385
- * \since Version 1.16.
386
- */
387
- ORT_API2_STATUS(GetTensorRTProviderOptionsByName, _In_ const OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _Outptr_ void** ptr);
388
-
389
- /**
390
- * Update CUDA EP provider option where its data type is pointer, for example 'user_compute_stream'.
391
- * If the data type of the provider option can be represented by string please use UpdateCUDAProviderOptions.
392
- *
393
- * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
394
- *
395
- * \param cuda_options - OrtCUDAProviderOptionsV2 instance
396
- * \param key - Name of the provider option
397
- * \param value - A pointer to the instance that will be assigned to this provider option
398
- *
399
- * \since Version 1.16.
400
- */
401
- ORT_API2_STATUS(UpdateCUDAProviderOptionsWithValue, _Inout_ OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _In_ void* value);
402
-
403
- /**
404
- * Get CUDA EP provider option where its data type is pointer.
405
- * If the data type of the provider option can be represented by string please use GetCUDAProviderOptionsAsString.
406
- *
407
- * \param cuda_options - OrtCUDAProviderOptionsV2 instance
408
- * \param key - Name of the provider option
409
- * \param ptr - A pointer to the instance that is kept by the provider option
410
- *
411
- * \since Version 1.16.
412
- */
413
- ORT_API2_STATUS(GetCUDAProviderOptionsByName, _In_ const OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _Outptr_ void** ptr);
414
-
415
- /**
416
- * Get a EP resource.
417
- * E.g. a cuda stream or a cublas handle
418
- *
419
- * \param context - Kernel context
420
- * \param resouce_version - Version of the resource
421
- * \param resource_id - Type of resource
422
- * \param resource - A pointer to returned resource
423
- *
424
- * \since Version 1.16.
425
- */
426
- ORT_API2_STATUS(KernelContext_GetResource, _In_ const OrtKernelContext* context, _In_ int resouce_version, _In_ int resource_id, _Outptr_ void** resource);
427
-
428
- /** \brief Set user logging function
429
- *
430
- * By default the logger created by the CreateEnv* functions is used to create the session logger as well.
431
- * This function allows a user to override this default session logger with a logger of their own choosing. This way
432
- * the user doesn't have to create a separate environment with a custom logger. This addresses the problem when
433
- * the user already created an env but now wants to use a different logger for a specific session (for debugging or
434
- * other reasons).
435
- *
436
- * \paramin options
437
- * \paramin user_logging_function A pointer to a logging function.
438
- * \paramin user_logging_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
439
- * `user_logging_function`. This parameter is optional.
440
- *
441
- * \snippet{doc} snippets.dox OrtStatus Return Value
442
- *
443
- * \since Version 1.17.
444
- */
445
- ORT_API2_STATUS(SetUserLoggingFunction, _Inout_ OrtSessionOptions* options,
446
- _In_ OrtLoggingFunction user_logging_function, _In_opt_ void* user_logging_param);
447
-
448
- /**
449
- * Get number of input from OrtShapeInferContext
450
- *
451
- * \paramin context
452
- * \paramout out The number of inputs
453
- *
454
- * \since Version 1.17.
455
- */
456
- ORT_API2_STATUS(ShapeInferContext_GetInputCount, _In_ const OrtShapeInferContext* context, _Out_ size_t* out);
457
-
458
- /**
459
- * Get type and shape info of an input
460
- *
461
- * \paramin context
462
- * \paramin index The index of the input
463
- * \paramout info Type shape info of the input
464
- *
465
- * \since Version 1.17.
466
- */
467
- ORT_API2_STATUS(ShapeInferContext_GetInputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _Outptr_ OrtTensorTypeAndShapeInfo** info);
468
-
469
- /**
470
- * Get attribute from OrtShapeInferContext. Note that OrtShapeInferContext is a per-node context, one could only read attribute from current node.
471
- *
472
- * \paramin context
473
- * \paramin attr_name Name of the attribute
474
- * \paramout attr Handle of the attribute fetched
475
- *
476
- * \since Version 1.17.
477
- */
478
- ORT_API2_STATUS(ShapeInferContext_GetAttribute, _In_ const OrtShapeInferContext* context, _In_ const char* attr_name, _Outptr_ const OrtOpAttr** attr);
479
-
480
- /**
481
- * Set type and shape info of an ouput
482
- *
483
- * \paramin context
484
- * \paramin index The index of the ouput
485
- * \paramout info Type shape info of the output
486
- *
487
- * \since Version 1.17.
488
- */
489
- ORT_API2_STATUS(ShapeInferContext_SetOutputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _In_ const OrtTensorTypeAndShapeInfo* info);
490
-
491
- /**
492
- * Set symbolic shape to type shape info
493
- *
494
- * \paramin info Type shape info
495
- * \paramin dim_params Symbolic strings
496
- * \paramin dim_params_length Number of strings
497
- *
498
- * \since Version 1.17.
499
- */
500
- ORT_API2_STATUS(SetSymbolicDimensions, _In_ OrtTensorTypeAndShapeInfo* info, _In_ const char* dim_params, _In_ size_t dim_params_length);
501
-
502
- /**
503
- * Read contents of an attribute to data
504
- *
505
- * \paramin op_attr
506
- * \paramin type Attribute type
507
- * \paramout data Memory address to save raw content of the attribute
508
- * \paramin len Number of bytes allowed to store in data
509
- * \paramout out Number of bytes required to save the data when the call failed, or the real number of bytes saved to data on success
510
- *
511
- * \since Version 1.17.
512
- */
513
- ORT_API2_STATUS(ReadOpAttr, _In_ const OrtOpAttr* op_attr, _In_ OrtOpAttrType type, _Inout_ void* data, _In_ size_t len, _Out_ size_t* out);
514
-
515
- /** \brief Set whether to use deterministic compute.
516
- *
517
- * Default is false. If set to true, this will enable deterministic compute for GPU kernels where possible.
518
- * Note that this most likely will have a performance cost.
519
- *
520
- * \paramin options
521
- * \paramin value
522
- *
523
- * \since Version 1.17.
524
- */
525
- ORT_API2_STATUS(SetDeterministicCompute, _Inout_ OrtSessionOptions* options, bool value);
526
-
527
- /**
528
- * Run fn in parallel
529
- *
530
- * \paramin context
531
- * \paramin fn Function accepting usr_data and an integer as iterator
532
- * \paramin total The number of times fn is to be invoked
533
- * \paramin num_batch Number of batches by which the "total" is to be divided in maximum. When zero, there is no limit
534
- * \paramin usr_data User data to be passed back to fn
535
- *
536
- * \since Version 1.17.
537
- */
538
- ORT_API2_STATUS(KernelContext_ParallelFor, _In_ const OrtKernelContext* context, _In_ void (*fn)(void*, size_t), _In_ size_t total, _In_ size_t num_batch, _In_ void* usr_data);
539
-
540
- /** \brief Append OpenVINO execution provider to the session options
541
- *
542
- * If OpenVINO is not available (due to a non OpenVINO enabled build, or if OpenVINO is not installed on the system), this function will fail.
543
- *
544
- * \paramin options
545
- * \paramin provider_options_keys
546
- * \paramin provider_options_values
547
- * \paramin num_keys
548
- *
549
- * \snippet{doc} snippets.dox OrtStatus Return Value
550
- */
551
- ORT_API2_STATUS(SessionOptionsAppendExecutionProvider_OpenVINO_V2,
552
- _In_ OrtSessionOptions* options,
553
- _In_reads_(num_keys) const char* const* provider_options_keys,
554
- _In_reads_(num_keys) const char* const* provider_options_values,
555
- _In_ size_t num_keys);
556
};
557
558
/*
559
560
struct OrtCustomOp {
561
uint32_t version; // Must be initialized to ORT_API_VERSION
562
563
- // This callback creates the kernel, which is a user defined
564
- // parameter that is passed to the Kernel* callbacks below. It is
565
- // recommended to use CreateKernelV2 which allows for a safe error
566
- // propagation by returning an OrtStatusPtr.
567
+ // This callback creates the kernel, which is a user defined parameter that is passed to the Kernel* callbacks below.
568
void*(ORT_API_CALL* CreateKernel)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
569
_In_ const OrtKernelInfo* info);
570
571
572
ONNXTensorElementDataType(ORT_API_CALL* GetOutputType)(_In_ const struct OrtCustomOp* op, _In_ size_t index);
573
size_t(ORT_API_CALL* GetOutputTypeCount)(_In_ const struct OrtCustomOp* op);
574
575
- // Perform a computation step. It is recommended to use
576
- // KernelComputeV2 which allows for a safe error propagation by
577
- // returning an OrtStatusPtr.
578
+ // Op kernel callbacks
579
void(ORT_API_CALL* KernelCompute)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
580
void(ORT_API_CALL* KernelDestroy)(_In_ void* op_kernel);
581
582
583
// and false (zero) otherwise.
584
// Applicable only for custom ops that have a variadic output.
585
int(ORT_API_CALL* GetVariadicOutputHomogeneity)(_In_ const struct OrtCustomOp* op);
586
-
587
- // Create the kernel state which is passed to each compute call.
588
- OrtStatusPtr(ORT_API_CALL* CreateKernelV2)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
589
- _In_ const OrtKernelInfo* info,
590
- _Out_ void** kernel);
591
-
592
- // Perform the computation step.
593
- OrtStatusPtr(ORT_API_CALL* KernelComputeV2)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
594
-
595
- OrtStatusPtr(ORT_API_CALL* InferOutputShapeFn)(_In_ const struct OrtCustomOp* op, _In_ OrtShapeInferContext*);
596
-
597
- // Get start range
598
- int(ORT_API_CALL* GetStartVersion)(_In_ const struct OrtCustomOp* op);
599
- int(ORT_API_CALL* GetEndVersion)(_In_ const struct OrtCustomOp* op);
600
};
601
602
/*
603
604
ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_CUDA, _In_ OrtSessionOptions* options, int device_id);
605
606
/*
607
- * This is the old way to add the ROCm provider to the session, please use
608
- * SessionOptionsAppendExecutionProvider_ROCM above to access the latest functionality
609
- * This function always exists, but will only succeed if Onnxruntime was built with
610
- * HIP support and the ROCm provider shared library exists
611
- *
612
- * \param device_id HIP device id, starts from zero.
613
- */
614
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_ROCM, _In_ OrtSessionOptions* options, int device_id);
615
-
616
-/*
617
* This is the old way to add the MIGraphX provider to the session, please use
618
* SessionOptionsAppendExecutionProvider_MIGraphX above to access the latest functionality
619
* This function always exists, but will only succeed if Onnxruntime was built with
620
621
*/
622
ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Dnnl, _In_ OrtSessionOptions* options, int use_arena);
623
624
-/*
625
- * This is the old way to add the TensorRT provider to the session, please use SessionOptionsAppendExecutionProvider_TensorRT_V2 above to access the latest functionality
626
- * This function always exists, but will only succeed if Onnxruntime was built with TensorRT support and the TensorRT provider shared library exists
627
- *
628
- * \param device_id CUDA device id, starts from zero.
629
- */
630
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Tensorrt, _In_ OrtSessionOptions* options, int device_id);
631
-
632
#ifdef __cplusplus
633
}
634
#endif
635
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_cxx_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_cxx_api.h
Changed
887
1
2
3
#pragma once
4
#include "onnxruntime_c_api.h"
5
-#include "onnxruntime_float16.h"
6
-
7
#include <cstddef>
8
#include <cstdio>
9
#include <array>
10
11
std::vector<std::string> GetAvailableProviders();
12
13
/** \brief IEEE 754 half-precision floating point data type
14
- *
15
- * \details This struct is used for converting float to float16 and back
16
- * so the user could feed inputs and fetch outputs using these type.
17
- *
18
+ * \details It is necessary for type dispatching to make use of C++ API
19
+ * The type is implicitly convertible to/from uint16_t.
20
* The size of the structure should align with uint16_t and one can freely cast
21
* uint16_t buffers to/from Ort::Float16_t to feed and retrieve data.
22
*
23
- * \code{.unparsed}
24
- * // This example demonstrates converion from float to float16
25
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
26
- * std::vector<Ort::Float16_t> fp16_values;
27
- * fp16_values.reserve(std::size(values));
28
- * std::transform(std::begin(values), std::end(values), std::back_inserter(fp16_values),
29
- * (float value) { return Ort::Float16_t(value); });
30
+ * Generally, you can feed any of your types as float16/blfoat16 data to create a tensor
31
+ * on top of it, providing it can form a continuous buffer with 16-bit elements with no padding.
32
+ * And you can also feed a array of uint16_t elements directly. For example,
33
*
34
+ * \code{.unparsed}
35
+ * uint16_t values = { 15360, 16384, 16896, 17408, 17664};
36
+ * constexpr size_t values_length = sizeof(values) / sizeof(values0);
37
+ * std::vector<int64_t> dims = {values_length}; // one dimensional example
38
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
39
+ * // Note we are passing bytes count in this api, not number of elements -> sizeof(values)
40
+ * auto float16_tensor = Ort::Value::CreateTensor(info, values, sizeof(values),
41
+ * dims.data(), dims.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);
42
* \endcode
43
- */
44
-struct Float16_t : onnxruntime_float16::Float16Impl<Float16_t> {
45
- private:
46
- /// <summary>
47
- /// Constructor from a 16-bit representation of a float16 value
48
- /// No conversion is done here.
49
- /// </summary>
50
- /// <param name="v">16-bit representation</param>
51
- constexpr explicit Float16_t(uint16_t v) noexcept { val = v; }
52
-
53
- public:
54
- using Base = onnxruntime_float16::Float16Impl<Float16_t>;
55
-
56
- /// <summary>
57
- /// Default constructor
58
- /// </summary>
59
- Float16_t() = default;
60
-
61
- /// <summary>
62
- /// Explicit conversion to uint16_t representation of float16.
63
- /// </summary>
64
- /// <param name="v">uint16_t bit representation of float16</param>
65
- /// <returns>new instance of Float16_t</returns>
66
- constexpr static Float16_t FromBits(uint16_t v) noexcept { return Float16_t(v); }
67
-
68
- /// <summary>
69
- /// __ctor from float. Float is converted into float16 16-bit representation.
70
- /// </summary>
71
- /// <param name="v">float value</param>
72
- explicit Float16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
73
-
74
- /// <summary>
75
- /// Converts float16 to float
76
- /// </summary>
77
- /// <returns>float representation of float16 value</returns>
78
- float ToFloat() const noexcept { return Base::ToFloatImpl(); }
79
-
80
- /// <summary>
81
- /// Checks if the value is negative
82
- /// </summary>
83
- /// <returns>true if negative</returns>
84
- using Base::IsNegative;
85
-
86
- /// <summary>
87
- /// Tests if the value is NaN
88
- /// </summary>
89
- /// <returns>true if NaN</returns>
90
- using Base::IsNaN;
91
-
92
- /// <summary>
93
- /// Tests if the value is finite
94
- /// </summary>
95
- /// <returns>true if finite</returns>
96
- using Base::IsFinite;
97
-
98
- /// <summary>
99
- /// Tests if the value represents positive infinity.
100
- /// </summary>
101
- /// <returns>true if positive infinity</returns>
102
- using Base::IsPositiveInfinity;
103
-
104
- /// <summary>
105
- /// Tests if the value represents negative infinity
106
- /// </summary>
107
- /// <returns>true if negative infinity</returns>
108
- using Base::IsNegativeInfinity;
109
-
110
- /// <summary>
111
- /// Tests if the value is either positive or negative infinity.
112
- /// </summary>
113
- /// <returns>True if absolute value is infinity</returns>
114
- using Base::IsInfinity;
115
-
116
- /// <summary>
117
- /// Tests if the value is NaN or zero. Useful for comparisons.
118
- /// </summary>
119
- /// <returns>True if NaN or zero.</returns>
120
- using Base::IsNaNOrZero;
121
-
122
- /// <summary>
123
- /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
124
- /// </summary>
125
- /// <returns>True if so</returns>
126
- using Base::IsNormal;
127
-
128
- /// <summary>
129
- /// Tests if the value is subnormal (denormal).
130
- /// </summary>
131
- /// <returns>True if so</returns>
132
- using Base::IsSubnormal;
133
-
134
- /// <summary>
135
- /// Creates an instance that represents absolute value.
136
- /// </summary>
137
- /// <returns>Absolute value</returns>
138
- using Base::Abs;
139
-
140
- /// <summary>
141
- /// Creates a new instance with the sign flipped.
142
- /// </summary>
143
- /// <returns>Flipped sign instance</returns>
144
- using Base::Negate;
145
-
146
- /// <summary>
147
- /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
148
- /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
149
- /// and therefore equivalent, if the resulting value is still zero.
150
- /// </summary>
151
- /// <param name="lhs">first value</param>
152
- /// <param name="rhs">second value</param>
153
- /// <returns>True if both arguments represent zero</returns>
154
- using Base::AreZero;
155
-
156
- /// <summary>
157
- /// User defined conversion operator. Converts Float16_t to float.
158
- /// </summary>
159
- explicit operator float() const noexcept { return ToFloat(); }
160
-
161
- using Base::operator==;
162
- using Base::operator!=;
163
- using Base::operator<;
164
+ *
165
+ * Here is another example, a little bit more elaborate. Let's assume that you use your own float16 type and you want to use
166
+ * a templated version of the API above so the type is automatically set based on your type. You will need to supply an extra
167
+ * template specialization.
168
+ *
169
+ * \code{.unparsed}
170
+ * namespace yours { struct half {}; } // assume this is your type, define this:
171
+ * namespace Ort {
172
+ * template<>
173
+ * struct TypeToTensorType<yours::half> { static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16; };
174
+ * } //namespace Ort
175
+ *
176
+ * std::vector<yours::half> values;
177
+ * std::vector<int64_t> dims = {values.size()}; // one dimensional example
178
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
179
+ * // Here we are passing element count -> values.size()
180
+ * auto float16_tensor = Ort::Value::CreateTensor<yours::half>(info, values.data(), values.size(), dims.data(), dims.size());
181
+ *
182
+ * \endcode
183
+ */
184
+struct Float16_t {
185
+ uint16_t value;
186
+ constexpr Float16_t() noexcept : value(0) {}
187
+ constexpr Float16_t(uint16_t v) noexcept : value(v) {}
188
+ constexpr operator uint16_t() const noexcept { return value; }
189
+ constexpr bool operator==(const Float16_t& rhs) const noexcept { return value == rhs.value; };
190
+ constexpr bool operator!=(const Float16_t& rhs) const noexcept { return value != rhs.value; };
191
};
192
193
static_assert(sizeof(Float16_t) == sizeof(uint16_t), "Sizes must match");
194
195
/** \brief bfloat16 (Brain Floating Point) data type
196
- *
197
- * \details This struct is used for converting float to bfloat16 and back
198
- * so the user could feed inputs and fetch outputs using these type.
199
- *
200
+ * \details It is necessary for type dispatching to make use of C++ API
201
+ * The type is implicitly convertible to/from uint16_t.
202
* The size of the structure should align with uint16_t and one can freely cast
203
* uint16_t buffers to/from Ort::BFloat16_t to feed and retrieve data.
204
*
205
- * \code{.unparsed}
206
- * // This example demonstrates converion from float to float16
207
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
208
- * std::vector<Ort::BFloat16_t> bfp16_values;
209
- * bfp16_values.reserve(std::size(values));
210
- * std::transform(std::begin(values), std::end(values), std::back_inserter(bfp16_values),
211
- * (float value) { return Ort::BFloat16_t(value); });
212
- *
213
- * \endcode
214
+ * See also code examples for Float16_t above.
215
*/
216
-struct BFloat16_t : onnxruntime_float16::BFloat16Impl<BFloat16_t> {
217
- private:
218
- /// <summary>
219
- /// Constructor from a uint16_t representation of bfloat16
220
- /// used in FromBits() to escape overload resolution issue with
221
- /// constructor from float.
222
- /// No conversion is done.
223
- /// </summary>
224
- /// <param name="v">16-bit bfloat16 value</param>
225
- constexpr explicit BFloat16_t(uint16_t v) noexcept { val = v; }
226
-
227
- public:
228
- using Base = onnxruntime_float16::BFloat16Impl<BFloat16_t>;
229
-
230
- BFloat16_t() = default;
231
-
232
- /// <summary>
233
- /// Explicit conversion to uint16_t representation of bfloat16.
234
- /// </summary>
235
- /// <param name="v">uint16_t bit representation of bfloat16</param>
236
- /// <returns>new instance of BFloat16_t</returns>
237
- static constexpr BFloat16_t FromBits(uint16_t v) noexcept { return BFloat16_t(v); }
238
-
239
- /// <summary>
240
- /// __ctor from float. Float is converted into bfloat16 16-bit representation.
241
- /// </summary>
242
- /// <param name="v">float value</param>
243
- explicit BFloat16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
244
-
245
- /// <summary>
246
- /// Converts bfloat16 to float
247
- /// </summary>
248
- /// <returns>float representation of bfloat16 value</returns>
249
- float ToFloat() const noexcept { return Base::ToFloatImpl(); }
250
-
251
- /// <summary>
252
- /// Checks if the value is negative
253
- /// </summary>
254
- /// <returns>true if negative</returns>
255
- using Base::IsNegative;
256
-
257
- /// <summary>
258
- /// Tests if the value is NaN
259
- /// </summary>
260
- /// <returns>true if NaN</returns>
261
- using Base::IsNaN;
262
-
263
- /// <summary>
264
- /// Tests if the value is finite
265
- /// </summary>
266
- /// <returns>true if finite</returns>
267
- using Base::IsFinite;
268
-
269
- /// <summary>
270
- /// Tests if the value represents positive infinity.
271
- /// </summary>
272
- /// <returns>true if positive infinity</returns>
273
- using Base::IsPositiveInfinity;
274
-
275
- /// <summary>
276
- /// Tests if the value represents negative infinity
277
- /// </summary>
278
- /// <returns>true if negative infinity</returns>
279
- using Base::IsNegativeInfinity;
280
-
281
- /// <summary>
282
- /// Tests if the value is either positive or negative infinity.
283
- /// </summary>
284
- /// <returns>True if absolute value is infinity</returns>
285
- using Base::IsInfinity;
286
-
287
- /// <summary>
288
- /// Tests if the value is NaN or zero. Useful for comparisons.
289
- /// </summary>
290
- /// <returns>True if NaN or zero.</returns>
291
- using Base::IsNaNOrZero;
292
-
293
- /// <summary>
294
- /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
295
- /// </summary>
296
- /// <returns>True if so</returns>
297
- using Base::IsNormal;
298
-
299
- /// <summary>
300
- /// Tests if the value is subnormal (denormal).
301
- /// </summary>
302
- /// <returns>True if so</returns>
303
- using Base::IsSubnormal;
304
-
305
- /// <summary>
306
- /// Creates an instance that represents absolute value.
307
- /// </summary>
308
- /// <returns>Absolute value</returns>
309
- using Base::Abs;
310
-
311
- /// <summary>
312
- /// Creates a new instance with the sign flipped.
313
- /// </summary>
314
- /// <returns>Flipped sign instance</returns>
315
- using Base::Negate;
316
-
317
- /// <summary>
318
- /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
319
- /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
320
- /// and therefore equivalent, if the resulting value is still zero.
321
- /// </summary>
322
- /// <param name="lhs">first value</param>
323
- /// <param name="rhs">second value</param>
324
- /// <returns>True if both arguments represent zero</returns>
325
- using Base::AreZero;
326
-
327
- /// <summary>
328
- /// User defined conversion operator. Converts BFloat16_t to float.
329
- /// </summary>
330
- explicit operator float() const noexcept { return ToFloat(); }
331
-
332
- // We do not have an inherited impl for the below operators
333
- // as the internal class implements them a little differently
334
- bool operator==(const BFloat16_t& rhs) const noexcept;
335
- bool operator!=(const BFloat16_t& rhs) const noexcept { return !(*this == rhs); }
336
- bool operator<(const BFloat16_t& rhs) const noexcept;
337
+struct BFloat16_t {
338
+ uint16_t value;
339
+ constexpr BFloat16_t() noexcept : value(0) {}
340
+ constexpr BFloat16_t(uint16_t v) noexcept : value(v) {}
341
+ constexpr operator uint16_t() const noexcept { return value; }
342
+ constexpr bool operator==(const BFloat16_t& rhs) const noexcept { return value == rhs.value; };
343
+ constexpr bool operator!=(const BFloat16_t& rhs) const noexcept { return value != rhs.value; };
344
};
345
346
static_assert(sizeof(BFloat16_t) == sizeof(uint16_t), "Sizes must match");
347
348
-/** \brief float8e4m3fn (Float8 Floating Point) data type
349
- * \details It is necessary for type dispatching to make use of C++ API
350
- * The type is implicitly convertible to/from uint8_t.
351
- * See https://onnx.ai/onnx/technical/float8.html for further details.
352
- */
353
-struct Float8E4M3FN_t {
354
- uint8_t value;
355
- constexpr Float8E4M3FN_t() noexcept : value(0) {}
356
- constexpr Float8E4M3FN_t(uint8_t v) noexcept : value(v) {}
357
- constexpr operator uint8_t() const noexcept { return value; }
358
- // nan values are treated like any other value for operator ==, !=
359
- constexpr bool operator==(const Float8E4M3FN_t& rhs) const noexcept { return value == rhs.value; };
360
- constexpr bool operator!=(const Float8E4M3FN_t& rhs) const noexcept { return value != rhs.value; };
361
-};
362
-
363
-static_assert(sizeof(Float8E4M3FN_t) == sizeof(uint8_t), "Sizes must match");
364
-
365
-/** \brief float8e4m3fnuz (Float8 Floating Point) data type
366
- * \details It is necessary for type dispatching to make use of C++ API
367
- * The type is implicitly convertible to/from uint8_t.
368
- * See https://onnx.ai/onnx/technical/float8.html for further details.
369
- */
370
-struct Float8E4M3FNUZ_t {
371
- uint8_t value;
372
- constexpr Float8E4M3FNUZ_t() noexcept : value(0) {}
373
- constexpr Float8E4M3FNUZ_t(uint8_t v) noexcept : value(v) {}
374
- constexpr operator uint8_t() const noexcept { return value; }
375
- // nan values are treated like any other value for operator ==, !=
376
- constexpr bool operator==(const Float8E4M3FNUZ_t& rhs) const noexcept { return value == rhs.value; };
377
- constexpr bool operator!=(const Float8E4M3FNUZ_t& rhs) const noexcept { return value != rhs.value; };
378
-};
379
-
380
-static_assert(sizeof(Float8E4M3FNUZ_t) == sizeof(uint8_t), "Sizes must match");
381
-
382
-/** \brief float8e5m2 (Float8 Floating Point) data type
383
- * \details It is necessary for type dispatching to make use of C++ API
384
- * The type is implicitly convertible to/from uint8_t.
385
- * See https://onnx.ai/onnx/technical/float8.html for further details.
386
- */
387
-struct Float8E5M2_t {
388
- uint8_t value;
389
- constexpr Float8E5M2_t() noexcept : value(0) {}
390
- constexpr Float8E5M2_t(uint8_t v) noexcept : value(v) {}
391
- constexpr operator uint8_t() const noexcept { return value; }
392
- // nan values are treated like any other value for operator ==, !=
393
- constexpr bool operator==(const Float8E5M2_t& rhs) const noexcept { return value == rhs.value; };
394
- constexpr bool operator!=(const Float8E5M2_t& rhs) const noexcept { return value != rhs.value; };
395
-};
396
-
397
-static_assert(sizeof(Float8E5M2_t) == sizeof(uint8_t), "Sizes must match");
398
-
399
-/** \brief float8e5m2fnuz (Float8 Floating Point) data type
400
- * \details It is necessary for type dispatching to make use of C++ API
401
- * The type is implicitly convertible to/from uint8_t.
402
- * See https://onnx.ai/onnx/technical/float8.html for further details.
403
- */
404
-struct Float8E5M2FNUZ_t {
405
- uint8_t value;
406
- constexpr Float8E5M2FNUZ_t() noexcept : value(0) {}
407
- constexpr Float8E5M2FNUZ_t(uint8_t v) noexcept : value(v) {}
408
- constexpr operator uint8_t() const noexcept { return value; }
409
- // nan values are treated like any other value for operator ==, !=
410
- constexpr bool operator==(const Float8E5M2FNUZ_t& rhs) const noexcept { return value == rhs.value; };
411
- constexpr bool operator!=(const Float8E5M2FNUZ_t& rhs) const noexcept { return value != rhs.value; };
412
-};
413
-
414
-static_assert(sizeof(Float8E5M2FNUZ_t) == sizeof(uint8_t), "Sizes must match");
415
-
416
namespace detail {
417
// This is used internally by the C++ API. This macro is to make it easy to generate overloaded methods for all of the various OrtRelease* functions for every Ort* type
418
// This can't be done in the C API since C doesn't have function overloading.
419
420
Env& UpdateEnvWithCustomLogLevel(OrtLoggingLevel log_severity_level); ///< Wraps OrtApi::UpdateEnvWithCustomLogLevel
421
422
Env& CreateAndRegisterAllocator(const OrtMemoryInfo* mem_info, const OrtArenaCfg* arena_cfg); ///< Wraps OrtApi::CreateAndRegisterAllocator
423
-
424
- Env& CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg); ///< Wraps OrtApi::CreateAndRegisterAllocatorV2
425
};
426
427
/** \brief Custom Op Domain
428
429
SessionOptionsImpl& SetIntraOpNumThreads(int intra_op_num_threads); ///< Wraps OrtApi::SetIntraOpNumThreads
430
SessionOptionsImpl& SetInterOpNumThreads(int inter_op_num_threads); ///< Wraps OrtApi::SetInterOpNumThreads
431
SessionOptionsImpl& SetGraphOptimizationLevel(GraphOptimizationLevel graph_optimization_level); ///< Wraps OrtApi::SetSessionGraphOptimizationLevel
432
- SessionOptionsImpl& SetDeterministicCompute(bool value); ///< Wraps OrtApi::SetDeterministicCompute
433
434
SessionOptionsImpl& EnableCpuMemArena(); ///< Wraps OrtApi::EnableCpuMemArena
435
SessionOptionsImpl& DisableCpuMemArena(); ///< Wraps OrtApi::DisableCpuMemArena
436
437
SessionOptionsImpl& AddInitializer(const char* name, const OrtValue* ort_val); ///< Wraps OrtApi::AddInitializer
438
SessionOptionsImpl& AddExternalInitializers(const std::vector<std::string>& names, const std::vector<Value>& ort_values); ///< Wraps OrtApi::AddExternalInitializers
439
440
- SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
441
- SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
442
- SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
443
- SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
444
- ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO_V2
445
- SessionOptionsImpl& AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options = {});
446
+ SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
447
+ SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
448
+ SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
449
+ SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
450
SessionOptionsImpl& AppendExecutionProvider_TensorRT(const OrtTensorRTProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
451
SessionOptionsImpl& AppendExecutionProvider_TensorRT_V2(const OrtTensorRTProviderOptionsV2& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
452
SessionOptionsImpl& AppendExecutionProvider_MIGraphX(const OrtMIGraphXProviderOptions& provider_options); ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
453
454
455
void Run(const RunOptions& run_options, const IoBinding&); ///< Wraps OrtApi::RunWithBinding
456
457
- /** \brief Run the model asynchronously in a thread owned by intra op thread pool
458
- *
459
- * Wraps OrtApi::RunAsync
460
- *
461
- * \paramin run_options
462
- * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
463
- * \paramin input_values Array of Value objects of length input_count
464
- * \paramin input_count Number of elements in the input_names and inputs arrays
465
- * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
466
- * \paramout output_values Array of provided Values to be filled with outputs.
467
- * On calling RunAsync, output_valuesi could either be initialized by a null pointer or a preallocated OrtValue*.
468
- * Later, on invoking the callback, each output_valuesi of null will be filled with an OrtValue* allocated by onnxruntime.
469
- * Then, an OrtValue** pointer will be casted from output_values, and pass to the callback.
470
- * NOTE: it is customer's duty to finally release output_values and each of its member,
471
- * regardless of whether the member (Ort::Value) is allocated by onnxruntime or preallocated by the customer.
472
- * \paramin output_count Number of elements in the output_names and outputs array
473
- * \paramin callback Callback function on model run completion
474
- * \paramin user_data User data that pass back to the callback
475
- */
476
- void RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
477
- const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data);
478
-
479
/** \brief End profiling and return a copy of the profiling file name.
480
*
481
* \param allocator to allocate memory for the copy of the string returned
482
483
static Value CreateTensor(const OrtMemoryInfo* info, T* p_data, size_t p_data_element_count, const int64_t* shape, size_t shape_len);
484
485
/** \brief Creates a tensor with a user supplied buffer. Wraps OrtApi::CreateTensorWithDataAsOrtValue.
486
- *
487
* \param info Memory description of where the p_data buffer resides (CPU vs GPU etc).
488
* \param p_data Pointer to the data buffer.
489
* \param p_data_byte_count The number of bytes in the data buffer.
490
491
static Value CreateTensor(const OrtMemoryInfo* info, void* p_data, size_t p_data_byte_count, const int64_t* shape, size_t shape_len,
492
ONNXTensorElementDataType type);
493
494
- /** \brief Creates an OrtValue with a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
495
- * This overload will allocate the buffer for the tensor according to the supplied shape and data type.
496
- * The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
497
- * The input data would need to be copied into the allocated buffer.
498
- * This API is not suitable for strings.
499
- *
500
+ /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
501
* \tparam T The numeric datatype. This API is not suitable for strings.
502
* \param allocator The allocator to use.
503
* \param shape Pointer to the tensor shape dimensions.
504
505
template <typename T>
506
static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len);
507
508
- /** \brief Creates an OrtValue with a tensor using the supplied OrtAllocator.
509
- * Wraps OrtApi::CreateTensorAsOrtValue.
510
- * The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
511
- * The input data would need to be copied into the allocated buffer.
512
- * This API is not suitable for strings.
513
- *
514
+ /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
515
* \param allocator The allocator to use.
516
* \param shape Pointer to the tensor shape dimensions.
517
* \param shape_len The number of tensor shape dimensions.
518
519
*/
520
static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len, ONNXTensorElementDataType type);
521
522
- /** \brief Creates an OrtValue with a Map Onnx type representation.
523
- * The API would ref-count the supplied OrtValues and they will be released
524
- * when the returned OrtValue is released. The caller may release keys and values after the call
525
- * returns.
526
- *
527
- * \param keys an OrtValue containing a tensor with primitive data type keys.
528
- * \param values an OrtValue that may contain a tensor. Ort currently supports only primitive data type values.
529
- */
530
- static Value CreateMap(const Value& keys, const Value& values); ///< Wraps OrtApi::CreateValue
531
-
532
- /** \brief Creates an OrtValue with a Sequence Onnx type representation.
533
- * The API would ref-count the supplied OrtValues and they will be released
534
- * when the returned OrtValue is released. The caller may release the values after the call
535
- * returns.
536
- *
537
- * \param values a vector of OrtValues that must have the same Onnx value type.
538
- */
539
- static Value CreateSequence(const std::vector<Value>& values); ///< Wraps OrtApi::CreateValue
540
+ static Value CreateMap(Value& keys, Value& values); ///< Wraps OrtApi::CreateValue
541
+ static Value CreateSequence(std::vector<Value>& values); ///< Wraps OrtApi::CreateValue
542
543
- /** \brief Creates an OrtValue wrapping an Opaque type.
544
- * This is used for experimental support of non-tensor types.
545
- *
546
- * \tparam T - the type of the value.
547
- * \param domain - zero terminated utf-8 string. Domain of the type.
548
- * \param type_name - zero terminated utf-8 string. Name of the type.
549
- * \param value - the value to be wrapped.
550
- */
551
template <typename T>
552
- static Value CreateOpaque(const char* domain, const char* type_name, const T& value); ///< Wraps OrtApi::CreateOpaqueValue
553
+ static Value CreateOpaque(const char* domain, const char* type_name, const T&); ///< Wraps OrtApi::CreateOpaqueValue
554
555
#if !defined(DISABLE_SPARSE_TENSORS)
556
/// <summary>
557
558
void* GetGPUComputeStream() const;
559
Logger GetLogger() const;
560
OrtAllocator* GetAllocator(const OrtMemoryInfo& memory_info) const;
561
- OrtKernelContext* GetOrtKernelContext() const { return ctx_; }
562
- void ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const;
563
564
private:
565
OrtKernelContext* ctx_;
566
567
};
568
569
/// <summary>
570
-/// Provide access to per-node attributes and input shapes, so one could compute and set output shapes.
571
+/// This entire structure is deprecated, but we not marking
572
+/// it as a whole yet since we want to preserve for the next release.
573
/// </summary>
574
-struct ShapeInferContext {
575
- struct SymbolicInteger {
576
- SymbolicInteger(int64_t i) : i_(i), is_int_(true){};
577
- SymbolicInteger(const char* s) : s_(s), is_int_(false){};
578
- SymbolicInteger(const SymbolicInteger&) = default;
579
- SymbolicInteger(SymbolicInteger&&) = default;
580
+struct CustomOpApi {
581
+ CustomOpApi(const OrtApi& api) : api_(api) {}
582
+
583
+ /** \deprecated use Ort::Value::GetTensorTypeAndShape()
584
+ * deprecated
585
+ * This interface produces a pointer that must be released. Not exception safe.
586
+ */
587
+ deprecated("use Ort::Value::GetTensorTypeAndShape()") OrtTensorTypeAndShapeInfo* GetTensorTypeAndShape(_In_ const OrtValue* value);
588
+
589
+ /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementCount()
590
+ * deprecated
591
+ * This interface is redundant.
592
+ */
593
+ deprecated("use Ort::TensorTypeAndShapeInfo::GetElementCount()") size_t GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info);
594
+
595
+ /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementType()
596
+ * deprecated
597
+ * This interface is redundant.
598
+ */
599
+ deprecated("use Ort::TensorTypeAndShapeInfo::GetElementType()") ONNXTensorElementDataType GetTensorElementType(const OrtTensorTypeAndShapeInfo* info);
600
+
601
+ /** \deprecated use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()
602
+ * deprecated
603
+ * This interface is redundant.
604
+ */
605
+ deprecated("use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()") size_t GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info);
606
607
- SymbolicInteger& operator=(const SymbolicInteger&) = default;
608
- SymbolicInteger& operator=(SymbolicInteger&&) = default;
609
+ /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
610
+ * deprecated
611
+ * This interface is redundant.
612
+ */
613
+ deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") void GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length);
614
615
- bool operator==(const SymbolicInteger& dim) const {
616
- if (is_int_ == dim.is_int_) {
617
- if (is_int_) {
618
- return i_ == dim.i_;
619
- } else {
620
- return std::string{s_} == std::string{dim.s_};
621
- }
622
- }
623
- return false;
624
- }
625
+ /** \deprecated
626
+ * deprecated
627
+ * This interface sets dimensions to TensorTypeAndShapeInfo, but has no effect on the OrtValue.
628
+ */
629
+ deprecated("Do not use") void SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count);
630
631
- bool IsInt() const { return is_int_; }
632
- int64_t AsInt() const { return i_; }
633
- const char* AsSym() const { return s_; }
634
+ /** \deprecated use Ort::Value::GetTensorMutableData()
635
+ * deprecated
636
+ * This interface is redundant.
637
+ */
638
+ template <typename T>
639
+ deprecated("use Ort::Value::GetTensorMutableData()") T* GetTensorMutableData(_Inout_ OrtValue* value);
640
641
- static constexpr int INVALID_INT_DIM = -2;
642
+ /** \deprecated use Ort::Value::GetTensorData()
643
+ * deprecated
644
+ * This interface is redundant.
645
+ */
646
+ template <typename T>
647
+ deprecated("use Ort::Value::GetTensorData()") const T* GetTensorData(_Inout_ const OrtValue* value);
648
649
- private:
650
- union {
651
- int64_t i_;
652
- const char* s_;
653
- };
654
- bool is_int_;
655
- };
656
+ /** \deprecated use Ort::Value::GetTensorMemoryInfo()
657
+ * deprecated
658
+ * This interface is redundant.
659
+ */
660
+ deprecated("use Ort::Value::GetTensorMemoryInfo()") const OrtMemoryInfo* GetTensorMemoryInfo(_In_ const OrtValue* value);
661
+
662
+ /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
663
+ * deprecated
664
+ * This interface is redundant.
665
+ */
666
+ deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") std::vector<int64_t> GetTensorShape(const OrtTensorTypeAndShapeInfo* info);
667
668
- using Shape = std::vector<SymbolicInteger>;
669
+ /** \deprecated use TensorTypeAndShapeInfo instances for automatic ownership.
670
+ * deprecated
671
+ * This interface is not exception safe.
672
+ */
673
+ deprecated("use TensorTypeAndShapeInfo") void ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input);
674
+
675
+ /** \deprecated use Ort::KernelContext::GetInputCount
676
+ * deprecated
677
+ * This interface is redundant.
678
+ */
679
+ deprecated("use Ort::KernelContext::GetInputCount") size_t KernelContext_GetInputCount(const OrtKernelContext* context);
680
+
681
+ /** \deprecated use Ort::KernelContext::GetInput
682
+ * deprecated
683
+ * This interface is redundant.
684
+ */
685
+ deprecated("use Ort::KernelContext::GetInput") const OrtValue* KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index);
686
+
687
+ /** \deprecated use Ort::KernelContext::GetOutputCount
688
+ * deprecated
689
+ * This interface is redundant.
690
+ */
691
+ deprecated("use Ort::KernelContext::GetOutputCount") size_t KernelContext_GetOutputCount(const OrtKernelContext* context);
692
+
693
+ /** \deprecated use Ort::KernelContext::GetOutput
694
+ * deprecated
695
+ * This interface is redundant.
696
+ */
697
+ deprecated("use Ort::KernelContext::GetOutput") OrtValue* KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index, _In_ const int64_t* dim_values, size_t dim_count);
698
699
- ShapeInferContext(const OrtApi* ort_api, OrtShapeInferContext* ctx);
700
+ /** \deprecated use Ort::KernelContext::GetGPUComputeStream
701
+ * deprecated
702
+ * This interface is redundant.
703
+ */
704
+ deprecated("use Ort::KernelContext::GetGPUComputeStream") void* KernelContext_GetGPUComputeStream(const OrtKernelContext* context);
705
706
- const Shape& GetInputShape(size_t indice) const { return input_shapes_.at(indice); }
707
+ /** \deprecated use Ort::ThrowOnError()
708
+ * deprecated
709
+ * This interface is redundant.
710
+ */
711
+ deprecated("use Ort::ThrowOnError()") void ThrowOnError(OrtStatus* result);
712
713
- size_t GetInputCount() const { return input_shapes_.size(); }
714
+ /** \deprecated use Ort::OpAttr
715
+ * deprecated
716
+ * This interface is not exception safe.
717
+ */
718
+ deprecated("use Ort::OpAttr") OrtOpAttr* CreateOpAttr(_In_ const char* name,
719
+ _In_ const void* data,
720
+ _In_ int len,
721
+ _In_ OrtOpAttrType type);
722
723
- Status SetOutputShape(size_t indice, const Shape& shape);
724
+ /** \deprecated use Ort::OpAttr
725
+ * deprecated
726
+ * This interface is not exception safe.
727
+ */
728
+ deprecated("use Ort::OpAttr") void ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr);
729
730
- int64_t GetAttrInt(const char* attr_name);
731
+ /** \deprecated use Ort::Op
732
+ * deprecated
733
+ * This interface is not exception safe.
734
+ */
735
+ deprecated("use Ort::Op") OrtOp* CreateOp(_In_ const OrtKernelInfo* info,
736
+ _In_z_ const char* op_name,
737
+ _In_z_ const char* domain,
738
+ int version,
739
+ _In_reads_(type_constraint_count) const char** type_constraint_names,
740
+ _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
741
+ int type_constraint_count,
742
+ _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
743
+ int attr_count,
744
+ int input_count,
745
+ int output_count);
746
747
- using Ints = std::vector<int64_t>;
748
- Ints GetAttrInts(const char* attr_name);
749
+ /** \deprecated use Ort::Op::Invoke
750
+ * deprecated
751
+ * This interface is redundant
752
+ */
753
+ deprecated("use Ort::Op::Invoke") void InvokeOp(_In_ const OrtKernelContext* context,
754
+ _In_ const OrtOp* ort_op,
755
+ _In_ const OrtValue* const* input_values,
756
+ _In_ int input_count,
757
+ _Inout_ OrtValue* const* output_values,
758
+ _In_ int output_count);
759
760
- float GetAttrFloat(const char* attr_name);
761
+ /** \deprecated use Ort::Op for automatic lifespan management.
762
+ * deprecated
763
+ * This interface is not exception safe.
764
+ */
765
+ deprecated("use Ort::Op") void ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op);
766
767
- using Floats = std::vector<float>;
768
- Floats GetAttrFloats(const char* attr_name);
769
+ /** \deprecated use Ort::KernelInfo for automatic lifespan management or for
770
+ * querying attributes
771
+ * deprecated
772
+ * This interface is redundant
773
+ */
774
+ template <typename T> // T is only implemented for std::vector<float>, std::vector<int64_t>, float, int64_t, and string
775
+ deprecated("use Ort::KernelInfo::GetAttribute") T KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name);
776
777
- std::string GetAttrString(const char* attr_name);
778
+ /** \deprecated use Ort::KernelInfo::Copy
779
+ * querying attributes
780
+ * deprecated
781
+ * This interface is not exception safe
782
+ */
783
+ deprecated("use Ort::KernelInfo::Copy") OrtKernelInfo* CopyKernelInfo(_In_ const OrtKernelInfo* info);
784
785
- using Strings = std::vector<std::string>;
786
- Strings GetAttrStrings(const char* attr_name);
787
+ /** \deprecated use Ort::KernelInfo for lifespan management
788
+ * querying attributes
789
+ * deprecated
790
+ * This interface is not exception safe
791
+ */
792
+ deprecated("use Ort::KernelInfo") void ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy);
793
794
private:
795
- const OrtOpAttr* GetAttrHdl(const char* attr_name) const;
796
- const OrtApi* ort_api_;
797
- OrtShapeInferContext* ctx_;
798
- std::vector<Shape> input_shapes_;
799
+ const OrtApi& api_;
800
};
801
802
-using ShapeInferFn = Ort::Status (*)(Ort::ShapeInferContext&);
803
-
804
-#define MAX_CUSTOM_OP_END_VER (1UL << 31) - 1
805
-
806
-template <typename TOp, typename TKernel, bool WithStatus = false>
807
+template <typename TOp, typename TKernel>
808
struct CustomOpBase : OrtCustomOp {
809
CustomOpBase() {
810
OrtCustomOp::version = ORT_API_VERSION;
811
+ OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
812
OrtCustomOp::GetName = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetName(); };
813
814
OrtCustomOp::GetExecutionProviderType = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetExecutionProviderType(); };
815
816
OrtCustomOp::GetOutputTypeCount = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetOutputTypeCount(); };
817
OrtCustomOp::GetOutputType = (const OrtCustomOp* this_, size_t index) { return static_cast<const TOp*>(this_)->GetOutputType(index); };
818
819
+ OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) { static_cast<TKernel*>(op_kernel)->Compute(context); };
820
#if defined(_MSC_VER) && !defined(__clang__)
821
#pragma warning(push)
822
#pragma warning(disable : 26409)
823
824
OrtCustomOp::GetVariadicInputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicInputHomogeneity()); };
825
OrtCustomOp::GetVariadicOutputMinArity = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetVariadicOutputMinArity(); };
826
OrtCustomOp::GetVariadicOutputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicOutputHomogeneity()); };
827
-#ifdef __cpp_if_constexpr
828
- if constexpr (WithStatus) {
829
-#else
830
- if (WithStatus) {
831
-#endif
832
- OrtCustomOp::CreateKernelV2 = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info, void** op_kernel) -> OrtStatusPtr {
833
- return static_cast<const TOp*>(this_)->CreateKernelV2(*api, info, op_kernel);
834
- };
835
- OrtCustomOp::KernelComputeV2 = (void* op_kernel, OrtKernelContext* context) -> OrtStatusPtr {
836
- return static_cast<TKernel*>(op_kernel)->ComputeV2(context);
837
- };
838
- } else {
839
- OrtCustomOp::CreateKernelV2 = nullptr;
840
- OrtCustomOp::KernelComputeV2 = nullptr;
841
-
842
- OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
843
- OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) {
844
- static_cast<TKernel*>(op_kernel)->Compute(context);
845
- };
846
- }
847
-
848
- SetShapeInferFn<TOp>(0);
849
-
850
- OrtCustomOp::GetStartVersion = (const OrtCustomOp* this_) {
851
- return static_cast<const TOp*>(this_)->start_ver_;
852
- };
853
-
854
- OrtCustomOp::GetEndVersion = (const OrtCustomOp* this_) {
855
- return static_cast<const TOp*>(this_)->end_ver_;
856
- };
857
}
858
859
// Default implementation of GetExecutionProviderType that returns nullptr to default to the CPU provider
860
861
return std::vector<std::string>{};
862
}
863
864
- template <typename C>
865
- decltype(&C::InferOutputShape) SetShapeInferFn(decltype(&C::InferOutputShape)) {
866
- OrtCustomOp::InferOutputShapeFn = (const OrtCustomOp*, OrtShapeInferContext* ort_ctx) -> OrtStatusPtr {
867
- ShapeInferContext ctx(&GetApi(), ort_ctx);
868
- return C::InferOutputShape(ctx);
869
- };
870
- return {};
871
- }
872
-
873
- template <typename C>
874
- void SetShapeInferFn(...) {
875
- OrtCustomOp::InferOutputShapeFn = {};
876
- }
877
-
878
protected:
879
// Helper function that returns a map of session config entries specified by CustomOpBase::GetSessionConfigKeys.
880
void GetSessionConfigs(std::unordered_map<std::string, std::string>& out, ConstSessionOptions options) const;
881
-
882
- int start_ver_ = 1;
883
- int end_ver_ = MAX_CUSTOM_OP_END_VER;
884
};
885
886
} // namespace Ort
887
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_cxx_inline.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_cxx_inline.h
Changed
570
1
2
// These are the inline implementations of the C++ header APIs. They're in this separate file as to not clutter
3
// the main C++ file with implementation details.
4
5
-#include <cstring>
6
-#include <functional>
7
-
8
-#define RETURN_ON_API_FAIL(expression) \
9
- { \
10
- auto err = (expression); \
11
- if (err) { \
12
- return Status(err); \
13
- } \
14
- }
15
-
16
namespace Ort {
17
18
namespace detail {
19
20
static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL;
21
};
22
23
-template <>
24
-struct TypeToTensorType<Float8E4M3FN_t> {
25
- static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN;
26
-};
27
-template <>
28
-struct TypeToTensorType<Float8E4M3FNUZ_t> {
29
- static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ;
30
-};
31
-template <>
32
-struct TypeToTensorType<Float8E5M2_t> {
33
- static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2;
34
-};
35
-template <>
36
-struct TypeToTensorType<Float8E5M2FNUZ_t> {
37
- static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ;
38
-};
39
-
40
-inline bool BFloat16_t::operator==(const BFloat16_t& rhs) const noexcept {
41
- if (IsNaN() || rhs.IsNaN()) {
42
- // IEEE defines that NaN is not equal to anything, including itself.
43
- return false;
44
- }
45
- return val == rhs.val;
46
-}
47
-
48
-inline bool BFloat16_t::operator<(const BFloat16_t& rhs) const noexcept {
49
- if (IsNaN() || rhs.IsNaN()) {
50
- // IEEE defines that NaN is unordered with respect to everything, including itself.
51
- return false;
52
- }
53
-
54
- const bool left_is_negative = IsNegative();
55
- if (left_is_negative != rhs.IsNegative()) {
56
- // When the signs of left and right differ, we know that left is less than right if it is
57
- // the negative value. The exception to this is if both values are zero, in which case IEEE
58
- // says they should be equal, even if the signs differ.
59
- return left_is_negative && !AreZero(*this, rhs);
60
- }
61
- return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
62
-}
63
-
64
inline MemoryAllocation::MemoryAllocation(OrtAllocator* allocator, void* p, size_t size)
65
: allocator_(allocator), p_(p), size_(size) {
66
}
67
68
return *this;
69
}
70
71
-inline Env& Env::CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg) {
72
- std::vector<const char*> keys, values;
73
- auto num_entries = options.size();
74
- if (num_entries > 0) {
75
- keys.reserve(num_entries);
76
- values.reserve(num_entries);
77
- for (const auto& entry : options) {
78
- keys.push_back(entry.first.c_str());
79
- values.push_back(entry.second.c_str());
80
- }
81
- }
82
- ThrowOnError(GetApi().CreateAndRegisterAllocatorV2(p_, provider_type.c_str(), mem_info, arena_cfg, keys.data(), values.data(), num_entries));
83
- return *this;
84
-}
85
-
86
inline CustomOpDomain::CustomOpDomain(const char* domain) {
87
ThrowOnError(GetApi().CreateCustomOpDomain(domain, &p_));
88
}
89
90
}
91
92
template <typename T>
93
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetDeterministicCompute(bool value) {
94
- ThrowOnError(GetApi().SetDeterministicCompute(this->p_, value));
95
- return *this;
96
-}
97
-
98
-template <typename T>
99
inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetOptimizedModelFilePath(const ORTCHAR_T* optimized_model_filepath) {
100
ThrowOnError(GetApi().SetOptimizedModelFilePath(this->p_, optimized_model_filepath));
101
return *this;
102
103
}
104
105
template <typename T>
106
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options) {
107
- auto num_entries = provider_options.size();
108
- std::vector<const char*> keys, values;
109
- if (num_entries > 0) {
110
- keys.reserve(num_entries);
111
- values.reserve(num_entries);
112
-
113
- for (const auto& entry : provider_options) {
114
- keys.push_back(entry.first.c_str());
115
- values.push_back(entry.second.c_str());
116
- }
117
- }
118
-
119
- ThrowOnError(GetApi().SessionOptionsAppendExecutionProvider_OpenVINO_V2(this->p_,
120
- keys.data(), values.data(), num_entries));
121
-
122
- return *this;
123
-}
124
-
125
-template <typename T>
126
inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::RegisterCustomOpsLibrary(const ORTCHAR_T* library_name,
127
const CustomOpConfigs& custom_op_configs) {
128
// Add custom op config entries before registering the custom op library. Otherwise, the config entries _may_ be ignored by
129
130
}
131
132
template <typename T>
133
-inline void SessionImpl<T>::RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
134
- const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data) {
135
- auto ort_input_values = reinterpret_cast<const OrtValue* const*>(input_values);
136
- auto ort_output_values = reinterpret_cast<OrtValue**>(output_values);
137
- ThrowOnError(GetApi().RunAsync(this->p_, run_options, input_names,
138
- ort_input_values, input_count, output_names, output_count,
139
- ort_output_values, callback, user_data));
140
-}
141
-
142
-template <typename T>
143
inline AllocatedStringPtr SessionImpl<T>::EndProfilingAllocated(OrtAllocator* allocator) {
144
char* out = nullptr;
145
ThrowOnError(GetApi().SessionEndProfiling(this->p_, allocator, &out));
146
147
}
148
#endif // !defined(DISABLE_SPARSE_TENSORS)
149
150
-inline Value Value::CreateMap(const Value& keys, const Value& values) {
151
+inline Value Value::CreateMap(Value& keys, Value& values) {
152
OrtValue* out;
153
- const OrtValue* inputs2 = {keys, values};
154
+ OrtValue* inputs2 = {keys, values};
155
ThrowOnError(GetApi().CreateValue(inputs, 2, ONNX_TYPE_MAP, &out));
156
return Value{out};
157
}
158
159
-inline Value Value::CreateSequence(const std::vector<Value>& values) {
160
+inline Value Value::CreateSequence(std::vector<Value>& values) {
161
OrtValue* out;
162
- std::vector<const OrtValue*> values_ort{values.data(), values.data() + values.size()};
163
+ std::vector<OrtValue*> values_ort{values.data(), values.data() + values.size()};
164
ThrowOnError(GetApi().CreateValue(values_ort.data(), values_ort.size(), ONNX_TYPE_SEQUENCE, &out));
165
return Value{out};
166
}
167
168
return Logger{out};
169
}
170
171
-inline void KernelContext::ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const {
172
- ThrowOnError(GetApi().KernelContext_ParallelFor(ctx_, fn, total, num_batch, usr_data));
173
-}
174
-
175
inline OpAttr::OpAttr(const char* name, const void* data, int len, OrtOpAttrType type) {
176
Ort::ThrowOnError(GetApi().CreateOpAttr(name, data, len, type, &p_));
177
}
178
179
output_values, static_cast<int>(output_count)));
180
}
181
182
+inline void CustomOpApi::ThrowOnError(OrtStatus* status) {
183
+ Ort::ThrowOnError(status);
184
+}
185
+
186
+template <>
187
+inline float CustomOpApi::KernelInfoGetAttribute<float>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
188
+ float out;
189
+ Ort::ThrowOnError(api_.KernelInfoGetAttribute_float(info, name, &out));
190
+ return out;
191
+}
192
+
193
+template <>
194
+inline int64_t CustomOpApi::KernelInfoGetAttribute<int64_t>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
195
+ int64_t out;
196
+ Ort::ThrowOnError(api_.KernelInfoGetAttribute_int64(info, name, &out));
197
+ return out;
198
+}
199
+
200
+template <>
201
+inline std::string CustomOpApi::KernelInfoGetAttribute<std::string>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
202
+ size_t size = 0;
203
+ std::string out;
204
+
205
+ // Feed nullptr for the data buffer to query the true size of the string attribute
206
+ OrtStatus* status = api_.KernelInfoGetAttribute_string(info, name, nullptr, &size);
207
+
208
+ if (status == nullptr) {
209
+ out.resize(size);
210
+ Ort::ThrowOnError(api_.KernelInfoGetAttribute_string(info, name, &out0, &size));
211
+ out.resize(size - 1); // remove the terminating character '\0'
212
+ } else {
213
+ Ort::ThrowOnError(status);
214
+ }
215
+ return out;
216
+}
217
+
218
+template <>
219
+inline std::vector<float> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
220
+ size_t size = 0;
221
+ std::vector<float> out;
222
+
223
+ // Feed nullptr for the data buffer to query the true size of the attribute
224
+ OrtStatus* status = api_.KernelInfoGetAttributeArray_float(info, name, nullptr, &size);
225
+
226
+ if (status == nullptr) {
227
+ out.resize(size);
228
+ Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_float(info, name, out.data(), &size));
229
+ } else {
230
+ Ort::ThrowOnError(status);
231
+ }
232
+ return out;
233
+}
234
+
235
+template <>
236
+inline std::vector<int64_t> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
237
+ size_t size = 0;
238
+ std::vector<int64_t> out;
239
+
240
+ // Feed nullptr for the data buffer to query the true size of the attribute
241
+ OrtStatus* status = api_.KernelInfoGetAttributeArray_int64(info, name, nullptr, &size);
242
+
243
+ if (status == nullptr) {
244
+ out.resize(size);
245
+ Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_int64(info, name, out.data(), &size));
246
+ } else {
247
+ Ort::ThrowOnError(status);
248
+ }
249
+ return out;
250
+}
251
+inline OrtTensorTypeAndShapeInfo* CustomOpApi::GetTensorTypeAndShape(_In_ const OrtValue* value) {
252
+ OrtTensorTypeAndShapeInfo* out;
253
+ Ort::ThrowOnError(api_.GetTensorTypeAndShape(value, &out));
254
+ return out;
255
+}
256
+
257
+inline size_t CustomOpApi::GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
258
+ size_t out;
259
+ Ort::ThrowOnError(api_.GetTensorShapeElementCount(info, &out));
260
+ return out;
261
+}
262
+
263
+inline ONNXTensorElementDataType CustomOpApi::GetTensorElementType(const OrtTensorTypeAndShapeInfo* info) {
264
+ ONNXTensorElementDataType out;
265
+ Ort::ThrowOnError(api_.GetTensorElementType(info, &out));
266
+ return out;
267
+}
268
+
269
+inline size_t CustomOpApi::GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
270
+ size_t out;
271
+ Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
272
+ return out;
273
+}
274
+
275
+inline void CustomOpApi::GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length) {
276
+ Ort::ThrowOnError(api_.GetDimensions(info, dim_values, dim_values_length));
277
+}
278
+
279
+inline void CustomOpApi::SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count) {
280
+ Ort::ThrowOnError(api_.SetDimensions(info, dim_values, dim_count));
281
+}
282
+
283
+template <typename T>
284
+inline T* CustomOpApi::GetTensorMutableData(_Inout_ OrtValue* value) {
285
+ T* data;
286
+ Ort::ThrowOnError(api_.GetTensorMutableData(value, reinterpret_cast<void**>(&data)));
287
+ return data;
288
+}
289
+
290
+inline const OrtMemoryInfo* CustomOpApi::GetTensorMemoryInfo(_In_ const OrtValue* value) {
291
+ const OrtMemoryInfo* mem_info;
292
+ Ort::ThrowOnError(api_.GetTensorMemoryInfo(value, &mem_info));
293
+ return mem_info;
294
+}
295
+
296
+template <typename T>
297
+inline const T* CustomOpApi::GetTensorData(_Inout_ const OrtValue* value) {
298
+ T* data = nullptr;
299
+ Ort::ThrowOnError(api_.GetTensorMutableData(const_cast<OrtValue*>(value), reinterpret_cast<void**>(&data)));
300
+ return data;
301
+}
302
+
303
+inline std::vector<int64_t> CustomOpApi::GetTensorShape(const OrtTensorTypeAndShapeInfo* info) {
304
+ size_t out;
305
+ Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
306
+ std::vector<int64_t> output(out);
307
+ Ort::ThrowOnError(api_.GetDimensions(info, output.data(), out));
308
+ return output;
309
+}
310
+
311
+inline void CustomOpApi::ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input) {
312
+ api_.ReleaseTensorTypeAndShapeInfo(input);
313
+}
314
+
315
+inline size_t CustomOpApi::KernelContext_GetInputCount(const OrtKernelContext* context) {
316
+ size_t out;
317
+ Ort::ThrowOnError(api_.KernelContext_GetInputCount(context, &out));
318
+ return out;
319
+}
320
+
321
+inline const OrtValue* CustomOpApi::KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index) {
322
+ const OrtValue* out;
323
+ Ort::ThrowOnError(api_.KernelContext_GetInput(context, index, &out));
324
+ return out;
325
+}
326
+
327
+inline size_t CustomOpApi::KernelContext_GetOutputCount(const OrtKernelContext* context) {
328
+ size_t out;
329
+ Ort::ThrowOnError(api_.KernelContext_GetOutputCount(context, &out));
330
+ return out;
331
+}
332
+
333
+inline OrtValue* CustomOpApi::KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index,
334
+ _In_ const int64_t* dim_values, size_t dim_count) {
335
+ OrtValue* out;
336
+ Ort::ThrowOnError(api_.KernelContext_GetOutput(context, index, dim_values, dim_count, &out));
337
+ return out;
338
+}
339
+
340
+inline void* CustomOpApi::KernelContext_GetGPUComputeStream(const OrtKernelContext* context) {
341
+ void* out;
342
+ Ort::ThrowOnError(api_.KernelContext_GetGPUComputeStream(context, &out));
343
+ return out;
344
+}
345
+
346
+inline OrtOpAttr* CustomOpApi::CreateOpAttr(_In_ const char* name,
347
+ _In_ const void* data,
348
+ _In_ int len,
349
+ _In_ OrtOpAttrType type) {
350
+ OrtOpAttr* op_attr{};
351
+ Ort::ThrowOnError(api_.CreateOpAttr(name, data, len, type, &op_attr));
352
+ return op_attr;
353
+}
354
+
355
+inline void CustomOpApi::ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr) {
356
+ api_.ReleaseOpAttr(op_attr);
357
+}
358
+
359
+inline OrtOp* CustomOpApi::CreateOp(_In_ const OrtKernelInfo* info,
360
+ _In_z_ const char* op_name,
361
+ _In_z_ const char* domain,
362
+ int version,
363
+ _In_reads_(type_constraint_count) const char** type_constraint_names,
364
+ _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
365
+ int type_constraint_count,
366
+ _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
367
+ int attr_count,
368
+ int input_count,
369
+ int output_count) {
370
+ OrtOp* ort_op{};
371
+ Ort::ThrowOnError(api_.CreateOp(info, op_name, domain, version, type_constraint_names, type_constraint_values,
372
+ type_constraint_count, attr_values, attr_count, input_count, output_count, &ort_op));
373
+ return ort_op;
374
+}
375
+
376
+inline void CustomOpApi::InvokeOp(_In_ const OrtKernelContext* context,
377
+ _In_ const OrtOp* ort_op,
378
+ _In_ const OrtValue* const* input_values,
379
+ _In_ int input_count,
380
+ _Inout_ OrtValue* const* output_values,
381
+ _In_ int output_count) {
382
+ Ort::ThrowOnError(api_.InvokeOp(context, ort_op, input_values, input_count, output_values, output_count));
383
+}
384
+
385
+inline void CustomOpApi::ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op) {
386
+ api_.ReleaseOp(ort_op);
387
+}
388
+
389
+inline OrtKernelInfo* CustomOpApi::CopyKernelInfo(_In_ const OrtKernelInfo* info) {
390
+ OrtKernelInfo* info_copy{};
391
+ Ort::ThrowOnError(api_.CopyKernelInfo(info, &info_copy));
392
+ return info_copy;
393
+}
394
+
395
+inline void CustomOpApi::ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy) {
396
+ api_.ReleaseKernelInfo(info_copy);
397
+}
398
+
399
inline std::string GetVersionString() {
400
return OrtGetApiBase()->GetVersionString();
401
}
402
403
return available_providers;
404
}
405
406
-template <typename TOp, typename TKernel, bool WithStatus>
407
-void CustomOpBase<TOp, TKernel, WithStatus>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
408
- ConstSessionOptions options) const {
409
+template <typename TOp, typename TKernel>
410
+void CustomOpBase<TOp, TKernel>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
411
+ ConstSessionOptions options) const {
412
const TOp* derived = static_cast<const TOp*>(this);
413
std::vector<std::string> keys = derived->GetSessionConfigKeys();
414
415
416
}
417
}
418
419
-inline ShapeInferContext::ShapeInferContext(const OrtApi* ort_api,
420
- OrtShapeInferContext* ctx) : ort_api_(ort_api), ctx_(ctx) {
421
- size_t input_count = 0;
422
- Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputCount(ctx_, &input_count));
423
- for (size_t ith_input = 0; ith_input < input_count; ++ith_input) {
424
- OrtTensorTypeAndShapeInfo* info{};
425
- Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputTypeShape(ctx, ith_input, &info));
426
- TensorTypeAndShapeInfo type_shape_info(info);
427
- auto integer_shape = type_shape_info.GetShape();
428
- std::vector<const char*> symbolic_shape(integer_shape.size(), {});
429
- type_shape_info.GetSymbolicDimensions(&symbolic_shape0, integer_shape.size());
430
- Shape shape;
431
- for (size_t ith = 0; ith < integer_shape.size(); ++ith) {
432
- if (symbolic_shapeith && std::string{symbolic_shapeith}.size() > 0) {
433
- shape.emplace_back(symbolic_shapeith);
434
- } else {
435
- shape.emplace_back(integer_shapeith);
436
- }
437
- }
438
- input_shapes_.push_back(std::move(shape));
439
- type_shape_info.release();
440
- }
441
-}
442
-
443
-inline Status ShapeInferContext::SetOutputShape(size_t indice, const Shape& shape) {
444
- OrtTensorTypeAndShapeInfo* info = {};
445
- RETURN_ON_API_FAIL(ort_api_->CreateTensorTypeAndShapeInfo(&info));
446
-
447
- using InfoPtr = std::unique_ptr<OrtTensorTypeAndShapeInfo, std::function<void(OrtTensorTypeAndShapeInfo*)>>;
448
-
449
- InfoPtr info_ptr(info, this(OrtTensorTypeAndShapeInfo* obj) {
450
- ort_api_->ReleaseTensorTypeAndShapeInfo(obj);
451
- });
452
-
453
- std::vector<int64_t> integer_dims;
454
- std::vector<const char*> symbolic_dims;
455
-
456
- for (const auto dim : shape) {
457
- if (dim.IsInt()) {
458
- integer_dims.push_back(dim.IsInt());
459
- symbolic_dims.push_back("");
460
- } else {
461
- if (!dim.AsSym() || std::string{dim.AsSym()}.empty()) {
462
- ORT_CXX_API_THROW("Symbolic dim must not be an empty string", ORT_INVALID_ARGUMENT);
463
- }
464
- integer_dims.push_back(SymbolicInteger::INVALID_INT_DIM);
465
- symbolic_dims.push_back(dim.AsSym());
466
- }
467
- }
468
-
469
- RETURN_ON_API_FAIL(ort_api_->SetDimensions(info, integer_dims.data(), integer_dims.size()));
470
- RETURN_ON_API_FAIL(ort_api_->SetSymbolicDimensions(info, symbolic_dims.data(), symbolic_dims.size()));
471
- RETURN_ON_API_FAIL(ort_api_->ShapeInferContext_SetOutputTypeShape(ctx_, indice, info));
472
- return Status{nullptr};
473
-}
474
-
475
-inline int64_t ShapeInferContext::GetAttrInt(const char* attr_name) {
476
- const auto* attr = GetAttrHdl(attr_name);
477
- int64_t i = {};
478
- size_t out = {};
479
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INT, &i, sizeof(i), &out));
480
- return i;
481
-}
482
-
483
-inline ShapeInferContext::Ints ShapeInferContext::GetAttrInts(const char* attr_name) {
484
- const auto* attr = GetAttrHdl(attr_name);
485
- int64_t i = {};
486
- size_t out = {};
487
- // first call to get the bytes needed
488
- auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, &i, sizeof(i), &out);
489
- if (status) {
490
- size_t num_i = out / sizeof(int64_t);
491
- ShapeInferContext::Ints ints(num_i, 0);
492
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, ints.data(), out, &out));
493
- return ints;
494
- } else {
495
- return {i};
496
- }
497
-}
498
-
499
-inline float ShapeInferContext::GetAttrFloat(const char* attr_name) {
500
- const auto* attr = GetAttrHdl(attr_name);
501
- float f = {};
502
- size_t out = {};
503
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOAT, &f, sizeof(f), &out));
504
- return f;
505
-}
506
-
507
-inline ShapeInferContext::Floats ShapeInferContext::GetAttrFloats(const char* attr_name) {
508
- const auto* attr = GetAttrHdl(attr_name);
509
- float f = {};
510
- size_t out = {};
511
- // first call to get the bytes needed
512
- auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, &f, sizeof(f), &out);
513
- if (status) {
514
- size_t num_f = out / sizeof(float);
515
- ShapeInferContext::Floats floats(num_f, 0);
516
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, floats.data(), out, &out));
517
- return floats;
518
- } else {
519
- return {f};
520
- }
521
-}
522
-
523
-inline std::string ShapeInferContext::GetAttrString(const char* attr_name) {
524
- const auto* attr = GetAttrHdl(attr_name);
525
- char c = {};
526
- size_t out = {};
527
- // first call to get the bytes needed
528
- auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, &c, sizeof(char), &out);
529
- if (status) {
530
- std::vector<char> chars(out, '\0');
531
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, chars.data(), out, &out));
532
- return {chars.data()};
533
- } else {
534
- return {c};
535
- }
536
-}
537
-
538
-inline ShapeInferContext::Strings ShapeInferContext::GetAttrStrings(const char* attr_name) {
539
- const auto* attr = GetAttrHdl(attr_name);
540
- char c = {};
541
- size_t out = {};
542
- // first call to get the bytes needed
543
- auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, &c, sizeof(char), &out);
544
- if (status) {
545
- std::vector<char> chars(out, '\0');
546
- Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, chars.data(), out, &out));
547
- ShapeInferContext::Strings strings;
548
- char* char_st = chars.data();
549
- char* char_ed = char_st + out;
550
- while (char_st < char_ed) {
551
- strings.emplace_back(char_st);
552
- while (*char_st != '\0') {
553
- char_st++;
554
- }
555
- char_st++;
556
- }
557
- return strings;
558
- } else {
559
- return {std::string{c}};
560
- }
561
-}
562
-
563
-inline const OrtOpAttr* ShapeInferContext::GetAttrHdl(const char* attr_name) const {
564
- const OrtOpAttr* attr_hdl = {};
565
- Ort::ThrowOnError(ort_api_->ShapeInferContext_GetAttribute(ctx_, attr_name, &attr_hdl));
566
- return attr_hdl;
567
-}
568
-
569
} // namespace Ort
570
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_session_options_config_keys.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_session_options_config_keys.h
Changed
97
1
2
// GeluApproximation has side effects which may change the inference results. It is disabled by default due to this.
3
static const char* const kOrtSessionOptionsEnableGeluApproximation = "optimization.enable_gelu_approximation";
4
5
-// This setting controls whether to enable AheadOfTime function inlining.
6
-// AOT function inlining examines the graph and attempts to inline as many locally defined functions in the model
7
-// as possible with the help of enabled execution providers.
8
-// This can reduce the number of function calls and improve performance because it is done before
9
-// Level1 optimizers and constant folding. However, under some circumstances, when the EPs are not available,
10
-// one can disable the AOT inlining, produce an optimized model and postpone AOT until run time.
11
-// "0": enable; "1": disable.
12
-// Its default value is "0".
13
-static const char* const kOrtSessionOptionsDisableAheadOfTimeFunctionInlining = "session.disable_aot_function_inlining";
14
-
15
#ifdef ENABLE_TRAINING
16
// Specifies a list of op types for memory footprint reduction.
17
// The value should be a ","-delimited list of pair of
18
-// <subgraph string: optimization strategy: number of subgraph to apply>.
19
+// <subgraph string : optimization strategy : number of subgraph to apply>.
20
// For example, "Gelu+Cast+:1:0,Dropout+:1:1".
21
// A valid "subgraph string" should be one subgraph representation output by ORT graph transformations.
22
// "optimization strategy" currently has valid values: 0 - disabled, 1 - recompute.
23
// "number of subgraph to apply" is used to control how many subgraphs to apply optimization, to avoid "oversaving"
24
// the memory.
25
-static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.memory_optimizer_config";
26
+static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.enable_memory_optimizer";
27
28
-// Specifies the config for detecting subgraphs for memory footprint reduction.
29
-// The value should be a string contains int separated using commas. The default value is "0:0".
30
-static const char* const kOrtSessionOptionsMemoryOptimizerProbeConfig = "optimization.enable_memory_probe_recompute_config";
31
+// Specifies the level for detecting subgraphs for memory footprint reduction.
32
+// The value should be an integer. The default value is 0.
33
+static const char* const kOrtSessionOptionsMemoryOptimizerProbeLevel = "optimization.enable_memory_probe_recompute_level";
34
#endif
35
36
// Enable or disable using device allocator for allocating initialized tensor memory. "1": enable; "0": disable. The default is "0".
37
38
// May be useful to expose bugs in models.
39
static const char* const kOrtSessionOptionsConfigStrictShapeTypeInference = "session.strict_shape_type_inference";
40
41
-// "1": every model using a more recent opset than the latest released one will fail
42
-// "0": the model may or may not work if onnxruntime cannot find an implementation, this option
43
-// is used for development purpose.
44
-static const char* const kOrtSessionOptionsConfigStrictAllowReleasedOpsetsOnly = "session.allow_released_opsets_only";
45
-
46
// The file saves configuration for partitioning node among logic streams
47
static const char* const kNodePartitionConfigFile = "session.node_partition_config_file";
48
49
50
// 3) after the L1 transformers are applied to the updated graph.
51
// The model will be saved to filename post_layout_transform_step_<step_number>.onnx.
52
static const char* const kDebugLayoutTransformation = "session.debug_layout_transformation";
53
-
54
-// Graph nodes that are not supported by the execution providers (EPs) explicitly added to the session are
55
-// assigned (i.e., "fallback") to the CPU EP by default.
56
-//
57
-// This option allows the user to disable the fallback of unsupported graph nodes to the CPU EP.
58
-// If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot
59
-// fully support all of the nodes in the graph.
60
-//
61
-// It is invalid to set this option and explicitly add the CPU EP to the session. In this case, session creation
62
-// will also fail with an error.
63
-//
64
-// Option values:
65
-// - "0": CPU EP fallback is not disabled. DEFAULT
66
-// - "1": CPU EP fallback is disabled.
67
-static const char* const kOrtSessionOptionsDisableCPUEPFallback = "session.disable_cpu_ep_fallback";
68
-
69
-// Use this config when serializing a large model after optimization to specify an external initializers file
70
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersFileName =
71
- "session.optimized_model_external_initializers_file_name";
72
-
73
-// Use this config to control the minimum size of the initializer when externalizing it during serialization
74
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersMinSizeInBytes =
75
- "session.optimized_model_external_initializers_min_size_in_bytes";
76
-
77
-// Enable EP context feature to dump the partitioned graph which includes the EP context into Onnx file.
78
-// The dumped Onnx model with EP context can be used for future inference to avoid the EP graph partitioning/compile overhead.
79
-// "0": disable. (default)
80
-// "1": enable.
81
-static const char* const kOrtSessionOptionEpContextEnable = "ep.context_enable";
82
-
83
-// Specify the file path for the Onnx model which has EP context.
84
-// Default to original_file_name_ctx.onnx if not specified
85
-static const char* const kOrtSessionOptionEpContextFilePath = "ep.context_file_path";
86
-
87
-// Flag to specify whether to dump the EP context into the Onnx model.
88
-// "0": dump the EP context into separate file, keep the file name in the Onnx model.
89
-// "1": dump the EP context into the Onnx model. (default).
90
-static const char* const kOrtSessionOptionEpContextEmbedMode = "ep.context_embed_mode";
91
-
92
-// Gemm fastmath mode provides fp32 gemm acceleration with bfloat16 based matmul.
93
-// Option values:
94
-// - "0": Gemm FastMath mode is not enabled. DEFAULT
95
-// - "1": Gemm FastMath mode is enabled.
96
-static const char* const kOrtSessionOptionsMlasGemmFastMathArm64Bfloat16 = "mlas.enable_gemm_fastmath_arm64_bfloat16";
97
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_c_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_c_api.h
Changed
200
1
2
*
3
* In order to train a model with onnxruntime, the following training artifacts must be generated:
4
* - The training onnx model
5
- * - The checkpoint file
6
+ * - The checkpoint directory
7
* - The optimizer onnx model
8
* - The eval onnx model model (optional)
9
*
10
11
/// \name Accessing The Training Session State
12
/// @{
13
14
- /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
15
+ /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
16
*
17
- * This function will parse a checkpoint file, pull relevant data and load the training
18
+ * This function will parse a checkpoint directory, pull relevant files and load the training
19
* state into the checkpoint_state. This checkpoint state can then be used to create the
20
* training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
21
* session will resume training from the given checkpoint state.
22
23
* training state (including model parameters, its gradients, the optimizer states and the properties).
24
* As a result, it is required that the checkpoint state outlive the lifetime of the training session.
25
*
26
- * \paramin checkpoint_path Path to the checkpoint file
27
+ * \paramin checkpoint_path Path to the checkpoint directory
28
* \paramout checkpoint_state Checkpoint state that contains the states of the training session.
29
*
30
* \snippet{doc} snippets.dox OrtStatus Return Value
31
32
ORT_API2_STATUS(LoadCheckpoint, _In_ const ORTCHAR_T* checkpoint_path,
33
_Outptr_ OrtCheckpointState** checkpoint_state);
34
35
- /** \brief Save the given state to a checkpoint file on disk.
36
+ /** \brief Save the given state to a checkpoint directory on disk.
37
*
38
- * This function serializes the provided checkpoint state to a file on disk.
39
+ * This function serializes the provided checkpoint state to a directory on disk.
40
* This checkpoint can later be loaded by invoking OrtTrainingApi::LoadCheckpoint to resume
41
* training from this snapshot of the state.
42
*
43
* \paramin checkpoint_state The checkpoint state to save.
44
- * \paramin checkpoint_path Path to the checkpoint file.
45
+ * \paramin checkpoint_path Path to the checkpoint directory.
46
* \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
47
*
48
* \snippet{doc} snippets.dox OrtStatus Return Value
49
50
* - The training onnx model
51
* - The evaluation onnx model (optional)
52
* - The optimizer onnx model
53
- * - The checkpoint file
54
+ * - The checkpoint directory
55
*
56
* These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
57
*
58
59
ORT_API2_STATUS(CreateTrainingSession, _In_ const OrtEnv* env, _In_ const OrtSessionOptions* options,
60
_Inout_ OrtCheckpointState* checkpoint_state, _In_ const ORTCHAR_T* train_model_path,
61
_In_ const ORTCHAR_T* eval_model_path, _In_ const ORTCHAR_T* optimizer_model_path,
62
- _Outptr_result_maybenull_ OrtTrainingSession** out);
63
-
64
- /** \brief Create a training session that can be used to begin or resume training.
65
- * This api provides a way to load all the training artifacts from buffers instead of files.
66
- *
67
- * \paramin env Environment to be used for the training session.
68
- * \paramin options Session options that the user can customize for this training session.
69
- * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
70
- * \paramin train_model_data Buffer containing the model data to be used to perform training
71
- * \paramin train_data_length Length of the buffer containing train_model_data
72
- * \paramin eval_model_data Buffer containing the model data to be used to perform evaluation
73
- * \paramin eval_data_length Length of the buffer containing eval_model_data
74
- * \paramin optim_model_data Buffer containing the model data to be used to perform weight update
75
- * \paramin optim_data_length Length of the buffer containing optim_model_data
76
- * \paramout out Created training session.
77
- *
78
- */
79
- ORT_API2_STATUS(CreateTrainingSessionFromBuffer, _In_ const OrtEnv* env,
80
- _In_ const OrtSessionOptions* options, _Inout_ OrtCheckpointState* checkpoint_state,
81
- _In_ const void* train_model_data, size_t train_data_length,
82
- _In_ const void* eval_model_data, size_t eval_data_length,
83
- _In_ const void* optim_model_data, size_t optim_data_length,
84
- _Outptr_result_maybenull_ OrtTrainingSession** out);
85
+ _Outptr_ OrtTrainingSession** out);
86
87
/// @}
88
89
90
/// \name Accessing The Training Session State
91
/// @{
92
93
- /** \brief Adds or updates the given property to/in the checkpoint state.
94
+ /** \brief Adds the given property to the checkpoint state.
95
*
96
* Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
97
- * state by the user by calling this function with the corresponding property name and value.
98
- * The given property name must be unique to be able to successfully add the property.
99
+ * state by the user if they desire by calling this function with the appropriate property name and
100
+ * value. The given property name must be unique to be able to successfully add the property.
101
*
102
* \paramin checkpoint_state The checkpoint state which should hold the property.
103
- * \paramin property_name Name of the property being added or updated.
104
+ * \paramin property_name Unique name of the property being added.
105
* \paramin property_type Type of the property associated with the given name.
106
* \paramin property_value Property value associated with the given name.
107
*
108
109
* exist in the checkpoint state to be able to retrieve it successfully.
110
*
111
* \paramin checkpoint_state The checkpoint state that is currently holding the property.
112
- * \paramin property_name Name of the property being retrieved.
113
+ * \paramin property_name Unique name of the property being retrieved.
114
* \paramin allocator Allocator used to allocate the memory for the property_value.
115
* \paramout property_type Type of the property associated with the given name.
116
* \paramout property_value Property value associated with the given name.
117
118
_Out_ enum OrtPropertyType* property_type, _Outptr_ void** property_value);
119
120
/// @}
121
-
122
- /// \name Accessing The Training Session State
123
- /// @{
124
-
125
- /** \brief Load a checkpoint state from a buffer into checkpoint_state.
126
- *
127
- * This function will parse a checkpoint bytes buffer, pull relevant data and load the training
128
- * state into the checkpoint_state. This checkpoint state can then be used to create the
129
- * training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
130
- * session will resume training from the given checkpoint state.
131
- * \note Note that the training session created with a checkpoint state uses this state to store the entire
132
- * training state (including model parameters, its gradients, the optimizer states and the properties).
133
- * As a result, it is required that the checkpoint state outlive the lifetime of the training session.
134
- *
135
- * \paramin checkpoint_buffer Path to the checkpoint bytes buffer.
136
- * \paramin num_bytes Number of bytes in the checkpoint buffer.
137
- * \paramout checkpoint_state Checkpoint state that contains the states of the training session.
138
- *
139
- * \snippet{doc} snippets.dox OrtStatus Return Value
140
- *
141
- */
142
- ORT_API2_STATUS(LoadCheckpointFromBuffer, _In_ const void* checkpoint_buffer,
143
- _In_ const size_t num_bytes, _Outptr_ OrtCheckpointState** checkpoint_state);
144
-
145
- /** \brief Retrieves the type and shape information of the parameter associated with the given parameter name.
146
- *
147
- * This function retrieves the type and shape of the parameter associated with the given parameter name.
148
- * The parameter must exist in the checkpoint state to be able to retrieve its type and shape information successfully.
149
- *
150
- * \paramin checkpoint_state The checkpoint state.
151
- * \paramin parameter_name Name of the parameter being retrieved.
152
- * \paramout parameter_type_and_shape The type and shape of the parameter being retrieved.
153
- *
154
- * \snippet{doc} snippets.dox OrtStatus Return Value
155
- *
156
- */
157
- ORT_API2_STATUS(GetParameterTypeAndShape, _In_ const OrtCheckpointState* checkpoint_state,
158
- _In_ const char* parameter_name, _Outptr_ OrtTensorTypeAndShapeInfo** parameter_type_and_shape);
159
-
160
- /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
161
- *
162
- * This function updates a model parameter in the checkpoint state with the given parameter data.
163
- * The training session must be already created with the checkpoint state that contains the parameter
164
- * being updated. The given parameter is copied over to the registered device for the training session.
165
- * The parameter must exist in the checkpoint state to be able to update it successfully.
166
- *
167
- * \paramin checkpoint_state The checkpoint state.
168
- * \paramin parameter_name Name of the parameter being updated.
169
- * \paramin parameter The parameter data that should replace the existing parameter data.
170
- *
171
- * \snippet{doc} snippets.dox OrtStatus Return Value
172
- *
173
- */
174
- ORT_API2_STATUS(UpdateParameter, _Inout_ OrtCheckpointState* checkpoint_state,
175
- _In_ const char* parameter_name, _In_ OrtValue* parameter);
176
-
177
- /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
178
- *
179
- * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
180
- * The parameter is copied over and returned as an OrtValue. The training session must be already created
181
- * with the checkpoint state that contains the parameter being retrieved.
182
- * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
183
- *
184
- * \paramin checkpoint_state The checkpoint state.
185
- * \paramin parameter_name Name of the parameter being retrieved.
186
- * \paramin allocator Allocator used to allocate the memory for the parameter.
187
- * \paramout parameter The parameter data that is retrieved from the checkpoint state.
188
- *
189
- * \snippet{doc} snippets.dox OrtStatus Return Value
190
- *
191
- */
192
- ORT_API2_STATUS(GetParameter, _In_ const OrtCheckpointState* checkpoint_state,
193
- _In_ const char* parameter_name, _Inout_ OrtAllocator* allocator,
194
- _Outptr_ OrtValue** parameter);
195
-
196
- /// @}
197
};
198
199
typedef struct OrtTrainingApi OrtTrainingApi;
200
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_cxx_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_cxx_api.h
Changed
144
1
2
/// \name Accessing The Training Session State
3
/// @{
4
5
- /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
6
+ /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
7
*
8
- * This function will parse a checkpoint file, pull relevant data and load the training
9
+ * This function will parse a checkpoint directory, pull relevant files and load the training
10
* state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
11
* training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
12
* training from the given checkpoint state.
13
*
14
- * \paramin path_to_checkpoint Path to the checkpoint file
15
+ * \paramin path_to_checkpoint Path to the checkpoint directory
16
* \return Ort::CheckpointState object which holds the state of the training session parameters.
17
*
18
*/
19
static CheckpointState LoadCheckpoint(const std::basic_string<ORTCHAR_T>& path_to_checkpoint);
20
21
- /** \brief Load a checkpoint state from a buffer.
22
+ /** \brief Save the given state to a checkpoint directory on disk.
23
*
24
- * This function will parse a checkpoint buffer, pull relevant data and load the training
25
- * state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
26
- * training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
27
- * training from the given checkpoint state.
28
- *
29
- * \paramin buffer Buffer containing the checkpoint data.
30
- * \return Ort::CheckpointState object which holds the state of the training session parameters.
31
- *
32
- */
33
- static CheckpointState LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer);
34
-
35
- /** \brief Save the given state to a checkpoint file on disk.
36
- *
37
- * This function serializes the provided checkpoint state to a file on disk.
38
+ * This function serializes the provided checkpoint state to a directory on disk.
39
* This checkpoint can later be loaded by invoking Ort::CheckpointState::LoadCheckpoint to resume
40
* training from this snapshot of the state.
41
*
42
* \paramin checkpoint_state The checkpoint state to save.
43
- * \paramin path_to_checkpoint Path to the checkpoint file.
44
+ * \paramin path_to_checkpoint Path to the checkpoint directory.
45
* \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
46
*
47
*/
48
49
const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
50
const bool include_optimizer_state = false);
51
52
- /** \brief Adds or updates the given property to/in the checkpoint state.
53
+ /** \brief Adds the given property to the checkpoint state.
54
*
55
* Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
56
- * state by the user by calling this function with the corresponding property name and value.
57
- * The given property name must be unique to be able to successfully add the property.
58
+ * state by the user if they desire by calling this function with the appropriate property name and
59
+ * value. The given property name must be unique to be able to successfully add the property.
60
*
61
- * \paramin property_name Name of the property being added or updated.
62
+ * \paramin property_name Unique name of the property being added.
63
* \paramin property_value Property value associated with the given name.
64
*
65
*/
66
67
* Gets the property value from an existing entry in the checkpoint state. The property must
68
* exist in the checkpoint state to be able to retrieve it successfully.
69
*
70
- * \paramin property_name Name of the property being retrieved.
71
+ * \paramin property_name Unique name of the property being retrieved.
72
* \return Property value associated with the given property name.
73
*
74
*/
75
Property GetProperty(const std::string& property_name);
76
77
- /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
78
- *
79
- * This function updates a model parameter in the checkpoint state with the given parameter data.
80
- * The training session must be already created with the checkpoint state that contains the parameter
81
- * being updated. The given parameter is copied over to the registered device for the training session.
82
- * The parameter must exist in the checkpoint state to be able to update it successfully.
83
- *
84
- * \paramin parameter_name Name of the parameter being updated.
85
- * \paramin parameter The parameter data that should replace the existing parameter data.
86
- *
87
- */
88
- void UpdateParameter(const std::string& parameter_name, const Value& parameter);
89
-
90
- /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
91
- *
92
- * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
93
- * The parameter is copied over to the provided OrtValue. The training session must be already created
94
- * with the checkpoint state that contains the parameter being retrieved.
95
- * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
96
- *
97
- * \paramin parameter_name Name of the parameter being retrieved.
98
- * \return The parameter data that is retrieved from the checkpoint state.
99
- *
100
- */
101
- Value GetParameter(const std::string& parameter_name);
102
-
103
/// @}
104
};
105
106
107
* - The training onnx model
108
* - The evaluation onnx model (optional)
109
* - The optimizer onnx model
110
- * - The checkpoint file
111
+ * - The checkpoint directory
112
*
113
* These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
114
*
115
116
const std::optional<std::basic_string<ORTCHAR_T>>& eval_model_path = std::nullopt,
117
const std::optional<std::basic_string<ORTCHAR_T>>& optimizer_model_path = std::nullopt);
118
119
- /** \brief Create a training session that can be used to begin or resume training.
120
- * This constructor allows the users to load the models from buffers instead of files.
121
- *
122
- * \paramin env Env to be used for the training session.
123
- * \paramin session_options SessionOptions that the user can customize for this training session.
124
- * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
125
- * \paramin train_model_data Buffer containing training model data.
126
- * \paramin eval_model_data Buffer containing evaluation model data.
127
- * \paramin optim_model_data Buffer containing optimizer model (used for performing weight/parameter update).
128
- *
129
- */
130
- TrainingSession(const Env& env, const SessionOptions& session_options, CheckpointState& checkpoint_state,
131
- const std::vector<uint8_t>& train_model_data, const std::vector<uint8_t>& eval_model_data = {},
132
- const std::vector<uint8_t>& optim_model_data = {});
133
/// @}
134
135
/// \name Implementing The Training Loop
136
137
* \paramin input_values The user inputs to the training model.
138
* \return A std::vector of Ort::Value objects that represents the output of the forward pass of the training model.
139
*
140
+ * \snippet{doc} snippets.dox OrtStatus Return Value
141
*
142
*/
143
std::vector<Value> TrainStep(const std::vector<Value>& input_values);
144
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_cxx_inline.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_cxx_inline.h
Changed
80
1
2
ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
3
}
4
5
-inline TrainingSession::TrainingSession(const Env& env, const SessionOptions& session_options,
6
- CheckpointState& checkpoint_state,
7
- const std::vector<uint8_t>& train_model_data,
8
- const std::vector<uint8_t>& eval_model_data,
9
- const std::vector<uint8_t>& optim_model_data) {
10
- ThrowOnError(GetTrainingApi().CreateTrainingSessionFromBuffer(
11
- env, session_options, checkpoint_state,
12
- train_model_data.data(), train_model_data.size(),
13
- eval_model_data.data(), eval_model_data.size(),
14
- optim_model_data.data(), optim_model_data.size(),
15
- &p_));
16
-
17
- ThrowOnError(GetTrainingApi().TrainingSessionGetTrainingModelOutputCount(p_, &training_model_output_count_));
18
-
19
- ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
20
-}
21
-
22
inline std::vector<Value> TrainingSession::TrainStep(const std::vector<Value>& input_values) {
23
std::vector<Value> output_values;
24
output_values.reserve(training_model_output_count_);
25
26
RunOptions run_options;
27
ThrowOnError(GetTrainingApi().EvalStep(
28
p_, run_options, input_values.size(), ort_input_values,
29
- eval_model_output_count_, ort_output_values));
30
+ training_model_output_count_, ort_output_values));
31
32
return output_values;
33
}
34
35
return CheckpointState(checkpoint_state);
36
}
37
38
-inline CheckpointState CheckpointState::LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer) {
39
- OrtCheckpointState* checkpoint_state;
40
- ThrowOnError(GetTrainingApi().LoadCheckpointFromBuffer(buffer.data(), buffer.size(), &checkpoint_state));
41
- return CheckpointState(checkpoint_state);
42
-}
43
-
44
inline void CheckpointState::SaveCheckpoint(const CheckpointState& checkpoint_states,
45
const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
46
const bool include_optimizer_state) {
47
48
ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtFloatProperty, value_p));
49
} else if (std::holds_alternative<std::string>(property_value)) {
50
std::string value = std::get<std::string>(property_value);
51
- auto buffer = std::make_unique<char>(value.length() + 1);
52
- memcpy(buffer.get(), value.c_str(), value.length());
53
- // AddProperty takes a char* and calls PropertyBag::AddProperty which takes a std::string. The data will be
54
- // copied at that point so buffer can free the local allocation once the call is made.
55
- ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty,
56
- buffer.get()));
57
+ auto buffer = std::make_unique<char>(value.length() + 1).release();
58
+ memcpy(buffer, value.c_str(), value.length());
59
+ ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty, buffer));
60
} else {
61
ThrowStatus(Status("Unknown property type received.", OrtErrorCode::ORT_INVALID_ARGUMENT));
62
}
63
64
return property;
65
}
66
67
-inline void CheckpointState::UpdateParameter(const std::string& parameter_name, const Value& parameter) {
68
- ThrowOnError(GetTrainingApi().UpdateParameter(p_, parameter_name.c_str(), parameter));
69
-}
70
-
71
-inline Value CheckpointState::GetParameter(const std::string& parameter_name) {
72
- AllocatorWithDefaultOptions allocator;
73
- OrtValue* parameter;
74
- ThrowOnError(GetTrainingApi().GetParameter(p_, parameter_name.c_str(), allocator, ¶meter));
75
-
76
- return Value{parameter};
77
-}
78
-
79
} // namespace Ort
80
onnxruntime-linux-x64-gpu-1.15.1.tgz/include/tensorrt_provider_factory.h
Added
16
1
2
+// Copyright (c) Microsoft Corporation. All rights reserved.
3
+// Licensed under the MIT License.
4
+
5
+#include "onnxruntime_c_api.h"
6
+
7
+#ifdef __cplusplus
8
+extern "C" {
9
+#endif
10
+
11
+ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Tensorrt, _In_ OrtSessionOptions* options, int device_id);
12
+
13
+#ifdef __cplusplus
14
+}
15
+#endif
16
onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime.so
Changed
3
1
-(symlink to libonnxruntime.so.1.17.1)
2
+(symlink to libonnxruntime.so.1.15.1)
3
onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime.so.1.15.1
Added
onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_cuda.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_cuda.so
Changed
onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_shared.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_shared.so
Changed
onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_tensorrt.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_tensorrt.so
Changed
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core
Deleted
2
1
-(directory)
2
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers
Deleted
2
1
-(directory)
2
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda
Deleted
2
1
-(directory)
2
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda/cuda_context.h
Deleted
99
1
2
-// Copyright (c) Microsoft Corporation. All rights reserved.
3
-// Licensed under the MIT License.
4
-
5
-// This header is to expose a context for cuda custom ops.
6
-// By the context, a custom cuda operator could fetch existing resources,
7
-// such as cuda stream and cudnn handle, for reusing.
8
-
9
-// For concrete usage, pls find page here:
10
-// https://onnxruntime.ai/docs/reference/operators/add-custom-op.html#custom-ops-for-cuda-and-rocm
11
-
12
-#pragma once
13
-
14
-#define ORT_CUDA_CTX
15
-
16
-#include "cuda_resource.h"
17
-#include "core/providers/custom_op_context.h"
18
-#include <cuda.h>
19
-#include <cuda_runtime.h>
20
-#include <cublas_v2.h>
21
-#include <cudnn.h>
22
-
23
-namespace Ort {
24
-
25
-namespace Custom {
26
-
27
-struct CudaContext : public CustomOpContext {
28
- cudaStream_t cuda_stream = {};
29
- cudnnHandle_t cudnn_handle = {};
30
- cublasHandle_t cublas_handle = {};
31
- OrtAllocator* deferred_cpu_allocator = {};
32
- // below are cuda ep options
33
- int16_t device_id = 0;
34
- int32_t arena_extend_strategy = 0;
35
- int32_t cudnn_conv_algo_search = 0;
36
- bool cudnn_conv_use_max_workspace = true;
37
- bool cudnn_conv1d_pad_to_nc1d = false;
38
- bool enable_skip_layer_norm_strict_mode = false;
39
- bool prefer_nhwc = false;
40
-
41
- void Init(const OrtKernelContext& kernel_ctx) {
42
- cuda_stream = FetchResource<cudaStream_t>(kernel_ctx, CudaResource::cuda_stream_t);
43
- cudnn_handle = FetchResource<cudnnHandle_t>(kernel_ctx, CudaResource::cudnn_handle_t);
44
- cublas_handle = FetchResource<cublasHandle_t>(kernel_ctx, CudaResource::cublas_handle_t);
45
- deferred_cpu_allocator = FetchResource<OrtAllocator*>(kernel_ctx, CudaResource::deferred_cpu_allocator_t);
46
-
47
- device_id = FetchResource<int16_t>(kernel_ctx, CudaResource::device_id_t);
48
- arena_extend_strategy = FetchResource<int32_t>(kernel_ctx, CudaResource::arena_extend_strategy_t);
49
- cudnn_conv_algo_search = FetchResource<int32_t>(kernel_ctx, CudaResource::cudnn_conv_algo_search_t);
50
- cudnn_conv_use_max_workspace = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv_use_max_workspace_t);
51
-
52
- cudnn_conv1d_pad_to_nc1d = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv1d_pad_to_nc1d_t);
53
- enable_skip_layer_norm_strict_mode = FetchResource<bool>(kernel_ctx, CudaResource::enable_skip_layer_norm_strict_mode_t);
54
- prefer_nhwc = FetchResource<bool>(kernel_ctx, CudaResource::prefer_nhwc_t);
55
- }
56
-
57
- template <typename T>
58
- T FetchResource(const OrtKernelContext& kernel_ctx, CudaResource resource_type) {
59
- if (sizeof(T) > sizeof(void*)) {
60
- ORT_CXX_API_THROW("void* is not large enough to hold resource type: " + std::to_string(resource_type), OrtErrorCode::ORT_INVALID_ARGUMENT);
61
- }
62
- const auto& ort_api = Ort::GetApi();
63
- void* resource = {};
64
- OrtStatus* status = ort_api.KernelContext_GetResource(&kernel_ctx, ORT_CUDA_RESOUCE_VERSION, resource_type, &resource);
65
- if (status) {
66
- ORT_CXX_API_THROW("Failed to fetch cuda ep resource, resouce type: " + std::to_string(resource_type), OrtErrorCode::ORT_RUNTIME_EXCEPTION);
67
- }
68
- T t = {};
69
- memcpy(&t, &resource, sizeof(T));
70
- return t;
71
- }
72
-
73
- void* AllocDeferredCpuMem(size_t size) const {
74
- if (0 == size) {
75
- return {};
76
- }
77
- const auto& ort_api = Ort::GetApi();
78
- void* mem = {};
79
- auto status = ort_api.AllocatorAlloc(deferred_cpu_allocator, size, &mem);
80
- if (status) {
81
- ORT_CXX_API_THROW("failed to allocate deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
82
- }
83
- return mem;
84
- }
85
-
86
- void FreeDeferredCpuMem(void* mem) const {
87
- if (mem) {
88
- const auto& ort_api = Ort::GetApi();
89
- auto status = ort_api.AllocatorFree(deferred_cpu_allocator, mem);
90
- if (status) {
91
- ORT_CXX_API_THROW("failed to free deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
92
- }
93
- }
94
- }
95
-};
96
-
97
-} // namespace Custom
98
-} // namespace Ort
99
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda/cuda_resource.h
Deleted
24
1
2
-// Copyright (c) Microsoft Corporation. All rights reserved.
3
-// Licensed under the MIT License.
4
-
5
-#include "core/providers/resource.h"
6
-
7
-#define ORT_CUDA_RESOUCE_VERSION 3
8
-
9
-enum CudaResource : int {
10
- cuda_stream_t = cuda_resource_offset, // 10000
11
- cudnn_handle_t,
12
- cublas_handle_t,
13
- deferred_cpu_allocator_t,
14
- // below are cuda ep options
15
- device_id_t, // 10004
16
- arena_extend_strategy_t,
17
- cudnn_conv_algo_search_t,
18
- cudnn_conv_use_max_workspace_t,
19
- cudnn_conv1d_pad_to_nc1d_t,
20
- enable_skip_layer_norm_strict_mode_t,
21
- prefer_nhwc_t,
22
-};
23
\ No newline at end of file
24
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/custom_op_context.h
Deleted
13
1
2
-// Copyright (c) Microsoft Corporation. All rights reserved.
3
-// Licensed under the MIT License.
4
-
5
-#pragma once
6
-
7
-// CustomOpContext defines an interface allowing a custom op to access ep-specific resources.
8
-struct CustomOpContext {
9
- CustomOpContext() = default;
10
- virtual ~CustomOpContext(){};
11
-};
12
\ No newline at end of file
13
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/resource.h
Deleted
17
1
2
-// Copyright (c) Microsoft Corporation. All rights reserved.
3
-// Licensed under the MIT License.
4
-
5
-#pragma once
6
-
7
-enum ResourceOffset {
8
- cpu_resource_offset = 0,
9
- cuda_resource_offset = 10000,
10
- dml_resource_offset = 20000,
11
- rocm_resource_offset = 30000,
12
- // offsets for other ort eps
13
- custom_ep_resource_offset = 10000000,
14
- // offsets for customized eps
15
-};
16
\ No newline at end of file
17
onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_float16.h
Deleted
542
1
2
-// Copyright (c) Microsoft Corporation. All rights reserved.
3
-// Licensed under the MIT License.
4
-
5
-#pragma once
6
-
7
-#include <stdint.h>
8
-#include <cmath>
9
-#include <cstring>
10
-#include <limits>
11
-
12
-namespace onnxruntime_float16 {
13
-
14
-namespace detail {
15
-
16
-enum class endian {
17
-#if defined(_WIN32)
18
- little = 0,
19
- big = 1,
20
- native = little,
21
-#elif defined(__GNUC__) || defined(__clang__)
22
- little = __ORDER_LITTLE_ENDIAN__,
23
- big = __ORDER_BIG_ENDIAN__,
24
- native = __BYTE_ORDER__,
25
-#else
26
-#error onnxruntime_float16::detail::endian is not implemented in this environment.
27
-#endif
28
-};
29
-
30
-static_assert(
31
- endian::native == endian::little || endian::native == endian::big,
32
- "Only little-endian or big-endian native byte orders are supported.");
33
-
34
-} // namespace detail
35
-
36
-/// <summary>
37
-/// Shared implementation between public and internal classes. CRTP pattern.
38
-/// </summary>
39
-template <class Derived>
40
-struct Float16Impl {
41
- protected:
42
- /// <summary>
43
- /// Converts from float to uint16_t float16 representation
44
- /// </summary>
45
- /// <param name="v"></param>
46
- /// <returns></returns>
47
- constexpr static uint16_t ToUint16Impl(float v) noexcept;
48
-
49
- /// <summary>
50
- /// Converts float16 to float
51
- /// </summary>
52
- /// <returns>float representation of float16 value</returns>
53
- float ToFloatImpl() const noexcept;
54
-
55
- /// <summary>
56
- /// Creates an instance that represents absolute value.
57
- /// </summary>
58
- /// <returns>Absolute value</returns>
59
- uint16_t AbsImpl() const noexcept {
60
- return static_cast<uint16_t>(val & ~kSignMask);
61
- }
62
-
63
- /// <summary>
64
- /// Creates a new instance with the sign flipped.
65
- /// </summary>
66
- /// <returns>Flipped sign instance</returns>
67
- uint16_t NegateImpl() const noexcept {
68
- return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
69
- }
70
-
71
- public:
72
- // uint16_t special values
73
- static constexpr uint16_t kSignMask = 0x8000U;
74
- static constexpr uint16_t kBiasedExponentMask = 0x7C00U;
75
- static constexpr uint16_t kPositiveInfinityBits = 0x7C00U;
76
- static constexpr uint16_t kNegativeInfinityBits = 0xFC00U;
77
- static constexpr uint16_t kPositiveQNaNBits = 0x7E00U;
78
- static constexpr uint16_t kNegativeQNaNBits = 0xFE00U;
79
- static constexpr uint16_t kEpsilonBits = 0x4170U;
80
- static constexpr uint16_t kMinValueBits = 0xFBFFU; // Minimum normal number
81
- static constexpr uint16_t kMaxValueBits = 0x7BFFU; // Largest normal number
82
- static constexpr uint16_t kOneBits = 0x3C00U;
83
- static constexpr uint16_t kMinusOneBits = 0xBC00U;
84
-
85
- uint16_t val{0};
86
-
87
- Float16Impl() = default;
88
-
89
- /// <summary>
90
- /// Checks if the value is negative
91
- /// </summary>
92
- /// <returns>true if negative</returns>
93
- bool IsNegative() const noexcept {
94
- return static_cast<int16_t>(val) < 0;
95
- }
96
-
97
- /// <summary>
98
- /// Tests if the value is NaN
99
- /// </summary>
100
- /// <returns>true if NaN</returns>
101
- bool IsNaN() const noexcept {
102
- return AbsImpl() > kPositiveInfinityBits;
103
- }
104
-
105
- /// <summary>
106
- /// Tests if the value is finite
107
- /// </summary>
108
- /// <returns>true if finite</returns>
109
- bool IsFinite() const noexcept {
110
- return AbsImpl() < kPositiveInfinityBits;
111
- }
112
-
113
- /// <summary>
114
- /// Tests if the value represents positive infinity.
115
- /// </summary>
116
- /// <returns>true if positive infinity</returns>
117
- bool IsPositiveInfinity() const noexcept {
118
- return val == kPositiveInfinityBits;
119
- }
120
-
121
- /// <summary>
122
- /// Tests if the value represents negative infinity
123
- /// </summary>
124
- /// <returns>true if negative infinity</returns>
125
- bool IsNegativeInfinity() const noexcept {
126
- return val == kNegativeInfinityBits;
127
- }
128
-
129
- /// <summary>
130
- /// Tests if the value is either positive or negative infinity.
131
- /// </summary>
132
- /// <returns>True if absolute value is infinity</returns>
133
- bool IsInfinity() const noexcept {
134
- return AbsImpl() == kPositiveInfinityBits;
135
- }
136
-
137
- /// <summary>
138
- /// Tests if the value is NaN or zero. Useful for comparisons.
139
- /// </summary>
140
- /// <returns>True if NaN or zero.</returns>
141
- bool IsNaNOrZero() const noexcept {
142
- auto abs = AbsImpl();
143
- return (abs == 0 || abs > kPositiveInfinityBits);
144
- }
145
-
146
- /// <summary>
147
- /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
148
- /// </summary>
149
- /// <returns>True if so</returns>
150
- bool IsNormal() const noexcept {
151
- auto abs = AbsImpl();
152
- return (abs < kPositiveInfinityBits) // is finite
153
- && (abs != 0) // is not zero
154
- && ((abs & kBiasedExponentMask) != 0); // is not subnormal (has a non-zero exponent)
155
- }
156
-
157
- /// <summary>
158
- /// Tests if the value is subnormal (denormal).
159
- /// </summary>
160
- /// <returns>True if so</returns>
161
- bool IsSubnormal() const noexcept {
162
- auto abs = AbsImpl();
163
- return (abs < kPositiveInfinityBits) // is finite
164
- && (abs != 0) // is not zero
165
- && ((abs & kBiasedExponentMask) == 0); // is subnormal (has a zero exponent)
166
- }
167
-
168
- /// <summary>
169
- /// Creates an instance that represents absolute value.
170
- /// </summary>
171
- /// <returns>Absolute value</returns>
172
- Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
173
-
174
- /// <summary>
175
- /// Creates a new instance with the sign flipped.
176
- /// </summary>
177
- /// <returns>Flipped sign instance</returns>
178
- Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
179
-
180
- /// <summary>
181
- /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
182
- /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
183
- /// and therefore equivalent, if the resulting value is still zero.
184
- /// </summary>
185
- /// <param name="lhs">first value</param>
186
- /// <param name="rhs">second value</param>
187
- /// <returns>True if both arguments represent zero</returns>
188
- static bool AreZero(const Float16Impl& lhs, const Float16Impl& rhs) noexcept {
189
- return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
190
- }
191
-
192
- bool operator==(const Float16Impl& rhs) const noexcept {
193
- if (IsNaN() || rhs.IsNaN()) {
194
- // IEEE defines that NaN is not equal to anything, including itself.
195
- return false;
196
- }
197
- return val == rhs.val;
198
- }
199
-
200
- bool operator!=(const Float16Impl& rhs) const noexcept { return !(*this == rhs); }
201
-
202
- bool operator<(const Float16Impl& rhs) const noexcept {
203
- if (IsNaN() || rhs.IsNaN()) {
204
- // IEEE defines that NaN is unordered with respect to everything, including itself.
205
- return false;
206
- }
207
-
208
- const bool left_is_negative = IsNegative();
209
- if (left_is_negative != rhs.IsNegative()) {
210
- // When the signs of left and right differ, we know that left is less than right if it is
211
- // the negative value. The exception to this is if both values are zero, in which case IEEE
212
- // says they should be equal, even if the signs differ.
213
- return left_is_negative && !AreZero(*this, rhs);
214
- }
215
- return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
216
- }
217
-};
218
-
219
-// The following Float16_t conversions are based on the code from
220
-// Eigen library.
221
-
222
-// The conversion routines are Copyright (c) Fabian Giesen, 2016.
223
-// The original license follows:
224
-//
225
-// Copyright (c) Fabian Giesen, 2016
226
-// All rights reserved.
227
-// Redistribution and use in source and binary forms, with or without
228
-// modification, are permitted.
229
-// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
230
-// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
231
-// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
232
-// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
233
-// HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
234
-// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
235
-// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
236
-// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
237
-// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
238
-// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
239
-// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
240
-
241
-namespace detail {
242
-union float32_bits {
243
- unsigned int u;
244
- float f;
245
-};
246
-} // namespace detail
247
-
248
-template <class Derived>
249
-inline constexpr uint16_t Float16Impl<Derived>::ToUint16Impl(float v) noexcept {
250
- detail::float32_bits f{};
251
- f.f = v;
252
-
253
- constexpr detail::float32_bits f32infty = {255 << 23};
254
- constexpr detail::float32_bits f16max = {(127 + 16) << 23};
255
- constexpr detail::float32_bits denorm_magic = {((127 - 15) + (23 - 10) + 1) << 23};
256
- constexpr unsigned int sign_mask = 0x80000000u;
257
- uint16_t val = static_cast<uint16_t>(0x0u);
258
-
259
- unsigned int sign = f.u & sign_mask;
260
- f.u ^= sign;
261
-
262
- // NOTE all the integer compares in this function can be safely
263
- // compiled into signed compares since all operands are below
264
- // 0x80000000. Important if you want fast straight SSE2 code
265
- // (since there's no unsigned PCMPGTD).
266
-
267
- if (f.u >= f16max.u) { // result is Inf or NaN (all exponent bits set)
268
- val = (f.u > f32infty.u) ? 0x7e00 : 0x7c00; // NaN->qNaN and Inf->Inf
269
- } else { // (De)normalized number or zero
270
- if (f.u < (113 << 23)) { // resulting FP16 is subnormal or zero
271
- // use a magic value to align our 10 mantissa bits at the bottom of
272
- // the float. as long as FP addition is round-to-nearest-even this
273
- // just works.
274
- f.f += denorm_magic.f;
275
-
276
- // and one integer subtract of the bias later, we have our final float!
277
- val = static_cast<uint16_t>(f.u - denorm_magic.u);
278
- } else {
279
- unsigned int mant_odd = (f.u >> 13) & 1; // resulting mantissa is odd
280
-
281
- // update exponent, rounding bias part 1
282
- // Equivalent to `f.u += ((unsigned int)(15 - 127) << 23) + 0xfff`, but
283
- // without arithmetic overflow.
284
- f.u += 0xc8000fffU;
285
- // rounding bias part 2
286
- f.u += mant_odd;
287
- // take the bits!
288
- val = static_cast<uint16_t>(f.u >> 13);
289
- }
290
- }
291
-
292
- val |= static_cast<uint16_t>(sign >> 16);
293
- return val;
294
-}
295
-
296
-template <class Derived>
297
-inline float Float16Impl<Derived>::ToFloatImpl() const noexcept {
298
- constexpr detail::float32_bits magic = {113 << 23};
299
- constexpr unsigned int shifted_exp = 0x7c00 << 13; // exponent mask after shift
300
- detail::float32_bits o{};
301
-
302
- o.u = (val & 0x7fff) << 13; // exponent/mantissa bits
303
- unsigned int exp = shifted_exp & o.u; // just the exponent
304
- o.u += (127 - 15) << 23; // exponent adjust
305
-
306
- // handle exponent special cases
307
- if (exp == shifted_exp) { // Inf/NaN?
308
- o.u += (128 - 16) << 23; // extra exp adjust
309
- } else if (exp == 0) { // Zero/Denormal?
310
- o.u += 1 << 23; // extra exp adjust
311
- o.f -= magic.f; // re-normalize
312
- }
313
-
314
- // Attempt to workaround the Internal Compiler Error on ARM64
315
- // for bitwise | operator, including std::bitset
316
-#if (defined _MSC_VER) && (defined _M_ARM || defined _M_ARM64 || defined _M_ARM64EC)
317
- if (IsNegative()) {
318
- return -o.f;
319
- }
320
-#else
321
- // original code:
322
- o.u |= (val & 0x8000U) << 16U; // sign bit
323
-#endif
324
- return o.f;
325
-}
326
-
327
-/// Shared implementation between public and internal classes. CRTP pattern.
328
-template <class Derived>
329
-struct BFloat16Impl {
330
- protected:
331
- /// <summary>
332
- /// Converts from float to uint16_t float16 representation
333
- /// </summary>
334
- /// <param name="v"></param>
335
- /// <returns></returns>
336
- static uint16_t ToUint16Impl(float v) noexcept;
337
-
338
- /// <summary>
339
- /// Converts bfloat16 to float
340
- /// </summary>
341
- /// <returns>float representation of bfloat16 value</returns>
342
- float ToFloatImpl() const noexcept;
343
-
344
- /// <summary>
345
- /// Creates an instance that represents absolute value.
346
- /// </summary>
347
- /// <returns>Absolute value</returns>
348
- uint16_t AbsImpl() const noexcept {
349
- return static_cast<uint16_t>(val & ~kSignMask);
350
- }
351
-
352
- /// <summary>
353
- /// Creates a new instance with the sign flipped.
354
- /// </summary>
355
- /// <returns>Flipped sign instance</returns>
356
- uint16_t NegateImpl() const noexcept {
357
- return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
358
- }
359
-
360
- public:
361
- // uint16_t special values
362
- static constexpr uint16_t kSignMask = 0x8000U;
363
- static constexpr uint16_t kBiasedExponentMask = 0x7F80U;
364
- static constexpr uint16_t kPositiveInfinityBits = 0x7F80U;
365
- static constexpr uint16_t kNegativeInfinityBits = 0xFF80U;
366
- static constexpr uint16_t kPositiveQNaNBits = 0x7FC1U;
367
- static constexpr uint16_t kNegativeQNaNBits = 0xFFC1U;
368
- static constexpr uint16_t kSignaling_NaNBits = 0x7F80U;
369
- static constexpr uint16_t kEpsilonBits = 0x0080U;
370
- static constexpr uint16_t kMinValueBits = 0xFF7FU;
371
- static constexpr uint16_t kMaxValueBits = 0x7F7FU;
372
- static constexpr uint16_t kRoundToNearest = 0x7FFFU;
373
- static constexpr uint16_t kOneBits = 0x3F80U;
374
- static constexpr uint16_t kMinusOneBits = 0xBF80U;
375
-
376
- uint16_t val{0};
377
-
378
- BFloat16Impl() = default;
379
-
380
- /// <summary>
381
- /// Checks if the value is negative
382
- /// </summary>
383
- /// <returns>true if negative</returns>
384
- bool IsNegative() const noexcept {
385
- return static_cast<int16_t>(val) < 0;
386
- }
387
-
388
- /// <summary>
389
- /// Tests if the value is NaN
390
- /// </summary>
391
- /// <returns>true if NaN</returns>
392
- bool IsNaN() const noexcept {
393
- return AbsImpl() > kPositiveInfinityBits;
394
- }
395
-
396
- /// <summary>
397
- /// Tests if the value is finite
398
- /// </summary>
399
- /// <returns>true if finite</returns>
400
- bool IsFinite() const noexcept {
401
- return AbsImpl() < kPositiveInfinityBits;
402
- }
403
-
404
- /// <summary>
405
- /// Tests if the value represents positive infinity.
406
- /// </summary>
407
- /// <returns>true if positive infinity</returns>
408
- bool IsPositiveInfinity() const noexcept {
409
- return val == kPositiveInfinityBits;
410
- }
411
-
412
- /// <summary>
413
- /// Tests if the value represents negative infinity
414
- /// </summary>
415
- /// <returns>true if negative infinity</returns>
416
- bool IsNegativeInfinity() const noexcept {
417
- return val == kNegativeInfinityBits;
418
- }
419
-
420
- /// <summary>
421
- /// Tests if the value is either positive or negative infinity.
422
- /// </summary>
423
- /// <returns>True if absolute value is infinity</returns>
424
- bool IsInfinity() const noexcept {
425
- return AbsImpl() == kPositiveInfinityBits;
426
- }
427
-
428
- /// <summary>
429
- /// Tests if the value is NaN or zero. Useful for comparisons.
430
- /// </summary>
431
- /// <returns>True if NaN or zero.</returns>
432
- bool IsNaNOrZero() const noexcept {
433
- auto abs = AbsImpl();
434
- return (abs == 0 || abs > kPositiveInfinityBits);
435
- }
436
-
437
- /// <summary>
438
- /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
439
- /// </summary>
440
- /// <returns>True if so</returns>
441
- bool IsNormal() const noexcept {
442
- auto abs = AbsImpl();
443
- return (abs < kPositiveInfinityBits) // is finite
444
- && (abs != 0) // is not zero
445
- && ((abs & kBiasedExponentMask) != 0); // is not subnormal (has a non-zero exponent)
446
- }
447
-
448
- /// <summary>
449
- /// Tests if the value is subnormal (denormal).
450
- /// </summary>
451
- /// <returns>True if so</returns>
452
- bool IsSubnormal() const noexcept {
453
- auto abs = AbsImpl();
454
- return (abs < kPositiveInfinityBits) // is finite
455
- && (abs != 0) // is not zero
456
- && ((abs & kBiasedExponentMask) == 0); // is subnormal (has a zero exponent)
457
- }
458
-
459
- /// <summary>
460
- /// Creates an instance that represents absolute value.
461
- /// </summary>
462
- /// <returns>Absolute value</returns>
463
- Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
464
-
465
- /// <summary>
466
- /// Creates a new instance with the sign flipped.
467
- /// </summary>
468
- /// <returns>Flipped sign instance</returns>
469
- Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
470
-
471
- /// <summary>
472
- /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
473
- /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
474
- /// and therefore equivalent, if the resulting value is still zero.
475
- /// </summary>
476
- /// <param name="lhs">first value</param>
477
- /// <param name="rhs">second value</param>
478
- /// <returns>True if both arguments represent zero</returns>
479
- static bool AreZero(const BFloat16Impl& lhs, const BFloat16Impl& rhs) noexcept {
480
- // IEEE defines that positive and negative zero are equal, this gives us a quick equality check
481
- // for two values by or'ing the private bits together and stripping the sign. They are both zero,
482
- // and therefore equivalent, if the resulting value is still zero.
483
- return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
484
- }
485
-};
486
-
487
-template <class Derived>
488
-inline uint16_t BFloat16Impl<Derived>::ToUint16Impl(float v) noexcept {
489
- uint16_t result;
490
- if (std::isnan(v)) {
491
- result = kPositiveQNaNBits;
492
- } else {
493
- auto get_msb_half = (float fl) {
494
- uint16_t result;
495
-#ifdef __cpp_if_constexpr
496
- if constexpr (detail::endian::native == detail::endian::little) {
497
-#else
498
- if (detail::endian::native == detail::endian::little) {
499
-#endif
500
- std::memcpy(&result, reinterpret_cast<char*>(&fl) + sizeof(uint16_t), sizeof(uint16_t));
501
- } else {
502
- std::memcpy(&result, &fl, sizeof(uint16_t));
503
- }
504
- return result;
505
- };
506
-
507
- uint16_t upper_bits = get_msb_half(v);
508
- union {
509
- uint32_t U32;
510
- float F32;
511
- };
512
- F32 = v;
513
- U32 += (upper_bits & 1) + kRoundToNearest;
514
- result = get_msb_half(F32);
515
- }
516
- return result;
517
-}
518
-
519
-template <class Derived>
520
-inline float BFloat16Impl<Derived>::ToFloatImpl() const noexcept {
521
- if (IsNaN()) {
522
- return std::numeric_limits<float>::quiet_NaN();
523
- }
524
- float result;
525
- char* const first = reinterpret_cast<char*>(&result);
526
- char* const second = first + sizeof(uint16_t);
527
-#ifdef __cpp_if_constexpr
528
- if constexpr (detail::endian::native == detail::endian::little) {
529
-#else
530
- if (detail::endian::native == detail::endian::little) {
531
-#endif
532
- std::memset(first, 0, sizeof(uint16_t));
533
- std::memcpy(second, &val, sizeof(uint16_t));
534
- } else {
535
- std::memcpy(first, &val, sizeof(uint16_t));
536
- std::memset(second, 0, sizeof(uint16_t));
537
- }
538
- return result;
539
-}
540
-
541
-} // namespace onnxruntime_float16
542
onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime.so.1.17.1
Deleted
opencv-linux-Release-4.8.0-1.tar.gz -> opencv-4.7.0.tar.gz
Changed
Refresh
obs-backgroundremoval
x86_64
x86_64
x86_64
x86_64
Refresh
Login required, please
login
or
signup
in order to comment
Request History
umireon created request almost 2 years ago
Adding a new package
OBS Plugin: Portrait Background Removal / Virtual Green-screen and Low-Light Enhancement
umireon revoked request over 1 year ago
Updating to 1.1.5