Request 5802 (revoked)

We truncated the diff of some files because they were too big. If you want to see the full diff for every file, click here.

Overview

Request 5802 (revoked)

Adding a new package
OBS Plugin: Portrait Background Removal / Virtual Green-screen and Low-Light Enhancement

Created by umireon almost 2 years ago
In state revoked
Package maintainer: umireon

Submit obs-backgroundremoval

Submit package home:umireon / obs-backgroundremoval to package Multimedia / obs-backgroundremoval

obs-backgroundremoval.changes Changed

@@ -1,58 +1,19 @@
 -------------------------------------------------------------------
-Tue Feb 18 12:35:51 UTC 2025 - Antonio Larrosa <alarrosa@suse.com>
+Mon Jun 26 17:51:23 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
 
-- Update to 1.1.13 
-  * Add video_tick function to background filter info
-  * Update Onnxruntime version and fix Windows compilerconfig
-- Update to 1.1.12
-  * Critical bugfix in the PSNR calculation for image-similarity
-    skipping in background filter
-- Update to 1.1.11
-  * New! RMBG model from Bria.AI
-    https://huggingface.co/briaai/RMBG-1.4 - remove background from
-    any object! (not just human)
-  * We got rid of the annoying "update available" message in favor
-    of a more discreet message on the plugin settings.
-  * Better handling of local file paths on Windows
-  * more.
-- Update to 1.1.10
-  * This release will fix the Flatpak recipe for Linux after the
-    dependency bump, as well as removing the start menu option from
-    the Windows installer.
-- Update to 1.1.9
-  * In this release we bumped versions of OpenCV and ONNXRuntime,
-    and trying to get rid of the annoying "smart screen" block on
-    Windows. We're also rolling out releases through AUR, Pacstall
-    and Flatpak. 💪 Linux!
-- Update to 1.1.8
-  * In this release we're introducing "simple mode" that hides most
-    of the settings under an "Advanced" checkbox, which should make
-    it far easier for newcomers to start using the filter without
-    "settings shock".
-  * Additionaly we implemented "temporal smoothing" that helps with
-    reducing the flickering of the edges in the binary mask.
-  * We bumped ONNX Runtime to v1.16.3 that increases robustness and
-    speed.
-  * We fixed the bug of the updater popping up the dialog because
-    we changed the repo URL.
-- Update to 1.1.7
-  * Upgrade to ONNXRuntime 1.16 which improves speed and
-    robustness.
-  * Repackaging of Mac OS release to a more consistent with Apple
-    dev tools.
-  * Fix crashes and bugs on Linux
-  * We added a new "website" for the plugin, which will eventually
-    have more installation info
-    https://occ-ai.github.io/obs-backgroundremoval/
-  * Adding a detailed log message with plugin info which helps us
-    debug
+Build only x86_64
 
-- Update onnxruntime to 1.17.1.tgz
-- Use Source URLs in the spec file
-- Add patch to fix a cmake error:
-  * fix-cmake-error.patch
+-------------------------------------------------------------------
+Mon Jun 26 16:29:21 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+
+v1.0.3
+
+-------------------------------------------------------------------
+Fri Jun 23 17:28:31 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+
+v1.0.2
 
 -------------------------------------------------------------------
-Thu Sep 21 13:50:09 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+Wed Jun 21 16:43:20 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
 
-- 1.1.6
+v1.0.1

​x
 
@@ -1,58 +1,19 @@
 -------------------------------------------------------------------
-Tue Feb 18 12:35:51 UTC 2025 - Antonio Larrosa <alarrosa@suse.com>
+Mon Jun 26 17:51:23 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
 
-- Update to 1.1.13 
-  * Add video_tick function to background filter info
-  * Update Onnxruntime version and fix Windows compilerconfig
-- Update to 1.1.12
-  * Critical bugfix in the PSNR calculation for image-similarity
-    skipping in background filter
-- Update to 1.1.11
-  * New! RMBG model from Bria.AI
-    https://huggingface.co/briaai/RMBG-1.4 - remove background from
-    any object! (not just human)
-  * We got rid of the annoying "update available" message in favor
-    of a more discreet message on the plugin settings.
-  * Better handling of local file paths on Windows
-  * more.
-- Update to 1.1.10
-  * This release will fix the Flatpak recipe for Linux after the
-    dependency bump, as well as removing the start menu option from
-    the Windows installer.
-- Update to 1.1.9
-  * In this release we bumped versions of OpenCV and ONNXRuntime,
-    and trying to get rid of the annoying "smart screen" block on
-    Windows. We're also rolling out releases through AUR, Pacstall
-    and Flatpak. 💪 Linux!
-- Update to 1.1.8
-  * In this release we're introducing "simple mode" that hides most
-    of the settings under an "Advanced" checkbox, which should make
-    it far easier for newcomers to start using the filter without
-    "settings shock".
-  * Additionaly we implemented "temporal smoothing" that helps with
-    reducing the flickering of the edges in the binary mask.
-  * We bumped ONNX Runtime to v1.16.3 that increases robustness and
-    speed.
-  * We fixed the bug of the updater popping up the dialog because
-    we changed the repo URL.
-- Update to 1.1.7
-  * Upgrade to ONNXRuntime 1.16 which improves speed and
-    robustness.
-  * Repackaging of Mac OS release to a more consistent with Apple
-    dev tools.
-  * Fix crashes and bugs on Linux
-  * We added a new "website" for the plugin, which will eventually
-    have more installation info
-    https://occ-ai.github.io/obs-backgroundremoval/
-  * Adding a detailed log message with plugin info which helps us
-    debug
+Build only x86_64
 
-- Update onnxruntime to 1.17.1.tgz
-- Use Source URLs in the spec file
-- Add patch to fix a cmake error:
-  * fix-cmake-error.patch
+-------------------------------------------------------------------
+Mon Jun 26 16:29:21 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+
+v1.0.3
+
+-------------------------------------------------------------------
+Fri Jun 23 17:28:31 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+
+v1.0.2
 
 -------------------------------------------------------------------
-Thu Sep 21 13:50:09 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
+Wed Jun 21 16:43:20 UTC 2023 - Kaito Udagawa <umireon@gmail.com>
 
-- 1.1.6
+v1.0.1
​

obs-backgroundremoval.spec Changed

@@ -17,48 +17,36 @@
 
 
 Name:           obs-backgroundremoval
-Version:        1.1.13
+Version:        1.0.3
 Release:        0
 Summary:        OBS Plugin for Background Removal
 License:        GPL-2.0
-URL:            https://github.com/locaal-ai/obs-backgroundremoval
-Source:         https://github.com/locaal-ai/%{name}/archive/refs/tags/%{version}.tar.gz#/%{name}-%{version}.tar.gz
+URL:            https://github.com/royshil/obs-backgroundremoval
+Source:         %{name}-%{version}.tar.gz
 Source1:        %{name}-rpmlintrc
-Source2:        opencv-linux-Release-4.8.0-1.tar.gz
-Source3:        https://github.com/microsoft/onnxruntime/releases/download/v1.17.1/onnxruntime-linux-x64-gpu-1.17.1.tgz
-Patch0:         fix-cmake-error.patch
+Source2:        opencv-4.7.0.tar.gz
+Source3:        onnxruntime-linux-x64-gpu-1.15.1.tgz
 BuildRequires:  cmake
-BuildRequires:  libcurl-devel
+BuildRequires:  gcc-c++
 BuildRequires:  obs-studio
 BuildRequires:  cmake(libobs)
-BuildRequires:  cmake(Qt6Core)
-BuildRequires:  cmake(Qt6Widgets)
-Requires:       obs-studio >= 29.0.0
+Requires:       obs-studio >= 28.0.0
 ExclusiveArch:  x86_64
 
 %global __requires_exclude_from ^.*libonnxruntime.*$
-%global __builddir build_x86_64
 
 %description
 An OBS plugin for removing background in portrait images (video), making it easy to replace the background when screen recording.
 
 %prep
-%autosetup -p1
+%autosetup
 
 %build
-test -x "$(type -p gcc-13)" && export CC="$_"
-test -x "$(type -p g++-13)" && export CXX="$_"
-%cmake \
-  -DQT_VERSION=6 \
-  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-  -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-  -DENABLE_FRONTEND_API=ON \
-  -DENABLE_QT=ON \
-  -DCMAKE_COMPILE_WARNING_AS_ERROR=ON \
-  -DCUSTOM_OPENCV_URL=%{SOURCE2} \
-  -DCUSTOM_OPENCV_HASH=MD5=7a668fbc3ac536812643c6b8c8f96be9 \
+%cmake -DLINUX_PORTABLE=OFF \
+  -DOPENCV_URL=%{SOURCE2} \
+  -DOPENCV_MD5=13e13244cb0cc6ec4f01eacd38d05d17 \
   -DCUSTOM_ONNXRUNTIME_URL=%{SOURCE3} \
-  -DCUSTOM_ONNXRUNTIME_HASH=MD5=da53e83b3ad3ab2cf46fbabd6a648a9d
+  -DCUSTOM_ONNXRUNTIME_MD5=8d2f5ee9f449bdecb10a45715fe74c53
 %cmake_build
 
 %install
@@ -70,6 +58,7 @@
 %files
 %license LICENSE
 %doc README.md
+%defattr(-,root,root,-)
 /usr/lib64/obs-plugins/obs-backgroundremoval.so
 /usr/lib64/obs-plugins/obs-backgroundremoval
 /usr/share/obs/obs-plugins/obs-backgroundremoval

 
@@ -17,48 +17,36 @@
 
 
 Name:           obs-backgroundremoval
-Version:        1.1.13
+Version:        1.0.3
 Release:        0
 Summary:        OBS Plugin for Background Removal
 License:        GPL-2.0
-URL:            https://github.com/locaal-ai/obs-backgroundremoval
-Source:         https://github.com/locaal-ai/%{name}/archive/refs/tags/%{version}.tar.gz#/%{name}-%{version}.tar.gz
+URL:            https://github.com/royshil/obs-backgroundremoval
+Source:         %{name}-%{version}.tar.gz
 Source1:        %{name}-rpmlintrc
-Source2:        opencv-linux-Release-4.8.0-1.tar.gz
-Source3:        https://github.com/microsoft/onnxruntime/releases/download/v1.17.1/onnxruntime-linux-x64-gpu-1.17.1.tgz
-Patch0:         fix-cmake-error.patch
+Source2:        opencv-4.7.0.tar.gz
+Source3:        onnxruntime-linux-x64-gpu-1.15.1.tgz
 BuildRequires:  cmake
-BuildRequires:  libcurl-devel
+BuildRequires:  gcc-c++
 BuildRequires:  obs-studio
 BuildRequires:  cmake(libobs)
-BuildRequires:  cmake(Qt6Core)
-BuildRequires:  cmake(Qt6Widgets)
-Requires:       obs-studio >= 29.0.0
+Requires:       obs-studio >= 28.0.0
 ExclusiveArch:  x86_64
 
 %global __requires_exclude_from ^.*libonnxruntime.*$
-%global __builddir build_x86_64
 
 %description
 An OBS plugin for removing background in portrait images (video), making it easy to replace the background when screen recording.
 
 %prep
-%autosetup -p1
+%autosetup
 
 %build
-test -x "$(type -p gcc-13)" && export CC="$_"
-test -x "$(type -p g++-13)" && export CXX="$_"
-%cmake \
-  -DQT_VERSION=6 \
-  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-  -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-  -DENABLE_FRONTEND_API=ON \
-  -DENABLE_QT=ON \
-  -DCMAKE_COMPILE_WARNING_AS_ERROR=ON \
-  -DCUSTOM_OPENCV_URL=%{SOURCE2} \
-  -DCUSTOM_OPENCV_HASH=MD5=7a668fbc3ac536812643c6b8c8f96be9 \
+%cmake -DLINUX_PORTABLE=OFF \
+  -DOPENCV_URL=%{SOURCE2} \
+  -DOPENCV_MD5=13e13244cb0cc6ec4f01eacd38d05d17 \
   -DCUSTOM_ONNXRUNTIME_URL=%{SOURCE3} \
-  -DCUSTOM_ONNXRUNTIME_HASH=MD5=da53e83b3ad3ab2cf46fbabd6a648a9d
+  -DCUSTOM_ONNXRUNTIME_MD5=8d2f5ee9f449bdecb10a45715fe74c53
 %cmake_build
 
 %install
@@ -70,6 +58,7 @@
 %files
 %license LICENSE
 %doc README.md
+%defattr(-,root,root,-)
 /usr/lib64/obs-plugins/obs-backgroundremoval.so
 /usr/lib64/obs-plugins/obs-backgroundremoval
 /usr/share/obs/obs-plugins/obs-backgroundremoval
​

fix-cmake-error.patch Deleted

 
@@ -1,12 +0,0 @@
-Index: obs-backgroundremoval-1.1.13/cmake/common/helpers_common.cmake
-===================================================================
---- obs-backgroundremoval-1.1.13.orig/cmake/common/helpers_common.cmake
-+++ obs-backgroundremoval-1.1.13/cmake/common/helpers_common.cmake
-@@ -86,7 +86,6 @@ macro(find_qt)
-       add_library(Qt::${component} INTERFACE IMPORTED)
-       set_target_properties(Qt::${component} PROPERTIES INTERFACE_LINK_LIBRARIES Qt${_QT_VERSION}::${component})
-     endif()
--    set_property(TARGET Qt::${component} PROPERTY INTERFACE_COMPILE_FEATURES "")
-   endforeach()
- 
- endmacro()
​

obs-backgroundremoval-1.1.13.tar.gz -> obs-backgroundremoval-1.0.3.tar.gz Changed

onnxruntime-linux-x64-gpu-1.17.1.tgz/GIT_COMMIT_ID -> onnxruntime-linux-x64-gpu-1.15.1.tgz/GIT_COMMIT_ID Changed

 
@@ -1,1 +1,1 @@
-8f5c79cb63f09ef1302e85081093a3fe4da1bc7d
+baeece44ba075009c6bfe95891a8c1b3d4571cb3
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/README.md -> onnxruntime-linux-x64-gpu-1.15.1.tgz/README.md Changed

@@ -6,22 +6,23 @@
 
 **ONNX Runtime training** can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more &rarr;(https://www.onnxruntime.ai/docs/#onnx-runtime-for-training)
 
+
 ## Get Started & Resources
 
 * **General Information**: onnxruntime.ai(https://onnxruntime.ai)
 
-* **Usage documentation and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
+* **Usage documention and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
 
 * **YouTube video tutorials**: youtube.com/@ONNXRuntime(https://www.youtube.com/@ONNXRuntime)
 
 * **Upcoming Release Roadmap**(https://github.com/microsoft/onnxruntime/wiki/Upcoming-Release-Roadmap)
 
-* **Companion sample repositories**:
+* **Companion sample repositories**: 
   - ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples(https://github.com/microsoft/onnxruntime-inference-examples)
   - ONNX Runtime Training: microsoft/onnxruntime-training-examples(https://github.com/microsoft/onnxruntime-training-examples)
 
-## Builtin Pipeline Status
 
+## Build Pipeline Status
 |System|Inference|Training|
 |---|---|---|
 |Windows|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20CPU%20CI%20Pipeline?label=Windows+CPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=9)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20CI%20Pipeline?label=Windows+GPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=10)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20TensorRT%20CI%20Pipeline?label=Windows+GPU+TensorRT)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=47)||
@@ -30,13 +31,8 @@
 |Android|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Android%20CI%20Pipeline?label=Android)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=53)||
 |iOS|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/iOS%20CI%20Pipeline?label=iOS)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=134)||
 |Web|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/ONNX%20Runtime%20Web%20CI%20Pipeline?label=Web)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=161)||
-|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)||
-
-## Third-party Pipeline Status
+|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-python-checks-ci-pipeline?label=Python+Checks)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=164)||
 
-|System|Inference|Training|
-|---|---|---|
-|Linux|!Build Status(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml/badge.svg)(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml)||
 
 ## Data/Telemetry

 
@@ -6,22 +6,23 @@
 
 **ONNX Runtime training** can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more &rarr;(https://www.onnxruntime.ai/docs/#onnx-runtime-for-training)
 
+
 ## Get Started & Resources
 
 * **General Information**: onnxruntime.ai(https://onnxruntime.ai)
 
-* **Usage documentation and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
+* **Usage documention and tutorials**: onnxruntime.ai/docs(https://onnxruntime.ai/docs)
 
 * **YouTube video tutorials**: youtube.com/@ONNXRuntime(https://www.youtube.com/@ONNXRuntime)
 
 * **Upcoming Release Roadmap**(https://github.com/microsoft/onnxruntime/wiki/Upcoming-Release-Roadmap)
 
-* **Companion sample repositories**:
+* **Companion sample repositories**: 
   - ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples(https://github.com/microsoft/onnxruntime-inference-examples)
   - ONNX Runtime Training: microsoft/onnxruntime-training-examples(https://github.com/microsoft/onnxruntime-training-examples)
 
-## Builtin Pipeline Status
 
+## Build Pipeline Status
 |System|Inference|Training|
 |---|---|---|
 |Windows|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20CPU%20CI%20Pipeline?label=Windows+CPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=9)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20CI%20Pipeline?label=Windows+GPU)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=10)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Windows%20GPU%20TensorRT%20CI%20Pipeline?label=Windows+GPU+TensorRT)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=47)||
@@ -30,13 +31,8 @@
 |Android|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/Android%20CI%20Pipeline?label=Android)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=53)||
 |iOS|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/iOS%20CI%20Pipeline?label=iOS)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=134)||
 |Web|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/ONNX%20Runtime%20Web%20CI%20Pipeline?label=Web)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=161)||
-|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)||
-
-## Third-party Pipeline Status
+|Other|!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-binary-size-checks-ci-pipeline?repoName=microsoft%2Fonnxruntime&label=Binary+Size+Check)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=187&repoName=microsoft%2Fonnxruntime)<br>!Build Status(https://dev.azure.com/onnxruntime/onnxruntime/_apis/build/status/onnxruntime-python-checks-ci-pipeline?label=Python+Checks)(https://dev.azure.com/onnxruntime/onnxruntime/_build/latest?definitionId=164)||
 
-|System|Inference|Training|
-|---|---|---|
-|Linux|!Build Status(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml/badge.svg)(https://github.com/Ascend/onnxruntime/actions/workflows/build-and-test.yaml)||
 
 ## Data/Telemetry
 
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/ThirdPartyNotices.txt -> onnxruntime-linux-x64-gpu-1.15.1.tgz/ThirdPartyNotices.txt Changed

@@ -6021,488 +6021,4 @@
 
 Except as contained in this notice, the name of a copyright holder shall not
 be used in advertising or otherwise to promote the sale, use or other dealings
-in this Software without prior written authorization of the copyright holder.
-
-_____
-
-Intel neural-compressor
-
-https://github.com/intel/neural-compressor
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   ============================================================================
-
-   Copyright 2016-2019 Intel Corporation
-   Copyright 2018 YANDEX LLC
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-
-   This distribution includes third party software ("third party programs").
-   This third party software, even if included with the distribution of
-   the Intel software, may be governed by separate license terms, including
-   without limitation, third party license terms, other Intel software license
-   terms, and open source software license terms. These separate license terms
-   govern your use of the third party programs as set forth in the
-   "THIRD-PARTY-PROGRAMS" file.
-
-_____
-
-FlashAttention, https://github.com/Dao-AILab/flash-attention
-
-BSD 3-Clause License
-
-Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file.
-All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-* Redistributions of source code must retain the above copyright notice, this
-  list of conditions and the following disclaimer.
-
-* Redistributions in binary form must reproduce the above copyright notice,
-  this list of conditions and the following disclaimer in the documentation
-  and/or other materials provided with the distribution.
-
-* Neither the name of the copyright holder nor the names of its
-  contributors may be used to endorse or promote products derived from
-  this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
-FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-_____
-
-composable_kernel
-
-https://github.com/ROCmSoftwarePlatform/composable_kernel
-
-Copyright (c) 2018-    , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang)
-Copyright (c) 2019-    , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang)
-Copyright (c) 2022-    , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan)
-Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang)
-Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah)
-Copyright (c) 2020     , Advanced Micro Devices, Inc. (Xiaoyan Zhou)
-Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan)
-
-SPDX-License-Identifier: MIT
-Copyright (c) 2018-2023, Advanced Micro Devices, Inc. All rights reserved.
-
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
-
-_____
-
-neural-speed
-
-https://github.com/intel/neural-speed
-
-                                 Apache License
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   ============================================================================
-
-   Copyright 2016-2019 Intel Corporation
-   Copyright 2018 YANDEX LLC
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-
-   This distribution includes third party software ("third party programs").
-   This third party software, even if included with the distribution of
-   the Intel software, may be governed by separate license terms, including
-   without limitation, third party license terms, other Intel software license
-   terms, and open source software license terms. These separate license terms
-   govern your use of the third party programs as set forth in the
-   "THIRD-PARTY-PROGRAMS" file.
+in this Software without prior written authorization of the copyright holder.
\ No newline at end of file

 
@@ -6021,488 +6021,4 @@
 
 Except as contained in this notice, the name of a copyright holder shall not
 be used in advertising or otherwise to promote the sale, use or other dealings
-in this Software without prior written authorization of the copyright holder.
-
-_____
-
-Intel neural-compressor
-
-https://github.com/intel/neural-compressor
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   ============================================================================
-
-   Copyright 2016-2019 Intel Corporation
-   Copyright 2018 YANDEX LLC
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-
-   This distribution includes third party software ("third party programs").
-   This third party software, even if included with the distribution of
-   the Intel software, may be governed by separate license terms, including
-   without limitation, third party license terms, other Intel software license
-   terms, and open source software license terms. These separate license terms
-   govern your use of the third party programs as set forth in the
-   "THIRD-PARTY-PROGRAMS" file.
-
-_____
-
-FlashAttention, https://github.com/Dao-AILab/flash-attention
-
-BSD 3-Clause License
-
-Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file.
-All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-* Redistributions of source code must retain the above copyright notice, this
-  list of conditions and the following disclaimer.
-
-* Redistributions in binary form must reproduce the above copyright notice,
-  this list of conditions and the following disclaimer in the documentation
-  and/or other materials provided with the distribution.
-
-* Neither the name of the copyright holder nor the names of its
-  contributors may be used to endorse or promote products derived from
-  this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
-FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-_____
-
-composable_kernel
-
-https://github.com/ROCmSoftwarePlatform/composable_kernel
-
-Copyright (c) 2018-    , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang)
-Copyright (c) 2019-    , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang)
-Copyright (c) 2022-    , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan)
-Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang)
-Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah)
-Copyright (c) 2020     , Advanced Micro Devices, Inc. (Xiaoyan Zhou)
-Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan)
-
-SPDX-License-Identifier: MIT
-Copyright (c) 2018-2023, Advanced Micro Devices, Inc. All rights reserved.
-
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
-
-_____
-
-neural-speed
-
-https://github.com/intel/neural-speed
-
-                                 Apache License
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   ============================================================================
-
-   Copyright 2016-2019 Intel Corporation
-   Copyright 2018 YANDEX LLC
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-
-   This distribution includes third party software ("third party programs").
-   This third party software, even if included with the distribution of
-   the Intel software, may be governed by separate license terms, including
-   without limitation, third party license terms, other Intel software license
-   terms, and open source software license terms. These separate license terms
-   govern your use of the third party programs as set forth in the
-   "THIRD-PARTY-PROGRAMS" file.
+in this Software without prior written authorization of the copyright holder.
\ No newline at end of file
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/VERSION_NUMBER -> onnxruntime-linux-x64-gpu-1.15.1.tgz/VERSION_NUMBER Changed

 
@@ -1,1 +1,1 @@
-1.17.1
+1.15.1
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_c_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_c_api.h Changed

@@ -5,11 +5,11 @@
 
 /** \mainpage ONNX Runtime
  *
- * ONNX Runtime is a high-performance inference and training graph execution engine for deep learning models.
+ * ONNX Runtime is a high-performance inference and training graph execution engine for deeplearning models.
  *
  * ONNX Runtime's C, C++ APIs offer an easy to use interface to onboard and execute onnx models.
  * - \subpage c_cpp_api "Core C, C++ APIs"
- * - \subpage training_c_cpp_api "Training C, C++ APIs for on-device training"
+ * - \subpage training_c_cpp_api "Training C, C++ APIs for learning on the edge"
  *
  * \page c_cpp_api Core C, C++ APIs
  * <h1>C</h1>
@@ -29,16 +29,15 @@
  */
 
 #pragma once
-#include <stdbool.h>
-#include <stdint.h>
 #include <stdlib.h>
+#include <stdint.h>
 #include <string.h>
 
 /** \brief The API version defined in this header
  *
  * This value is used by some API functions to behave as this version of the header expects.
  */
-#define ORT_API_VERSION 17
+#define ORT_API_VERSION 15
 
 #ifdef __cplusplus
 extern "C" {
@@ -62,8 +61,6 @@
 #define _Check_return_
 #define _Outptr_result_maybenull_
 #define _In_reads_(X)
-#define _Inout_updates_(X)
-#define _Out_writes_(X)
 #define _Inout_updates_all_(X)
 #define _Out_writes_bytes_all_(X)
 #define _Out_writes_all_(X)
@@ -191,12 +188,7 @@
   ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT64,      // maps to c type uint64_t
   ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX64,   // complex with float32 real and imaginary components
   ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX128,  // complex with float64 real and imaginary components
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16,    // Non-IEEE floating-point format based on IEEE754 single-precision
-  // float 8 types were introduced in onnx 1.14, see https://onnx.ai/onnx/technical/float8.html
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN,    // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ,  // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2,      // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ   // Non-IEEE floating-point format based on IEEE754 single-precision
+  ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16     // Non-IEEE floating-point format based on IEEE754 single-precision
 } ONNXTensorElementDataType;
 
 // Synced with onnx TypeProto oneof
@@ -300,7 +292,6 @@
 ORT_RUNTIME_CLASS(Op);
 ORT_RUNTIME_CLASS(OpAttr);
 ORT_RUNTIME_CLASS(Logger);
-ORT_RUNTIME_CLASS(ShapeInferContext);
 
 #ifdef _WIN32
 typedef _Return_type_success_(return == 0) OrtStatus* OrtStatusPtr;
@@ -410,8 +401,7 @@
         user_compute_stream{},
         default_memory_arena_cfg{},
         tunable_op_enable{false},
-        tunable_op_tuning_enable{false},
-        tunable_op_max_tuning_duration_ms{} {}
+        tunable_op_tuning_enable{false} {}
 #endif
 
   /** \brief CUDA device Id
@@ -474,11 +464,6 @@
    */
   int tunable_op_tuning_enable;
 
-  /** \brief Max tuning duration time limit for each instance of TunableOp.
-   *   Defaults to 0 to disable the limit.
-   */
-  int tunable_op_max_tuning_duration_ms;
-
 } OrtCUDAProviderOptions;
 
 /** \brief ROCM Provider Options
@@ -497,8 +482,7 @@
         user_compute_stream{},
         default_memory_arena_cfg{},
         tunable_op_enable{false},
-        tunable_op_tuning_enable{false},
-        tunable_op_max_tuning_duration_ms{} {}
+        tunable_op_tuning_enable{false} {}
 #endif
 
   /** \brief ROCM device Id
@@ -560,11 +544,6 @@
    */
   int tunable_op_tuning_enable;
 
-  /** \brief Max tuning duration time limit for each instance of TunableOp.
-   *   Defaults to 0 to disable the limit.
-   */
-  int tunable_op_max_tuning_duration_ms;
-
 } OrtROCMProviderOptions;
 
 /** \brief TensorRT Provider Options
@@ -600,11 +579,9 @@
  * \see OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
  */
 typedef struct OrtMIGraphXProviderOptions {
-  int device_id;                                     // hip device id.
-  int migraphx_fp16_enable;                          // MIGraphX FP16 precision. Default 0 = false, nonzero = true
-  int migraphx_int8_enable;                          // MIGraphX INT8 precision. Default 0 = false, nonzero = true
-  int migraphx_use_native_calibration_table;         // MIGraphx INT8 cal table. Default 0 = false, noznero = true
-  const char* migraphx_int8_calibration_table_name;  // MIGraphx INT8 calibration table name
+  int device_id;             // hip device id.
+  int migraphx_fp16_enable;  // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
+  int migraphx_int8_enable;  // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
 } OrtMIGraphXProviderOptions;
 
 /** \brief OpenVINO Provider Options
@@ -614,7 +591,7 @@
 typedef struct OrtOpenVINOProviderOptions {
 #ifdef __cplusplus
   OrtOpenVINOProviderOptions() : device_type{},
-                                 enable_npu_fast_compile{},
+                                 enable_vpu_fast_compile{},
                                  device_id{},
                                  num_of_threads{},
                                  cache_dir{},
@@ -627,7 +604,7 @@
    * Valid settings are one of: "CPU_FP32", "CPU_FP16", "GPU_FP32", "GPU_FP16"
    */
   const char* device_type;
-  unsigned char enable_npu_fast_compile;  ///< 0 = disabled, nonzero = enabled
+  unsigned char enable_vpu_fast_compile;  ///< 0 = disabled, nonzero = enabled
   const char* device_id;
   size_t num_of_threads;  ///< 0 = Use default number of threads
   const char* cache_dir;  // path is set to empty by default
@@ -700,15 +677,6 @@
 
 typedef OrtStatus*(ORT_API_CALL* RegisterCustomOpsFn)(OrtSessionOptions* options, const OrtApiBase* api);
 
-/** \brief Callback function for RunAsync
- *
- * \paramin user_data User specific data that passed back to the callback
- * \paramout outputs On succeed, outputs host inference results, on error, the value will be nullptr
- * \paramout num_outputs Number of outputs, on error, the value will be zero
- * \paramout status On error, status will provide details
- */
-typedef void (*RunAsyncCallbackFn)(void* user_data, OrtValue** outputs, size_t num_outputs, OrtStatusPtr status);
-
 /** \brief The C API
  *
  * All C API functions are defined inside this structure as pointers to functions.
@@ -749,8 +717,6 @@
 
   /** \brief Create an OrtEnv
    *
-   * \note Invoking this function will return the same instance of the environment as that returned by a previous call
-   * to another env creation function; all arguments to this function will be ignored.
    * \paramin log_severity_level The log severity level.
    * \paramin logid The log identifier.
    * \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
@@ -761,20 +727,17 @@
 
   /** \brief Create an OrtEnv
    *
-   * \note Invoking this function will return the same instance of the environment as that returned by a previous call
-   * to another env creation function; all arguments to this function will be ignored. If you want to provide your
-   * own logging function, consider setting it using the SetUserLoggingFunction API instead.
    * \paramin logging_function A pointer to a logging function.
    * \paramin logger_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
-   *                         `logging_function`. This parameter is optional.
+   *                         `logging_function`.
    * \paramin log_severity_level The log severity level.
    * \paramin logid The log identifier.
    * \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
    */
-  ORT_API2_STATUS(CreateEnvWithCustomLogger, _In_ OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
-                  _In_ OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
+  ORT_API2_STATUS(CreateEnvWithCustomLogger, OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
+                  OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
 
   /** \brief Enable Telemetry
    *
@@ -996,7 +959,7 @@
 
   /** \brief Set the optimization level to apply when loading a graph
    *
-   * Please see https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html for an in-depth explanation
+   * Please see https://onnxruntime.ai/docs/performance/graph-optimizations.html for an in-depth explanation
    * \paramin,out options The session options object
    * \paramin graph_optimization_level The optimization level
    *
@@ -2765,10 +2728,6 @@
    *  crossing which the current chunk is chunked into 2.
    * "initial_growth_chunk_size_bytes": (Possible) Size of the second allocation in the arena.
    *  Only relevant if arena strategy is `kNextPowerOfTwo`. Use -1 to allow ORT to choose the default.
-   * "max_power_of_two_extend_bytes": The maximum enxtend size if arena strategy is `kNextPowerOfTwo`.
-   *  It is not an allocation limit, it is only a limit for extention when requested byte is less than the limit.
-   *  When requested bytes is more than the limit, allocator will still return as requested.
-   *  Use -1 to allow ORT to choose the default 1GB for max_power_of_two_extend_bytes.
    *  Ultimately, the allocation size is determined by the allocation memory request.
    *  Further allocation sizes are governed by the arena extend strategy.
    *
@@ -3594,28 +3553,8 @@
    *
    * QNN supported keys:
    *   "backend_path": file path to QNN backend library.
-   *   "profiling_level": QNN profiling level, options: "off", "basic", "detailed". Default to off.
+   *   "profiling_level": QNN profiling level, options: "basic", "detailed".
    *   "rpc_control_latency": QNN RPC control latency.
-   *   "vtcm_mb": QNN VTCM size in MB. default to 0(not set).
-   *   "htp_performance_mode": QNN performance mode, options: "burst", "balanced", "default", "high_performance",
-   *   "high_power_saver", "low_balanced", "extreme_power_saver", "low_power_saver", "power_saver", "sustained_high_performance". Default to "default".
-   *   "qnn_saver_path": File path to the QNN Saver backend library. If specified, QNN Saver will be enabled and will
-   *   dump QNN API calls to disk for replay/debugging. QNN Saver produces incorrect model inference results and
-   *   may alter model/EP partitioning. Use only for debugging.
-   *   "qnn_context_priority": QNN context priority, options: "low", "normal", "normal_high", "high". Default to "normal".
-   *   "htp_graph_finalization_optimization_mode": Set the optimization mode for graph finalization on the HTP backend. Available options:
-   *     - "0": Default.
-   *     - "1": Faster preparation time, less optimal graph.
-   *     - "2": Longer preparation time, more optimal graph.
-   *     - "3": Longest preparation time, most likely even more optimal graph. See QNN SDK documentation for specific details.
-   *   "soc_model": The SoC model number. Refer to the QNN SDK documentation for valid values. Defaults to "0" (unknown).
-   *   "htp_arch": The minimum HTP architecture the driver will use to select compatible QNN operators. Available options:
-   *     - "0": Default (none).
-   *     - "68"
-   *     - "69"
-   *     - "73"
-   *     - "75"
-   *   "device_id": The ID of the device to use when setting 'htp_arch'. Defaults to "0" (for single device).
    *
    * SNPE supported keys:
    *   "runtime": SNPE runtime engine, options: "CPU", "CPU_FLOAT32", "GPU", "GPU_FLOAT32_16_HYBRID", "GPU_FLOAT16",
@@ -3629,7 +3568,6 @@
    *   "buffer_type": ITensor or user buffers, options: "ITENSOR", user buffer with different types - "TF8", "TF16", "UINT8", "FLOAT".
    *   "ITENSOR" -- default, ITensor which is float only.
    *   "TF8" -- quantized model required, "FLOAT" -- for both quantized or non-quantized model
-   *   "enable_init_cache": enable SNPE init caching feature, set to 1 to enabled it. Disabled by default.
    *   If SNPE is not available (due to a non Snpe enabled build or its dependencies not being installed), this function will fail.
    *
    * XNNPACK supported keys:
@@ -4253,7 +4191,7 @@
    */
   ORT_API2_STATUS(GetResizedStringTensorElementBuffer, _Inout_ OrtValue* value, _In_ size_t index, _In_ size_t length_in_bytes, _Inout_ char** buffer);
 
-  /** \brief Get Allocator from KernelContext for a specific memoryInfo. Please use C API ReleaseAllocator to release out object
+  /** \brief Get Allocator from KernelContext for a specific memoryInfo.
    *
    * \paramin context OrtKernelContext instance
    * \paramin mem_info OrtMemoryInfo instance
@@ -4272,300 +4210,6 @@
    * \since Version 1.15.
    */
   const char*(ORT_API_CALL* GetBuildInfoString)(void);
-
-  /// \name OrtROCMProviderOptions
-  /// @{
-
-  /** \brief Create an OrtROCMProviderOptions
-   *
-   * \paramout out Newly created ::OrtROCMProviderOptions. Must be released with OrtApi::ReleaseROCMProviderOptions
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(CreateROCMProviderOptions, _Outptr_ OrtROCMProviderOptions** out);
-
-  /** \brief Set options in a ROCm Execution Provider.
-   *
-   * Please refer to https://onnxruntime.ai/docs/execution-providers/ROCm-ExecutionProvider.html
-   * to know the available keys and values. Key should be in null terminated string format of the member of
-   * ::OrtROCMProviderOptions and value should be its related range.
-   *
-   * For example, key="device_id" and value="0"
-   *
-   * \paramin rocm_options
-   * \paramin provider_options_keys Array of UTF-8 null-terminated string for provider options keys
-   * \paramin provider_options_values Array of UTF-8 null-terminated string for provider options values
-   * \paramin num_keys Number of elements in the `provider_option_keys` and `provider_options_values` arrays
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateROCMProviderOptions, _Inout_ OrtROCMProviderOptions* rocm_options,
-                  _In_reads_(num_keys) const char* const* provider_options_keys,
-                  _In_reads_(num_keys) const char* const* provider_options_values,
-                  _In_ size_t num_keys);
-
-  /**
-   * Get serialized ROCm provider options string.
-   *
-   * For example, "device_id=0;arena_extend_strategy=0;......"
-   *
-   * \param rocm_options - OrtROCMProviderOptions instance
-   * \param allocator - a ptr to an instance of OrtAllocator obtained with CreateAllocator() or GetAllocatorWithDefaultOptions()
-   *                      the specified allocator will be used to allocate continuous buffers for output strings and lengths.
-   * \param ptr - is a UTF-8 null terminated string allocated using 'allocator'. The caller is responsible for using the same allocator to free it.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetROCMProviderOptionsAsString, _In_ const OrtROCMProviderOptions* rocm_options, _Inout_ OrtAllocator* allocator, _Outptr_ char** ptr);
-
-  /** \brief Release an ::OrtROCMProviderOptions
-   *
-   * \note This is an exception in the naming convention of other Release* functions, as the name of the method does not have the V2 suffix, but the type does
-   *
-   * \since Version 1.16.
-   */
-  void(ORT_API_CALL* ReleaseROCMProviderOptions)(_Frees_ptr_opt_ OrtROCMProviderOptions* input);
-
-  /** \brief Create an allocator with specific type and register it with the ::OrtEnv
-   *  This API enhance CreateAndRegisterAllocator that it can create an allocator with specific type, not just CPU allocator
-   *  Enables sharing the allocator between multiple sessions that use the same env instance.
-   *  Lifetime of the created allocator will be valid for the duration of the environment.
-   *  Returns an error if an allocator with the same ::OrtMemoryInfo is already registered.
-   *  \paramin env OrtEnv instance
-   *  \paramin provider_type ExecutionProvider type
-   *  \paramin mem_info OrtMemoryInfo instance
-   *  \paramin arena_cfg Arena configuration
-   *  \paramin provider_options_keys key of the provider options map
-   *  \paramin provider_options_values value of the provider options map
-   *  \paramin num_keys Length of the provider options map
-   */
-  ORT_API2_STATUS(CreateAndRegisterAllocatorV2, _Inout_ OrtEnv* env, _In_ const char* provider_type, _In_ const OrtMemoryInfo* mem_info, _In_ const OrtArenaCfg* arena_cfg,
-                  _In_reads_(num_keys) const char* const* provider_options_keys, _In_reads_(num_keys) const char* const* provider_options_values, _In_ size_t num_keys);
-
-  /** \brief Run the model asynchronously in a thread owned by intra op thread pool
-   *
-   * \paramin session
-   * \paramin run_options If nullptr, will use a default ::OrtRunOptions
-   * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
-   * \paramin input Array of ::OrtValue%s of the input values
-   * \paramin input_len Number of elements in the input_names and inputs arrays
-   * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
-   * \paramin output_names_len Number of elements in the output_names and outputs array
-   * \paramout output OrtValue* array of size output_names_len.
-   *             On calling RunAsync, outputi could either be a null or a pointer to a preallocated OrtValue.
-   *             Later, the output array will be passed to run_async_callback with all null(s) filled with valid
-   *             OrtValue pointer(s) allocated by onnxruntime.
-   *             NOTE: it is customer's duty to finally release the output array and each of its member,
-   *             regardless of whether the member (OrtValue*) is allocated by onnxruntime or preallocated by the customer.
-   * \paramin run_async_callback Callback function on model run completion
-   * \paramin user_data User data that pass back to run_async_callback
-   */
-  ORT_API2_STATUS(RunAsync, _Inout_ OrtSession* session, _In_opt_ const OrtRunOptions* run_options,
-                  _In_reads_(input_len) const char* const* input_names,
-                  _In_reads_(input_len) const OrtValue* const* input, size_t input_len,
-                  _In_reads_(output_names_len) const char* const* output_names, size_t output_names_len,
-                  _Inout_updates_all_(output_names_len) OrtValue** output,
-                  _In_ RunAsyncCallbackFn run_async_callback, _In_opt_ void* user_data);
-
-  /**
-   * Update TensorRT EP provider option where its data type is pointer, for example 'user_compute_stream'.
-   * If the data type of the provider option can be represented by string please use UpdateTensorRTProviderOptions.
-   *
-   * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
-   *
-   * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param value - A pointer to the instance that will be assigned to this provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateTensorRTProviderOptionsWithValue, _Inout_ OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _In_ void* value);
-
-  /**
-   * Get TensorRT EP provider option where its data type is pointer.
-   * If the data type of the provider option can be represented by string please use GetTensorRTProviderOptionsAsString.
-   *
-   * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param ptr - A pointer to the instance that is kept by the provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetTensorRTProviderOptionsByName, _In_ const OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _Outptr_ void** ptr);
-
-  /**
-   * Update CUDA EP provider option where its data type is pointer, for example 'user_compute_stream'.
-   * If the data type of the provider option can be represented by string please use UpdateCUDAProviderOptions.
-   *
-   * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
-   *
-   * \param cuda_options - OrtCUDAProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param value - A pointer to the instance that will be assigned to this provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateCUDAProviderOptionsWithValue, _Inout_ OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _In_ void* value);
-
-  /**
-   * Get CUDA EP provider option where its data type is pointer.
-   * If the data type of the provider option can be represented by string please use GetCUDAProviderOptionsAsString.
-   *
-   * \param cuda_options - OrtCUDAProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param ptr - A pointer to the instance that is kept by the provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetCUDAProviderOptionsByName, _In_ const OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _Outptr_ void** ptr);
-
-  /**
-   * Get a EP resource.
-   * E.g. a cuda stream or a cublas handle
-   *
-   * \param context - Kernel context
-   * \param resouce_version - Version of the resource
-   * \param resource_id - Type of resource
-   * \param resource - A pointer to returned resource
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(KernelContext_GetResource, _In_ const OrtKernelContext* context, _In_ int resouce_version, _In_ int resource_id, _Outptr_ void** resource);
-
-  /** \brief Set user logging function
-   *
-   *  By default the logger created by the CreateEnv* functions is used to create the session logger as well.
-   *  This function allows a user to override this default session logger with a logger of their own choosing. This way
-   *  the user doesn't have to create a separate environment with a custom logger. This addresses the problem when
-   *  the user already created an env but now wants to use a different logger for a specific session (for debugging or
-   *  other reasons).
-   *
-   * \paramin options
-   * \paramin user_logging_function A pointer to a logging function.
-   * \paramin user_logging_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
-   *                         `user_logging_function`. This parameter is optional.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetUserLoggingFunction, _Inout_ OrtSessionOptions* options,
-                  _In_ OrtLoggingFunction user_logging_function, _In_opt_ void* user_logging_param);
-
-  /**
-   * Get number of input from OrtShapeInferContext
-   *
-   * \paramin context
-   * \paramout out The number of inputs
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetInputCount, _In_ const OrtShapeInferContext* context, _Out_ size_t* out);
-
-  /**
-   * Get type and shape info of an input
-   *
-   * \paramin context
-   * \paramin index The index of the input
-   * \paramout info Type shape info of the input
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetInputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _Outptr_ OrtTensorTypeAndShapeInfo** info);
-
-  /**
-   * Get attribute from OrtShapeInferContext. Note that OrtShapeInferContext is a per-node context, one could only read attribute from current node.
-   *
-   * \paramin context
-   * \paramin attr_name Name of the attribute
-   * \paramout attr Handle of the attribute fetched
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetAttribute, _In_ const OrtShapeInferContext* context, _In_ const char* attr_name, _Outptr_ const OrtOpAttr** attr);
-
-  /**
-   * Set type and shape info of an ouput
-   *
-   * \paramin context
-   * \paramin index The index of the ouput
-   * \paramout info Type shape info of the output
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_SetOutputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _In_ const OrtTensorTypeAndShapeInfo* info);
-
-  /**
-   * Set symbolic shape to type shape info
-   *
-   * \paramin info Type shape info
-   * \paramin dim_params Symbolic strings
-   * \paramin dim_params_length Number of strings
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetSymbolicDimensions, _In_ OrtTensorTypeAndShapeInfo* info, _In_ const char* dim_params, _In_ size_t dim_params_length);
-
-  /**
-   * Read contents of an attribute to data
-   *
-   * \paramin op_attr
-   * \paramin type Attribute type
-   * \paramout data Memory address to save raw content of the attribute
-   * \paramin len Number of bytes allowed to store in data
-   * \paramout out Number of bytes required to save the data when the call failed, or the real number of bytes saved to data on success
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ReadOpAttr, _In_ const OrtOpAttr* op_attr, _In_ OrtOpAttrType type, _Inout_ void* data, _In_ size_t len, _Out_ size_t* out);
-
-  /** \brief Set whether to use deterministic compute.
-   *
-   * Default is false. If set to true, this will enable deterministic compute for GPU kernels where possible.
-   * Note that this most likely will have a performance cost.
-   *
-   * \paramin options
-   * \paramin value
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetDeterministicCompute, _Inout_ OrtSessionOptions* options, bool value);
-
-  /**
-   * Run fn in parallel
-   *
-   * \paramin context
-   * \paramin fn Function accepting usr_data and an integer as iterator
-   * \paramin total The number of times fn is to be invoked
-   * \paramin num_batch Number of batches by which the "total" is to be divided in maximum. When zero, there is no limit
-   * \paramin usr_data User data to be passed back to fn
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(KernelContext_ParallelFor, _In_ const OrtKernelContext* context, _In_ void (*fn)(void*, size_t), _In_ size_t total, _In_ size_t num_batch, _In_ void* usr_data);
-
-  /** \brief Append OpenVINO execution provider to the session options
-   *
-   * If OpenVINO is not available (due to a non OpenVINO enabled build, or if OpenVINO is not installed on the system), this function will fail.
-   *
-   * \paramin options
-   * \paramin provider_options_keys
-   * \paramin provider_options_values
-   * \paramin num_keys
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   */
-  ORT_API2_STATUS(SessionOptionsAppendExecutionProvider_OpenVINO_V2,
-                  _In_ OrtSessionOptions* options,
-                  _In_reads_(num_keys) const char* const* provider_options_keys,
-                  _In_reads_(num_keys) const char* const* provider_options_values,
-                  _In_ size_t num_keys);
 };
 
 /*
@@ -4596,10 +4240,7 @@
 struct OrtCustomOp {
   uint32_t version;  // Must be initialized to ORT_API_VERSION
 
-  // This callback creates the kernel, which is a user defined
-  // parameter that is passed to the Kernel* callbacks below. It is
-  // recommended to use CreateKernelV2 which allows for a safe error
-  // propagation by returning an OrtStatusPtr.
+  // This callback creates the kernel, which is a user defined parameter that is passed to the Kernel* callbacks below.
   void*(ORT_API_CALL* CreateKernel)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
                                     _In_ const OrtKernelInfo* info);
 
@@ -4615,9 +4256,7 @@
   ONNXTensorElementDataType(ORT_API_CALL* GetOutputType)(_In_ const struct OrtCustomOp* op, _In_ size_t index);
   size_t(ORT_API_CALL* GetOutputTypeCount)(_In_ const struct OrtCustomOp* op);
 
-  // Perform a computation step.  It is recommended to use
-  // KernelComputeV2 which allows for a safe error propagation by
-  // returning an OrtStatusPtr.
+  // Op kernel callbacks
   void(ORT_API_CALL* KernelCompute)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
   void(ORT_API_CALL* KernelDestroy)(_In_ void* op_kernel);
 
@@ -4649,20 +4288,6 @@
   // and false (zero) otherwise.
   // Applicable only for custom ops that have a variadic output.
   int(ORT_API_CALL* GetVariadicOutputHomogeneity)(_In_ const struct OrtCustomOp* op);
-
-  // Create the kernel state which is passed to each compute call.
-  OrtStatusPtr(ORT_API_CALL* CreateKernelV2)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
-                                             _In_ const OrtKernelInfo* info,
-                                             _Out_ void** kernel);
-
-  // Perform the computation step.
-  OrtStatusPtr(ORT_API_CALL* KernelComputeV2)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
-
-  OrtStatusPtr(ORT_API_CALL* InferOutputShapeFn)(_In_ const struct OrtCustomOp* op, _In_ OrtShapeInferContext*);
-
-  // Get start range
-  int(ORT_API_CALL* GetStartVersion)(_In_ const struct OrtCustomOp* op);
-  int(ORT_API_CALL* GetEndVersion)(_In_ const struct OrtCustomOp* op);
 };
 
 /*
@@ -4674,16 +4299,6 @@
 ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_CUDA, _In_ OrtSessionOptions* options, int device_id);
 
 /*
- * This is the old way to add the ROCm provider to the session, please use
- * SessionOptionsAppendExecutionProvider_ROCM above to access the latest functionality
- * This function always exists, but will only succeed if Onnxruntime was built with
- * HIP support and the ROCm provider shared library exists
- *
- * \param device_id HIP device id, starts from zero.
- */
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_ROCM, _In_ OrtSessionOptions* options, int device_id);
-
-/*
  * This is the old way to add the MIGraphX provider to the session, please use
  * SessionOptionsAppendExecutionProvider_MIGraphX above to access the latest functionality
  * This function always exists, but will only succeed if Onnxruntime was built with
@@ -4703,14 +4318,6 @@
  */
 ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Dnnl, _In_ OrtSessionOptions* options, int use_arena);
 
-/*
- * This is the old way to add the TensorRT provider to the session, please use SessionOptionsAppendExecutionProvider_TensorRT_V2 above to access the latest functionality
- * This function always exists, but will only succeed if Onnxruntime was built with TensorRT support and the TensorRT provider shared library exists
- *
- * \param device_id CUDA device id, starts from zero.
- */
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Tensorrt, _In_ OrtSessionOptions* options, int device_id);
-
 #ifdef __cplusplus
 }
 #endif

 
@@ -5,11 +5,11 @@
 
 /** \mainpage ONNX Runtime
  *
- * ONNX Runtime is a high-performance inference and training graph execution engine for deep learning models.
+ * ONNX Runtime is a high-performance inference and training graph execution engine for deeplearning models.
  *
  * ONNX Runtime's C, C++ APIs offer an easy to use interface to onboard and execute onnx models.
  * - \subpage c_cpp_api "Core C, C++ APIs"
- * - \subpage training_c_cpp_api "Training C, C++ APIs for on-device training"
+ * - \subpage training_c_cpp_api "Training C, C++ APIs for learning on the edge"
  *
  * \page c_cpp_api Core C, C++ APIs
  * <h1>C</h1>
@@ -29,16 +29,15 @@
  */
 
 #pragma once
-#include <stdbool.h>
-#include <stdint.h>
 #include <stdlib.h>
+#include <stdint.h>
 #include <string.h>
 
 /** \brief The API version defined in this header
  *
  * This value is used by some API functions to behave as this version of the header expects.
  */
-#define ORT_API_VERSION 17
+#define ORT_API_VERSION 15
 
 #ifdef __cplusplus
 extern "C" {
@@ -62,8 +61,6 @@
 #define _Check_return_
 #define _Outptr_result_maybenull_
 #define _In_reads_(X)
-#define _Inout_updates_(X)
-#define _Out_writes_(X)
 #define _Inout_updates_all_(X)
 #define _Out_writes_bytes_all_(X)
 #define _Out_writes_all_(X)
@@ -191,12 +188,7 @@
   ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT64,      // maps to c type uint64_t
   ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX64,   // complex with float32 real and imaginary components
   ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX128,  // complex with float64 real and imaginary components
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16,    // Non-IEEE floating-point format based on IEEE754 single-precision
-  // float 8 types were introduced in onnx 1.14, see https://onnx.ai/onnx/technical/float8.html
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN,    // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ,  // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2,      // Non-IEEE floating-point format based on IEEE754 single-precision
-  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ   // Non-IEEE floating-point format based on IEEE754 single-precision
+  ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16     // Non-IEEE floating-point format based on IEEE754 single-precision
 } ONNXTensorElementDataType;
 
 // Synced with onnx TypeProto oneof
@@ -300,7 +292,6 @@
 ORT_RUNTIME_CLASS(Op);
 ORT_RUNTIME_CLASS(OpAttr);
 ORT_RUNTIME_CLASS(Logger);
-ORT_RUNTIME_CLASS(ShapeInferContext);
 
 #ifdef _WIN32
 typedef _Return_type_success_(return == 0) OrtStatus* OrtStatusPtr;
@@ -410,8 +401,7 @@
         user_compute_stream{},
         default_memory_arena_cfg{},
         tunable_op_enable{false},
-        tunable_op_tuning_enable{false},
-        tunable_op_max_tuning_duration_ms{} {}
+        tunable_op_tuning_enable{false} {}
 #endif
 
   /** \brief CUDA device Id
@@ -474,11 +464,6 @@
    */
   int tunable_op_tuning_enable;
 
-  /** \brief Max tuning duration time limit for each instance of TunableOp.
-   *   Defaults to 0 to disable the limit.
-   */
-  int tunable_op_max_tuning_duration_ms;
-
 } OrtCUDAProviderOptions;
 
 /** \brief ROCM Provider Options
@@ -497,8 +482,7 @@
         user_compute_stream{},
         default_memory_arena_cfg{},
         tunable_op_enable{false},
-        tunable_op_tuning_enable{false},
-        tunable_op_max_tuning_duration_ms{} {}
+        tunable_op_tuning_enable{false} {}
 #endif
 
   /** \brief ROCM device Id
@@ -560,11 +544,6 @@
    */
   int tunable_op_tuning_enable;
 
-  /** \brief Max tuning duration time limit for each instance of TunableOp.
-   *   Defaults to 0 to disable the limit.
-   */
-  int tunable_op_max_tuning_duration_ms;
-
 } OrtROCMProviderOptions;
 
 /** \brief TensorRT Provider Options
@@ -600,11 +579,9 @@
  * \see OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
  */
 typedef struct OrtMIGraphXProviderOptions {
-  int device_id;                                     // hip device id.
-  int migraphx_fp16_enable;                          // MIGraphX FP16 precision. Default 0 = false, nonzero = true
-  int migraphx_int8_enable;                          // MIGraphX INT8 precision. Default 0 = false, nonzero = true
-  int migraphx_use_native_calibration_table;         // MIGraphx INT8 cal table. Default 0 = false, noznero = true
-  const char* migraphx_int8_calibration_table_name;  // MIGraphx INT8 calibration table name
+  int device_id;             // hip device id.
+  int migraphx_fp16_enable;  // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
+  int migraphx_int8_enable;  // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
 } OrtMIGraphXProviderOptions;
 
 /** \brief OpenVINO Provider Options
@@ -614,7 +591,7 @@
 typedef struct OrtOpenVINOProviderOptions {
 #ifdef __cplusplus
   OrtOpenVINOProviderOptions() : device_type{},
-                                 enable_npu_fast_compile{},
+                                 enable_vpu_fast_compile{},
                                  device_id{},
                                  num_of_threads{},
                                  cache_dir{},
@@ -627,7 +604,7 @@
    * Valid settings are one of: "CPU_FP32", "CPU_FP16", "GPU_FP32", "GPU_FP16"
    */
   const char* device_type;
-  unsigned char enable_npu_fast_compile;  ///< 0 = disabled, nonzero = enabled
+  unsigned char enable_vpu_fast_compile;  ///< 0 = disabled, nonzero = enabled
   const char* device_id;
   size_t num_of_threads;  ///< 0 = Use default number of threads
   const char* cache_dir;  // path is set to empty by default
@@ -700,15 +677,6 @@
 
 typedef OrtStatus*(ORT_API_CALL* RegisterCustomOpsFn)(OrtSessionOptions* options, const OrtApiBase* api);
 
-/** \brief Callback function for RunAsync
- *
- * \paramin user_data User specific data that passed back to the callback
- * \paramout outputs On succeed, outputs host inference results, on error, the value will be nullptr
- * \paramout num_outputs Number of outputs, on error, the value will be zero
- * \paramout status On error, status will provide details
- */
-typedef void (*RunAsyncCallbackFn)(void* user_data, OrtValue** outputs, size_t num_outputs, OrtStatusPtr status);
-
 /** \brief The C API
  *
  * All C API functions are defined inside this structure as pointers to functions.
@@ -749,8 +717,6 @@
 
   /** \brief Create an OrtEnv
    *
-   * \note Invoking this function will return the same instance of the environment as that returned by a previous call
-   * to another env creation function; all arguments to this function will be ignored.
    * \paramin log_severity_level The log severity level.
    * \paramin logid The log identifier.
    * \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
@@ -761,20 +727,17 @@
 
   /** \brief Create an OrtEnv
    *
-   * \note Invoking this function will return the same instance of the environment as that returned by a previous call
-   * to another env creation function; all arguments to this function will be ignored. If you want to provide your
-   * own logging function, consider setting it using the SetUserLoggingFunction API instead.
    * \paramin logging_function A pointer to a logging function.
    * \paramin logger_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
-   *                         `logging_function`. This parameter is optional.
+   *                         `logging_function`.
    * \paramin log_severity_level The log severity level.
    * \paramin logid The log identifier.
    * \paramout out Returned newly created OrtEnv. Must be freed with OrtApi::ReleaseEnv
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
    */
-  ORT_API2_STATUS(CreateEnvWithCustomLogger, _In_ OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
-                  _In_ OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
+  ORT_API2_STATUS(CreateEnvWithCustomLogger, OrtLoggingFunction logging_function, _In_opt_ void* logger_param,
+                  OrtLoggingLevel log_severity_level, _In_ const char* logid, _Outptr_ OrtEnv** out);
 
   /** \brief Enable Telemetry
    *
@@ -996,7 +959,7 @@
 
   /** \brief Set the optimization level to apply when loading a graph
    *
-   * Please see https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html for an in-depth explanation
+   * Please see https://onnxruntime.ai/docs/performance/graph-optimizations.html for an in-depth explanation
    * \paramin,out options The session options object
    * \paramin graph_optimization_level The optimization level
    *
@@ -2765,10 +2728,6 @@
    *  crossing which the current chunk is chunked into 2.
    * "initial_growth_chunk_size_bytes": (Possible) Size of the second allocation in the arena.
    *  Only relevant if arena strategy is `kNextPowerOfTwo`. Use -1 to allow ORT to choose the default.
-   * "max_power_of_two_extend_bytes": The maximum enxtend size if arena strategy is `kNextPowerOfTwo`.
-   *  It is not an allocation limit, it is only a limit for extention when requested byte is less than the limit.
-   *  When requested bytes is more than the limit, allocator will still return as requested.
-   *  Use -1 to allow ORT to choose the default 1GB for max_power_of_two_extend_bytes.
    *  Ultimately, the allocation size is determined by the allocation memory request.
    *  Further allocation sizes are governed by the arena extend strategy.
    *
@@ -3594,28 +3553,8 @@
    *
    * QNN supported keys:
    *   "backend_path": file path to QNN backend library.
-   *   "profiling_level": QNN profiling level, options: "off", "basic", "detailed". Default to off.
+   *   "profiling_level": QNN profiling level, options: "basic", "detailed".
    *   "rpc_control_latency": QNN RPC control latency.
-   *   "vtcm_mb": QNN VTCM size in MB. default to 0(not set).
-   *   "htp_performance_mode": QNN performance mode, options: "burst", "balanced", "default", "high_performance",
-   *   "high_power_saver", "low_balanced", "extreme_power_saver", "low_power_saver", "power_saver", "sustained_high_performance". Default to "default".
-   *   "qnn_saver_path": File path to the QNN Saver backend library. If specified, QNN Saver will be enabled and will
-   *   dump QNN API calls to disk for replay/debugging. QNN Saver produces incorrect model inference results and
-   *   may alter model/EP partitioning. Use only for debugging.
-   *   "qnn_context_priority": QNN context priority, options: "low", "normal", "normal_high", "high". Default to "normal".
-   *   "htp_graph_finalization_optimization_mode": Set the optimization mode for graph finalization on the HTP backend. Available options:
-   *     - "0": Default.
-   *     - "1": Faster preparation time, less optimal graph.
-   *     - "2": Longer preparation time, more optimal graph.
-   *     - "3": Longest preparation time, most likely even more optimal graph. See QNN SDK documentation for specific details.
-   *   "soc_model": The SoC model number. Refer to the QNN SDK documentation for valid values. Defaults to "0" (unknown).
-   *   "htp_arch": The minimum HTP architecture the driver will use to select compatible QNN operators. Available options:
-   *     - "0": Default (none).
-   *     - "68"
-   *     - "69"
-   *     - "73"
-   *     - "75"
-   *   "device_id": The ID of the device to use when setting 'htp_arch'. Defaults to "0" (for single device).
    *
    * SNPE supported keys:
    *   "runtime": SNPE runtime engine, options: "CPU", "CPU_FLOAT32", "GPU", "GPU_FLOAT32_16_HYBRID", "GPU_FLOAT16",
@@ -3629,7 +3568,6 @@
    *   "buffer_type": ITensor or user buffers, options: "ITENSOR", user buffer with different types - "TF8", "TF16", "UINT8", "FLOAT".
    *   "ITENSOR" -- default, ITensor which is float only.
    *   "TF8" -- quantized model required, "FLOAT" -- for both quantized or non-quantized model
-   *   "enable_init_cache": enable SNPE init caching feature, set to 1 to enabled it. Disabled by default.
    *   If SNPE is not available (due to a non Snpe enabled build or its dependencies not being installed), this function will fail.
    *
    * XNNPACK supported keys:
@@ -4253,7 +4191,7 @@
    */
   ORT_API2_STATUS(GetResizedStringTensorElementBuffer, _Inout_ OrtValue* value, _In_ size_t index, _In_ size_t length_in_bytes, _Inout_ char** buffer);
 
-  /** \brief Get Allocator from KernelContext for a specific memoryInfo. Please use C API ReleaseAllocator to release out object
+  /** \brief Get Allocator from KernelContext for a specific memoryInfo.
    *
    * \paramin context OrtKernelContext instance
    * \paramin mem_info OrtMemoryInfo instance
@@ -4272,300 +4210,6 @@
    * \since Version 1.15.
    */
   const char*(ORT_API_CALL* GetBuildInfoString)(void);
-
-  /// \name OrtROCMProviderOptions
-  /// @{
-
-  /** \brief Create an OrtROCMProviderOptions
-   *
-   * \paramout out Newly created ::OrtROCMProviderOptions. Must be released with OrtApi::ReleaseROCMProviderOptions
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(CreateROCMProviderOptions, _Outptr_ OrtROCMProviderOptions** out);
-
-  /** \brief Set options in a ROCm Execution Provider.
-   *
-   * Please refer to https://onnxruntime.ai/docs/execution-providers/ROCm-ExecutionProvider.html
-   * to know the available keys and values. Key should be in null terminated string format of the member of
-   * ::OrtROCMProviderOptions and value should be its related range.
-   *
-   * For example, key="device_id" and value="0"
-   *
-   * \paramin rocm_options
-   * \paramin provider_options_keys Array of UTF-8 null-terminated string for provider options keys
-   * \paramin provider_options_values Array of UTF-8 null-terminated string for provider options values
-   * \paramin num_keys Number of elements in the `provider_option_keys` and `provider_options_values` arrays
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateROCMProviderOptions, _Inout_ OrtROCMProviderOptions* rocm_options,
-                  _In_reads_(num_keys) const char* const* provider_options_keys,
-                  _In_reads_(num_keys) const char* const* provider_options_values,
-                  _In_ size_t num_keys);
-
-  /**
-   * Get serialized ROCm provider options string.
-   *
-   * For example, "device_id=0;arena_extend_strategy=0;......"
-   *
-   * \param rocm_options - OrtROCMProviderOptions instance
-   * \param allocator - a ptr to an instance of OrtAllocator obtained with CreateAllocator() or GetAllocatorWithDefaultOptions()
-   *                      the specified allocator will be used to allocate continuous buffers for output strings and lengths.
-   * \param ptr - is a UTF-8 null terminated string allocated using 'allocator'. The caller is responsible for using the same allocator to free it.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetROCMProviderOptionsAsString, _In_ const OrtROCMProviderOptions* rocm_options, _Inout_ OrtAllocator* allocator, _Outptr_ char** ptr);
-
-  /** \brief Release an ::OrtROCMProviderOptions
-   *
-   * \note This is an exception in the naming convention of other Release* functions, as the name of the method does not have the V2 suffix, but the type does
-   *
-   * \since Version 1.16.
-   */
-  void(ORT_API_CALL* ReleaseROCMProviderOptions)(_Frees_ptr_opt_ OrtROCMProviderOptions* input);
-
-  /** \brief Create an allocator with specific type and register it with the ::OrtEnv
-   *  This API enhance CreateAndRegisterAllocator that it can create an allocator with specific type, not just CPU allocator
-   *  Enables sharing the allocator between multiple sessions that use the same env instance.
-   *  Lifetime of the created allocator will be valid for the duration of the environment.
-   *  Returns an error if an allocator with the same ::OrtMemoryInfo is already registered.
-   *  \paramin env OrtEnv instance
-   *  \paramin provider_type ExecutionProvider type
-   *  \paramin mem_info OrtMemoryInfo instance
-   *  \paramin arena_cfg Arena configuration
-   *  \paramin provider_options_keys key of the provider options map
-   *  \paramin provider_options_values value of the provider options map
-   *  \paramin num_keys Length of the provider options map
-   */
-  ORT_API2_STATUS(CreateAndRegisterAllocatorV2, _Inout_ OrtEnv* env, _In_ const char* provider_type, _In_ const OrtMemoryInfo* mem_info, _In_ const OrtArenaCfg* arena_cfg,
-                  _In_reads_(num_keys) const char* const* provider_options_keys, _In_reads_(num_keys) const char* const* provider_options_values, _In_ size_t num_keys);
-
-  /** \brief Run the model asynchronously in a thread owned by intra op thread pool
-   *
-   * \paramin session
-   * \paramin run_options If nullptr, will use a default ::OrtRunOptions
-   * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
-   * \paramin input Array of ::OrtValue%s of the input values
-   * \paramin input_len Number of elements in the input_names and inputs arrays
-   * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
-   * \paramin output_names_len Number of elements in the output_names and outputs array
-   * \paramout output OrtValue* array of size output_names_len.
-   *             On calling RunAsync, outputi could either be a null or a pointer to a preallocated OrtValue.
-   *             Later, the output array will be passed to run_async_callback with all null(s) filled with valid
-   *             OrtValue pointer(s) allocated by onnxruntime.
-   *             NOTE: it is customer's duty to finally release the output array and each of its member,
-   *             regardless of whether the member (OrtValue*) is allocated by onnxruntime or preallocated by the customer.
-   * \paramin run_async_callback Callback function on model run completion
-   * \paramin user_data User data that pass back to run_async_callback
-   */
-  ORT_API2_STATUS(RunAsync, _Inout_ OrtSession* session, _In_opt_ const OrtRunOptions* run_options,
-                  _In_reads_(input_len) const char* const* input_names,
-                  _In_reads_(input_len) const OrtValue* const* input, size_t input_len,
-                  _In_reads_(output_names_len) const char* const* output_names, size_t output_names_len,
-                  _Inout_updates_all_(output_names_len) OrtValue** output,
-                  _In_ RunAsyncCallbackFn run_async_callback, _In_opt_ void* user_data);
-
-  /**
-   * Update TensorRT EP provider option where its data type is pointer, for example 'user_compute_stream'.
-   * If the data type of the provider option can be represented by string please use UpdateTensorRTProviderOptions.
-   *
-   * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
-   *
-   * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param value - A pointer to the instance that will be assigned to this provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateTensorRTProviderOptionsWithValue, _Inout_ OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _In_ void* value);
-
-  /**
-   * Get TensorRT EP provider option where its data type is pointer.
-   * If the data type of the provider option can be represented by string please use GetTensorRTProviderOptionsAsString.
-   *
-   * \param tensorrt_options - OrtTensorRTProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param ptr - A pointer to the instance that is kept by the provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetTensorRTProviderOptionsByName, _In_ const OrtTensorRTProviderOptionsV2* tensorrt_options, _In_ const char* key, _Outptr_ void** ptr);
-
-  /**
-   * Update CUDA EP provider option where its data type is pointer, for example 'user_compute_stream'.
-   * If the data type of the provider option can be represented by string please use UpdateCUDAProviderOptions.
-   *
-   * Note: It's caller's responsibility to properly manage the lifetime of the instance pointed by this pointer.
-   *
-   * \param cuda_options - OrtCUDAProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param value - A pointer to the instance that will be assigned to this provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(UpdateCUDAProviderOptionsWithValue, _Inout_ OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _In_ void* value);
-
-  /**
-   * Get CUDA EP provider option where its data type is pointer.
-   * If the data type of the provider option can be represented by string please use GetCUDAProviderOptionsAsString.
-   *
-   * \param cuda_options - OrtCUDAProviderOptionsV2 instance
-   * \param key - Name of the provider option
-   * \param ptr - A pointer to the instance that is kept by the provider option
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(GetCUDAProviderOptionsByName, _In_ const OrtCUDAProviderOptionsV2* cuda_options, _In_ const char* key, _Outptr_ void** ptr);
-
-  /**
-   * Get a EP resource.
-   * E.g. a cuda stream or a cublas handle
-   *
-   * \param context - Kernel context
-   * \param resouce_version - Version of the resource
-   * \param resource_id - Type of resource
-   * \param resource - A pointer to returned resource
-   *
-   * \since Version 1.16.
-   */
-  ORT_API2_STATUS(KernelContext_GetResource, _In_ const OrtKernelContext* context, _In_ int resouce_version, _In_ int resource_id, _Outptr_ void** resource);
-
-  /** \brief Set user logging function
-   *
-   *  By default the logger created by the CreateEnv* functions is used to create the session logger as well.
-   *  This function allows a user to override this default session logger with a logger of their own choosing. This way
-   *  the user doesn't have to create a separate environment with a custom logger. This addresses the problem when
-   *  the user already created an env but now wants to use a different logger for a specific session (for debugging or
-   *  other reasons).
-   *
-   * \paramin options
-   * \paramin user_logging_function A pointer to a logging function.
-   * \paramin user_logging_param A pointer to arbitrary data passed as the ::OrtLoggingFunction `param` parameter to
-   *                         `user_logging_function`. This parameter is optional.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetUserLoggingFunction, _Inout_ OrtSessionOptions* options,
-                  _In_ OrtLoggingFunction user_logging_function, _In_opt_ void* user_logging_param);
-
-  /**
-   * Get number of input from OrtShapeInferContext
-   *
-   * \paramin context
-   * \paramout out The number of inputs
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetInputCount, _In_ const OrtShapeInferContext* context, _Out_ size_t* out);
-
-  /**
-   * Get type and shape info of an input
-   *
-   * \paramin context
-   * \paramin index The index of the input
-   * \paramout info Type shape info of the input
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetInputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _Outptr_ OrtTensorTypeAndShapeInfo** info);
-
-  /**
-   * Get attribute from OrtShapeInferContext. Note that OrtShapeInferContext is a per-node context, one could only read attribute from current node.
-   *
-   * \paramin context
-   * \paramin attr_name Name of the attribute
-   * \paramout attr Handle of the attribute fetched
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_GetAttribute, _In_ const OrtShapeInferContext* context, _In_ const char* attr_name, _Outptr_ const OrtOpAttr** attr);
-
-  /**
-   * Set type and shape info of an ouput
-   *
-   * \paramin context
-   * \paramin index The index of the ouput
-   * \paramout info Type shape info of the output
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ShapeInferContext_SetOutputTypeShape, _In_ const OrtShapeInferContext* context, _In_ size_t index, _In_ const OrtTensorTypeAndShapeInfo* info);
-
-  /**
-   * Set symbolic shape to type shape info
-   *
-   * \paramin info Type shape info
-   * \paramin dim_params Symbolic strings
-   * \paramin dim_params_length Number of strings
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetSymbolicDimensions, _In_ OrtTensorTypeAndShapeInfo* info, _In_ const char* dim_params, _In_ size_t dim_params_length);
-
-  /**
-   * Read contents of an attribute to data
-   *
-   * \paramin op_attr
-   * \paramin type Attribute type
-   * \paramout data Memory address to save raw content of the attribute
-   * \paramin len Number of bytes allowed to store in data
-   * \paramout out Number of bytes required to save the data when the call failed, or the real number of bytes saved to data on success
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(ReadOpAttr, _In_ const OrtOpAttr* op_attr, _In_ OrtOpAttrType type, _Inout_ void* data, _In_ size_t len, _Out_ size_t* out);
-
-  /** \brief Set whether to use deterministic compute.
-   *
-   * Default is false. If set to true, this will enable deterministic compute for GPU kernels where possible.
-   * Note that this most likely will have a performance cost.
-   *
-   * \paramin options
-   * \paramin value
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(SetDeterministicCompute, _Inout_ OrtSessionOptions* options, bool value);
-
-  /**
-   * Run fn in parallel
-   *
-   * \paramin context
-   * \paramin fn Function accepting usr_data and an integer as iterator
-   * \paramin total The number of times fn is to be invoked
-   * \paramin num_batch Number of batches by which the "total" is to be divided in maximum. When zero, there is no limit
-   * \paramin usr_data User data to be passed back to fn
-   *
-   * \since Version 1.17.
-   */
-  ORT_API2_STATUS(KernelContext_ParallelFor, _In_ const OrtKernelContext* context, _In_ void (*fn)(void*, size_t), _In_ size_t total, _In_ size_t num_batch, _In_ void* usr_data);
-
-  /** \brief Append OpenVINO execution provider to the session options
-   *
-   * If OpenVINO is not available (due to a non OpenVINO enabled build, or if OpenVINO is not installed on the system), this function will fail.
-   *
-   * \paramin options
-   * \paramin provider_options_keys
-   * \paramin provider_options_values
-   * \paramin num_keys
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   */
-  ORT_API2_STATUS(SessionOptionsAppendExecutionProvider_OpenVINO_V2,
-                  _In_ OrtSessionOptions* options,
-                  _In_reads_(num_keys) const char* const* provider_options_keys,
-                  _In_reads_(num_keys) const char* const* provider_options_values,
-                  _In_ size_t num_keys);
 };
 
 /*
@@ -4596,10 +4240,7 @@
 struct OrtCustomOp {
   uint32_t version;  // Must be initialized to ORT_API_VERSION
 
-  // This callback creates the kernel, which is a user defined
-  // parameter that is passed to the Kernel* callbacks below. It is
-  // recommended to use CreateKernelV2 which allows for a safe error
-  // propagation by returning an OrtStatusPtr.
+  // This callback creates the kernel, which is a user defined parameter that is passed to the Kernel* callbacks below.
   void*(ORT_API_CALL* CreateKernel)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
                                     _In_ const OrtKernelInfo* info);
 
@@ -4615,9 +4256,7 @@
   ONNXTensorElementDataType(ORT_API_CALL* GetOutputType)(_In_ const struct OrtCustomOp* op, _In_ size_t index);
   size_t(ORT_API_CALL* GetOutputTypeCount)(_In_ const struct OrtCustomOp* op);
 
-  // Perform a computation step.  It is recommended to use
-  // KernelComputeV2 which allows for a safe error propagation by
-  // returning an OrtStatusPtr.
+  // Op kernel callbacks
   void(ORT_API_CALL* KernelCompute)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
   void(ORT_API_CALL* KernelDestroy)(_In_ void* op_kernel);
 
@@ -4649,20 +4288,6 @@
   // and false (zero) otherwise.
   // Applicable only for custom ops that have a variadic output.
   int(ORT_API_CALL* GetVariadicOutputHomogeneity)(_In_ const struct OrtCustomOp* op);
-
-  // Create the kernel state which is passed to each compute call.
-  OrtStatusPtr(ORT_API_CALL* CreateKernelV2)(_In_ const struct OrtCustomOp* op, _In_ const OrtApi* api,
-                                             _In_ const OrtKernelInfo* info,
-                                             _Out_ void** kernel);
-
-  // Perform the computation step.
-  OrtStatusPtr(ORT_API_CALL* KernelComputeV2)(_In_ void* op_kernel, _In_ OrtKernelContext* context);
-
-  OrtStatusPtr(ORT_API_CALL* InferOutputShapeFn)(_In_ const struct OrtCustomOp* op, _In_ OrtShapeInferContext*);
-
-  // Get start range
-  int(ORT_API_CALL* GetStartVersion)(_In_ const struct OrtCustomOp* op);
-  int(ORT_API_CALL* GetEndVersion)(_In_ const struct OrtCustomOp* op);
 };
 
 /*
@@ -4674,16 +4299,6 @@
 ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_CUDA, _In_ OrtSessionOptions* options, int device_id);
 
 /*
- * This is the old way to add the ROCm provider to the session, please use
- * SessionOptionsAppendExecutionProvider_ROCM above to access the latest functionality
- * This function always exists, but will only succeed if Onnxruntime was built with
- * HIP support and the ROCm provider shared library exists
- *
- * \param device_id HIP device id, starts from zero.
- */
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_ROCM, _In_ OrtSessionOptions* options, int device_id);
-
-/*
  * This is the old way to add the MIGraphX provider to the session, please use
  * SessionOptionsAppendExecutionProvider_MIGraphX above to access the latest functionality
  * This function always exists, but will only succeed if Onnxruntime was built with
@@ -4703,14 +4318,6 @@
  */
 ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Dnnl, _In_ OrtSessionOptions* options, int use_arena);
 
-/*
- * This is the old way to add the TensorRT provider to the session, please use SessionOptionsAppendExecutionProvider_TensorRT_V2 above to access the latest functionality
- * This function always exists, but will only succeed if Onnxruntime was built with TensorRT support and the TensorRT provider shared library exists
- *
- * \param device_id CUDA device id, starts from zero.
- */
-ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Tensorrt, _In_ OrtSessionOptions* options, int device_id);
-
 #ifdef __cplusplus
 }
 #endif
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_cxx_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_cxx_api.h Changed

@@ -24,8 +24,6 @@
 
 #pragma once
 #include "onnxruntime_c_api.h"
-#include "onnxruntime_float16.h"
-
 #include <cstddef>
 #include <cstdio>
 #include <array>
@@ -144,358 +142,74 @@
 std::vector<std::string> GetAvailableProviders();
 
 /** \brief IEEE 754 half-precision floating point data type
- *
- * \details This struct is used for converting float to float16 and back
- * so the user could feed inputs and fetch outputs using these type.
- *
+ * \details It is necessary for type dispatching to make use of C++ API
+ * The type is implicitly convertible to/from uint16_t.
  * The size of the structure should align with uint16_t and one can freely cast
  * uint16_t buffers to/from Ort::Float16_t to feed and retrieve data.
  *
- * \code{.unparsed}
- * // This example demonstrates converion from float to float16
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
- * std::vector<Ort::Float16_t> fp16_values;
- * fp16_values.reserve(std::size(values));
- * std::transform(std::begin(values), std::end(values), std::back_inserter(fp16_values),
- *     (float value) { return Ort::Float16_t(value); });
+ * Generally, you can feed any of your types as float16/blfoat16 data to create a tensor
+ * on top of it, providing it can form a continuous buffer with 16-bit elements with no padding.
+ * And you can also feed a array of uint16_t elements directly. For example,
  *
+ * \code{.unparsed}
+ * uint16_t values = { 15360, 16384, 16896, 17408, 17664};
+ * constexpr size_t values_length = sizeof(values) / sizeof(values0);
+ * std::vector<int64_t> dims = {values_length};  // one dimensional example
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
+ * // Note we are passing bytes count in this api, not number of elements -> sizeof(values)
+ * auto float16_tensor = Ort::Value::CreateTensor(info, values, sizeof(values),
+ *                                                dims.data(), dims.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);
  * \endcode
- */
-struct Float16_t : onnxruntime_float16::Float16Impl<Float16_t> {
- private:
-  /// <summary>
-  /// Constructor from a 16-bit representation of a float16 value
-  /// No conversion is done here.
-  /// </summary>
-  /// <param name="v">16-bit representation</param>
-  constexpr explicit Float16_t(uint16_t v) noexcept { val = v; }
-
- public:
-  using Base = onnxruntime_float16::Float16Impl<Float16_t>;
-
-  /// <summary>
-  /// Default constructor
-  /// </summary>
-  Float16_t() = default;
-
-  /// <summary>
-  /// Explicit conversion to uint16_t representation of float16.
-  /// </summary>
-  /// <param name="v">uint16_t bit representation of float16</param>
-  /// <returns>new instance of Float16_t</returns>
-  constexpr static Float16_t FromBits(uint16_t v) noexcept { return Float16_t(v); }
-
-  /// <summary>
-  /// __ctor from float. Float is converted into float16 16-bit representation.
-  /// </summary>
-  /// <param name="v">float value</param>
-  explicit Float16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
-
-  /// <summary>
-  /// Converts float16 to float
-  /// </summary>
-  /// <returns>float representation of float16 value</returns>
-  float ToFloat() const noexcept { return Base::ToFloatImpl(); }
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  using Base::IsNegative;
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  using Base::IsNaN;
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  using Base::IsFinite;
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  using Base::IsPositiveInfinity;
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  using Base::IsNegativeInfinity;
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  using Base::IsInfinity;
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  using Base::IsNaNOrZero;
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsNormal;
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsSubnormal;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  using Base::Abs;
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  using Base::Negate;
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  using Base::AreZero;
-
-  /// <summary>
-  /// User defined conversion operator. Converts Float16_t to float.
-  /// </summary>
-  explicit operator float() const noexcept { return ToFloat(); }
-
-  using Base::operator==;
-  using Base::operator!=;
-  using Base::operator<;
+ *
+ * Here is another example, a little bit more elaborate. Let's assume that you use your own float16 type and you want to use
+ * a templated version of the API above so the type is automatically set based on your type. You will need to supply an extra
+ * template specialization.
+ *
+ * \code{.unparsed}
+ * namespace yours { struct half {}; } // assume this is your type, define this:
+ * namespace Ort {
+ * template<>
+ * struct TypeToTensorType<yours::half> { static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16; };
+ * } //namespace Ort
+ *
+ * std::vector<yours::half> values;
+ * std::vector<int64_t> dims = {values.size()}; // one dimensional example
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
+ * // Here we are passing element count -> values.size()
+ * auto float16_tensor = Ort::Value::CreateTensor<yours::half>(info, values.data(), values.size(), dims.data(), dims.size());
+ *
+ *  \endcode
+ */
+struct Float16_t {
+  uint16_t value;
+  constexpr Float16_t() noexcept : value(0) {}
+  constexpr Float16_t(uint16_t v) noexcept : value(v) {}
+  constexpr operator uint16_t() const noexcept { return value; }
+  constexpr bool operator==(const Float16_t& rhs) const noexcept { return value == rhs.value; };
+  constexpr bool operator!=(const Float16_t& rhs) const noexcept { return value != rhs.value; };
 };
 
 static_assert(sizeof(Float16_t) == sizeof(uint16_t), "Sizes must match");
 
 /** \brief bfloat16 (Brain Floating Point) data type
- *
- * \details This struct is used for converting float to bfloat16 and back
- * so the user could feed inputs and fetch outputs using these type.
- *
+ * \details It is necessary for type dispatching to make use of C++ API
+ * The type is implicitly convertible to/from uint16_t.
  * The size of the structure should align with uint16_t and one can freely cast
  * uint16_t buffers to/from Ort::BFloat16_t to feed and retrieve data.
  *
- * \code{.unparsed}
- * // This example demonstrates converion from float to float16
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
- * std::vector<Ort::BFloat16_t> bfp16_values;
- * bfp16_values.reserve(std::size(values));
- * std::transform(std::begin(values), std::end(values), std::back_inserter(bfp16_values),
- *     (float value) { return Ort::BFloat16_t(value); });
- *
- * \endcode
+ * See also code examples for Float16_t above.
  */
-struct BFloat16_t : onnxruntime_float16::BFloat16Impl<BFloat16_t> {
- private:
-  /// <summary>
-  /// Constructor from a uint16_t representation of bfloat16
-  /// used in FromBits() to escape overload resolution issue with
-  /// constructor from float.
-  /// No conversion is done.
-  /// </summary>
-  /// <param name="v">16-bit bfloat16 value</param>
-  constexpr explicit BFloat16_t(uint16_t v) noexcept { val = v; }
-
- public:
-  using Base = onnxruntime_float16::BFloat16Impl<BFloat16_t>;
-
-  BFloat16_t() = default;
-
-  /// <summary>
-  /// Explicit conversion to uint16_t representation of bfloat16.
-  /// </summary>
-  /// <param name="v">uint16_t bit representation of bfloat16</param>
-  /// <returns>new instance of BFloat16_t</returns>
-  static constexpr BFloat16_t FromBits(uint16_t v) noexcept { return BFloat16_t(v); }
-
-  /// <summary>
-  /// __ctor from float. Float is converted into bfloat16 16-bit representation.
-  /// </summary>
-  /// <param name="v">float value</param>
-  explicit BFloat16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
-
-  /// <summary>
-  /// Converts bfloat16 to float
-  /// </summary>
-  /// <returns>float representation of bfloat16 value</returns>
-  float ToFloat() const noexcept { return Base::ToFloatImpl(); }
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  using Base::IsNegative;
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  using Base::IsNaN;
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  using Base::IsFinite;
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  using Base::IsPositiveInfinity;
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  using Base::IsNegativeInfinity;
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  using Base::IsInfinity;
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  using Base::IsNaNOrZero;
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsNormal;
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsSubnormal;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  using Base::Abs;
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  using Base::Negate;
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  using Base::AreZero;
-
-  /// <summary>
-  /// User defined conversion operator. Converts BFloat16_t to float.
-  /// </summary>
-  explicit operator float() const noexcept { return ToFloat(); }
-
-  // We do not have an inherited impl for the below operators
-  // as the internal class implements them a little differently
-  bool operator==(const BFloat16_t& rhs) const noexcept;
-  bool operator!=(const BFloat16_t& rhs) const noexcept { return !(*this == rhs); }
-  bool operator<(const BFloat16_t& rhs) const noexcept;
+struct BFloat16_t {
+  uint16_t value;
+  constexpr BFloat16_t() noexcept : value(0) {}
+  constexpr BFloat16_t(uint16_t v) noexcept : value(v) {}
+  constexpr operator uint16_t() const noexcept { return value; }
+  constexpr bool operator==(const BFloat16_t& rhs) const noexcept { return value == rhs.value; };
+  constexpr bool operator!=(const BFloat16_t& rhs) const noexcept { return value != rhs.value; };
 };
 
 static_assert(sizeof(BFloat16_t) == sizeof(uint16_t), "Sizes must match");
 
-/** \brief float8e4m3fn (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E4M3FN_t {
-  uint8_t value;
-  constexpr Float8E4M3FN_t() noexcept : value(0) {}
-  constexpr Float8E4M3FN_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E4M3FN_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E4M3FN_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E4M3FN_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e4m3fnuz (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E4M3FNUZ_t {
-  uint8_t value;
-  constexpr Float8E4M3FNUZ_t() noexcept : value(0) {}
-  constexpr Float8E4M3FNUZ_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E4M3FNUZ_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E4M3FNUZ_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E4M3FNUZ_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e5m2 (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E5M2_t {
-  uint8_t value;
-  constexpr Float8E5M2_t() noexcept : value(0) {}
-  constexpr Float8E5M2_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E5M2_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E5M2_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E5M2_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e5m2fnuz (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E5M2FNUZ_t {
-  uint8_t value;
-  constexpr Float8E5M2FNUZ_t() noexcept : value(0) {}
-  constexpr Float8E5M2FNUZ_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E5M2FNUZ_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E5M2FNUZ_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E5M2FNUZ_t) == sizeof(uint8_t), "Sizes must match");
-
 namespace detail {
 // This is used internally by the C++ API. This macro is to make it easy to generate overloaded methods for all of the various OrtRelease* functions for every Ort* type
 // This can't be done in the C API since C doesn't have function overloading.
@@ -719,8 +433,6 @@
   Env& UpdateEnvWithCustomLogLevel(OrtLoggingLevel log_severity_level);  ///< Wraps OrtApi::UpdateEnvWithCustomLogLevel
 
   Env& CreateAndRegisterAllocator(const OrtMemoryInfo* mem_info, const OrtArenaCfg* arena_cfg);  ///< Wraps OrtApi::CreateAndRegisterAllocator
-
-  Env& CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg);  ///< Wraps OrtApi::CreateAndRegisterAllocatorV2
 };
 
 /** \brief Custom Op Domain
@@ -845,7 +557,6 @@
   SessionOptionsImpl& SetIntraOpNumThreads(int intra_op_num_threads);                              ///< Wraps OrtApi::SetIntraOpNumThreads
   SessionOptionsImpl& SetInterOpNumThreads(int inter_op_num_threads);                              ///< Wraps OrtApi::SetInterOpNumThreads
   SessionOptionsImpl& SetGraphOptimizationLevel(GraphOptimizationLevel graph_optimization_level);  ///< Wraps OrtApi::SetSessionGraphOptimizationLevel
-  SessionOptionsImpl& SetDeterministicCompute(bool value);                                         ///< Wraps OrtApi::SetDeterministicCompute
 
   SessionOptionsImpl& EnableCpuMemArena();   ///< Wraps OrtApi::EnableCpuMemArena
   SessionOptionsImpl& DisableCpuMemArena();  ///< Wraps OrtApi::DisableCpuMemArena
@@ -874,12 +585,10 @@
   SessionOptionsImpl& AddInitializer(const char* name, const OrtValue* ort_val);                                             ///< Wraps OrtApi::AddInitializer
   SessionOptionsImpl& AddExternalInitializers(const std::vector<std::string>& names, const std::vector<Value>& ort_values);  ///< Wraps OrtApi::AddExternalInitializers
 
-  SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
-  SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options);     ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
-  SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
-  SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options);  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
-  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO_V2
-  SessionOptionsImpl& AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options = {});
+  SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options);               ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
+  SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
+  SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options);               ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
+  SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
   SessionOptionsImpl& AppendExecutionProvider_TensorRT(const OrtTensorRTProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
   SessionOptionsImpl& AppendExecutionProvider_TensorRT_V2(const OrtTensorRTProviderOptionsV2& provider_options);  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
   SessionOptionsImpl& AppendExecutionProvider_MIGraphX(const OrtMIGraphXProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
@@ -1070,28 +779,6 @@
 
   void Run(const RunOptions& run_options, const IoBinding&);  ///< Wraps OrtApi::RunWithBinding
 
-  /** \brief Run the model asynchronously in a thread owned by intra op thread pool
-   *
-   * Wraps OrtApi::RunAsync
-   *
-   * \paramin run_options
-   * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
-   * \paramin input_values Array of Value objects of length input_count
-   * \paramin input_count Number of elements in the input_names and inputs arrays
-   * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
-   * \paramout output_values Array of provided Values to be filled with outputs.
-   *             On calling RunAsync, output_valuesi could either be initialized by a null pointer or a preallocated OrtValue*.
-   *             Later, on invoking the callback, each output_valuesi of null will be filled with an OrtValue* allocated by onnxruntime.
-   *             Then, an OrtValue** pointer will be casted from output_values, and pass to the callback.
-   *             NOTE: it is customer's duty to finally release output_values and each of its member,
-   *             regardless of whether the member (Ort::Value) is allocated by onnxruntime or preallocated by the customer.
-   * \paramin output_count Number of elements in the output_names and outputs array
-   * \paramin callback Callback function on model run completion
-   * \paramin user_data User data that pass back to the callback
-   */
-  void RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
-                const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data);
-
   /** \brief End profiling and return a copy of the profiling file name.
    *
    * \param allocator to allocate memory for the copy of the string returned
@@ -1630,7 +1317,6 @@
   static Value CreateTensor(const OrtMemoryInfo* info, T* p_data, size_t p_data_element_count, const int64_t* shape, size_t shape_len);
 
   /** \brief Creates a tensor with a user supplied buffer. Wraps OrtApi::CreateTensorWithDataAsOrtValue.
-   *
    * \param info Memory description of where the p_data buffer resides (CPU vs GPU etc).
    * \param p_data Pointer to the data buffer.
    * \param p_data_byte_count The number of bytes in the data buffer.
@@ -1641,12 +1327,7 @@
   static Value CreateTensor(const OrtMemoryInfo* info, void* p_data, size_t p_data_byte_count, const int64_t* shape, size_t shape_len,
                             ONNXTensorElementDataType type);
 
-  /** \brief Creates an OrtValue with a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
-   *         This overload will allocate the buffer for the tensor  according to the supplied shape and data type.
-   *         The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
-   *         The input data would need to be copied into the allocated buffer.
-   *         This API is not suitable for strings.
-   *
+  /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
    * \tparam T The numeric datatype. This API is not suitable for strings.
    * \param allocator The allocator to use.
    * \param shape Pointer to the tensor shape dimensions.
@@ -1655,12 +1336,7 @@
   template <typename T>
   static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len);
 
-  /** \brief Creates an OrtValue with a tensor using the supplied OrtAllocator.
-   *   Wraps OrtApi::CreateTensorAsOrtValue.
-   *   The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
-   *   The input data would need to be copied into the allocated buffer.
-   *   This API is not suitable for strings.
-   *
+  /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
    * \param allocator The allocator to use.
    * \param shape Pointer to the tensor shape dimensions.
    * \param shape_len The number of tensor shape dimensions.
@@ -1668,35 +1344,11 @@
    */
   static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len, ONNXTensorElementDataType type);
 
-  /** \brief Creates an OrtValue with a Map Onnx type representation.
-   *  The API would ref-count the supplied OrtValues and they will be released
-   *  when the returned OrtValue is released. The caller may release keys and values after the call
-   *  returns.
-   *
-   * \param keys an OrtValue containing a tensor with primitive data type keys.
-   * \param values an OrtValue that may contain a tensor. Ort currently supports only primitive data type values.
-   */
-  static Value CreateMap(const Value& keys, const Value& values);  ///< Wraps OrtApi::CreateValue
-
-  /** \brief Creates an OrtValue with a Sequence Onnx type representation.
-   *  The API would ref-count the supplied OrtValues and they will be released
-   *  when the returned OrtValue is released. The caller may release the values after the call
-   *  returns.
-   *
-   * \param values a vector of OrtValues that must have the same Onnx value type.
-   */
-  static Value CreateSequence(const std::vector<Value>& values);  ///< Wraps OrtApi::CreateValue
+  static Value CreateMap(Value& keys, Value& values);       ///< Wraps OrtApi::CreateValue
+  static Value CreateSequence(std::vector<Value>& values);  ///< Wraps OrtApi::CreateValue
 
-  /** \brief Creates an OrtValue wrapping an Opaque type.
-   *  This is used for experimental support of non-tensor types.
-   *
-   * \tparam T - the type of the value.
-   * \param domain - zero terminated utf-8 string. Domain of the type.
-   * \param type_name - zero terminated utf-8 string. Name of the type.
-   * \param value - the value to be wrapped.
-   */
   template <typename T>
-  static Value CreateOpaque(const char* domain, const char* type_name, const T& value);  ///< Wraps OrtApi::CreateOpaqueValue
+  static Value CreateOpaque(const char* domain, const char* type_name, const T&);  ///< Wraps OrtApi::CreateOpaqueValue
 
 #if !defined(DISABLE_SPARSE_TENSORS)
   /// <summary>
@@ -2058,8 +1710,6 @@
   void* GetGPUComputeStream() const;
   Logger GetLogger() const;
   OrtAllocator* GetAllocator(const OrtMemoryInfo& memory_info) const;
-  OrtKernelContext* GetOrtKernelContext() const { return ctx_; }
-  void ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const;
 
  private:
   OrtKernelContext* ctx_;
@@ -2161,83 +1811,195 @@
 };
 
 /// <summary>
-/// Provide access to per-node attributes and input shapes, so one could compute and set output shapes.
+/// This entire structure is deprecated, but we not marking
+/// it as a whole yet since we want to preserve for the next release.
 /// </summary>
-struct ShapeInferContext {
-  struct SymbolicInteger {
-    SymbolicInteger(int64_t i) : i_(i), is_int_(true){};
-    SymbolicInteger(const char* s) : s_(s), is_int_(false){};
-    SymbolicInteger(const SymbolicInteger&) = default;
-    SymbolicInteger(SymbolicInteger&&) = default;
+struct CustomOpApi {
+  CustomOpApi(const OrtApi& api) : api_(api) {}
+
+  /** \deprecated use Ort::Value::GetTensorTypeAndShape()
+   * deprecated
+   * This interface produces a pointer that must be released. Not exception safe.
+   */
+  deprecated("use Ort::Value::GetTensorTypeAndShape()") OrtTensorTypeAndShapeInfo* GetTensorTypeAndShape(_In_ const OrtValue* value);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementCount()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetElementCount()") size_t GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementType()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetElementType()") ONNXTensorElementDataType GetTensorElementType(const OrtTensorTypeAndShapeInfo* info);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()") size_t GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info);
 
-    SymbolicInteger& operator=(const SymbolicInteger&) = default;
-    SymbolicInteger& operator=(SymbolicInteger&&) = default;
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") void GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length);
 
-    bool operator==(const SymbolicInteger& dim) const {
-      if (is_int_ == dim.is_int_) {
-        if (is_int_) {
-          return i_ == dim.i_;
-        } else {
-          return std::string{s_} == std::string{dim.s_};
-        }
-      }
-      return false;
-    }
+  /** \deprecated
+   * deprecated
+   * This interface sets dimensions to TensorTypeAndShapeInfo, but has no effect on the OrtValue.
+   */
+  deprecated("Do not use") void SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count);
 
-    bool IsInt() const { return is_int_; }
-    int64_t AsInt() const { return i_; }
-    const char* AsSym() const { return s_; }
+  /** \deprecated use Ort::Value::GetTensorMutableData()
+   * deprecated
+   * This interface is redundant.
+   */
+  template <typename T>
+  deprecated("use Ort::Value::GetTensorMutableData()") T* GetTensorMutableData(_Inout_ OrtValue* value);
 
-    static constexpr int INVALID_INT_DIM = -2;
+  /** \deprecated use Ort::Value::GetTensorData()
+   * deprecated
+   * This interface is redundant.
+   */
+  template <typename T>
+  deprecated("use Ort::Value::GetTensorData()") const T* GetTensorData(_Inout_ const OrtValue* value);
 
-   private:
-    union {
-      int64_t i_;
-      const char* s_;
-    };
-    bool is_int_;
-  };
+  /** \deprecated use Ort::Value::GetTensorMemoryInfo()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::Value::GetTensorMemoryInfo()") const OrtMemoryInfo* GetTensorMemoryInfo(_In_ const OrtValue* value);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") std::vector<int64_t> GetTensorShape(const OrtTensorTypeAndShapeInfo* info);
 
-  using Shape = std::vector<SymbolicInteger>;
+  /** \deprecated use TensorTypeAndShapeInfo instances for automatic ownership.
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use TensorTypeAndShapeInfo") void ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input);
+
+  /** \deprecated use Ort::KernelContext::GetInputCount
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetInputCount") size_t KernelContext_GetInputCount(const OrtKernelContext* context);
+
+  /** \deprecated use Ort::KernelContext::GetInput
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetInput") const OrtValue* KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index);
+
+  /** \deprecated use Ort::KernelContext::GetOutputCount
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetOutputCount") size_t KernelContext_GetOutputCount(const OrtKernelContext* context);
+
+  /** \deprecated use Ort::KernelContext::GetOutput
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetOutput") OrtValue* KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index, _In_ const int64_t* dim_values, size_t dim_count);
 
-  ShapeInferContext(const OrtApi* ort_api, OrtShapeInferContext* ctx);
+  /** \deprecated use Ort::KernelContext::GetGPUComputeStream
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetGPUComputeStream") void* KernelContext_GetGPUComputeStream(const OrtKernelContext* context);
 
-  const Shape& GetInputShape(size_t indice) const { return input_shapes_.at(indice); }
+  /** \deprecated use Ort::ThrowOnError()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::ThrowOnError()") void ThrowOnError(OrtStatus* result);
 
-  size_t GetInputCount() const { return input_shapes_.size(); }
+  /** \deprecated use Ort::OpAttr
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::OpAttr") OrtOpAttr* CreateOpAttr(_In_ const char* name,
+                                                            _In_ const void* data,
+                                                            _In_ int len,
+                                                            _In_ OrtOpAttrType type);
 
-  Status SetOutputShape(size_t indice, const Shape& shape);
+  /** \deprecated use Ort::OpAttr
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::OpAttr") void ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr);
 
-  int64_t GetAttrInt(const char* attr_name);
+  /** \deprecated use Ort::Op
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::Op") OrtOp* CreateOp(_In_ const OrtKernelInfo* info,
+                                                _In_z_ const char* op_name,
+                                                _In_z_ const char* domain,
+                                                int version,
+                                                _In_reads_(type_constraint_count) const char** type_constraint_names,
+                                                _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
+                                                int type_constraint_count,
+                                                _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
+                                                int attr_count,
+                                                int input_count,
+                                                int output_count);
 
-  using Ints = std::vector<int64_t>;
-  Ints GetAttrInts(const char* attr_name);
+  /** \deprecated use Ort::Op::Invoke
+   * deprecated
+   * This interface is redundant
+   */
+  deprecated("use Ort::Op::Invoke") void InvokeOp(_In_ const OrtKernelContext* context,
+                                                      _In_ const OrtOp* ort_op,
+                                                      _In_ const OrtValue* const* input_values,
+                                                      _In_ int input_count,
+                                                      _Inout_ OrtValue* const* output_values,
+                                                      _In_ int output_count);
 
-  float GetAttrFloat(const char* attr_name);
+  /** \deprecated use Ort::Op for automatic lifespan management.
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::Op") void ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op);
 
-  using Floats = std::vector<float>;
-  Floats GetAttrFloats(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo for automatic lifespan management or for
+   * querying attributes
+   * deprecated
+   * This interface is redundant
+   */
+  template <typename T>  // T is only implemented for std::vector<float>, std::vector<int64_t>, float, int64_t, and string
+  deprecated("use Ort::KernelInfo::GetAttribute") T KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name);
 
-  std::string GetAttrString(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo::Copy
+   * querying attributes
+   * deprecated
+   * This interface is not exception safe
+   */
+  deprecated("use Ort::KernelInfo::Copy") OrtKernelInfo* CopyKernelInfo(_In_ const OrtKernelInfo* info);
 
-  using Strings = std::vector<std::string>;
-  Strings GetAttrStrings(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo for lifespan management
+   * querying attributes
+   * deprecated
+   * This interface is not exception safe
+   */
+  deprecated("use Ort::KernelInfo") void ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy);
 
  private:
-  const OrtOpAttr* GetAttrHdl(const char* attr_name) const;
-  const OrtApi* ort_api_;
-  OrtShapeInferContext* ctx_;
-  std::vector<Shape> input_shapes_;
+  const OrtApi& api_;
 };
 
-using ShapeInferFn = Ort::Status (*)(Ort::ShapeInferContext&);
-
-#define MAX_CUSTOM_OP_END_VER (1UL << 31) - 1
-
-template <typename TOp, typename TKernel, bool WithStatus = false>
+template <typename TOp, typename TKernel>
 struct CustomOpBase : OrtCustomOp {
   CustomOpBase() {
     OrtCustomOp::version = ORT_API_VERSION;
+    OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
     OrtCustomOp::GetName = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetName(); };
 
     OrtCustomOp::GetExecutionProviderType = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetExecutionProviderType(); };
@@ -2249,6 +2011,7 @@
     OrtCustomOp::GetOutputTypeCount = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetOutputTypeCount(); };
     OrtCustomOp::GetOutputType = (const OrtCustomOp* this_, size_t index) { return static_cast<const TOp*>(this_)->GetOutputType(index); };
 
+    OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) { static_cast<TKernel*>(op_kernel)->Compute(context); };
 #if defined(_MSC_VER) && !defined(__clang__)
 #pragma warning(push)
 #pragma warning(disable : 26409)
@@ -2264,36 +2027,6 @@
     OrtCustomOp::GetVariadicInputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicInputHomogeneity()); };
     OrtCustomOp::GetVariadicOutputMinArity = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetVariadicOutputMinArity(); };
     OrtCustomOp::GetVariadicOutputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicOutputHomogeneity()); };
-#ifdef __cpp_if_constexpr
-    if constexpr (WithStatus) {
-#else
-    if (WithStatus) {
-#endif
-      OrtCustomOp::CreateKernelV2 = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info, void** op_kernel) -> OrtStatusPtr {
-        return static_cast<const TOp*>(this_)->CreateKernelV2(*api, info, op_kernel);
-      };
-      OrtCustomOp::KernelComputeV2 = (void* op_kernel, OrtKernelContext* context) -> OrtStatusPtr {
-        return static_cast<TKernel*>(op_kernel)->ComputeV2(context);
-      };
-    } else {
-      OrtCustomOp::CreateKernelV2 = nullptr;
-      OrtCustomOp::KernelComputeV2 = nullptr;
-
-      OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
-      OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) {
-        static_cast<TKernel*>(op_kernel)->Compute(context);
-      };
-    }
-
-    SetShapeInferFn<TOp>(0);
-
-    OrtCustomOp::GetStartVersion = (const OrtCustomOp* this_) {
-      return static_cast<const TOp*>(this_)->start_ver_;
-    };
-
-    OrtCustomOp::GetEndVersion = (const OrtCustomOp* this_) {
-      return static_cast<const TOp*>(this_)->end_ver_;
-    };
   }
 
   // Default implementation of GetExecutionProviderType that returns nullptr to default to the CPU provider
@@ -2345,26 +2078,9 @@
     return std::vector<std::string>{};
   }
 
-  template <typename C>
-  decltype(&C::InferOutputShape) SetShapeInferFn(decltype(&C::InferOutputShape)) {
-    OrtCustomOp::InferOutputShapeFn = (const OrtCustomOp*, OrtShapeInferContext* ort_ctx) -> OrtStatusPtr {
-      ShapeInferContext ctx(&GetApi(), ort_ctx);
-      return C::InferOutputShape(ctx);
-    };
-    return {};
-  }
-
-  template <typename C>
-  void SetShapeInferFn(...) {
-    OrtCustomOp::InferOutputShapeFn = {};
-  }
-
  protected:
   // Helper function that returns a map of session config entries specified by CustomOpBase::GetSessionConfigKeys.
   void GetSessionConfigs(std::unordered_map<std::string, std::string>& out, ConstSessionOptions options) const;
-
-  int start_ver_ = 1;
-  int end_ver_ = MAX_CUSTOM_OP_END_VER;
 };
 
 }  // namespace Ort

 
@@ -24,8 +24,6 @@
 
 #pragma once
 #include "onnxruntime_c_api.h"
-#include "onnxruntime_float16.h"
-
 #include <cstddef>
 #include <cstdio>
 #include <array>
@@ -144,358 +142,74 @@
 std::vector<std::string> GetAvailableProviders();
 
 /** \brief IEEE 754 half-precision floating point data type
- *
- * \details This struct is used for converting float to float16 and back
- * so the user could feed inputs and fetch outputs using these type.
- *
+ * \details It is necessary for type dispatching to make use of C++ API
+ * The type is implicitly convertible to/from uint16_t.
  * The size of the structure should align with uint16_t and one can freely cast
  * uint16_t buffers to/from Ort::Float16_t to feed and retrieve data.
  *
- * \code{.unparsed}
- * // This example demonstrates converion from float to float16
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
- * std::vector<Ort::Float16_t> fp16_values;
- * fp16_values.reserve(std::size(values));
- * std::transform(std::begin(values), std::end(values), std::back_inserter(fp16_values),
- *     (float value) { return Ort::Float16_t(value); });
+ * Generally, you can feed any of your types as float16/blfoat16 data to create a tensor
+ * on top of it, providing it can form a continuous buffer with 16-bit elements with no padding.
+ * And you can also feed a array of uint16_t elements directly. For example,
  *
+ * \code{.unparsed}
+ * uint16_t values = { 15360, 16384, 16896, 17408, 17664};
+ * constexpr size_t values_length = sizeof(values) / sizeof(values0);
+ * std::vector<int64_t> dims = {values_length};  // one dimensional example
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
+ * // Note we are passing bytes count in this api, not number of elements -> sizeof(values)
+ * auto float16_tensor = Ort::Value::CreateTensor(info, values, sizeof(values),
+ *                                                dims.data(), dims.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);
  * \endcode
- */
-struct Float16_t : onnxruntime_float16::Float16Impl<Float16_t> {
- private:
-  /// <summary>
-  /// Constructor from a 16-bit representation of a float16 value
-  /// No conversion is done here.
-  /// </summary>
-  /// <param name="v">16-bit representation</param>
-  constexpr explicit Float16_t(uint16_t v) noexcept { val = v; }
-
- public:
-  using Base = onnxruntime_float16::Float16Impl<Float16_t>;
-
-  /// <summary>
-  /// Default constructor
-  /// </summary>
-  Float16_t() = default;
-
-  /// <summary>
-  /// Explicit conversion to uint16_t representation of float16.
-  /// </summary>
-  /// <param name="v">uint16_t bit representation of float16</param>
-  /// <returns>new instance of Float16_t</returns>
-  constexpr static Float16_t FromBits(uint16_t v) noexcept { return Float16_t(v); }
-
-  /// <summary>
-  /// __ctor from float. Float is converted into float16 16-bit representation.
-  /// </summary>
-  /// <param name="v">float value</param>
-  explicit Float16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
-
-  /// <summary>
-  /// Converts float16 to float
-  /// </summary>
-  /// <returns>float representation of float16 value</returns>
-  float ToFloat() const noexcept { return Base::ToFloatImpl(); }
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  using Base::IsNegative;
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  using Base::IsNaN;
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  using Base::IsFinite;
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  using Base::IsPositiveInfinity;
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  using Base::IsNegativeInfinity;
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  using Base::IsInfinity;
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  using Base::IsNaNOrZero;
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsNormal;
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsSubnormal;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  using Base::Abs;
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  using Base::Negate;
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  using Base::AreZero;
-
-  /// <summary>
-  /// User defined conversion operator. Converts Float16_t to float.
-  /// </summary>
-  explicit operator float() const noexcept { return ToFloat(); }
-
-  using Base::operator==;
-  using Base::operator!=;
-  using Base::operator<;
+ *
+ * Here is another example, a little bit more elaborate. Let's assume that you use your own float16 type and you want to use
+ * a templated version of the API above so the type is automatically set based on your type. You will need to supply an extra
+ * template specialization.
+ *
+ * \code{.unparsed}
+ * namespace yours { struct half {}; } // assume this is your type, define this:
+ * namespace Ort {
+ * template<>
+ * struct TypeToTensorType<yours::half> { static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16; };
+ * } //namespace Ort
+ *
+ * std::vector<yours::half> values;
+ * std::vector<int64_t> dims = {values.size()}; // one dimensional example
+ * Ort::MemoryInfo info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
+ * // Here we are passing element count -> values.size()
+ * auto float16_tensor = Ort::Value::CreateTensor<yours::half>(info, values.data(), values.size(), dims.data(), dims.size());
+ *
+ *  \endcode
+ */
+struct Float16_t {
+  uint16_t value;
+  constexpr Float16_t() noexcept : value(0) {}
+  constexpr Float16_t(uint16_t v) noexcept : value(v) {}
+  constexpr operator uint16_t() const noexcept { return value; }
+  constexpr bool operator==(const Float16_t& rhs) const noexcept { return value == rhs.value; };
+  constexpr bool operator!=(const Float16_t& rhs) const noexcept { return value != rhs.value; };
 };
 
 static_assert(sizeof(Float16_t) == sizeof(uint16_t), "Sizes must match");
 
 /** \brief bfloat16 (Brain Floating Point) data type
- *
- * \details This struct is used for converting float to bfloat16 and back
- * so the user could feed inputs and fetch outputs using these type.
- *
+ * \details It is necessary for type dispatching to make use of C++ API
+ * The type is implicitly convertible to/from uint16_t.
  * The size of the structure should align with uint16_t and one can freely cast
  * uint16_t buffers to/from Ort::BFloat16_t to feed and retrieve data.
  *
- * \code{.unparsed}
- * // This example demonstrates converion from float to float16
- * constexpr float values = {1.f, 2.f, 3.f, 4.f, 5.f};
- * std::vector<Ort::BFloat16_t> bfp16_values;
- * bfp16_values.reserve(std::size(values));
- * std::transform(std::begin(values), std::end(values), std::back_inserter(bfp16_values),
- *     (float value) { return Ort::BFloat16_t(value); });
- *
- * \endcode
+ * See also code examples for Float16_t above.
  */
-struct BFloat16_t : onnxruntime_float16::BFloat16Impl<BFloat16_t> {
- private:
-  /// <summary>
-  /// Constructor from a uint16_t representation of bfloat16
-  /// used in FromBits() to escape overload resolution issue with
-  /// constructor from float.
-  /// No conversion is done.
-  /// </summary>
-  /// <param name="v">16-bit bfloat16 value</param>
-  constexpr explicit BFloat16_t(uint16_t v) noexcept { val = v; }
-
- public:
-  using Base = onnxruntime_float16::BFloat16Impl<BFloat16_t>;
-
-  BFloat16_t() = default;
-
-  /// <summary>
-  /// Explicit conversion to uint16_t representation of bfloat16.
-  /// </summary>
-  /// <param name="v">uint16_t bit representation of bfloat16</param>
-  /// <returns>new instance of BFloat16_t</returns>
-  static constexpr BFloat16_t FromBits(uint16_t v) noexcept { return BFloat16_t(v); }
-
-  /// <summary>
-  /// __ctor from float. Float is converted into bfloat16 16-bit representation.
-  /// </summary>
-  /// <param name="v">float value</param>
-  explicit BFloat16_t(float v) noexcept { val = Base::ToUint16Impl(v); }
-
-  /// <summary>
-  /// Converts bfloat16 to float
-  /// </summary>
-  /// <returns>float representation of bfloat16 value</returns>
-  float ToFloat() const noexcept { return Base::ToFloatImpl(); }
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  using Base::IsNegative;
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  using Base::IsNaN;
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  using Base::IsFinite;
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  using Base::IsPositiveInfinity;
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  using Base::IsNegativeInfinity;
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  using Base::IsInfinity;
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  using Base::IsNaNOrZero;
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsNormal;
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  using Base::IsSubnormal;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  using Base::Abs;
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  using Base::Negate;
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  using Base::AreZero;
-
-  /// <summary>
-  /// User defined conversion operator. Converts BFloat16_t to float.
-  /// </summary>
-  explicit operator float() const noexcept { return ToFloat(); }
-
-  // We do not have an inherited impl for the below operators
-  // as the internal class implements them a little differently
-  bool operator==(const BFloat16_t& rhs) const noexcept;
-  bool operator!=(const BFloat16_t& rhs) const noexcept { return !(*this == rhs); }
-  bool operator<(const BFloat16_t& rhs) const noexcept;
+struct BFloat16_t {
+  uint16_t value;
+  constexpr BFloat16_t() noexcept : value(0) {}
+  constexpr BFloat16_t(uint16_t v) noexcept : value(v) {}
+  constexpr operator uint16_t() const noexcept { return value; }
+  constexpr bool operator==(const BFloat16_t& rhs) const noexcept { return value == rhs.value; };
+  constexpr bool operator!=(const BFloat16_t& rhs) const noexcept { return value != rhs.value; };
 };
 
 static_assert(sizeof(BFloat16_t) == sizeof(uint16_t), "Sizes must match");
 
-/** \brief float8e4m3fn (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E4M3FN_t {
-  uint8_t value;
-  constexpr Float8E4M3FN_t() noexcept : value(0) {}
-  constexpr Float8E4M3FN_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E4M3FN_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E4M3FN_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E4M3FN_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e4m3fnuz (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E4M3FNUZ_t {
-  uint8_t value;
-  constexpr Float8E4M3FNUZ_t() noexcept : value(0) {}
-  constexpr Float8E4M3FNUZ_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E4M3FNUZ_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E4M3FNUZ_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E4M3FNUZ_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e5m2 (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E5M2_t {
-  uint8_t value;
-  constexpr Float8E5M2_t() noexcept : value(0) {}
-  constexpr Float8E5M2_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E5M2_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E5M2_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E5M2_t) == sizeof(uint8_t), "Sizes must match");
-
-/** \brief float8e5m2fnuz (Float8 Floating Point) data type
- * \details It is necessary for type dispatching to make use of C++ API
- * The type is implicitly convertible to/from uint8_t.
- * See https://onnx.ai/onnx/technical/float8.html for further details.
- */
-struct Float8E5M2FNUZ_t {
-  uint8_t value;
-  constexpr Float8E5M2FNUZ_t() noexcept : value(0) {}
-  constexpr Float8E5M2FNUZ_t(uint8_t v) noexcept : value(v) {}
-  constexpr operator uint8_t() const noexcept { return value; }
-  // nan values are treated like any other value for operator ==, !=
-  constexpr bool operator==(const Float8E5M2FNUZ_t& rhs) const noexcept { return value == rhs.value; };
-  constexpr bool operator!=(const Float8E5M2FNUZ_t& rhs) const noexcept { return value != rhs.value; };
-};
-
-static_assert(sizeof(Float8E5M2FNUZ_t) == sizeof(uint8_t), "Sizes must match");
-
 namespace detail {
 // This is used internally by the C++ API. This macro is to make it easy to generate overloaded methods for all of the various OrtRelease* functions for every Ort* type
 // This can't be done in the C API since C doesn't have function overloading.
@@ -719,8 +433,6 @@
   Env& UpdateEnvWithCustomLogLevel(OrtLoggingLevel log_severity_level);  ///< Wraps OrtApi::UpdateEnvWithCustomLogLevel
 
   Env& CreateAndRegisterAllocator(const OrtMemoryInfo* mem_info, const OrtArenaCfg* arena_cfg);  ///< Wraps OrtApi::CreateAndRegisterAllocator
-
-  Env& CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg);  ///< Wraps OrtApi::CreateAndRegisterAllocatorV2
 };
 
 /** \brief Custom Op Domain
@@ -845,7 +557,6 @@
   SessionOptionsImpl& SetIntraOpNumThreads(int intra_op_num_threads);                              ///< Wraps OrtApi::SetIntraOpNumThreads
   SessionOptionsImpl& SetInterOpNumThreads(int inter_op_num_threads);                              ///< Wraps OrtApi::SetInterOpNumThreads
   SessionOptionsImpl& SetGraphOptimizationLevel(GraphOptimizationLevel graph_optimization_level);  ///< Wraps OrtApi::SetSessionGraphOptimizationLevel
-  SessionOptionsImpl& SetDeterministicCompute(bool value);                                         ///< Wraps OrtApi::SetDeterministicCompute
 
   SessionOptionsImpl& EnableCpuMemArena();   ///< Wraps OrtApi::EnableCpuMemArena
   SessionOptionsImpl& DisableCpuMemArena();  ///< Wraps OrtApi::DisableCpuMemArena
@@ -874,12 +585,10 @@
   SessionOptionsImpl& AddInitializer(const char* name, const OrtValue* ort_val);                                             ///< Wraps OrtApi::AddInitializer
   SessionOptionsImpl& AddExternalInitializers(const std::vector<std::string>& names, const std::vector<Value>& ort_values);  ///< Wraps OrtApi::AddExternalInitializers
 
-  SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
-  SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options);     ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
-  SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
-  SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options);  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
-  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO_V2
-  SessionOptionsImpl& AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options = {});
+  SessionOptionsImpl& AppendExecutionProvider_CUDA(const OrtCUDAProviderOptions& provider_options);               ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA
+  SessionOptionsImpl& AppendExecutionProvider_CUDA_V2(const OrtCUDAProviderOptionsV2& provider_options);          ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_CUDA_V2
+  SessionOptionsImpl& AppendExecutionProvider_ROCM(const OrtROCMProviderOptions& provider_options);               ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_ROCM
+  SessionOptionsImpl& AppendExecutionProvider_OpenVINO(const OrtOpenVINOProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_OpenVINO
   SessionOptionsImpl& AppendExecutionProvider_TensorRT(const OrtTensorRTProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
   SessionOptionsImpl& AppendExecutionProvider_TensorRT_V2(const OrtTensorRTProviderOptionsV2& provider_options);  ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_TensorRT
   SessionOptionsImpl& AppendExecutionProvider_MIGraphX(const OrtMIGraphXProviderOptions& provider_options);       ///< Wraps OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
@@ -1070,28 +779,6 @@
 
   void Run(const RunOptions& run_options, const IoBinding&);  ///< Wraps OrtApi::RunWithBinding
 
-  /** \brief Run the model asynchronously in a thread owned by intra op thread pool
-   *
-   * Wraps OrtApi::RunAsync
-   *
-   * \paramin run_options
-   * \paramin input_names Array of null terminated UTF8 encoded strings of the input names
-   * \paramin input_values Array of Value objects of length input_count
-   * \paramin input_count Number of elements in the input_names and inputs arrays
-   * \paramin output_names Array of null terminated UTF8 encoded strings of the output names
-   * \paramout output_values Array of provided Values to be filled with outputs.
-   *             On calling RunAsync, output_valuesi could either be initialized by a null pointer or a preallocated OrtValue*.
-   *             Later, on invoking the callback, each output_valuesi of null will be filled with an OrtValue* allocated by onnxruntime.
-   *             Then, an OrtValue** pointer will be casted from output_values, and pass to the callback.
-   *             NOTE: it is customer's duty to finally release output_values and each of its member,
-   *             regardless of whether the member (Ort::Value) is allocated by onnxruntime or preallocated by the customer.
-   * \paramin output_count Number of elements in the output_names and outputs array
-   * \paramin callback Callback function on model run completion
-   * \paramin user_data User data that pass back to the callback
-   */
-  void RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
-                const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data);
-
   /** \brief End profiling and return a copy of the profiling file name.
    *
    * \param allocator to allocate memory for the copy of the string returned
@@ -1630,7 +1317,6 @@
   static Value CreateTensor(const OrtMemoryInfo* info, T* p_data, size_t p_data_element_count, const int64_t* shape, size_t shape_len);
 
   /** \brief Creates a tensor with a user supplied buffer. Wraps OrtApi::CreateTensorWithDataAsOrtValue.
-   *
    * \param info Memory description of where the p_data buffer resides (CPU vs GPU etc).
    * \param p_data Pointer to the data buffer.
    * \param p_data_byte_count The number of bytes in the data buffer.
@@ -1641,12 +1327,7 @@
   static Value CreateTensor(const OrtMemoryInfo* info, void* p_data, size_t p_data_byte_count, const int64_t* shape, size_t shape_len,
                             ONNXTensorElementDataType type);
 
-  /** \brief Creates an OrtValue with a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
-   *         This overload will allocate the buffer for the tensor  according to the supplied shape and data type.
-   *         The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
-   *         The input data would need to be copied into the allocated buffer.
-   *         This API is not suitable for strings.
-   *
+  /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
    * \tparam T The numeric datatype. This API is not suitable for strings.
    * \param allocator The allocator to use.
    * \param shape Pointer to the tensor shape dimensions.
@@ -1655,12 +1336,7 @@
   template <typename T>
   static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len);
 
-  /** \brief Creates an OrtValue with a tensor using the supplied OrtAllocator.
-   *   Wraps OrtApi::CreateTensorAsOrtValue.
-   *   The allocated buffer will be owned by the returned OrtValue and will be freed when the OrtValue is released.
-   *   The input data would need to be copied into the allocated buffer.
-   *   This API is not suitable for strings.
-   *
+  /** \brief Creates a tensor using a supplied OrtAllocator. Wraps OrtApi::CreateTensorAsOrtValue.
    * \param allocator The allocator to use.
    * \param shape Pointer to the tensor shape dimensions.
    * \param shape_len The number of tensor shape dimensions.
@@ -1668,35 +1344,11 @@
    */
   static Value CreateTensor(OrtAllocator* allocator, const int64_t* shape, size_t shape_len, ONNXTensorElementDataType type);
 
-  /** \brief Creates an OrtValue with a Map Onnx type representation.
-   *  The API would ref-count the supplied OrtValues and they will be released
-   *  when the returned OrtValue is released. The caller may release keys and values after the call
-   *  returns.
-   *
-   * \param keys an OrtValue containing a tensor with primitive data type keys.
-   * \param values an OrtValue that may contain a tensor. Ort currently supports only primitive data type values.
-   */
-  static Value CreateMap(const Value& keys, const Value& values);  ///< Wraps OrtApi::CreateValue
-
-  /** \brief Creates an OrtValue with a Sequence Onnx type representation.
-   *  The API would ref-count the supplied OrtValues and they will be released
-   *  when the returned OrtValue is released. The caller may release the values after the call
-   *  returns.
-   *
-   * \param values a vector of OrtValues that must have the same Onnx value type.
-   */
-  static Value CreateSequence(const std::vector<Value>& values);  ///< Wraps OrtApi::CreateValue
+  static Value CreateMap(Value& keys, Value& values);       ///< Wraps OrtApi::CreateValue
+  static Value CreateSequence(std::vector<Value>& values);  ///< Wraps OrtApi::CreateValue
 
-  /** \brief Creates an OrtValue wrapping an Opaque type.
-   *  This is used for experimental support of non-tensor types.
-   *
-   * \tparam T - the type of the value.
-   * \param domain - zero terminated utf-8 string. Domain of the type.
-   * \param type_name - zero terminated utf-8 string. Name of the type.
-   * \param value - the value to be wrapped.
-   */
   template <typename T>
-  static Value CreateOpaque(const char* domain, const char* type_name, const T& value);  ///< Wraps OrtApi::CreateOpaqueValue
+  static Value CreateOpaque(const char* domain, const char* type_name, const T&);  ///< Wraps OrtApi::CreateOpaqueValue
 
 #if !defined(DISABLE_SPARSE_TENSORS)
   /// <summary>
@@ -2058,8 +1710,6 @@
   void* GetGPUComputeStream() const;
   Logger GetLogger() const;
   OrtAllocator* GetAllocator(const OrtMemoryInfo& memory_info) const;
-  OrtKernelContext* GetOrtKernelContext() const { return ctx_; }
-  void ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const;
 
  private:
   OrtKernelContext* ctx_;
@@ -2161,83 +1811,195 @@
 };
 
 /// <summary>
-/// Provide access to per-node attributes and input shapes, so one could compute and set output shapes.
+/// This entire structure is deprecated, but we not marking
+/// it as a whole yet since we want to preserve for the next release.
 /// </summary>
-struct ShapeInferContext {
-  struct SymbolicInteger {
-    SymbolicInteger(int64_t i) : i_(i), is_int_(true){};
-    SymbolicInteger(const char* s) : s_(s), is_int_(false){};
-    SymbolicInteger(const SymbolicInteger&) = default;
-    SymbolicInteger(SymbolicInteger&&) = default;
+struct CustomOpApi {
+  CustomOpApi(const OrtApi& api) : api_(api) {}
+
+  /** \deprecated use Ort::Value::GetTensorTypeAndShape()
+   * deprecated
+   * This interface produces a pointer that must be released. Not exception safe.
+   */
+  deprecated("use Ort::Value::GetTensorTypeAndShape()") OrtTensorTypeAndShapeInfo* GetTensorTypeAndShape(_In_ const OrtValue* value);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementCount()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetElementCount()") size_t GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetElementType()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetElementType()") ONNXTensorElementDataType GetTensorElementType(const OrtTensorTypeAndShapeInfo* info);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetDimensionsCount()") size_t GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info);
 
-    SymbolicInteger& operator=(const SymbolicInteger&) = default;
-    SymbolicInteger& operator=(SymbolicInteger&&) = default;
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") void GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length);
 
-    bool operator==(const SymbolicInteger& dim) const {
-      if (is_int_ == dim.is_int_) {
-        if (is_int_) {
-          return i_ == dim.i_;
-        } else {
-          return std::string{s_} == std::string{dim.s_};
-        }
-      }
-      return false;
-    }
+  /** \deprecated
+   * deprecated
+   * This interface sets dimensions to TensorTypeAndShapeInfo, but has no effect on the OrtValue.
+   */
+  deprecated("Do not use") void SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count);
 
-    bool IsInt() const { return is_int_; }
-    int64_t AsInt() const { return i_; }
-    const char* AsSym() const { return s_; }
+  /** \deprecated use Ort::Value::GetTensorMutableData()
+   * deprecated
+   * This interface is redundant.
+   */
+  template <typename T>
+  deprecated("use Ort::Value::GetTensorMutableData()") T* GetTensorMutableData(_Inout_ OrtValue* value);
 
-    static constexpr int INVALID_INT_DIM = -2;
+  /** \deprecated use Ort::Value::GetTensorData()
+   * deprecated
+   * This interface is redundant.
+   */
+  template <typename T>
+  deprecated("use Ort::Value::GetTensorData()") const T* GetTensorData(_Inout_ const OrtValue* value);
 
-   private:
-    union {
-      int64_t i_;
-      const char* s_;
-    };
-    bool is_int_;
-  };
+  /** \deprecated use Ort::Value::GetTensorMemoryInfo()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::Value::GetTensorMemoryInfo()") const OrtMemoryInfo* GetTensorMemoryInfo(_In_ const OrtValue* value);
+
+  /** \deprecated use Ort::TensorTypeAndShapeInfo::GetShape()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::TensorTypeAndShapeInfo::GetShape()") std::vector<int64_t> GetTensorShape(const OrtTensorTypeAndShapeInfo* info);
 
-  using Shape = std::vector<SymbolicInteger>;
+  /** \deprecated use TensorTypeAndShapeInfo instances for automatic ownership.
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use TensorTypeAndShapeInfo") void ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input);
+
+  /** \deprecated use Ort::KernelContext::GetInputCount
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetInputCount") size_t KernelContext_GetInputCount(const OrtKernelContext* context);
+
+  /** \deprecated use Ort::KernelContext::GetInput
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetInput") const OrtValue* KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index);
+
+  /** \deprecated use Ort::KernelContext::GetOutputCount
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetOutputCount") size_t KernelContext_GetOutputCount(const OrtKernelContext* context);
+
+  /** \deprecated use Ort::KernelContext::GetOutput
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetOutput") OrtValue* KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index, _In_ const int64_t* dim_values, size_t dim_count);
 
-  ShapeInferContext(const OrtApi* ort_api, OrtShapeInferContext* ctx);
+  /** \deprecated use Ort::KernelContext::GetGPUComputeStream
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::KernelContext::GetGPUComputeStream") void* KernelContext_GetGPUComputeStream(const OrtKernelContext* context);
 
-  const Shape& GetInputShape(size_t indice) const { return input_shapes_.at(indice); }
+  /** \deprecated use Ort::ThrowOnError()
+   * deprecated
+   * This interface is redundant.
+   */
+  deprecated("use Ort::ThrowOnError()") void ThrowOnError(OrtStatus* result);
 
-  size_t GetInputCount() const { return input_shapes_.size(); }
+  /** \deprecated use Ort::OpAttr
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::OpAttr") OrtOpAttr* CreateOpAttr(_In_ const char* name,
+                                                            _In_ const void* data,
+                                                            _In_ int len,
+                                                            _In_ OrtOpAttrType type);
 
-  Status SetOutputShape(size_t indice, const Shape& shape);
+  /** \deprecated use Ort::OpAttr
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::OpAttr") void ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr);
 
-  int64_t GetAttrInt(const char* attr_name);
+  /** \deprecated use Ort::Op
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::Op") OrtOp* CreateOp(_In_ const OrtKernelInfo* info,
+                                                _In_z_ const char* op_name,
+                                                _In_z_ const char* domain,
+                                                int version,
+                                                _In_reads_(type_constraint_count) const char** type_constraint_names,
+                                                _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
+                                                int type_constraint_count,
+                                                _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
+                                                int attr_count,
+                                                int input_count,
+                                                int output_count);
 
-  using Ints = std::vector<int64_t>;
-  Ints GetAttrInts(const char* attr_name);
+  /** \deprecated use Ort::Op::Invoke
+   * deprecated
+   * This interface is redundant
+   */
+  deprecated("use Ort::Op::Invoke") void InvokeOp(_In_ const OrtKernelContext* context,
+                                                      _In_ const OrtOp* ort_op,
+                                                      _In_ const OrtValue* const* input_values,
+                                                      _In_ int input_count,
+                                                      _Inout_ OrtValue* const* output_values,
+                                                      _In_ int output_count);
 
-  float GetAttrFloat(const char* attr_name);
+  /** \deprecated use Ort::Op for automatic lifespan management.
+   * deprecated
+   * This interface is not exception safe.
+   */
+  deprecated("use Ort::Op") void ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op);
 
-  using Floats = std::vector<float>;
-  Floats GetAttrFloats(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo for automatic lifespan management or for
+   * querying attributes
+   * deprecated
+   * This interface is redundant
+   */
+  template <typename T>  // T is only implemented for std::vector<float>, std::vector<int64_t>, float, int64_t, and string
+  deprecated("use Ort::KernelInfo::GetAttribute") T KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name);
 
-  std::string GetAttrString(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo::Copy
+   * querying attributes
+   * deprecated
+   * This interface is not exception safe
+   */
+  deprecated("use Ort::KernelInfo::Copy") OrtKernelInfo* CopyKernelInfo(_In_ const OrtKernelInfo* info);
 
-  using Strings = std::vector<std::string>;
-  Strings GetAttrStrings(const char* attr_name);
+  /** \deprecated use Ort::KernelInfo for lifespan management
+   * querying attributes
+   * deprecated
+   * This interface is not exception safe
+   */
+  deprecated("use Ort::KernelInfo") void ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy);
 
  private:
-  const OrtOpAttr* GetAttrHdl(const char* attr_name) const;
-  const OrtApi* ort_api_;
-  OrtShapeInferContext* ctx_;
-  std::vector<Shape> input_shapes_;
+  const OrtApi& api_;
 };
 
-using ShapeInferFn = Ort::Status (*)(Ort::ShapeInferContext&);
-
-#define MAX_CUSTOM_OP_END_VER (1UL << 31) - 1
-
-template <typename TOp, typename TKernel, bool WithStatus = false>
+template <typename TOp, typename TKernel>
 struct CustomOpBase : OrtCustomOp {
   CustomOpBase() {
     OrtCustomOp::version = ORT_API_VERSION;
+    OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
     OrtCustomOp::GetName = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetName(); };
 
     OrtCustomOp::GetExecutionProviderType = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetExecutionProviderType(); };
@@ -2249,6 +2011,7 @@
     OrtCustomOp::GetOutputTypeCount = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetOutputTypeCount(); };
     OrtCustomOp::GetOutputType = (const OrtCustomOp* this_, size_t index) { return static_cast<const TOp*>(this_)->GetOutputType(index); };
 
+    OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) { static_cast<TKernel*>(op_kernel)->Compute(context); };
 #if defined(_MSC_VER) && !defined(__clang__)
 #pragma warning(push)
 #pragma warning(disable : 26409)
@@ -2264,36 +2027,6 @@
     OrtCustomOp::GetVariadicInputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicInputHomogeneity()); };
     OrtCustomOp::GetVariadicOutputMinArity = (const OrtCustomOp* this_) { return static_cast<const TOp*>(this_)->GetVariadicOutputMinArity(); };
     OrtCustomOp::GetVariadicOutputHomogeneity = (const OrtCustomOp* this_) { return static_cast<int>(static_cast<const TOp*>(this_)->GetVariadicOutputHomogeneity()); };
-#ifdef __cpp_if_constexpr
-    if constexpr (WithStatus) {
-#else
-    if (WithStatus) {
-#endif
-      OrtCustomOp::CreateKernelV2 = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info, void** op_kernel) -> OrtStatusPtr {
-        return static_cast<const TOp*>(this_)->CreateKernelV2(*api, info, op_kernel);
-      };
-      OrtCustomOp::KernelComputeV2 = (void* op_kernel, OrtKernelContext* context) -> OrtStatusPtr {
-        return static_cast<TKernel*>(op_kernel)->ComputeV2(context);
-      };
-    } else {
-      OrtCustomOp::CreateKernelV2 = nullptr;
-      OrtCustomOp::KernelComputeV2 = nullptr;
-
-      OrtCustomOp::CreateKernel = (const OrtCustomOp* this_, const OrtApi* api, const OrtKernelInfo* info) { return static_cast<const TOp*>(this_)->CreateKernel(*api, info); };
-      OrtCustomOp::KernelCompute = (void* op_kernel, OrtKernelContext* context) {
-        static_cast<TKernel*>(op_kernel)->Compute(context);
-      };
-    }
-
-    SetShapeInferFn<TOp>(0);
-
-    OrtCustomOp::GetStartVersion = (const OrtCustomOp* this_) {
-      return static_cast<const TOp*>(this_)->start_ver_;
-    };
-
-    OrtCustomOp::GetEndVersion = (const OrtCustomOp* this_) {
-      return static_cast<const TOp*>(this_)->end_ver_;
-    };
   }
 
   // Default implementation of GetExecutionProviderType that returns nullptr to default to the CPU provider
@@ -2345,26 +2078,9 @@
     return std::vector<std::string>{};
   }
 
-  template <typename C>
-  decltype(&C::InferOutputShape) SetShapeInferFn(decltype(&C::InferOutputShape)) {
-    OrtCustomOp::InferOutputShapeFn = (const OrtCustomOp*, OrtShapeInferContext* ort_ctx) -> OrtStatusPtr {
-      ShapeInferContext ctx(&GetApi(), ort_ctx);
-      return C::InferOutputShape(ctx);
-    };
-    return {};
-  }
-
-  template <typename C>
-  void SetShapeInferFn(...) {
-    OrtCustomOp::InferOutputShapeFn = {};
-  }
-
  protected:
   // Helper function that returns a map of session config entries specified by CustomOpBase::GetSessionConfigKeys.
   void GetSessionConfigs(std::unordered_map<std::string, std::string>& out, ConstSessionOptions options) const;
-
-  int start_ver_ = 1;
-  int end_ver_ = MAX_CUSTOM_OP_END_VER;
 };
 
 }  // namespace Ort
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_cxx_inline.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_cxx_inline.h Changed

@@ -7,17 +7,6 @@
 // These are the inline implementations of the C++ header APIs. They're in this separate file as to not clutter
 // the main C++ file with implementation details.
 
-#include <cstring>
-#include <functional>
-
-#define RETURN_ON_API_FAIL(expression) \
-  {                                    \
-    auto err = (expression);           \
-    if (err) {                         \
-      return Status(err);              \
-    }                                  \
-  }
-
 namespace Ort {
 
 namespace detail {
@@ -125,47 +114,6 @@
   static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL;
 };
 
-template <>
-struct TypeToTensorType<Float8E4M3FN_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN;
-};
-template <>
-struct TypeToTensorType<Float8E4M3FNUZ_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ;
-};
-template <>
-struct TypeToTensorType<Float8E5M2_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2;
-};
-template <>
-struct TypeToTensorType<Float8E5M2FNUZ_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ;
-};
-
-inline bool BFloat16_t::operator==(const BFloat16_t& rhs) const noexcept {
-  if (IsNaN() || rhs.IsNaN()) {
-    // IEEE defines that NaN is not equal to anything, including itself.
-    return false;
-  }
-  return val == rhs.val;
-}
-
-inline bool BFloat16_t::operator<(const BFloat16_t& rhs) const noexcept {
-  if (IsNaN() || rhs.IsNaN()) {
-    // IEEE defines that NaN is unordered with respect to everything, including itself.
-    return false;
-  }
-
-  const bool left_is_negative = IsNegative();
-  if (left_is_negative != rhs.IsNegative()) {
-    // When the signs of left and right differ, we know that left is less than right if it is
-    // the negative value. The exception to this is if both values are zero, in which case IEEE
-    // says they should be equal, even if the signs differ.
-    return left_is_negative && !AreZero(*this, rhs);
-  }
-  return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
-}
-
 inline MemoryAllocation::MemoryAllocation(OrtAllocator* allocator, void* p, size_t size)
     : allocator_(allocator), p_(p), size_(size) {
 }
@@ -524,21 +472,6 @@
   return *this;
 }
 
-inline Env& Env::CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg) {
-  std::vector<const char*> keys, values;
-  auto num_entries = options.size();
-  if (num_entries > 0) {
-    keys.reserve(num_entries);
-    values.reserve(num_entries);
-    for (const auto& entry : options) {
-      keys.push_back(entry.first.c_str());
-      values.push_back(entry.second.c_str());
-    }
-  }
-  ThrowOnError(GetApi().CreateAndRegisterAllocatorV2(p_, provider_type.c_str(), mem_info, arena_cfg, keys.data(), values.data(), num_entries));
-  return *this;
-}
-
 inline CustomOpDomain::CustomOpDomain(const char* domain) {
   ThrowOnError(GetApi().CreateCustomOpDomain(domain, &p_));
 }
@@ -657,12 +590,6 @@
 }
 
 template <typename T>
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetDeterministicCompute(bool value) {
-  ThrowOnError(GetApi().SetDeterministicCompute(this->p_, value));
-  return *this;
-}
-
-template <typename T>
 inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetOptimizedModelFilePath(const ORTCHAR_T* optimized_model_filepath) {
   ThrowOnError(GetApi().SetOptimizedModelFilePath(this->p_, optimized_model_filepath));
   return *this;
@@ -866,26 +793,6 @@
 }
 
 template <typename T>
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options) {
-  auto num_entries = provider_options.size();
-  std::vector<const char*> keys, values;
-  if (num_entries > 0) {
-    keys.reserve(num_entries);
-    values.reserve(num_entries);
-
-    for (const auto& entry : provider_options) {
-      keys.push_back(entry.first.c_str());
-      values.push_back(entry.second.c_str());
-    }
-  }
-
-  ThrowOnError(GetApi().SessionOptionsAppendExecutionProvider_OpenVINO_V2(this->p_,
-                                                                          keys.data(), values.data(), num_entries));
-
-  return *this;
-}
-
-template <typename T>
 inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::RegisterCustomOpsLibrary(const ORTCHAR_T* library_name,
                                                                               const CustomOpConfigs& custom_op_configs) {
   // Add custom op config entries before registering the custom op library. Otherwise, the config entries _may_ be ignored by
@@ -1008,16 +915,6 @@
 }
 
 template <typename T>
-inline void SessionImpl<T>::RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
-                                     const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data) {
-  auto ort_input_values = reinterpret_cast<const OrtValue* const*>(input_values);
-  auto ort_output_values = reinterpret_cast<OrtValue**>(output_values);
-  ThrowOnError(GetApi().RunAsync(this->p_, run_options, input_names,
-                                 ort_input_values, input_count, output_names, output_count,
-                                 ort_output_values, callback, user_data));
-}
-
-template <typename T>
 inline AllocatedStringPtr SessionImpl<T>::EndProfilingAllocated(OrtAllocator* allocator) {
   char* out = nullptr;
   ThrowOnError(GetApi().SessionEndProfiling(this->p_, allocator, &out));
@@ -1537,16 +1434,16 @@
 }
 #endif  // !defined(DISABLE_SPARSE_TENSORS)
 
-inline Value Value::CreateMap(const Value& keys, const Value& values) {
+inline Value Value::CreateMap(Value& keys, Value& values) {
   OrtValue* out;
-  const OrtValue* inputs2 = {keys, values};
+  OrtValue* inputs2 = {keys, values};
   ThrowOnError(GetApi().CreateValue(inputs, 2, ONNX_TYPE_MAP, &out));
   return Value{out};
 }
 
-inline Value Value::CreateSequence(const std::vector<Value>& values) {
+inline Value Value::CreateSequence(std::vector<Value>& values) {
   OrtValue* out;
-  std::vector<const OrtValue*> values_ort{values.data(), values.data() + values.size()};
+  std::vector<OrtValue*> values_ort{values.data(), values.data() + values.size()};
   ThrowOnError(GetApi().CreateValue(values_ort.data(), values_ort.size(), ONNX_TYPE_SEQUENCE, &out));
   return Value{out};
 }
@@ -1678,10 +1575,6 @@
   return Logger{out};
 }
 
-inline void KernelContext::ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const {
-  ThrowOnError(GetApi().KernelContext_ParallelFor(ctx_, fn, total, num_batch, usr_data));
-}
-
 inline OpAttr::OpAttr(const char* name, const void* data, int len, OrtOpAttrType type) {
   Ort::ThrowOnError(GetApi().CreateOpAttr(name, data, len, type, &p_));
 }
@@ -1877,6 +1770,223 @@
                                       output_values, static_cast<int>(output_count)));
 }
 
+inline void CustomOpApi::ThrowOnError(OrtStatus* status) {
+  Ort::ThrowOnError(status);
+}
+
+template <>
+inline float CustomOpApi::KernelInfoGetAttribute<float>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  float out;
+  Ort::ThrowOnError(api_.KernelInfoGetAttribute_float(info, name, &out));
+  return out;
+}
+
+template <>
+inline int64_t CustomOpApi::KernelInfoGetAttribute<int64_t>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  int64_t out;
+  Ort::ThrowOnError(api_.KernelInfoGetAttribute_int64(info, name, &out));
+  return out;
+}
+
+template <>
+inline std::string CustomOpApi::KernelInfoGetAttribute<std::string>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::string out;
+
+  // Feed nullptr for the data buffer to query the true size of the string attribute
+  OrtStatus* status = api_.KernelInfoGetAttribute_string(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttribute_string(info, name, &out0, &size));
+    out.resize(size - 1);  // remove the terminating character '\0'
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+
+template <>
+inline std::vector<float> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::vector<float> out;
+
+  // Feed nullptr for the data buffer to query the true size of the attribute
+  OrtStatus* status = api_.KernelInfoGetAttributeArray_float(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_float(info, name, out.data(), &size));
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+
+template <>
+inline std::vector<int64_t> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::vector<int64_t> out;
+
+  // Feed nullptr for the data buffer to query the true size of the attribute
+  OrtStatus* status = api_.KernelInfoGetAttributeArray_int64(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_int64(info, name, out.data(), &size));
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+inline OrtTensorTypeAndShapeInfo* CustomOpApi::GetTensorTypeAndShape(_In_ const OrtValue* value) {
+  OrtTensorTypeAndShapeInfo* out;
+  Ort::ThrowOnError(api_.GetTensorTypeAndShape(value, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetTensorShapeElementCount(info, &out));
+  return out;
+}
+
+inline ONNXTensorElementDataType CustomOpApi::GetTensorElementType(const OrtTensorTypeAndShapeInfo* info) {
+  ONNXTensorElementDataType out;
+  Ort::ThrowOnError(api_.GetTensorElementType(info, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
+  return out;
+}
+
+inline void CustomOpApi::GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length) {
+  Ort::ThrowOnError(api_.GetDimensions(info, dim_values, dim_values_length));
+}
+
+inline void CustomOpApi::SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count) {
+  Ort::ThrowOnError(api_.SetDimensions(info, dim_values, dim_count));
+}
+
+template <typename T>
+inline T* CustomOpApi::GetTensorMutableData(_Inout_ OrtValue* value) {
+  T* data;
+  Ort::ThrowOnError(api_.GetTensorMutableData(value, reinterpret_cast<void**>(&data)));
+  return data;
+}
+
+inline const OrtMemoryInfo* CustomOpApi::GetTensorMemoryInfo(_In_ const OrtValue* value) {
+  const OrtMemoryInfo* mem_info;
+  Ort::ThrowOnError(api_.GetTensorMemoryInfo(value, &mem_info));
+  return mem_info;
+}
+
+template <typename T>
+inline const T* CustomOpApi::GetTensorData(_Inout_ const OrtValue* value) {
+  T* data = nullptr;
+  Ort::ThrowOnError(api_.GetTensorMutableData(const_cast<OrtValue*>(value), reinterpret_cast<void**>(&data)));
+  return data;
+}
+
+inline std::vector<int64_t> CustomOpApi::GetTensorShape(const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
+  std::vector<int64_t> output(out);
+  Ort::ThrowOnError(api_.GetDimensions(info, output.data(), out));
+  return output;
+}
+
+inline void CustomOpApi::ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input) {
+  api_.ReleaseTensorTypeAndShapeInfo(input);
+}
+
+inline size_t CustomOpApi::KernelContext_GetInputCount(const OrtKernelContext* context) {
+  size_t out;
+  Ort::ThrowOnError(api_.KernelContext_GetInputCount(context, &out));
+  return out;
+}
+
+inline const OrtValue* CustomOpApi::KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index) {
+  const OrtValue* out;
+  Ort::ThrowOnError(api_.KernelContext_GetInput(context, index, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::KernelContext_GetOutputCount(const OrtKernelContext* context) {
+  size_t out;
+  Ort::ThrowOnError(api_.KernelContext_GetOutputCount(context, &out));
+  return out;
+}
+
+inline OrtValue* CustomOpApi::KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index,
+                                                      _In_ const int64_t* dim_values, size_t dim_count) {
+  OrtValue* out;
+  Ort::ThrowOnError(api_.KernelContext_GetOutput(context, index, dim_values, dim_count, &out));
+  return out;
+}
+
+inline void* CustomOpApi::KernelContext_GetGPUComputeStream(const OrtKernelContext* context) {
+  void* out;
+  Ort::ThrowOnError(api_.KernelContext_GetGPUComputeStream(context, &out));
+  return out;
+}
+
+inline OrtOpAttr* CustomOpApi::CreateOpAttr(_In_ const char* name,
+                                            _In_ const void* data,
+                                            _In_ int len,
+                                            _In_ OrtOpAttrType type) {
+  OrtOpAttr* op_attr{};
+  Ort::ThrowOnError(api_.CreateOpAttr(name, data, len, type, &op_attr));
+  return op_attr;
+}
+
+inline void CustomOpApi::ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr) {
+  api_.ReleaseOpAttr(op_attr);
+}
+
+inline OrtOp* CustomOpApi::CreateOp(_In_ const OrtKernelInfo* info,
+                                    _In_z_ const char* op_name,
+                                    _In_z_ const char* domain,
+                                    int version,
+                                    _In_reads_(type_constraint_count) const char** type_constraint_names,
+                                    _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
+                                    int type_constraint_count,
+                                    _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
+                                    int attr_count,
+                                    int input_count,
+                                    int output_count) {
+  OrtOp* ort_op{};
+  Ort::ThrowOnError(api_.CreateOp(info, op_name, domain, version, type_constraint_names, type_constraint_values,
+                                  type_constraint_count, attr_values, attr_count, input_count, output_count, &ort_op));
+  return ort_op;
+}
+
+inline void CustomOpApi::InvokeOp(_In_ const OrtKernelContext* context,
+                                  _In_ const OrtOp* ort_op,
+                                  _In_ const OrtValue* const* input_values,
+                                  _In_ int input_count,
+                                  _Inout_ OrtValue* const* output_values,
+                                  _In_ int output_count) {
+  Ort::ThrowOnError(api_.InvokeOp(context, ort_op, input_values, input_count, output_values, output_count));
+}
+
+inline void CustomOpApi::ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op) {
+  api_.ReleaseOp(ort_op);
+}
+
+inline OrtKernelInfo* CustomOpApi::CopyKernelInfo(_In_ const OrtKernelInfo* info) {
+  OrtKernelInfo* info_copy{};
+  Ort::ThrowOnError(api_.CopyKernelInfo(info, &info_copy));
+  return info_copy;
+}
+
+inline void CustomOpApi::ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy) {
+  api_.ReleaseKernelInfo(info_copy);
+}
+
 inline std::string GetVersionString() {
   return OrtGetApiBase()->GetVersionString();
 }
@@ -1904,9 +2014,9 @@
   return available_providers;
 }
 
-template <typename TOp, typename TKernel, bool WithStatus>
-void CustomOpBase<TOp, TKernel, WithStatus>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
-                                                               ConstSessionOptions options) const {
+template <typename TOp, typename TKernel>
+void CustomOpBase<TOp, TKernel>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
+                                                   ConstSessionOptions options) const {
   const TOp* derived = static_cast<const TOp*>(this);
   std::vector<std::string> keys = derived->GetSessionConfigKeys();
 
@@ -1922,154 +2032,4 @@
   }
 }
 
-inline ShapeInferContext::ShapeInferContext(const OrtApi* ort_api,
-                                            OrtShapeInferContext* ctx) : ort_api_(ort_api), ctx_(ctx) {
-  size_t input_count = 0;
-  Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputCount(ctx_, &input_count));
-  for (size_t ith_input = 0; ith_input < input_count; ++ith_input) {
-    OrtTensorTypeAndShapeInfo* info{};
-    Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputTypeShape(ctx, ith_input, &info));
-    TensorTypeAndShapeInfo type_shape_info(info);
-    auto integer_shape = type_shape_info.GetShape();
-    std::vector<const char*> symbolic_shape(integer_shape.size(), {});
-    type_shape_info.GetSymbolicDimensions(&symbolic_shape0, integer_shape.size());
-    Shape shape;
-    for (size_t ith = 0; ith < integer_shape.size(); ++ith) {
-      if (symbolic_shapeith && std::string{symbolic_shapeith}.size() > 0) {
-        shape.emplace_back(symbolic_shapeith);
-      } else {
-        shape.emplace_back(integer_shapeith);
-      }
-    }
-    input_shapes_.push_back(std::move(shape));
-    type_shape_info.release();
-  }
-}
-
-inline Status ShapeInferContext::SetOutputShape(size_t indice, const Shape& shape) {
-  OrtTensorTypeAndShapeInfo* info = {};
-  RETURN_ON_API_FAIL(ort_api_->CreateTensorTypeAndShapeInfo(&info));
-
-  using InfoPtr = std::unique_ptr<OrtTensorTypeAndShapeInfo, std::function<void(OrtTensorTypeAndShapeInfo*)>>;
-
-  InfoPtr info_ptr(info, this(OrtTensorTypeAndShapeInfo* obj) {
-    ort_api_->ReleaseTensorTypeAndShapeInfo(obj);
-  });
-
-  std::vector<int64_t> integer_dims;
-  std::vector<const char*> symbolic_dims;
-
-  for (const auto dim : shape) {
-    if (dim.IsInt()) {
-      integer_dims.push_back(dim.IsInt());
-      symbolic_dims.push_back("");
-    } else {
-      if (!dim.AsSym() || std::string{dim.AsSym()}.empty()) {
-        ORT_CXX_API_THROW("Symbolic dim must not be an empty string", ORT_INVALID_ARGUMENT);
-      }
-      integer_dims.push_back(SymbolicInteger::INVALID_INT_DIM);
-      symbolic_dims.push_back(dim.AsSym());
-    }
-  }
-
-  RETURN_ON_API_FAIL(ort_api_->SetDimensions(info, integer_dims.data(), integer_dims.size()));
-  RETURN_ON_API_FAIL(ort_api_->SetSymbolicDimensions(info, symbolic_dims.data(), symbolic_dims.size()));
-  RETURN_ON_API_FAIL(ort_api_->ShapeInferContext_SetOutputTypeShape(ctx_, indice, info));
-  return Status{nullptr};
-}
-
-inline int64_t ShapeInferContext::GetAttrInt(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  int64_t i = {};
-  size_t out = {};
-  Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INT, &i, sizeof(i), &out));
-  return i;
-}
-
-inline ShapeInferContext::Ints ShapeInferContext::GetAttrInts(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  int64_t i = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, &i, sizeof(i), &out);
-  if (status) {
-    size_t num_i = out / sizeof(int64_t);
-    ShapeInferContext::Ints ints(num_i, 0);
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, ints.data(), out, &out));
-    return ints;
-  } else {
-    return {i};
-  }
-}
-
-inline float ShapeInferContext::GetAttrFloat(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  float f = {};
-  size_t out = {};
-  Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOAT, &f, sizeof(f), &out));
-  return f;
-}
-
-inline ShapeInferContext::Floats ShapeInferContext::GetAttrFloats(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  float f = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, &f, sizeof(f), &out);
-  if (status) {
-    size_t num_f = out / sizeof(float);
-    ShapeInferContext::Floats floats(num_f, 0);
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, floats.data(), out, &out));
-    return floats;
-  } else {
-    return {f};
-  }
-}
-
-inline std::string ShapeInferContext::GetAttrString(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  char c = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, &c, sizeof(char), &out);
-  if (status) {
-    std::vector<char> chars(out, '\0');
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, chars.data(), out, &out));
-    return {chars.data()};
-  } else {
-    return {c};
-  }
-}
-
-inline ShapeInferContext::Strings ShapeInferContext::GetAttrStrings(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  char c = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, &c, sizeof(char), &out);
-  if (status) {
-    std::vector<char> chars(out, '\0');
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, chars.data(), out, &out));
-    ShapeInferContext::Strings strings;
-    char* char_st = chars.data();
-    char* char_ed = char_st + out;
-    while (char_st < char_ed) {
-      strings.emplace_back(char_st);
-      while (*char_st != '\0') {
-        char_st++;
-      }
-      char_st++;
-    }
-    return strings;
-  } else {
-    return {std::string{c}};
-  }
-}
-
-inline const OrtOpAttr* ShapeInferContext::GetAttrHdl(const char* attr_name) const {
-  const OrtOpAttr* attr_hdl = {};
-  Ort::ThrowOnError(ort_api_->ShapeInferContext_GetAttribute(ctx_, attr_name, &attr_hdl));
-  return attr_hdl;
-}
-
 }  // namespace Ort

 
@@ -7,17 +7,6 @@
 // These are the inline implementations of the C++ header APIs. They're in this separate file as to not clutter
 // the main C++ file with implementation details.
 
-#include <cstring>
-#include <functional>
-
-#define RETURN_ON_API_FAIL(expression) \
-  {                                    \
-    auto err = (expression);           \
-    if (err) {                         \
-      return Status(err);              \
-    }                                  \
-  }
-
 namespace Ort {
 
 namespace detail {
@@ -125,47 +114,6 @@
   static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL;
 };
 
-template <>
-struct TypeToTensorType<Float8E4M3FN_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FN;
-};
-template <>
-struct TypeToTensorType<Float8E4M3FNUZ_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E4M3FNUZ;
-};
-template <>
-struct TypeToTensorType<Float8E5M2_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2;
-};
-template <>
-struct TypeToTensorType<Float8E5M2FNUZ_t> {
-  static constexpr ONNXTensorElementDataType type = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT8E5M2FNUZ;
-};
-
-inline bool BFloat16_t::operator==(const BFloat16_t& rhs) const noexcept {
-  if (IsNaN() || rhs.IsNaN()) {
-    // IEEE defines that NaN is not equal to anything, including itself.
-    return false;
-  }
-  return val == rhs.val;
-}
-
-inline bool BFloat16_t::operator<(const BFloat16_t& rhs) const noexcept {
-  if (IsNaN() || rhs.IsNaN()) {
-    // IEEE defines that NaN is unordered with respect to everything, including itself.
-    return false;
-  }
-
-  const bool left_is_negative = IsNegative();
-  if (left_is_negative != rhs.IsNegative()) {
-    // When the signs of left and right differ, we know that left is less than right if it is
-    // the negative value. The exception to this is if both values are zero, in which case IEEE
-    // says they should be equal, even if the signs differ.
-    return left_is_negative && !AreZero(*this, rhs);
-  }
-  return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
-}
-
 inline MemoryAllocation::MemoryAllocation(OrtAllocator* allocator, void* p, size_t size)
     : allocator_(allocator), p_(p), size_(size) {
 }
@@ -524,21 +472,6 @@
   return *this;
 }
 
-inline Env& Env::CreateAndRegisterAllocatorV2(const std::string& provider_type, const OrtMemoryInfo* mem_info, const std::unordered_map<std::string, std::string>& options, const OrtArenaCfg* arena_cfg) {
-  std::vector<const char*> keys, values;
-  auto num_entries = options.size();
-  if (num_entries > 0) {
-    keys.reserve(num_entries);
-    values.reserve(num_entries);
-    for (const auto& entry : options) {
-      keys.push_back(entry.first.c_str());
-      values.push_back(entry.second.c_str());
-    }
-  }
-  ThrowOnError(GetApi().CreateAndRegisterAllocatorV2(p_, provider_type.c_str(), mem_info, arena_cfg, keys.data(), values.data(), num_entries));
-  return *this;
-}
-
 inline CustomOpDomain::CustomOpDomain(const char* domain) {
   ThrowOnError(GetApi().CreateCustomOpDomain(domain, &p_));
 }
@@ -657,12 +590,6 @@
 }
 
 template <typename T>
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetDeterministicCompute(bool value) {
-  ThrowOnError(GetApi().SetDeterministicCompute(this->p_, value));
-  return *this;
-}
-
-template <typename T>
 inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::SetOptimizedModelFilePath(const ORTCHAR_T* optimized_model_filepath) {
   ThrowOnError(GetApi().SetOptimizedModelFilePath(this->p_, optimized_model_filepath));
   return *this;
@@ -866,26 +793,6 @@
 }
 
 template <typename T>
-inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::AppendExecutionProvider_OpenVINO_V2(const std::unordered_map<std::string, std::string>& provider_options) {
-  auto num_entries = provider_options.size();
-  std::vector<const char*> keys, values;
-  if (num_entries > 0) {
-    keys.reserve(num_entries);
-    values.reserve(num_entries);
-
-    for (const auto& entry : provider_options) {
-      keys.push_back(entry.first.c_str());
-      values.push_back(entry.second.c_str());
-    }
-  }
-
-  ThrowOnError(GetApi().SessionOptionsAppendExecutionProvider_OpenVINO_V2(this->p_,
-                                                                          keys.data(), values.data(), num_entries));
-
-  return *this;
-}
-
-template <typename T>
 inline SessionOptionsImpl<T>& SessionOptionsImpl<T>::RegisterCustomOpsLibrary(const ORTCHAR_T* library_name,
                                                                               const CustomOpConfigs& custom_op_configs) {
   // Add custom op config entries before registering the custom op library. Otherwise, the config entries _may_ be ignored by
@@ -1008,16 +915,6 @@
 }
 
 template <typename T>
-inline void SessionImpl<T>::RunAsync(const RunOptions& run_options, const char* const* input_names, const Value* input_values, size_t input_count,
-                                     const char* const* output_names, Value* output_values, size_t output_count, RunAsyncCallbackFn callback, void* user_data) {
-  auto ort_input_values = reinterpret_cast<const OrtValue* const*>(input_values);
-  auto ort_output_values = reinterpret_cast<OrtValue**>(output_values);
-  ThrowOnError(GetApi().RunAsync(this->p_, run_options, input_names,
-                                 ort_input_values, input_count, output_names, output_count,
-                                 ort_output_values, callback, user_data));
-}
-
-template <typename T>
 inline AllocatedStringPtr SessionImpl<T>::EndProfilingAllocated(OrtAllocator* allocator) {
   char* out = nullptr;
   ThrowOnError(GetApi().SessionEndProfiling(this->p_, allocator, &out));
@@ -1537,16 +1434,16 @@
 }
 #endif  // !defined(DISABLE_SPARSE_TENSORS)
 
-inline Value Value::CreateMap(const Value& keys, const Value& values) {
+inline Value Value::CreateMap(Value& keys, Value& values) {
   OrtValue* out;
-  const OrtValue* inputs2 = {keys, values};
+  OrtValue* inputs2 = {keys, values};
   ThrowOnError(GetApi().CreateValue(inputs, 2, ONNX_TYPE_MAP, &out));
   return Value{out};
 }
 
-inline Value Value::CreateSequence(const std::vector<Value>& values) {
+inline Value Value::CreateSequence(std::vector<Value>& values) {
   OrtValue* out;
-  std::vector<const OrtValue*> values_ort{values.data(), values.data() + values.size()};
+  std::vector<OrtValue*> values_ort{values.data(), values.data() + values.size()};
   ThrowOnError(GetApi().CreateValue(values_ort.data(), values_ort.size(), ONNX_TYPE_SEQUENCE, &out));
   return Value{out};
 }
@@ -1678,10 +1575,6 @@
   return Logger{out};
 }
 
-inline void KernelContext::ParallelFor(void (*fn)(void*, size_t), size_t total, size_t num_batch, void* usr_data) const {
-  ThrowOnError(GetApi().KernelContext_ParallelFor(ctx_, fn, total, num_batch, usr_data));
-}
-
 inline OpAttr::OpAttr(const char* name, const void* data, int len, OrtOpAttrType type) {
   Ort::ThrowOnError(GetApi().CreateOpAttr(name, data, len, type, &p_));
 }
@@ -1877,6 +1770,223 @@
                                       output_values, static_cast<int>(output_count)));
 }
 
+inline void CustomOpApi::ThrowOnError(OrtStatus* status) {
+  Ort::ThrowOnError(status);
+}
+
+template <>
+inline float CustomOpApi::KernelInfoGetAttribute<float>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  float out;
+  Ort::ThrowOnError(api_.KernelInfoGetAttribute_float(info, name, &out));
+  return out;
+}
+
+template <>
+inline int64_t CustomOpApi::KernelInfoGetAttribute<int64_t>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  int64_t out;
+  Ort::ThrowOnError(api_.KernelInfoGetAttribute_int64(info, name, &out));
+  return out;
+}
+
+template <>
+inline std::string CustomOpApi::KernelInfoGetAttribute<std::string>(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::string out;
+
+  // Feed nullptr for the data buffer to query the true size of the string attribute
+  OrtStatus* status = api_.KernelInfoGetAttribute_string(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttribute_string(info, name, &out0, &size));
+    out.resize(size - 1);  // remove the terminating character '\0'
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+
+template <>
+inline std::vector<float> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::vector<float> out;
+
+  // Feed nullptr for the data buffer to query the true size of the attribute
+  OrtStatus* status = api_.KernelInfoGetAttributeArray_float(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_float(info, name, out.data(), &size));
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+
+template <>
+inline std::vector<int64_t> CustomOpApi::KernelInfoGetAttribute(_In_ const OrtKernelInfo* info, _In_ const char* name) {
+  size_t size = 0;
+  std::vector<int64_t> out;
+
+  // Feed nullptr for the data buffer to query the true size of the attribute
+  OrtStatus* status = api_.KernelInfoGetAttributeArray_int64(info, name, nullptr, &size);
+
+  if (status == nullptr) {
+    out.resize(size);
+    Ort::ThrowOnError(api_.KernelInfoGetAttributeArray_int64(info, name, out.data(), &size));
+  } else {
+    Ort::ThrowOnError(status);
+  }
+  return out;
+}
+inline OrtTensorTypeAndShapeInfo* CustomOpApi::GetTensorTypeAndShape(_In_ const OrtValue* value) {
+  OrtTensorTypeAndShapeInfo* out;
+  Ort::ThrowOnError(api_.GetTensorTypeAndShape(value, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::GetTensorShapeElementCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetTensorShapeElementCount(info, &out));
+  return out;
+}
+
+inline ONNXTensorElementDataType CustomOpApi::GetTensorElementType(const OrtTensorTypeAndShapeInfo* info) {
+  ONNXTensorElementDataType out;
+  Ort::ThrowOnError(api_.GetTensorElementType(info, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::GetDimensionsCount(_In_ const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
+  return out;
+}
+
+inline void CustomOpApi::GetDimensions(_In_ const OrtTensorTypeAndShapeInfo* info, _Out_ int64_t* dim_values, size_t dim_values_length) {
+  Ort::ThrowOnError(api_.GetDimensions(info, dim_values, dim_values_length));
+}
+
+inline void CustomOpApi::SetDimensions(OrtTensorTypeAndShapeInfo* info, _In_ const int64_t* dim_values, size_t dim_count) {
+  Ort::ThrowOnError(api_.SetDimensions(info, dim_values, dim_count));
+}
+
+template <typename T>
+inline T* CustomOpApi::GetTensorMutableData(_Inout_ OrtValue* value) {
+  T* data;
+  Ort::ThrowOnError(api_.GetTensorMutableData(value, reinterpret_cast<void**>(&data)));
+  return data;
+}
+
+inline const OrtMemoryInfo* CustomOpApi::GetTensorMemoryInfo(_In_ const OrtValue* value) {
+  const OrtMemoryInfo* mem_info;
+  Ort::ThrowOnError(api_.GetTensorMemoryInfo(value, &mem_info));
+  return mem_info;
+}
+
+template <typename T>
+inline const T* CustomOpApi::GetTensorData(_Inout_ const OrtValue* value) {
+  T* data = nullptr;
+  Ort::ThrowOnError(api_.GetTensorMutableData(const_cast<OrtValue*>(value), reinterpret_cast<void**>(&data)));
+  return data;
+}
+
+inline std::vector<int64_t> CustomOpApi::GetTensorShape(const OrtTensorTypeAndShapeInfo* info) {
+  size_t out;
+  Ort::ThrowOnError(api_.GetDimensionsCount(info, &out));
+  std::vector<int64_t> output(out);
+  Ort::ThrowOnError(api_.GetDimensions(info, output.data(), out));
+  return output;
+}
+
+inline void CustomOpApi::ReleaseTensorTypeAndShapeInfo(OrtTensorTypeAndShapeInfo* input) {
+  api_.ReleaseTensorTypeAndShapeInfo(input);
+}
+
+inline size_t CustomOpApi::KernelContext_GetInputCount(const OrtKernelContext* context) {
+  size_t out;
+  Ort::ThrowOnError(api_.KernelContext_GetInputCount(context, &out));
+  return out;
+}
+
+inline const OrtValue* CustomOpApi::KernelContext_GetInput(const OrtKernelContext* context, _In_ size_t index) {
+  const OrtValue* out;
+  Ort::ThrowOnError(api_.KernelContext_GetInput(context, index, &out));
+  return out;
+}
+
+inline size_t CustomOpApi::KernelContext_GetOutputCount(const OrtKernelContext* context) {
+  size_t out;
+  Ort::ThrowOnError(api_.KernelContext_GetOutputCount(context, &out));
+  return out;
+}
+
+inline OrtValue* CustomOpApi::KernelContext_GetOutput(OrtKernelContext* context, _In_ size_t index,
+                                                      _In_ const int64_t* dim_values, size_t dim_count) {
+  OrtValue* out;
+  Ort::ThrowOnError(api_.KernelContext_GetOutput(context, index, dim_values, dim_count, &out));
+  return out;
+}
+
+inline void* CustomOpApi::KernelContext_GetGPUComputeStream(const OrtKernelContext* context) {
+  void* out;
+  Ort::ThrowOnError(api_.KernelContext_GetGPUComputeStream(context, &out));
+  return out;
+}
+
+inline OrtOpAttr* CustomOpApi::CreateOpAttr(_In_ const char* name,
+                                            _In_ const void* data,
+                                            _In_ int len,
+                                            _In_ OrtOpAttrType type) {
+  OrtOpAttr* op_attr{};
+  Ort::ThrowOnError(api_.CreateOpAttr(name, data, len, type, &op_attr));
+  return op_attr;
+}
+
+inline void CustomOpApi::ReleaseOpAttr(_Frees_ptr_opt_ OrtOpAttr* op_attr) {
+  api_.ReleaseOpAttr(op_attr);
+}
+
+inline OrtOp* CustomOpApi::CreateOp(_In_ const OrtKernelInfo* info,
+                                    _In_z_ const char* op_name,
+                                    _In_z_ const char* domain,
+                                    int version,
+                                    _In_reads_(type_constraint_count) const char** type_constraint_names,
+                                    _In_reads_(type_constraint_count) const ONNXTensorElementDataType* type_constraint_values,
+                                    int type_constraint_count,
+                                    _In_reads_(attr_count) const OrtOpAttr* const* attr_values,
+                                    int attr_count,
+                                    int input_count,
+                                    int output_count) {
+  OrtOp* ort_op{};
+  Ort::ThrowOnError(api_.CreateOp(info, op_name, domain, version, type_constraint_names, type_constraint_values,
+                                  type_constraint_count, attr_values, attr_count, input_count, output_count, &ort_op));
+  return ort_op;
+}
+
+inline void CustomOpApi::InvokeOp(_In_ const OrtKernelContext* context,
+                                  _In_ const OrtOp* ort_op,
+                                  _In_ const OrtValue* const* input_values,
+                                  _In_ int input_count,
+                                  _Inout_ OrtValue* const* output_values,
+                                  _In_ int output_count) {
+  Ort::ThrowOnError(api_.InvokeOp(context, ort_op, input_values, input_count, output_values, output_count));
+}
+
+inline void CustomOpApi::ReleaseOp(_Frees_ptr_opt_ OrtOp* ort_op) {
+  api_.ReleaseOp(ort_op);
+}
+
+inline OrtKernelInfo* CustomOpApi::CopyKernelInfo(_In_ const OrtKernelInfo* info) {
+  OrtKernelInfo* info_copy{};
+  Ort::ThrowOnError(api_.CopyKernelInfo(info, &info_copy));
+  return info_copy;
+}
+
+inline void CustomOpApi::ReleaseKernelInfo(_Frees_ptr_opt_ OrtKernelInfo* info_copy) {
+  api_.ReleaseKernelInfo(info_copy);
+}
+
 inline std::string GetVersionString() {
   return OrtGetApiBase()->GetVersionString();
 }
@@ -1904,9 +2014,9 @@
   return available_providers;
 }
 
-template <typename TOp, typename TKernel, bool WithStatus>
-void CustomOpBase<TOp, TKernel, WithStatus>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
-                                                               ConstSessionOptions options) const {
+template <typename TOp, typename TKernel>
+void CustomOpBase<TOp, TKernel>::GetSessionConfigs(std::unordered_map<std::string, std::string>& out,
+                                                   ConstSessionOptions options) const {
   const TOp* derived = static_cast<const TOp*>(this);
   std::vector<std::string> keys = derived->GetSessionConfigKeys();
 
@@ -1922,154 +2032,4 @@
   }
 }
 
-inline ShapeInferContext::ShapeInferContext(const OrtApi* ort_api,
-                                            OrtShapeInferContext* ctx) : ort_api_(ort_api), ctx_(ctx) {
-  size_t input_count = 0;
-  Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputCount(ctx_, &input_count));
-  for (size_t ith_input = 0; ith_input < input_count; ++ith_input) {
-    OrtTensorTypeAndShapeInfo* info{};
-    Ort::ThrowOnError(ort_api_->ShapeInferContext_GetInputTypeShape(ctx, ith_input, &info));
-    TensorTypeAndShapeInfo type_shape_info(info);
-    auto integer_shape = type_shape_info.GetShape();
-    std::vector<const char*> symbolic_shape(integer_shape.size(), {});
-    type_shape_info.GetSymbolicDimensions(&symbolic_shape0, integer_shape.size());
-    Shape shape;
-    for (size_t ith = 0; ith < integer_shape.size(); ++ith) {
-      if (symbolic_shapeith && std::string{symbolic_shapeith}.size() > 0) {
-        shape.emplace_back(symbolic_shapeith);
-      } else {
-        shape.emplace_back(integer_shapeith);
-      }
-    }
-    input_shapes_.push_back(std::move(shape));
-    type_shape_info.release();
-  }
-}
-
-inline Status ShapeInferContext::SetOutputShape(size_t indice, const Shape& shape) {
-  OrtTensorTypeAndShapeInfo* info = {};
-  RETURN_ON_API_FAIL(ort_api_->CreateTensorTypeAndShapeInfo(&info));
-
-  using InfoPtr = std::unique_ptr<OrtTensorTypeAndShapeInfo, std::function<void(OrtTensorTypeAndShapeInfo*)>>;
-
-  InfoPtr info_ptr(info, this(OrtTensorTypeAndShapeInfo* obj) {
-    ort_api_->ReleaseTensorTypeAndShapeInfo(obj);
-  });
-
-  std::vector<int64_t> integer_dims;
-  std::vector<const char*> symbolic_dims;
-
-  for (const auto dim : shape) {
-    if (dim.IsInt()) {
-      integer_dims.push_back(dim.IsInt());
-      symbolic_dims.push_back("");
-    } else {
-      if (!dim.AsSym() || std::string{dim.AsSym()}.empty()) {
-        ORT_CXX_API_THROW("Symbolic dim must not be an empty string", ORT_INVALID_ARGUMENT);
-      }
-      integer_dims.push_back(SymbolicInteger::INVALID_INT_DIM);
-      symbolic_dims.push_back(dim.AsSym());
-    }
-  }
-
-  RETURN_ON_API_FAIL(ort_api_->SetDimensions(info, integer_dims.data(), integer_dims.size()));
-  RETURN_ON_API_FAIL(ort_api_->SetSymbolicDimensions(info, symbolic_dims.data(), symbolic_dims.size()));
-  RETURN_ON_API_FAIL(ort_api_->ShapeInferContext_SetOutputTypeShape(ctx_, indice, info));
-  return Status{nullptr};
-}
-
-inline int64_t ShapeInferContext::GetAttrInt(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  int64_t i = {};
-  size_t out = {};
-  Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INT, &i, sizeof(i), &out));
-  return i;
-}
-
-inline ShapeInferContext::Ints ShapeInferContext::GetAttrInts(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  int64_t i = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, &i, sizeof(i), &out);
-  if (status) {
-    size_t num_i = out / sizeof(int64_t);
-    ShapeInferContext::Ints ints(num_i, 0);
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_INTS, ints.data(), out, &out));
-    return ints;
-  } else {
-    return {i};
-  }
-}
-
-inline float ShapeInferContext::GetAttrFloat(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  float f = {};
-  size_t out = {};
-  Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOAT, &f, sizeof(f), &out));
-  return f;
-}
-
-inline ShapeInferContext::Floats ShapeInferContext::GetAttrFloats(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  float f = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, &f, sizeof(f), &out);
-  if (status) {
-    size_t num_f = out / sizeof(float);
-    ShapeInferContext::Floats floats(num_f, 0);
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_FLOATS, floats.data(), out, &out));
-    return floats;
-  } else {
-    return {f};
-  }
-}
-
-inline std::string ShapeInferContext::GetAttrString(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  char c = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, &c, sizeof(char), &out);
-  if (status) {
-    std::vector<char> chars(out, '\0');
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRING, chars.data(), out, &out));
-    return {chars.data()};
-  } else {
-    return {c};
-  }
-}
-
-inline ShapeInferContext::Strings ShapeInferContext::GetAttrStrings(const char* attr_name) {
-  const auto* attr = GetAttrHdl(attr_name);
-  char c = {};
-  size_t out = {};
-  // first call to get the bytes needed
-  auto status = ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, &c, sizeof(char), &out);
-  if (status) {
-    std::vector<char> chars(out, '\0');
-    Ort::ThrowOnError(ort_api_->ReadOpAttr(attr, ORT_OP_ATTR_STRINGS, chars.data(), out, &out));
-    ShapeInferContext::Strings strings;
-    char* char_st = chars.data();
-    char* char_ed = char_st + out;
-    while (char_st < char_ed) {
-      strings.emplace_back(char_st);
-      while (*char_st != '\0') {
-        char_st++;
-      }
-      char_st++;
-    }
-    return strings;
-  } else {
-    return {std::string{c}};
-  }
-}
-
-inline const OrtOpAttr* ShapeInferContext::GetAttrHdl(const char* attr_name) const {
-  const OrtOpAttr* attr_hdl = {};
-  Ort::ThrowOnError(ort_api_->ShapeInferContext_GetAttribute(ctx_, attr_name, &attr_hdl));
-  return attr_hdl;
-}
-
 }  // namespace Ort
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_session_options_config_keys.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_session_options_config_keys.h Changed

@@ -67,30 +67,20 @@
 // GeluApproximation has side effects which may change the inference results. It is disabled by default due to this.
 static const char* const kOrtSessionOptionsEnableGeluApproximation = "optimization.enable_gelu_approximation";
 
-// This setting controls whether to enable AheadOfTime function inlining.
-// AOT function inlining examines the graph and attempts to inline as many locally defined functions in the model
-// as possible with the help of enabled execution providers.
-// This can reduce the number of function calls and improve performance because it is done before
-// Level1 optimizers and constant folding. However, under some circumstances, when the EPs are not available,
-// one can disable the AOT inlining, produce an optimized model and postpone AOT until run time.
-// "0": enable; "1": disable.
-// Its default value is "0".
-static const char* const kOrtSessionOptionsDisableAheadOfTimeFunctionInlining = "session.disable_aot_function_inlining";
-
 #ifdef ENABLE_TRAINING
 // Specifies a list of op types for memory footprint reduction.
 // The value should be a ","-delimited list of pair of
-// <subgraph string: optimization strategy: number of subgraph to apply>.
+// <subgraph string : optimization strategy : number of subgraph to apply>.
 // For example, "Gelu+Cast+:1:0,Dropout+:1:1".
 //   A valid "subgraph string" should be one subgraph representation output by ORT graph transformations.
 //   "optimization strategy" currently has valid values: 0 - disabled, 1 - recompute.
 //   "number of subgraph to apply" is used to control how many subgraphs to apply optimization, to avoid "oversaving"
 //   the memory.
-static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.memory_optimizer_config";
+static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.enable_memory_optimizer";
 
-// Specifies the config for detecting subgraphs for memory footprint reduction.
-// The value should be a string contains int separated using commas. The default value is "0:0".
-static const char* const kOrtSessionOptionsMemoryOptimizerProbeConfig = "optimization.enable_memory_probe_recompute_config";
+// Specifies the level for detecting subgraphs for memory footprint reduction.
+// The value should be an integer. The default value is 0.
+static const char* const kOrtSessionOptionsMemoryOptimizerProbeLevel = "optimization.enable_memory_probe_recompute_level";
 #endif
 
 // Enable or disable using device allocator for allocating initialized tensor memory. "1": enable; "0": disable. The default is "0".
@@ -175,11 +165,6 @@
 // May be useful to expose bugs in models.
 static const char* const kOrtSessionOptionsConfigStrictShapeTypeInference = "session.strict_shape_type_inference";
 
-// "1": every model using a more recent opset than the latest released one will fail
-// "0": the model may or may not work if onnxruntime cannot find an implementation, this option
-// is used for development purpose.
-static const char* const kOrtSessionOptionsConfigStrictAllowReleasedOpsetsOnly = "session.allow_released_opsets_only";
-
 // The file saves configuration for partitioning node among logic streams
 static const char* const kNodePartitionConfigFile = "session.node_partition_config_file";
 
@@ -212,47 +197,3 @@
 //   3) after the L1 transformers are applied to the updated graph.
 // The model will be saved to filename post_layout_transform_step_<step_number>.onnx.
 static const char* const kDebugLayoutTransformation = "session.debug_layout_transformation";
-
-// Graph nodes that are not supported by the execution providers (EPs) explicitly added to the session are
-// assigned (i.e., "fallback") to the CPU EP by default.
-//
-// This option allows the user to disable the fallback of unsupported graph nodes to the CPU EP.
-// If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot
-// fully support all of the nodes in the graph.
-//
-// It is invalid to set this option and explicitly add the CPU EP to the session. In this case, session creation
-// will also fail with an error.
-//
-// Option values:
-// - "0": CPU EP fallback is not disabled. DEFAULT
-// - "1": CPU EP fallback is disabled.
-static const char* const kOrtSessionOptionsDisableCPUEPFallback = "session.disable_cpu_ep_fallback";
-
-// Use this config when serializing a large model after optimization to specify an external initializers file
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersFileName =
-    "session.optimized_model_external_initializers_file_name";
-
-// Use this config to control the minimum size of the initializer when externalizing it during serialization
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersMinSizeInBytes =
-    "session.optimized_model_external_initializers_min_size_in_bytes";
-
-// Enable EP context feature to dump the partitioned graph which includes the EP context into Onnx file.
-// The dumped Onnx model with EP context can be used for future inference to avoid the EP graph partitioning/compile overhead.
-// "0": disable. (default)
-// "1": enable.
-static const char* const kOrtSessionOptionEpContextEnable = "ep.context_enable";
-
-// Specify the file path for the Onnx model which has EP context.
-// Default to original_file_name_ctx.onnx if not specified
-static const char* const kOrtSessionOptionEpContextFilePath = "ep.context_file_path";
-
-// Flag to specify whether to dump the EP context into the Onnx model.
-// "0": dump the EP context into separate file, keep the file name in the Onnx model.
-// "1": dump the EP context into the Onnx model. (default).
-static const char* const kOrtSessionOptionEpContextEmbedMode = "ep.context_embed_mode";
-
-// Gemm fastmath mode provides fp32 gemm acceleration with bfloat16 based matmul.
-// Option values:
-// - "0": Gemm FastMath mode is not enabled. DEFAULT
-// - "1": Gemm FastMath mode is enabled.
-static const char* const kOrtSessionOptionsMlasGemmFastMathArm64Bfloat16 = "mlas.enable_gemm_fastmath_arm64_bfloat16";

 
@@ -67,30 +67,20 @@
 // GeluApproximation has side effects which may change the inference results. It is disabled by default due to this.
 static const char* const kOrtSessionOptionsEnableGeluApproximation = "optimization.enable_gelu_approximation";
 
-// This setting controls whether to enable AheadOfTime function inlining.
-// AOT function inlining examines the graph and attempts to inline as many locally defined functions in the model
-// as possible with the help of enabled execution providers.
-// This can reduce the number of function calls and improve performance because it is done before
-// Level1 optimizers and constant folding. However, under some circumstances, when the EPs are not available,
-// one can disable the AOT inlining, produce an optimized model and postpone AOT until run time.
-// "0": enable; "1": disable.
-// Its default value is "0".
-static const char* const kOrtSessionOptionsDisableAheadOfTimeFunctionInlining = "session.disable_aot_function_inlining";
-
 #ifdef ENABLE_TRAINING
 // Specifies a list of op types for memory footprint reduction.
 // The value should be a ","-delimited list of pair of
-// <subgraph string: optimization strategy: number of subgraph to apply>.
+// <subgraph string : optimization strategy : number of subgraph to apply>.
 // For example, "Gelu+Cast+:1:0,Dropout+:1:1".
 //   A valid "subgraph string" should be one subgraph representation output by ORT graph transformations.
 //   "optimization strategy" currently has valid values: 0 - disabled, 1 - recompute.
 //   "number of subgraph to apply" is used to control how many subgraphs to apply optimization, to avoid "oversaving"
 //   the memory.
-static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.memory_optimizer_config";
+static const char* const kOrtSessionOptionsMemoryOptimizerEnabler = "optimization.enable_memory_optimizer";
 
-// Specifies the config for detecting subgraphs for memory footprint reduction.
-// The value should be a string contains int separated using commas. The default value is "0:0".
-static const char* const kOrtSessionOptionsMemoryOptimizerProbeConfig = "optimization.enable_memory_probe_recompute_config";
+// Specifies the level for detecting subgraphs for memory footprint reduction.
+// The value should be an integer. The default value is 0.
+static const char* const kOrtSessionOptionsMemoryOptimizerProbeLevel = "optimization.enable_memory_probe_recompute_level";
 #endif
 
 // Enable or disable using device allocator for allocating initialized tensor memory. "1": enable; "0": disable. The default is "0".
@@ -175,11 +165,6 @@
 // May be useful to expose bugs in models.
 static const char* const kOrtSessionOptionsConfigStrictShapeTypeInference = "session.strict_shape_type_inference";
 
-// "1": every model using a more recent opset than the latest released one will fail
-// "0": the model may or may not work if onnxruntime cannot find an implementation, this option
-// is used for development purpose.
-static const char* const kOrtSessionOptionsConfigStrictAllowReleasedOpsetsOnly = "session.allow_released_opsets_only";
-
 // The file saves configuration for partitioning node among logic streams
 static const char* const kNodePartitionConfigFile = "session.node_partition_config_file";
 
@@ -212,47 +197,3 @@
 //   3) after the L1 transformers are applied to the updated graph.
 // The model will be saved to filename post_layout_transform_step_<step_number>.onnx.
 static const char* const kDebugLayoutTransformation = "session.debug_layout_transformation";
-
-// Graph nodes that are not supported by the execution providers (EPs) explicitly added to the session are
-// assigned (i.e., "fallback") to the CPU EP by default.
-//
-// This option allows the user to disable the fallback of unsupported graph nodes to the CPU EP.
-// If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot
-// fully support all of the nodes in the graph.
-//
-// It is invalid to set this option and explicitly add the CPU EP to the session. In this case, session creation
-// will also fail with an error.
-//
-// Option values:
-// - "0": CPU EP fallback is not disabled. DEFAULT
-// - "1": CPU EP fallback is disabled.
-static const char* const kOrtSessionOptionsDisableCPUEPFallback = "session.disable_cpu_ep_fallback";
-
-// Use this config when serializing a large model after optimization to specify an external initializers file
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersFileName =
-    "session.optimized_model_external_initializers_file_name";
-
-// Use this config to control the minimum size of the initializer when externalizing it during serialization
-static const char* const kOrtSessionOptionsOptimizedModelExternalInitializersMinSizeInBytes =
-    "session.optimized_model_external_initializers_min_size_in_bytes";
-
-// Enable EP context feature to dump the partitioned graph which includes the EP context into Onnx file.
-// The dumped Onnx model with EP context can be used for future inference to avoid the EP graph partitioning/compile overhead.
-// "0": disable. (default)
-// "1": enable.
-static const char* const kOrtSessionOptionEpContextEnable = "ep.context_enable";
-
-// Specify the file path for the Onnx model which has EP context.
-// Default to original_file_name_ctx.onnx if not specified
-static const char* const kOrtSessionOptionEpContextFilePath = "ep.context_file_path";
-
-// Flag to specify whether to dump the EP context into the Onnx model.
-// "0": dump the EP context into separate file, keep the file name in the Onnx model.
-// "1": dump the EP context into the Onnx model. (default).
-static const char* const kOrtSessionOptionEpContextEmbedMode = "ep.context_embed_mode";
-
-// Gemm fastmath mode provides fp32 gemm acceleration with bfloat16 based matmul.
-// Option values:
-// - "0": Gemm FastMath mode is not enabled. DEFAULT
-// - "1": Gemm FastMath mode is enabled.
-static const char* const kOrtSessionOptionsMlasGemmFastMathArm64Bfloat16 = "mlas.enable_gemm_fastmath_arm64_bfloat16";
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_c_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_c_api.h Changed

@@ -13,7 +13,7 @@
  *
  * In order to train a model with onnxruntime, the following training artifacts must be generated:
  * - The training onnx model
- * - The checkpoint file
+ * - The checkpoint directory
  * - The optimizer onnx model
  * - The eval onnx model model (optional)
  *
@@ -123,9 +123,9 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
+  /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
    *
-   * This function will parse a checkpoint file, pull relevant data and load the training
+   * This function will parse a checkpoint directory, pull relevant files and load the training
    * state into the checkpoint_state. This checkpoint state can then be used to create the
    * training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
    * session will resume training from the given checkpoint state.
@@ -133,7 +133,7 @@
    * training state (including model parameters, its gradients, the optimizer states and the properties).
    * As a result, it is required that the checkpoint state outlive the lifetime of the training session.
    *
-   * \paramin checkpoint_path Path to the checkpoint file
+   * \paramin checkpoint_path Path to the checkpoint directory
    * \paramout checkpoint_state Checkpoint state that contains the states of the training session.
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
@@ -142,14 +142,14 @@
   ORT_API2_STATUS(LoadCheckpoint, _In_ const ORTCHAR_T* checkpoint_path,
                   _Outptr_ OrtCheckpointState** checkpoint_state);
 
-  /** \brief Save the given state to a checkpoint file on disk.
+  /** \brief Save the given state to a checkpoint directory on disk.
    *
-   * This function serializes the provided checkpoint state to a file on disk.
+   * This function serializes the provided checkpoint state to a directory on disk.
    * This checkpoint can later be loaded by invoking OrtTrainingApi::LoadCheckpoint to resume
    * training from this snapshot of the state.
    *
    * \paramin checkpoint_state The checkpoint state to save.
-   * \paramin checkpoint_path Path to the checkpoint file.
+   * \paramin checkpoint_path Path to the checkpoint directory.
    * \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
@@ -172,7 +172,7 @@
    * - The training onnx model
    * - The evaluation onnx model (optional)
    * - The optimizer onnx model
-   * - The checkpoint file
+   * - The checkpoint directory
    *
    * These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
    *
@@ -190,29 +190,7 @@
   ORT_API2_STATUS(CreateTrainingSession, _In_ const OrtEnv* env, _In_ const OrtSessionOptions* options,
                   _Inout_ OrtCheckpointState* checkpoint_state, _In_ const ORTCHAR_T* train_model_path,
                   _In_ const ORTCHAR_T* eval_model_path, _In_ const ORTCHAR_T* optimizer_model_path,
-                  _Outptr_result_maybenull_ OrtTrainingSession** out);
-
-  /** \brief Create a training session that can be used to begin or resume training.
-   * This api provides a way to load all the training artifacts from buffers instead of files.
-   *
-   * \paramin env Environment to be used for the training session.
-   * \paramin options Session options that the user can customize for this training session.
-   * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
-   * \paramin train_model_data Buffer containing the model data to be used to perform training
-   * \paramin train_data_length Length of the buffer containing train_model_data
-   * \paramin eval_model_data Buffer containing the model data to be used to perform evaluation
-   * \paramin eval_data_length Length of the buffer containing eval_model_data
-   * \paramin optim_model_data Buffer containing the model data to be used to perform weight update
-   * \paramin optim_data_length Length of the buffer containing optim_model_data
-   * \paramout out Created training session.
-   *
-   */
-  ORT_API2_STATUS(CreateTrainingSessionFromBuffer, _In_ const OrtEnv* env,
-                  _In_ const OrtSessionOptions* options, _Inout_ OrtCheckpointState* checkpoint_state,
-                  _In_ const void* train_model_data, size_t train_data_length,
-                  _In_ const void* eval_model_data, size_t eval_data_length,
-                  _In_ const void* optim_model_data, size_t optim_data_length,
-                  _Outptr_result_maybenull_ OrtTrainingSession** out);
+                  _Outptr_ OrtTrainingSession** out);
 
   /// @}
 
@@ -608,14 +586,14 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Adds or updates the given property to/in the checkpoint state.
+  /** \brief Adds the given property to the checkpoint state.
    *
    * Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
-   * state by the user by calling this function with the corresponding property name and value.
-   * The given property name must be unique to be able to successfully add the property.
+   * state by the user if they desire by calling this function with the appropriate property name and
+   * value. The given property name must be unique to be able to successfully add the property.
    *
    * \paramin checkpoint_state The checkpoint state which should hold the property.
-   * \paramin property_name Name of the property being added or updated.
+   * \paramin property_name Unique name of the property being added.
    * \paramin property_type Type of the property associated with the given name.
    * \paramin property_value Property value associated with the given name.
    *
@@ -632,7 +610,7 @@
    * exist in the checkpoint state to be able to retrieve it successfully.
    *
    * \paramin checkpoint_state The checkpoint state that is currently holding the property.
-   * \paramin property_name Name of the property being retrieved.
+   * \paramin property_name Unique name of the property being retrieved.
    * \paramin allocator Allocator used to allocate the memory for the property_value.
    * \paramout property_type Type of the property associated with the given name.
    * \paramout property_value Property value associated with the given name.
@@ -645,82 +623,6 @@
                   _Out_ enum OrtPropertyType* property_type, _Outptr_ void** property_value);
 
   /// @}
-
-  /// \name Accessing The Training Session State
-  /// @{
-
-  /** \brief Load a checkpoint state from a buffer into checkpoint_state.
-   *
-   * This function will parse a checkpoint bytes buffer, pull relevant data and load the training
-   * state into the checkpoint_state. This checkpoint state can then be used to create the
-   * training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
-   * session will resume training from the given checkpoint state.
-   * \note Note that the training session created with a checkpoint state uses this state to store the entire
-   * training state (including model parameters, its gradients, the optimizer states and the properties).
-   * As a result, it is required that the checkpoint state outlive the lifetime of the training session.
-   *
-   * \paramin checkpoint_buffer Path to the checkpoint bytes buffer.
-   * \paramin num_bytes Number of bytes in the checkpoint buffer.
-   * \paramout checkpoint_state Checkpoint state that contains the states of the training session.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(LoadCheckpointFromBuffer, _In_ const void* checkpoint_buffer,
-                  _In_ const size_t num_bytes, _Outptr_ OrtCheckpointState** checkpoint_state);
-
-  /** \brief Retrieves the type and shape information of the parameter associated with the given parameter name.
-   *
-   * This function retrieves the type and shape of the parameter associated with the given parameter name.
-   * The parameter must exist in the checkpoint state to be able to retrieve its type and shape information successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \paramout parameter_type_and_shape The type and shape of the parameter being retrieved.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(GetParameterTypeAndShape, _In_ const OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _Outptr_ OrtTensorTypeAndShapeInfo** parameter_type_and_shape);
-
-  /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
-   *
-   * This function updates a model parameter in the checkpoint state with the given parameter data.
-   * The training session must be already created with the checkpoint state that contains the parameter
-   * being updated. The given parameter is copied over to the registered device for the training session.
-   * The parameter must exist in the checkpoint state to be able to update it successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being updated.
-   * \paramin parameter The parameter data that should replace the existing parameter data.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(UpdateParameter, _Inout_ OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _In_ OrtValue* parameter);
-
-  /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
-   *
-   * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
-   * The parameter is copied over and returned as an OrtValue. The training session must be already created
-   * with the checkpoint state that contains the parameter being retrieved.
-   * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \paramin allocator Allocator used to allocate the memory for the parameter.
-   * \paramout parameter The parameter data that is retrieved from the checkpoint state.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(GetParameter, _In_ const OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _Inout_ OrtAllocator* allocator,
-                  _Outptr_ OrtValue** parameter);
-
-  /// @}
 };
 
 typedef struct OrtTrainingApi OrtTrainingApi;

 
@@ -13,7 +13,7 @@
  *
  * In order to train a model with onnxruntime, the following training artifacts must be generated:
  * - The training onnx model
- * - The checkpoint file
+ * - The checkpoint directory
  * - The optimizer onnx model
  * - The eval onnx model model (optional)
  *
@@ -123,9 +123,9 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
+  /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
    *
-   * This function will parse a checkpoint file, pull relevant data and load the training
+   * This function will parse a checkpoint directory, pull relevant files and load the training
    * state into the checkpoint_state. This checkpoint state can then be used to create the
    * training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
    * session will resume training from the given checkpoint state.
@@ -133,7 +133,7 @@
    * training state (including model parameters, its gradients, the optimizer states and the properties).
    * As a result, it is required that the checkpoint state outlive the lifetime of the training session.
    *
-   * \paramin checkpoint_path Path to the checkpoint file
+   * \paramin checkpoint_path Path to the checkpoint directory
    * \paramout checkpoint_state Checkpoint state that contains the states of the training session.
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
@@ -142,14 +142,14 @@
   ORT_API2_STATUS(LoadCheckpoint, _In_ const ORTCHAR_T* checkpoint_path,
                   _Outptr_ OrtCheckpointState** checkpoint_state);
 
-  /** \brief Save the given state to a checkpoint file on disk.
+  /** \brief Save the given state to a checkpoint directory on disk.
    *
-   * This function serializes the provided checkpoint state to a file on disk.
+   * This function serializes the provided checkpoint state to a directory on disk.
    * This checkpoint can later be loaded by invoking OrtTrainingApi::LoadCheckpoint to resume
    * training from this snapshot of the state.
    *
    * \paramin checkpoint_state The checkpoint state to save.
-   * \paramin checkpoint_path Path to the checkpoint file.
+   * \paramin checkpoint_path Path to the checkpoint directory.
    * \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
    *
    * \snippet{doc} snippets.dox OrtStatus Return Value
@@ -172,7 +172,7 @@
    * - The training onnx model
    * - The evaluation onnx model (optional)
    * - The optimizer onnx model
-   * - The checkpoint file
+   * - The checkpoint directory
    *
    * These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
    *
@@ -190,29 +190,7 @@
   ORT_API2_STATUS(CreateTrainingSession, _In_ const OrtEnv* env, _In_ const OrtSessionOptions* options,
                   _Inout_ OrtCheckpointState* checkpoint_state, _In_ const ORTCHAR_T* train_model_path,
                   _In_ const ORTCHAR_T* eval_model_path, _In_ const ORTCHAR_T* optimizer_model_path,
-                  _Outptr_result_maybenull_ OrtTrainingSession** out);
-
-  /** \brief Create a training session that can be used to begin or resume training.
-   * This api provides a way to load all the training artifacts from buffers instead of files.
-   *
-   * \paramin env Environment to be used for the training session.
-   * \paramin options Session options that the user can customize for this training session.
-   * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
-   * \paramin train_model_data Buffer containing the model data to be used to perform training
-   * \paramin train_data_length Length of the buffer containing train_model_data
-   * \paramin eval_model_data Buffer containing the model data to be used to perform evaluation
-   * \paramin eval_data_length Length of the buffer containing eval_model_data
-   * \paramin optim_model_data Buffer containing the model data to be used to perform weight update
-   * \paramin optim_data_length Length of the buffer containing optim_model_data
-   * \paramout out Created training session.
-   *
-   */
-  ORT_API2_STATUS(CreateTrainingSessionFromBuffer, _In_ const OrtEnv* env,
-                  _In_ const OrtSessionOptions* options, _Inout_ OrtCheckpointState* checkpoint_state,
-                  _In_ const void* train_model_data, size_t train_data_length,
-                  _In_ const void* eval_model_data, size_t eval_data_length,
-                  _In_ const void* optim_model_data, size_t optim_data_length,
-                  _Outptr_result_maybenull_ OrtTrainingSession** out);
+                  _Outptr_ OrtTrainingSession** out);
 
   /// @}
 
@@ -608,14 +586,14 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Adds or updates the given property to/in the checkpoint state.
+  /** \brief Adds the given property to the checkpoint state.
    *
    * Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
-   * state by the user by calling this function with the corresponding property name and value.
-   * The given property name must be unique to be able to successfully add the property.
+   * state by the user if they desire by calling this function with the appropriate property name and
+   * value. The given property name must be unique to be able to successfully add the property.
    *
    * \paramin checkpoint_state The checkpoint state which should hold the property.
-   * \paramin property_name Name of the property being added or updated.
+   * \paramin property_name Unique name of the property being added.
    * \paramin property_type Type of the property associated with the given name.
    * \paramin property_value Property value associated with the given name.
    *
@@ -632,7 +610,7 @@
    * exist in the checkpoint state to be able to retrieve it successfully.
    *
    * \paramin checkpoint_state The checkpoint state that is currently holding the property.
-   * \paramin property_name Name of the property being retrieved.
+   * \paramin property_name Unique name of the property being retrieved.
    * \paramin allocator Allocator used to allocate the memory for the property_value.
    * \paramout property_type Type of the property associated with the given name.
    * \paramout property_value Property value associated with the given name.
@@ -645,82 +623,6 @@
                   _Out_ enum OrtPropertyType* property_type, _Outptr_ void** property_value);
 
   /// @}
-
-  /// \name Accessing The Training Session State
-  /// @{
-
-  /** \brief Load a checkpoint state from a buffer into checkpoint_state.
-   *
-   * This function will parse a checkpoint bytes buffer, pull relevant data and load the training
-   * state into the checkpoint_state. This checkpoint state can then be used to create the
-   * training session by invoking OrtTrainingApi::CreateTrainingSession. By doing so, the training
-   * session will resume training from the given checkpoint state.
-   * \note Note that the training session created with a checkpoint state uses this state to store the entire
-   * training state (including model parameters, its gradients, the optimizer states and the properties).
-   * As a result, it is required that the checkpoint state outlive the lifetime of the training session.
-   *
-   * \paramin checkpoint_buffer Path to the checkpoint bytes buffer.
-   * \paramin num_bytes Number of bytes in the checkpoint buffer.
-   * \paramout checkpoint_state Checkpoint state that contains the states of the training session.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(LoadCheckpointFromBuffer, _In_ const void* checkpoint_buffer,
-                  _In_ const size_t num_bytes, _Outptr_ OrtCheckpointState** checkpoint_state);
-
-  /** \brief Retrieves the type and shape information of the parameter associated with the given parameter name.
-   *
-   * This function retrieves the type and shape of the parameter associated with the given parameter name.
-   * The parameter must exist in the checkpoint state to be able to retrieve its type and shape information successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \paramout parameter_type_and_shape The type and shape of the parameter being retrieved.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(GetParameterTypeAndShape, _In_ const OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _Outptr_ OrtTensorTypeAndShapeInfo** parameter_type_and_shape);
-
-  /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
-   *
-   * This function updates a model parameter in the checkpoint state with the given parameter data.
-   * The training session must be already created with the checkpoint state that contains the parameter
-   * being updated. The given parameter is copied over to the registered device for the training session.
-   * The parameter must exist in the checkpoint state to be able to update it successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being updated.
-   * \paramin parameter The parameter data that should replace the existing parameter data.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(UpdateParameter, _Inout_ OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _In_ OrtValue* parameter);
-
-  /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
-   *
-   * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
-   * The parameter is copied over and returned as an OrtValue. The training session must be already created
-   * with the checkpoint state that contains the parameter being retrieved.
-   * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
-   *
-   * \paramin checkpoint_state The checkpoint state.
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \paramin allocator Allocator used to allocate the memory for the parameter.
-   * \paramout parameter The parameter data that is retrieved from the checkpoint state.
-   *
-   * \snippet{doc} snippets.dox OrtStatus Return Value
-   *
-   */
-  ORT_API2_STATUS(GetParameter, _In_ const OrtCheckpointState* checkpoint_state,
-                  _In_ const char* parameter_name, _Inout_ OrtAllocator* allocator,
-                  _Outptr_ OrtValue** parameter);
-
-  /// @}
 };
 
 typedef struct OrtTrainingApi OrtTrainingApi;
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_cxx_api.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_cxx_api.h Changed

@@ -71,40 +71,27 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
+  /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
    *
-   * This function will parse a checkpoint file, pull relevant data and load the training
+   * This function will parse a checkpoint directory, pull relevant files and load the training
    * state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
    * training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
    * training from the given checkpoint state.
    *
-   * \paramin path_to_checkpoint Path to the checkpoint file
+   * \paramin path_to_checkpoint Path to the checkpoint directory
    * \return Ort::CheckpointState object which holds the state of the training session parameters.
    *
    */
   static CheckpointState LoadCheckpoint(const std::basic_string<ORTCHAR_T>& path_to_checkpoint);
 
-  /** \brief Load a checkpoint state from a buffer.
+  /** \brief Save the given state to a checkpoint directory on disk.
    *
-   * This function will parse a checkpoint buffer, pull relevant data and load the training
-   * state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
-   * training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
-   * training from the given checkpoint state.
-   *
-   * \paramin buffer Buffer containing the checkpoint data.
-   * \return Ort::CheckpointState object which holds the state of the training session parameters.
-   *
-   */
-  static CheckpointState LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer);
-
-  /** \brief Save the given state to a checkpoint file on disk.
-   *
-   * This function serializes the provided checkpoint state to a file on disk.
+   * This function serializes the provided checkpoint state to a directory on disk.
    * This checkpoint can later be loaded by invoking Ort::CheckpointState::LoadCheckpoint to resume
    * training from this snapshot of the state.
    *
    * \paramin checkpoint_state The checkpoint state to save.
-   * \paramin path_to_checkpoint Path to the checkpoint file.
+   * \paramin path_to_checkpoint Path to the checkpoint directory.
    * \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
    *
    */
@@ -112,13 +99,13 @@
                              const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
                              const bool include_optimizer_state = false);
 
-  /** \brief Adds or updates the given property to/in the checkpoint state.
+  /** \brief Adds the given property to the checkpoint state.
    *
    * Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
-   * state by the user by calling this function with the corresponding property name and value.
-   * The given property name must be unique to be able to successfully add the property.
+   * state by the user if they desire by calling this function with the appropriate property name and
+   * value. The given property name must be unique to be able to successfully add the property.
    *
-   * \paramin property_name Name of the property being added or updated.
+   * \paramin property_name Unique name of the property being added.
    * \paramin property_value Property value associated with the given name.
    *
    */
@@ -129,38 +116,12 @@
    * Gets the property value from an existing entry in the checkpoint state. The property must
    * exist in the checkpoint state to be able to retrieve it successfully.
    *
-   * \paramin property_name Name of the property being retrieved.
+   * \paramin property_name Unique name of the property being retrieved.
    * \return Property value associated with the given property name.
    *
    */
   Property GetProperty(const std::string& property_name);
 
-  /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
-   *
-   * This function updates a model parameter in the checkpoint state with the given parameter data.
-   * The training session must be already created with the checkpoint state that contains the parameter
-   * being updated. The given parameter is copied over to the registered device for the training session.
-   * The parameter must exist in the checkpoint state to be able to update it successfully.
-   *
-   * \paramin parameter_name Name of the parameter being updated.
-   * \paramin parameter The parameter data that should replace the existing parameter data.
-   *
-   */
-  void UpdateParameter(const std::string& parameter_name, const Value& parameter);
-
-  /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
-   *
-   * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
-   * The parameter is copied over to the provided OrtValue. The training session must be already created
-   * with the checkpoint state that contains the parameter being retrieved.
-   * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
-   *
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \return The parameter data that is retrieved from the checkpoint state.
-   *
-   */
-  Value GetParameter(const std::string& parameter_name);
-
   /// @}
 };
 
@@ -170,7 +131,7 @@
  * - The training onnx model
  * - The evaluation onnx model (optional)
  * - The optimizer onnx model
- * - The checkpoint file
+ * - The checkpoint directory
  *
  * These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
  *
@@ -202,20 +163,6 @@
                   const std::optional<std::basic_string<ORTCHAR_T>>& eval_model_path = std::nullopt,
                   const std::optional<std::basic_string<ORTCHAR_T>>& optimizer_model_path = std::nullopt);
 
-  /** \brief Create a training session that can be used to begin or resume training.
-   * This constructor allows the users to load the models from buffers instead of files.
-   *
-   * \paramin env Env to be used for the training session.
-   * \paramin session_options SessionOptions that the user can customize for this training session.
-   * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
-   * \paramin train_model_data Buffer containing training model data.
-   * \paramin eval_model_data Buffer containing evaluation model data.
-   * \paramin optim_model_data Buffer containing optimizer model (used for performing weight/parameter update).
-   *
-   */
-  TrainingSession(const Env& env, const SessionOptions& session_options, CheckpointState& checkpoint_state,
-                  const std::vector<uint8_t>& train_model_data, const std::vector<uint8_t>& eval_model_data = {},
-                  const std::vector<uint8_t>& optim_model_data = {});
   /// @}
 
   /// \name Implementing The Training Loop
@@ -234,6 +181,7 @@
    * \paramin input_values The user inputs to the training model.
    * \return A std::vector of Ort::Value objects that represents the output of the forward pass of the training model.
    *
+   * \snippet{doc} snippets.dox OrtStatus Return Value
    *
    */
   std::vector<Value> TrainStep(const std::vector<Value>& input_values);

 
@@ -71,40 +71,27 @@
   /// \name Accessing The Training Session State
   /// @{
 
-  /** \brief Load a checkpoint state from a file on disk into checkpoint_state.
+  /** \brief Load a checkpoint state from directory on disk into checkpoint_state.
    *
-   * This function will parse a checkpoint file, pull relevant data and load the training
+   * This function will parse a checkpoint directory, pull relevant files and load the training
    * state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
    * training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
    * training from the given checkpoint state.
    *
-   * \paramin path_to_checkpoint Path to the checkpoint file
+   * \paramin path_to_checkpoint Path to the checkpoint directory
    * \return Ort::CheckpointState object which holds the state of the training session parameters.
    *
    */
   static CheckpointState LoadCheckpoint(const std::basic_string<ORTCHAR_T>& path_to_checkpoint);
 
-  /** \brief Load a checkpoint state from a buffer.
+  /** \brief Save the given state to a checkpoint directory on disk.
    *
-   * This function will parse a checkpoint buffer, pull relevant data and load the training
-   * state and return an instance of Ort::CheckpointState. This checkpoint state can then be used to create the
-   * training session by instantiating Ort::TrainingSession. By doing so, the training session will resume
-   * training from the given checkpoint state.
-   *
-   * \paramin buffer Buffer containing the checkpoint data.
-   * \return Ort::CheckpointState object which holds the state of the training session parameters.
-   *
-   */
-  static CheckpointState LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer);
-
-  /** \brief Save the given state to a checkpoint file on disk.
-   *
-   * This function serializes the provided checkpoint state to a file on disk.
+   * This function serializes the provided checkpoint state to a directory on disk.
    * This checkpoint can later be loaded by invoking Ort::CheckpointState::LoadCheckpoint to resume
    * training from this snapshot of the state.
    *
    * \paramin checkpoint_state The checkpoint state to save.
-   * \paramin path_to_checkpoint Path to the checkpoint file.
+   * \paramin path_to_checkpoint Path to the checkpoint directory.
    * \paramin include_optimizer_state Flag to indicate whether to save the optimizer state or not.
    *
    */
@@ -112,13 +99,13 @@
                              const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
                              const bool include_optimizer_state = false);
 
-  /** \brief Adds or updates the given property to/in the checkpoint state.
+  /** \brief Adds the given property to the checkpoint state.
    *
    * Runtime properties such as epoch, training step, best score, and others can be added to the checkpoint
-   * state by the user by calling this function with the corresponding property name and value.
-   * The given property name must be unique to be able to successfully add the property.
+   * state by the user if they desire by calling this function with the appropriate property name and
+   * value. The given property name must be unique to be able to successfully add the property.
    *
-   * \paramin property_name Name of the property being added or updated.
+   * \paramin property_name Unique name of the property being added.
    * \paramin property_value Property value associated with the given name.
    *
    */
@@ -129,38 +116,12 @@
    * Gets the property value from an existing entry in the checkpoint state. The property must
    * exist in the checkpoint state to be able to retrieve it successfully.
    *
-   * \paramin property_name Name of the property being retrieved.
+   * \paramin property_name Unique name of the property being retrieved.
    * \return Property value associated with the given property name.
    *
    */
   Property GetProperty(const std::string& property_name);
 
-  /** \brief Updates the data associated with the model parameter in the checkpoint state for the given parameter name.
-   *
-   * This function updates a model parameter in the checkpoint state with the given parameter data.
-   * The training session must be already created with the checkpoint state that contains the parameter
-   * being updated. The given parameter is copied over to the registered device for the training session.
-   * The parameter must exist in the checkpoint state to be able to update it successfully.
-   *
-   * \paramin parameter_name Name of the parameter being updated.
-   * \paramin parameter The parameter data that should replace the existing parameter data.
-   *
-   */
-  void UpdateParameter(const std::string& parameter_name, const Value& parameter);
-
-  /** \brief Gets the data associated with the model parameter from the checkpoint state for the given parameter name.
-   *
-   * This function retrieves the model parameter data from the checkpoint state for the given parameter name.
-   * The parameter is copied over to the provided OrtValue. The training session must be already created
-   * with the checkpoint state that contains the parameter being retrieved.
-   * The parameter must exist in the checkpoint state to be able to retrieve it successfully.
-   *
-   * \paramin parameter_name Name of the parameter being retrieved.
-   * \return The parameter data that is retrieved from the checkpoint state.
-   *
-   */
-  Value GetParameter(const std::string& parameter_name);
-
   /// @}
 };
 
@@ -170,7 +131,7 @@
  * - The training onnx model
  * - The evaluation onnx model (optional)
  * - The optimizer onnx model
- * - The checkpoint file
+ * - The checkpoint directory
  *
  * These artifacts can be generated using the `onnxruntime-training` python utility(https://github.com/microsoft/onnxruntime/blob/main/orttraining/orttraining/python/training/onnxblock/README.md).
  *
@@ -202,20 +163,6 @@
                   const std::optional<std::basic_string<ORTCHAR_T>>& eval_model_path = std::nullopt,
                   const std::optional<std::basic_string<ORTCHAR_T>>& optimizer_model_path = std::nullopt);
 
-  /** \brief Create a training session that can be used to begin or resume training.
-   * This constructor allows the users to load the models from buffers instead of files.
-   *
-   * \paramin env Env to be used for the training session.
-   * \paramin session_options SessionOptions that the user can customize for this training session.
-   * \paramin checkpoint_state Training states that the training session uses as a starting point for training.
-   * \paramin train_model_data Buffer containing training model data.
-   * \paramin eval_model_data Buffer containing evaluation model data.
-   * \paramin optim_model_data Buffer containing optimizer model (used for performing weight/parameter update).
-   *
-   */
-  TrainingSession(const Env& env, const SessionOptions& session_options, CheckpointState& checkpoint_state,
-                  const std::vector<uint8_t>& train_model_data, const std::vector<uint8_t>& eval_model_data = {},
-                  const std::vector<uint8_t>& optim_model_data = {});
   /// @}
 
   /// \name Implementing The Training Loop
@@ -234,6 +181,7 @@
    * \paramin input_values The user inputs to the training model.
    * \return A std::vector of Ort::Value objects that represents the output of the forward pass of the training model.
    *
+   * \snippet{doc} snippets.dox OrtStatus Return Value
    *
    */
   std::vector<Value> TrainStep(const std::vector<Value>& input_values);
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_training_cxx_inline.h -> onnxruntime-linux-x64-gpu-1.15.1.tgz/include/onnxruntime_training_cxx_inline.h Changed

@@ -24,23 +24,6 @@
   ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
 }
 
-inline TrainingSession::TrainingSession(const Env& env, const SessionOptions& session_options,
-                                        CheckpointState& checkpoint_state,
-                                        const std::vector<uint8_t>& train_model_data,
-                                        const std::vector<uint8_t>& eval_model_data,
-                                        const std::vector<uint8_t>& optim_model_data) {
-  ThrowOnError(GetTrainingApi().CreateTrainingSessionFromBuffer(
-      env, session_options, checkpoint_state,
-      train_model_data.data(), train_model_data.size(),
-      eval_model_data.data(), eval_model_data.size(),
-      optim_model_data.data(), optim_model_data.size(),
-      &p_));
-
-  ThrowOnError(GetTrainingApi().TrainingSessionGetTrainingModelOutputCount(p_, &training_model_output_count_));
-
-  ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
-}
-
 inline std::vector<Value> TrainingSession::TrainStep(const std::vector<Value>& input_values) {
   std::vector<Value> output_values;
   output_values.reserve(training_model_output_count_);
@@ -68,7 +51,7 @@
   RunOptions run_options;
   ThrowOnError(GetTrainingApi().EvalStep(
       p_, run_options, input_values.size(), ort_input_values,
-      eval_model_output_count_, ort_output_values));
+      training_model_output_count_, ort_output_values));
 
   return output_values;
 }
@@ -192,12 +175,6 @@
   return CheckpointState(checkpoint_state);
 }
 
-inline CheckpointState CheckpointState::LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer) {
-  OrtCheckpointState* checkpoint_state;
-  ThrowOnError(GetTrainingApi().LoadCheckpointFromBuffer(buffer.data(), buffer.size(), &checkpoint_state));
-  return CheckpointState(checkpoint_state);
-}
-
 inline void CheckpointState::SaveCheckpoint(const CheckpointState& checkpoint_states,
                                             const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
                                             const bool include_optimizer_state) {
@@ -231,12 +208,9 @@
     ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtFloatProperty, value_p));
   } else if (std::holds_alternative<std::string>(property_value)) {
     std::string value = std::get<std::string>(property_value);
-    auto buffer = std::make_unique<char>(value.length() + 1);
-    memcpy(buffer.get(), value.c_str(), value.length());
-    // AddProperty takes a char* and calls PropertyBag::AddProperty which takes a std::string. The data will be
-    // copied at that point so buffer can free the local allocation once the call is made.
-    ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty,
-                                              buffer.get()));
+    auto buffer = std::make_unique<char>(value.length() + 1).release();
+    memcpy(buffer, value.c_str(), value.length());
+    ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty, buffer));
   } else {
     ThrowStatus(Status("Unknown property type received.", OrtErrorCode::ORT_INVALID_ARGUMENT));
   }
@@ -279,16 +253,4 @@
   return property;
 }
 
-inline void CheckpointState::UpdateParameter(const std::string& parameter_name, const Value& parameter) {
-  ThrowOnError(GetTrainingApi().UpdateParameter(p_, parameter_name.c_str(), parameter));
-}
-
-inline Value CheckpointState::GetParameter(const std::string& parameter_name) {
-  AllocatorWithDefaultOptions allocator;
-  OrtValue* parameter;
-  ThrowOnError(GetTrainingApi().GetParameter(p_, parameter_name.c_str(), allocator, &parameter));
-
-  return Value{parameter};
-}
-
 }  // namespace Ort

 
@@ -24,23 +24,6 @@
   ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
 }
 
-inline TrainingSession::TrainingSession(const Env& env, const SessionOptions& session_options,
-                                        CheckpointState& checkpoint_state,
-                                        const std::vector<uint8_t>& train_model_data,
-                                        const std::vector<uint8_t>& eval_model_data,
-                                        const std::vector<uint8_t>& optim_model_data) {
-  ThrowOnError(GetTrainingApi().CreateTrainingSessionFromBuffer(
-      env, session_options, checkpoint_state,
-      train_model_data.data(), train_model_data.size(),
-      eval_model_data.data(), eval_model_data.size(),
-      optim_model_data.data(), optim_model_data.size(),
-      &p_));
-
-  ThrowOnError(GetTrainingApi().TrainingSessionGetTrainingModelOutputCount(p_, &training_model_output_count_));
-
-  ThrowOnError(GetTrainingApi().TrainingSessionGetEvalModelOutputCount(p_, &eval_model_output_count_));
-}
-
 inline std::vector<Value> TrainingSession::TrainStep(const std::vector<Value>& input_values) {
   std::vector<Value> output_values;
   output_values.reserve(training_model_output_count_);
@@ -68,7 +51,7 @@
   RunOptions run_options;
   ThrowOnError(GetTrainingApi().EvalStep(
       p_, run_options, input_values.size(), ort_input_values,
-      eval_model_output_count_, ort_output_values));
+      training_model_output_count_, ort_output_values));
 
   return output_values;
 }
@@ -192,12 +175,6 @@
   return CheckpointState(checkpoint_state);
 }
 
-inline CheckpointState CheckpointState::LoadCheckpointFromBuffer(const std::vector<uint8_t>& buffer) {
-  OrtCheckpointState* checkpoint_state;
-  ThrowOnError(GetTrainingApi().LoadCheckpointFromBuffer(buffer.data(), buffer.size(), &checkpoint_state));
-  return CheckpointState(checkpoint_state);
-}
-
 inline void CheckpointState::SaveCheckpoint(const CheckpointState& checkpoint_states,
                                             const std::basic_string<ORTCHAR_T>& path_to_checkpoint,
                                             const bool include_optimizer_state) {
@@ -231,12 +208,9 @@
     ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtFloatProperty, value_p));
   } else if (std::holds_alternative<std::string>(property_value)) {
     std::string value = std::get<std::string>(property_value);
-    auto buffer = std::make_unique<char>(value.length() + 1);
-    memcpy(buffer.get(), value.c_str(), value.length());
-    // AddProperty takes a char* and calls PropertyBag::AddProperty which takes a std::string. The data will be
-    // copied at that point so buffer can free the local allocation once the call is made.
-    ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty,
-                                              buffer.get()));
+    auto buffer = std::make_unique<char>(value.length() + 1).release();
+    memcpy(buffer, value.c_str(), value.length());
+    ThrowOnError(GetTrainingApi().AddProperty(p_, property_name.c_str(), OrtPropertyType::OrtStringProperty, buffer));
   } else {
     ThrowStatus(Status("Unknown property type received.", OrtErrorCode::ORT_INVALID_ARGUMENT));
   }
@@ -279,16 +253,4 @@
   return property;
 }
 
-inline void CheckpointState::UpdateParameter(const std::string& parameter_name, const Value& parameter) {
-  ThrowOnError(GetTrainingApi().UpdateParameter(p_, parameter_name.c_str(), parameter));
-}
-
-inline Value CheckpointState::GetParameter(const std::string& parameter_name) {
-  AllocatorWithDefaultOptions allocator;
-  OrtValue* parameter;
-  ThrowOnError(GetTrainingApi().GetParameter(p_, parameter_name.c_str(), allocator, &parameter));
-
-  return Value{parameter};
-}
-
 }  // namespace Ort
​

onnxruntime-linux-x64-gpu-1.15.1.tgz/include/tensorrt_provider_factory.h Added

 
@@ -0,0 +1,14 @@
+// Copyright (c) Microsoft Corporation. All rights reserved.
+// Licensed under the MIT License.
+
+#include "onnxruntime_c_api.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+ORT_API_STATUS(OrtSessionOptionsAppendExecutionProvider_Tensorrt, _In_ OrtSessionOptions* options, int device_id);
+
+#ifdef __cplusplus
+}
+#endif
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime.so Changed

 
-(symlink to libonnxruntime.so.1.17.1)
+(symlink to libonnxruntime.so.1.15.1)
​

onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime.so.1.15.1 Added

onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_cuda.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_cuda.so Changed

onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_shared.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_shared.so Changed

onnxruntime-linux-x64-gpu-1.17.1.tgz/lib/libonnxruntime_providers_tensorrt.so -> onnxruntime-linux-x64-gpu-1.15.1.tgz/lib/libonnxruntime_providers_tensorrt.so Changed

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core Deleted

 
-(directory)
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers Deleted

 
-(directory)
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda Deleted

 
-(directory)
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda/cuda_context.h Deleted

@@ -1,97 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-// This header is to expose a context for cuda custom ops.
-// By the context, a custom cuda operator could fetch existing resources,
-// such as cuda stream and cudnn handle, for reusing.
-
-// For concrete usage, pls find page here:
-// https://onnxruntime.ai/docs/reference/operators/add-custom-op.html#custom-ops-for-cuda-and-rocm
-
-#pragma once
-
-#define ORT_CUDA_CTX
-
-#include "cuda_resource.h"
-#include "core/providers/custom_op_context.h"
-#include <cuda.h>
-#include <cuda_runtime.h>
-#include <cublas_v2.h>
-#include <cudnn.h>
-
-namespace Ort {
-
-namespace Custom {
-
-struct CudaContext : public CustomOpContext {
-  cudaStream_t cuda_stream = {};
-  cudnnHandle_t cudnn_handle = {};
-  cublasHandle_t cublas_handle = {};
-  OrtAllocator* deferred_cpu_allocator = {};
-  // below are cuda ep options
-  int16_t device_id = 0;
-  int32_t arena_extend_strategy = 0;
-  int32_t cudnn_conv_algo_search = 0;
-  bool cudnn_conv_use_max_workspace = true;
-  bool cudnn_conv1d_pad_to_nc1d = false;
-  bool enable_skip_layer_norm_strict_mode = false;
-  bool prefer_nhwc = false;
-
-  void Init(const OrtKernelContext& kernel_ctx) {
-    cuda_stream = FetchResource<cudaStream_t>(kernel_ctx, CudaResource::cuda_stream_t);
-    cudnn_handle = FetchResource<cudnnHandle_t>(kernel_ctx, CudaResource::cudnn_handle_t);
-    cublas_handle = FetchResource<cublasHandle_t>(kernel_ctx, CudaResource::cublas_handle_t);
-    deferred_cpu_allocator = FetchResource<OrtAllocator*>(kernel_ctx, CudaResource::deferred_cpu_allocator_t);
-
-    device_id = FetchResource<int16_t>(kernel_ctx, CudaResource::device_id_t);
-    arena_extend_strategy = FetchResource<int32_t>(kernel_ctx, CudaResource::arena_extend_strategy_t);
-    cudnn_conv_algo_search = FetchResource<int32_t>(kernel_ctx, CudaResource::cudnn_conv_algo_search_t);
-    cudnn_conv_use_max_workspace = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv_use_max_workspace_t);
-
-    cudnn_conv1d_pad_to_nc1d = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv1d_pad_to_nc1d_t);
-    enable_skip_layer_norm_strict_mode = FetchResource<bool>(kernel_ctx, CudaResource::enable_skip_layer_norm_strict_mode_t);
-    prefer_nhwc = FetchResource<bool>(kernel_ctx, CudaResource::prefer_nhwc_t);
-  }
-
-  template <typename T>
-  T FetchResource(const OrtKernelContext& kernel_ctx, CudaResource resource_type) {
-    if (sizeof(T) > sizeof(void*)) {
-      ORT_CXX_API_THROW("void* is not large enough to hold resource type: " + std::to_string(resource_type), OrtErrorCode::ORT_INVALID_ARGUMENT);
-    }
-    const auto& ort_api = Ort::GetApi();
-    void* resource = {};
-    OrtStatus* status = ort_api.KernelContext_GetResource(&kernel_ctx, ORT_CUDA_RESOUCE_VERSION, resource_type, &resource);
-    if (status) {
-      ORT_CXX_API_THROW("Failed to fetch cuda ep resource, resouce type: " + std::to_string(resource_type), OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-    }
-    T t = {};
-    memcpy(&t, &resource, sizeof(T));
-    return t;
-  }
-
-  void* AllocDeferredCpuMem(size_t size) const {
-    if (0 == size) {
-      return {};
-    }
-    const auto& ort_api = Ort::GetApi();
-    void* mem = {};
-    auto status = ort_api.AllocatorAlloc(deferred_cpu_allocator, size, &mem);
-    if (status) {
-      ORT_CXX_API_THROW("failed to allocate deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-    }
-    return mem;
-  }
-
-  void FreeDeferredCpuMem(void* mem) const {
-    if (mem) {
-      const auto& ort_api = Ort::GetApi();
-      auto status = ort_api.AllocatorFree(deferred_cpu_allocator, mem);
-      if (status) {
-        ORT_CXX_API_THROW("failed to free deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-      }
-    }
-  }
-};
-
-}  // namespace Custom
-}  // namespace Ort

 
@@ -1,97 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-// This header is to expose a context for cuda custom ops.
-// By the context, a custom cuda operator could fetch existing resources,
-// such as cuda stream and cudnn handle, for reusing.
-
-// For concrete usage, pls find page here:
-// https://onnxruntime.ai/docs/reference/operators/add-custom-op.html#custom-ops-for-cuda-and-rocm
-
-#pragma once
-
-#define ORT_CUDA_CTX
-
-#include "cuda_resource.h"
-#include "core/providers/custom_op_context.h"
-#include <cuda.h>
-#include <cuda_runtime.h>
-#include <cublas_v2.h>
-#include <cudnn.h>
-
-namespace Ort {
-
-namespace Custom {
-
-struct CudaContext : public CustomOpContext {
-  cudaStream_t cuda_stream = {};
-  cudnnHandle_t cudnn_handle = {};
-  cublasHandle_t cublas_handle = {};
-  OrtAllocator* deferred_cpu_allocator = {};
-  // below are cuda ep options
-  int16_t device_id = 0;
-  int32_t arena_extend_strategy = 0;
-  int32_t cudnn_conv_algo_search = 0;
-  bool cudnn_conv_use_max_workspace = true;
-  bool cudnn_conv1d_pad_to_nc1d = false;
-  bool enable_skip_layer_norm_strict_mode = false;
-  bool prefer_nhwc = false;
-
-  void Init(const OrtKernelContext& kernel_ctx) {
-    cuda_stream = FetchResource<cudaStream_t>(kernel_ctx, CudaResource::cuda_stream_t);
-    cudnn_handle = FetchResource<cudnnHandle_t>(kernel_ctx, CudaResource::cudnn_handle_t);
-    cublas_handle = FetchResource<cublasHandle_t>(kernel_ctx, CudaResource::cublas_handle_t);
-    deferred_cpu_allocator = FetchResource<OrtAllocator*>(kernel_ctx, CudaResource::deferred_cpu_allocator_t);
-
-    device_id = FetchResource<int16_t>(kernel_ctx, CudaResource::device_id_t);
-    arena_extend_strategy = FetchResource<int32_t>(kernel_ctx, CudaResource::arena_extend_strategy_t);
-    cudnn_conv_algo_search = FetchResource<int32_t>(kernel_ctx, CudaResource::cudnn_conv_algo_search_t);
-    cudnn_conv_use_max_workspace = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv_use_max_workspace_t);
-
-    cudnn_conv1d_pad_to_nc1d = FetchResource<bool>(kernel_ctx, CudaResource::cudnn_conv1d_pad_to_nc1d_t);
-    enable_skip_layer_norm_strict_mode = FetchResource<bool>(kernel_ctx, CudaResource::enable_skip_layer_norm_strict_mode_t);
-    prefer_nhwc = FetchResource<bool>(kernel_ctx, CudaResource::prefer_nhwc_t);
-  }
-
-  template <typename T>
-  T FetchResource(const OrtKernelContext& kernel_ctx, CudaResource resource_type) {
-    if (sizeof(T) > sizeof(void*)) {
-      ORT_CXX_API_THROW("void* is not large enough to hold resource type: " + std::to_string(resource_type), OrtErrorCode::ORT_INVALID_ARGUMENT);
-    }
-    const auto& ort_api = Ort::GetApi();
-    void* resource = {};
-    OrtStatus* status = ort_api.KernelContext_GetResource(&kernel_ctx, ORT_CUDA_RESOUCE_VERSION, resource_type, &resource);
-    if (status) {
-      ORT_CXX_API_THROW("Failed to fetch cuda ep resource, resouce type: " + std::to_string(resource_type), OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-    }
-    T t = {};
-    memcpy(&t, &resource, sizeof(T));
-    return t;
-  }
-
-  void* AllocDeferredCpuMem(size_t size) const {
-    if (0 == size) {
-      return {};
-    }
-    const auto& ort_api = Ort::GetApi();
-    void* mem = {};
-    auto status = ort_api.AllocatorAlloc(deferred_cpu_allocator, size, &mem);
-    if (status) {
-      ORT_CXX_API_THROW("failed to allocate deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-    }
-    return mem;
-  }
-
-  void FreeDeferredCpuMem(void* mem) const {
-    if (mem) {
-      const auto& ort_api = Ort::GetApi();
-      auto status = ort_api.AllocatorFree(deferred_cpu_allocator, mem);
-      if (status) {
-        ORT_CXX_API_THROW("failed to free deferred cpu memory", OrtErrorCode::ORT_RUNTIME_EXCEPTION);
-      }
-    }
-  }
-};
-
-}  // namespace Custom
-}  // namespace Ort
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/cuda/cuda_resource.h Deleted

 
@@ -1,21 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-#include "core/providers/resource.h"
-
-#define ORT_CUDA_RESOUCE_VERSION 3
-
-enum CudaResource : int {
-  cuda_stream_t = cuda_resource_offset,  // 10000
-  cudnn_handle_t,
-  cublas_handle_t,
-  deferred_cpu_allocator_t,
-  // below are cuda ep options
-  device_id_t,  // 10004
-  arena_extend_strategy_t,
-  cudnn_conv_algo_search_t,
-  cudnn_conv_use_max_workspace_t,
-  cudnn_conv1d_pad_to_nc1d_t,
-  enable_skip_layer_norm_strict_mode_t,
-  prefer_nhwc_t,
-};
\ No newline at end of file
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/custom_op_context.h Deleted

 
@@ -1,10 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-#pragma once
-
-// CustomOpContext defines an interface allowing a custom op to access ep-specific resources.
-struct CustomOpContext {
-  CustomOpContext() = default;
-  virtual ~CustomOpContext(){};
-};
\ No newline at end of file
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/core/providers/resource.h Deleted

 
@@ -1,14 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-#pragma once
-
-enum ResourceOffset {
-  cpu_resource_offset = 0,
-  cuda_resource_offset = 10000,
-  dml_resource_offset = 20000,
-  rocm_resource_offset = 30000,
-  // offsets for other ort eps
-  custom_ep_resource_offset = 10000000,
-  // offsets for customized eps
-};
\ No newline at end of file
​

onnxruntime-linux-x64-gpu-1.17.1.tgz/include/onnxruntime_float16.h Deleted

@@ -1,540 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-#pragma once
-
-#include <stdint.h>
-#include <cmath>
-#include <cstring>
-#include <limits>
-
-namespace onnxruntime_float16 {
-
-namespace detail {
-
-enum class endian {
-#if defined(_WIN32)
-  little = 0,
-  big = 1,
-  native = little,
-#elif defined(__GNUC__) || defined(__clang__)
-  little = __ORDER_LITTLE_ENDIAN__,
-  big = __ORDER_BIG_ENDIAN__,
-  native = __BYTE_ORDER__,
-#else
-#error onnxruntime_float16::detail::endian is not implemented in this environment.
-#endif
-};
-
-static_assert(
-    endian::native == endian::little || endian::native == endian::big,
-    "Only little-endian or big-endian native byte orders are supported.");
-
-}  // namespace detail
-
-/// <summary>
-/// Shared implementation between public and internal classes. CRTP pattern.
-/// </summary>
-template <class Derived>
-struct Float16Impl {
- protected:
-  /// <summary>
-  /// Converts from float to uint16_t float16 representation
-  /// </summary>
-  /// <param name="v"></param>
-  /// <returns></returns>
-  constexpr static uint16_t ToUint16Impl(float v) noexcept;
-
-  /// <summary>
-  /// Converts float16 to float
-  /// </summary>
-  /// <returns>float representation of float16 value</returns>
-  float ToFloatImpl() const noexcept;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  uint16_t AbsImpl() const noexcept {
-    return static_cast<uint16_t>(val & ~kSignMask);
-  }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  uint16_t NegateImpl() const noexcept {
-    return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
-  }
-
- public:
-  // uint16_t special values
-  static constexpr uint16_t kSignMask = 0x8000U;
-  static constexpr uint16_t kBiasedExponentMask = 0x7C00U;
-  static constexpr uint16_t kPositiveInfinityBits = 0x7C00U;
-  static constexpr uint16_t kNegativeInfinityBits = 0xFC00U;
-  static constexpr uint16_t kPositiveQNaNBits = 0x7E00U;
-  static constexpr uint16_t kNegativeQNaNBits = 0xFE00U;
-  static constexpr uint16_t kEpsilonBits = 0x4170U;
-  static constexpr uint16_t kMinValueBits = 0xFBFFU;  // Minimum normal number
-  static constexpr uint16_t kMaxValueBits = 0x7BFFU;  // Largest normal number
-  static constexpr uint16_t kOneBits = 0x3C00U;
-  static constexpr uint16_t kMinusOneBits = 0xBC00U;
-
-  uint16_t val{0};
-
-  Float16Impl() = default;
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  bool IsNegative() const noexcept {
-    return static_cast<int16_t>(val) < 0;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  bool IsNaN() const noexcept {
-    return AbsImpl() > kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  bool IsFinite() const noexcept {
-    return AbsImpl() < kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  bool IsPositiveInfinity() const noexcept {
-    return val == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  bool IsNegativeInfinity() const noexcept {
-    return val == kNegativeInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  bool IsInfinity() const noexcept {
-    return AbsImpl() == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  bool IsNaNOrZero() const noexcept {
-    auto abs = AbsImpl();
-    return (abs == 0 || abs > kPositiveInfinityBits);
-  }
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsNormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) != 0);  // is not subnormal (has a non-zero exponent)
-  }
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsSubnormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) == 0);  // is subnormal (has a zero exponent)
-  }
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  static bool AreZero(const Float16Impl& lhs, const Float16Impl& rhs) noexcept {
-    return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
-  }
-
-  bool operator==(const Float16Impl& rhs) const noexcept {
-    if (IsNaN() || rhs.IsNaN()) {
-      // IEEE defines that NaN is not equal to anything, including itself.
-      return false;
-    }
-    return val == rhs.val;
-  }
-
-  bool operator!=(const Float16Impl& rhs) const noexcept { return !(*this == rhs); }
-
-  bool operator<(const Float16Impl& rhs) const noexcept {
-    if (IsNaN() || rhs.IsNaN()) {
-      // IEEE defines that NaN is unordered with respect to everything, including itself.
-      return false;
-    }
-
-    const bool left_is_negative = IsNegative();
-    if (left_is_negative != rhs.IsNegative()) {
-      // When the signs of left and right differ, we know that left is less than right if it is
-      // the negative value. The exception to this is if both values are zero, in which case IEEE
-      // says they should be equal, even if the signs differ.
-      return left_is_negative && !AreZero(*this, rhs);
-    }
-    return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
-  }
-};
-
-// The following Float16_t conversions are based on the code from
-// Eigen library.
-
-// The conversion routines are Copyright (c) Fabian Giesen, 2016.
-// The original license follows:
-//
-// Copyright (c) Fabian Giesen, 2016
-// All rights reserved.
-// Redistribution and use in source and binary forms, with or without
-// modification, are permitted.
-// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-// HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-namespace detail {
-union float32_bits {
-  unsigned int u;
-  float f;
-};
-}  // namespace detail
-
-template <class Derived>
-inline constexpr uint16_t Float16Impl<Derived>::ToUint16Impl(float v) noexcept {
-  detail::float32_bits f{};
-  f.f = v;
-
-  constexpr detail::float32_bits f32infty = {255 << 23};
-  constexpr detail::float32_bits f16max = {(127 + 16) << 23};
-  constexpr detail::float32_bits denorm_magic = {((127 - 15) + (23 - 10) + 1) << 23};
-  constexpr unsigned int sign_mask = 0x80000000u;
-  uint16_t val = static_cast<uint16_t>(0x0u);
-
-  unsigned int sign = f.u & sign_mask;
-  f.u ^= sign;
-
-  // NOTE all the integer compares in this function can be safely
-  // compiled into signed compares since all operands are below
-  // 0x80000000. Important if you want fast straight SSE2 code
-  // (since there's no unsigned PCMPGTD).
-
-  if (f.u >= f16max.u) {                         // result is Inf or NaN (all exponent bits set)
-    val = (f.u > f32infty.u) ? 0x7e00 : 0x7c00;  // NaN->qNaN and Inf->Inf
-  } else {                                       // (De)normalized number or zero
-    if (f.u < (113 << 23)) {                     // resulting FP16 is subnormal or zero
-      // use a magic value to align our 10 mantissa bits at the bottom of
-      // the float. as long as FP addition is round-to-nearest-even this
-      // just works.
-      f.f += denorm_magic.f;
-
-      // and one integer subtract of the bias later, we have our final float!
-      val = static_cast<uint16_t>(f.u - denorm_magic.u);
-    } else {
-      unsigned int mant_odd = (f.u >> 13) & 1;  // resulting mantissa is odd
-
-      // update exponent, rounding bias part 1
-      // Equivalent to `f.u += ((unsigned int)(15 - 127) << 23) + 0xfff`, but
-      // without arithmetic overflow.
-      f.u += 0xc8000fffU;
-      // rounding bias part 2
-      f.u += mant_odd;
-      // take the bits!
-      val = static_cast<uint16_t>(f.u >> 13);
-    }
-  }
-
-  val |= static_cast<uint16_t>(sign >> 16);
-  return val;
-}
-
-template <class Derived>
-inline float Float16Impl<Derived>::ToFloatImpl() const noexcept {
-  constexpr detail::float32_bits magic = {113 << 23};
-  constexpr unsigned int shifted_exp = 0x7c00 << 13;  // exponent mask after shift
-  detail::float32_bits o{};
-
-  o.u = (val & 0x7fff) << 13;            // exponent/mantissa bits
-  unsigned int exp = shifted_exp & o.u;  // just the exponent
-  o.u += (127 - 15) << 23;               // exponent adjust
-
-  // handle exponent special cases
-  if (exp == shifted_exp) {   // Inf/NaN?
-    o.u += (128 - 16) << 23;  // extra exp adjust
-  } else if (exp == 0) {      // Zero/Denormal?
-    o.u += 1 << 23;           // extra exp adjust
-    o.f -= magic.f;           // re-normalize
-  }
-
-  // Attempt to workaround the Internal Compiler Error on ARM64
-  // for bitwise | operator, including std::bitset
-#if (defined _MSC_VER) && (defined _M_ARM || defined _M_ARM64 || defined _M_ARM64EC)
-  if (IsNegative()) {
-    return -o.f;
-  }
-#else
-  // original code:
-  o.u |= (val & 0x8000U) << 16U;  // sign bit
-#endif
-  return o.f;
-}
-
-/// Shared implementation between public and internal classes. CRTP pattern.
-template <class Derived>
-struct BFloat16Impl {
- protected:
-  /// <summary>
-  /// Converts from float to uint16_t float16 representation
-  /// </summary>
-  /// <param name="v"></param>
-  /// <returns></returns>
-  static uint16_t ToUint16Impl(float v) noexcept;
-
-  /// <summary>
-  /// Converts bfloat16 to float
-  /// </summary>
-  /// <returns>float representation of bfloat16 value</returns>
-  float ToFloatImpl() const noexcept;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  uint16_t AbsImpl() const noexcept {
-    return static_cast<uint16_t>(val & ~kSignMask);
-  }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  uint16_t NegateImpl() const noexcept {
-    return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
-  }
-
- public:
-  // uint16_t special values
-  static constexpr uint16_t kSignMask = 0x8000U;
-  static constexpr uint16_t kBiasedExponentMask = 0x7F80U;
-  static constexpr uint16_t kPositiveInfinityBits = 0x7F80U;
-  static constexpr uint16_t kNegativeInfinityBits = 0xFF80U;
-  static constexpr uint16_t kPositiveQNaNBits = 0x7FC1U;
-  static constexpr uint16_t kNegativeQNaNBits = 0xFFC1U;
-  static constexpr uint16_t kSignaling_NaNBits = 0x7F80U;
-  static constexpr uint16_t kEpsilonBits = 0x0080U;
-  static constexpr uint16_t kMinValueBits = 0xFF7FU;
-  static constexpr uint16_t kMaxValueBits = 0x7F7FU;
-  static constexpr uint16_t kRoundToNearest = 0x7FFFU;
-  static constexpr uint16_t kOneBits = 0x3F80U;
-  static constexpr uint16_t kMinusOneBits = 0xBF80U;
-
-  uint16_t val{0};
-
-  BFloat16Impl() = default;
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  bool IsNegative() const noexcept {
-    return static_cast<int16_t>(val) < 0;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  bool IsNaN() const noexcept {
-    return AbsImpl() > kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  bool IsFinite() const noexcept {
-    return AbsImpl() < kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  bool IsPositiveInfinity() const noexcept {
-    return val == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  bool IsNegativeInfinity() const noexcept {
-    return val == kNegativeInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  bool IsInfinity() const noexcept {
-    return AbsImpl() == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  bool IsNaNOrZero() const noexcept {
-    auto abs = AbsImpl();
-    return (abs == 0 || abs > kPositiveInfinityBits);
-  }
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsNormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) != 0);  // is not subnormal (has a non-zero exponent)
-  }
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsSubnormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) == 0);  // is subnormal (has a zero exponent)
-  }
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  static bool AreZero(const BFloat16Impl& lhs, const BFloat16Impl& rhs) noexcept {
-    // IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-    // for two values by or'ing the private bits together and stripping the sign. They are both zero,
-    // and therefore equivalent, if the resulting value is still zero.
-    return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
-  }
-};
-
-template <class Derived>
-inline uint16_t BFloat16Impl<Derived>::ToUint16Impl(float v) noexcept {
-  uint16_t result;
-  if (std::isnan(v)) {
-    result = kPositiveQNaNBits;
-  } else {
-    auto get_msb_half = (float fl) {
-      uint16_t result;
-#ifdef __cpp_if_constexpr
-      if constexpr (detail::endian::native == detail::endian::little) {
-#else
-      if (detail::endian::native == detail::endian::little) {
-#endif
-        std::memcpy(&result, reinterpret_cast<char*>(&fl) + sizeof(uint16_t), sizeof(uint16_t));
-      } else {
-        std::memcpy(&result, &fl, sizeof(uint16_t));
-      }
-      return result;
-    };
-
-    uint16_t upper_bits = get_msb_half(v);
-    union {
-      uint32_t U32;
-      float F32;
-    };
-    F32 = v;
-    U32 += (upper_bits & 1) + kRoundToNearest;
-    result = get_msb_half(F32);
-  }
-  return result;
-}
-
-template <class Derived>
-inline float BFloat16Impl<Derived>::ToFloatImpl() const noexcept {
-  if (IsNaN()) {
-    return std::numeric_limits<float>::quiet_NaN();
-  }
-  float result;
-  char* const first = reinterpret_cast<char*>(&result);
-  char* const second = first + sizeof(uint16_t);
-#ifdef __cpp_if_constexpr
-  if constexpr (detail::endian::native == detail::endian::little) {
-#else
-  if (detail::endian::native == detail::endian::little) {
-#endif
-    std::memset(first, 0, sizeof(uint16_t));
-    std::memcpy(second, &val, sizeof(uint16_t));
-  } else {
-    std::memcpy(first, &val, sizeof(uint16_t));
-    std::memset(second, 0, sizeof(uint16_t));
-  }
-  return result;
-}
-
-}  // namespace onnxruntime_float16

 
@@ -1,540 +0,0 @@
-// Copyright (c) Microsoft Corporation. All rights reserved.
-// Licensed under the MIT License.
-
-#pragma once
-
-#include <stdint.h>
-#include <cmath>
-#include <cstring>
-#include <limits>
-
-namespace onnxruntime_float16 {
-
-namespace detail {
-
-enum class endian {
-#if defined(_WIN32)
-  little = 0,
-  big = 1,
-  native = little,
-#elif defined(__GNUC__) || defined(__clang__)
-  little = __ORDER_LITTLE_ENDIAN__,
-  big = __ORDER_BIG_ENDIAN__,
-  native = __BYTE_ORDER__,
-#else
-#error onnxruntime_float16::detail::endian is not implemented in this environment.
-#endif
-};
-
-static_assert(
-    endian::native == endian::little || endian::native == endian::big,
-    "Only little-endian or big-endian native byte orders are supported.");
-
-}  // namespace detail
-
-/// <summary>
-/// Shared implementation between public and internal classes. CRTP pattern.
-/// </summary>
-template <class Derived>
-struct Float16Impl {
- protected:
-  /// <summary>
-  /// Converts from float to uint16_t float16 representation
-  /// </summary>
-  /// <param name="v"></param>
-  /// <returns></returns>
-  constexpr static uint16_t ToUint16Impl(float v) noexcept;
-
-  /// <summary>
-  /// Converts float16 to float
-  /// </summary>
-  /// <returns>float representation of float16 value</returns>
-  float ToFloatImpl() const noexcept;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  uint16_t AbsImpl() const noexcept {
-    return static_cast<uint16_t>(val & ~kSignMask);
-  }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  uint16_t NegateImpl() const noexcept {
-    return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
-  }
-
- public:
-  // uint16_t special values
-  static constexpr uint16_t kSignMask = 0x8000U;
-  static constexpr uint16_t kBiasedExponentMask = 0x7C00U;
-  static constexpr uint16_t kPositiveInfinityBits = 0x7C00U;
-  static constexpr uint16_t kNegativeInfinityBits = 0xFC00U;
-  static constexpr uint16_t kPositiveQNaNBits = 0x7E00U;
-  static constexpr uint16_t kNegativeQNaNBits = 0xFE00U;
-  static constexpr uint16_t kEpsilonBits = 0x4170U;
-  static constexpr uint16_t kMinValueBits = 0xFBFFU;  // Minimum normal number
-  static constexpr uint16_t kMaxValueBits = 0x7BFFU;  // Largest normal number
-  static constexpr uint16_t kOneBits = 0x3C00U;
-  static constexpr uint16_t kMinusOneBits = 0xBC00U;
-
-  uint16_t val{0};
-
-  Float16Impl() = default;
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  bool IsNegative() const noexcept {
-    return static_cast<int16_t>(val) < 0;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  bool IsNaN() const noexcept {
-    return AbsImpl() > kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  bool IsFinite() const noexcept {
-    return AbsImpl() < kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  bool IsPositiveInfinity() const noexcept {
-    return val == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  bool IsNegativeInfinity() const noexcept {
-    return val == kNegativeInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  bool IsInfinity() const noexcept {
-    return AbsImpl() == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  bool IsNaNOrZero() const noexcept {
-    auto abs = AbsImpl();
-    return (abs == 0 || abs > kPositiveInfinityBits);
-  }
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsNormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) != 0);  // is not subnormal (has a non-zero exponent)
-  }
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsSubnormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) == 0);  // is subnormal (has a zero exponent)
-  }
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  static bool AreZero(const Float16Impl& lhs, const Float16Impl& rhs) noexcept {
-    return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
-  }
-
-  bool operator==(const Float16Impl& rhs) const noexcept {
-    if (IsNaN() || rhs.IsNaN()) {
-      // IEEE defines that NaN is not equal to anything, including itself.
-      return false;
-    }
-    return val == rhs.val;
-  }
-
-  bool operator!=(const Float16Impl& rhs) const noexcept { return !(*this == rhs); }
-
-  bool operator<(const Float16Impl& rhs) const noexcept {
-    if (IsNaN() || rhs.IsNaN()) {
-      // IEEE defines that NaN is unordered with respect to everything, including itself.
-      return false;
-    }
-
-    const bool left_is_negative = IsNegative();
-    if (left_is_negative != rhs.IsNegative()) {
-      // When the signs of left and right differ, we know that left is less than right if it is
-      // the negative value. The exception to this is if both values are zero, in which case IEEE
-      // says they should be equal, even if the signs differ.
-      return left_is_negative && !AreZero(*this, rhs);
-    }
-    return (val != rhs.val) && ((val < rhs.val) ^ left_is_negative);
-  }
-};
-
-// The following Float16_t conversions are based on the code from
-// Eigen library.
-
-// The conversion routines are Copyright (c) Fabian Giesen, 2016.
-// The original license follows:
-//
-// Copyright (c) Fabian Giesen, 2016
-// All rights reserved.
-// Redistribution and use in source and binary forms, with or without
-// modification, are permitted.
-// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-// HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-namespace detail {
-union float32_bits {
-  unsigned int u;
-  float f;
-};
-}  // namespace detail
-
-template <class Derived>
-inline constexpr uint16_t Float16Impl<Derived>::ToUint16Impl(float v) noexcept {
-  detail::float32_bits f{};
-  f.f = v;
-
-  constexpr detail::float32_bits f32infty = {255 << 23};
-  constexpr detail::float32_bits f16max = {(127 + 16) << 23};
-  constexpr detail::float32_bits denorm_magic = {((127 - 15) + (23 - 10) + 1) << 23};
-  constexpr unsigned int sign_mask = 0x80000000u;
-  uint16_t val = static_cast<uint16_t>(0x0u);
-
-  unsigned int sign = f.u & sign_mask;
-  f.u ^= sign;
-
-  // NOTE all the integer compares in this function can be safely
-  // compiled into signed compares since all operands are below
-  // 0x80000000. Important if you want fast straight SSE2 code
-  // (since there's no unsigned PCMPGTD).
-
-  if (f.u >= f16max.u) {                         // result is Inf or NaN (all exponent bits set)
-    val = (f.u > f32infty.u) ? 0x7e00 : 0x7c00;  // NaN->qNaN and Inf->Inf
-  } else {                                       // (De)normalized number or zero
-    if (f.u < (113 << 23)) {                     // resulting FP16 is subnormal or zero
-      // use a magic value to align our 10 mantissa bits at the bottom of
-      // the float. as long as FP addition is round-to-nearest-even this
-      // just works.
-      f.f += denorm_magic.f;
-
-      // and one integer subtract of the bias later, we have our final float!
-      val = static_cast<uint16_t>(f.u - denorm_magic.u);
-    } else {
-      unsigned int mant_odd = (f.u >> 13) & 1;  // resulting mantissa is odd
-
-      // update exponent, rounding bias part 1
-      // Equivalent to `f.u += ((unsigned int)(15 - 127) << 23) + 0xfff`, but
-      // without arithmetic overflow.
-      f.u += 0xc8000fffU;
-      // rounding bias part 2
-      f.u += mant_odd;
-      // take the bits!
-      val = static_cast<uint16_t>(f.u >> 13);
-    }
-  }
-
-  val |= static_cast<uint16_t>(sign >> 16);
-  return val;
-}
-
-template <class Derived>
-inline float Float16Impl<Derived>::ToFloatImpl() const noexcept {
-  constexpr detail::float32_bits magic = {113 << 23};
-  constexpr unsigned int shifted_exp = 0x7c00 << 13;  // exponent mask after shift
-  detail::float32_bits o{};
-
-  o.u = (val & 0x7fff) << 13;            // exponent/mantissa bits
-  unsigned int exp = shifted_exp & o.u;  // just the exponent
-  o.u += (127 - 15) << 23;               // exponent adjust
-
-  // handle exponent special cases
-  if (exp == shifted_exp) {   // Inf/NaN?
-    o.u += (128 - 16) << 23;  // extra exp adjust
-  } else if (exp == 0) {      // Zero/Denormal?
-    o.u += 1 << 23;           // extra exp adjust
-    o.f -= magic.f;           // re-normalize
-  }
-
-  // Attempt to workaround the Internal Compiler Error on ARM64
-  // for bitwise | operator, including std::bitset
-#if (defined _MSC_VER) && (defined _M_ARM || defined _M_ARM64 || defined _M_ARM64EC)
-  if (IsNegative()) {
-    return -o.f;
-  }
-#else
-  // original code:
-  o.u |= (val & 0x8000U) << 16U;  // sign bit
-#endif
-  return o.f;
-}
-
-/// Shared implementation between public and internal classes. CRTP pattern.
-template <class Derived>
-struct BFloat16Impl {
- protected:
-  /// <summary>
-  /// Converts from float to uint16_t float16 representation
-  /// </summary>
-  /// <param name="v"></param>
-  /// <returns></returns>
-  static uint16_t ToUint16Impl(float v) noexcept;
-
-  /// <summary>
-  /// Converts bfloat16 to float
-  /// </summary>
-  /// <returns>float representation of bfloat16 value</returns>
-  float ToFloatImpl() const noexcept;
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  uint16_t AbsImpl() const noexcept {
-    return static_cast<uint16_t>(val & ~kSignMask);
-  }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  uint16_t NegateImpl() const noexcept {
-    return IsNaN() ? val : static_cast<uint16_t>(val ^ kSignMask);
-  }
-
- public:
-  // uint16_t special values
-  static constexpr uint16_t kSignMask = 0x8000U;
-  static constexpr uint16_t kBiasedExponentMask = 0x7F80U;
-  static constexpr uint16_t kPositiveInfinityBits = 0x7F80U;
-  static constexpr uint16_t kNegativeInfinityBits = 0xFF80U;
-  static constexpr uint16_t kPositiveQNaNBits = 0x7FC1U;
-  static constexpr uint16_t kNegativeQNaNBits = 0xFFC1U;
-  static constexpr uint16_t kSignaling_NaNBits = 0x7F80U;
-  static constexpr uint16_t kEpsilonBits = 0x0080U;
-  static constexpr uint16_t kMinValueBits = 0xFF7FU;
-  static constexpr uint16_t kMaxValueBits = 0x7F7FU;
-  static constexpr uint16_t kRoundToNearest = 0x7FFFU;
-  static constexpr uint16_t kOneBits = 0x3F80U;
-  static constexpr uint16_t kMinusOneBits = 0xBF80U;
-
-  uint16_t val{0};
-
-  BFloat16Impl() = default;
-
-  /// <summary>
-  /// Checks if the value is negative
-  /// </summary>
-  /// <returns>true if negative</returns>
-  bool IsNegative() const noexcept {
-    return static_cast<int16_t>(val) < 0;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN
-  /// </summary>
-  /// <returns>true if NaN</returns>
-  bool IsNaN() const noexcept {
-    return AbsImpl() > kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is finite
-  /// </summary>
-  /// <returns>true if finite</returns>
-  bool IsFinite() const noexcept {
-    return AbsImpl() < kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents positive infinity.
-  /// </summary>
-  /// <returns>true if positive infinity</returns>
-  bool IsPositiveInfinity() const noexcept {
-    return val == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value represents negative infinity
-  /// </summary>
-  /// <returns>true if negative infinity</returns>
-  bool IsNegativeInfinity() const noexcept {
-    return val == kNegativeInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is either positive or negative infinity.
-  /// </summary>
-  /// <returns>True if absolute value is infinity</returns>
-  bool IsInfinity() const noexcept {
-    return AbsImpl() == kPositiveInfinityBits;
-  }
-
-  /// <summary>
-  /// Tests if the value is NaN or zero. Useful for comparisons.
-  /// </summary>
-  /// <returns>True if NaN or zero.</returns>
-  bool IsNaNOrZero() const noexcept {
-    auto abs = AbsImpl();
-    return (abs == 0 || abs > kPositiveInfinityBits);
-  }
-
-  /// <summary>
-  /// Tests if the value is normal (not zero, subnormal, infinite, or NaN).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsNormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) != 0);  // is not subnormal (has a non-zero exponent)
-  }
-
-  /// <summary>
-  /// Tests if the value is subnormal (denormal).
-  /// </summary>
-  /// <returns>True if so</returns>
-  bool IsSubnormal() const noexcept {
-    auto abs = AbsImpl();
-    return (abs < kPositiveInfinityBits)           // is finite
-           && (abs != 0)                           // is not zero
-           && ((abs & kBiasedExponentMask) == 0);  // is subnormal (has a zero exponent)
-  }
-
-  /// <summary>
-  /// Creates an instance that represents absolute value.
-  /// </summary>
-  /// <returns>Absolute value</returns>
-  Derived Abs() const noexcept { return Derived::FromBits(AbsImpl()); }
-
-  /// <summary>
-  /// Creates a new instance with the sign flipped.
-  /// </summary>
-  /// <returns>Flipped sign instance</returns>
-  Derived Negate() const noexcept { return Derived::FromBits(NegateImpl()); }
-
-  /// <summary>
-  /// IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-  /// for two values by or'ing the private bits together and stripping the sign. They are both zero,
-  /// and therefore equivalent, if the resulting value is still zero.
-  /// </summary>
-  /// <param name="lhs">first value</param>
-  /// <param name="rhs">second value</param>
-  /// <returns>True if both arguments represent zero</returns>
-  static bool AreZero(const BFloat16Impl& lhs, const BFloat16Impl& rhs) noexcept {
-    // IEEE defines that positive and negative zero are equal, this gives us a quick equality check
-    // for two values by or'ing the private bits together and stripping the sign. They are both zero,
-    // and therefore equivalent, if the resulting value is still zero.
-    return static_cast<uint16_t>((lhs.val | rhs.val) & ~kSignMask) == 0;
-  }
-};
-
-template <class Derived>
-inline uint16_t BFloat16Impl<Derived>::ToUint16Impl(float v) noexcept {
-  uint16_t result;
-  if (std::isnan(v)) {
-    result = kPositiveQNaNBits;
-  } else {
-    auto get_msb_half = (float fl) {
-      uint16_t result;
-#ifdef __cpp_if_constexpr
-      if constexpr (detail::endian::native == detail::endian::little) {
-#else
-      if (detail::endian::native == detail::endian::little) {
-#endif
-        std::memcpy(&result, reinterpret_cast<char*>(&fl) + sizeof(uint16_t), sizeof(uint16_t));
-      } else {
-        std::memcpy(&result, &fl, sizeof(uint16_t));
-      }
-      return result;
-    };
-
-    uint16_t upper_bits = get_msb_half(v);
-    union {
-      uint32_t U32;
-      float F32;
-    };
-    F32 = v;
-    U32 += (upper_bits & 1) + kRoundToNearest;
-    result = get_msb_half(F32);
-  }
-  return result;
-}
-
-template <class Derived>
-inline float BFloat16Impl<Derived>::ToFloatImpl() const noexcept {
-  if (IsNaN()) {
-    return std::numeric_limits<float>::quiet_NaN();
-  }
-  float result;
-  char* const first = reinterpret_cast<char*>(&result);
-  char* const second = first + sizeof(uint16_t);
-#ifdef __cpp_if_constexpr
-  if constexpr (detail::endian::native == detail::endian::little) {
-#else
-  if (detail::endian::native == detail::endian::little) {
-#endif
-    std::memset(first, 0, sizeof(uint16_t));
-    std::memcpy(second, &val, sizeof(uint16_t));
-  } else {
-    std::memcpy(first, &val, sizeof(uint16_t));
-    std::memset(second, 0, sizeof(uint16_t));
-  }
-  return result;
-}
-
-}  // namespace onnxruntime_float16
​