OML4PY 2.1 Quickstart

In the world of AI vector searches, everything is constantly changing and Oracle is no exception. In a previous post we did a walkthrough for installing OML4PY 2.0 in a VirtualBox 23FreeAI appliance in order to export the preconfigured ONNX LLMs using Python and OML4PY. OML4PY 2.1 is available for download now, and the installation is a bit different. In this post I will go through this installation using the current VirtualBox appliance.

  1. Install Environment: VirtualBox
  2. Import VM Appliance 23AI Free
  3. Fix Linux Yum Installer
  4. Upgrade Linux Package Dependencies
  5. Upgrade Python to 3.12.3 or higher
  6. Add Python Components
  7. Download and Install OML4PY 2.1
  8. Using Alternate Python Package Releases
  9. OML4PY 2.1 Show Preconfigured Models
  10. Exporting Augmented ONNX Models
  11. Export Augmented Models: Syntax
  12. Augmented ONNX Models: Download Links
  13. ONNX Models: Loading To Oracle 23ai
  14. Testing the text embedding models
  15. Afterword: Successful ONNX Export Details

Install Environment: VirtualBox

I have not had any luck with the new VirtualBox 7.1 release and the Oracle VirtualBox appliances. The main problem I have been having is with getting a connection in SQL Developer using Host Only networking. Because of this issue I am still running the older VirtualBox 7.0 Environment which is working successfully for me. I am running Oracle VirtualBox 7.0.22 r165102. You can access older VirtualBox builds here. Authors Note: (9/26/2025) Virtual Box Version 7.2.0 r170228 has fixed this issue with accessing the guest from the host using Host Only Networking, so there is no problem with updating VirtualBox to the latest version.

Set up host networking in VirtualBox to allow accessing guests from the host machine.

Import VM Appliance 23AI Free

After installing VirtualBox, download the latest Oracle 23ai appliance from the Oracle Database 23ai Free VirtualBox appliance page. The latest download is 23.7 as of February 2025.

Import the appliance, start it up and upgrade Guest Additions. Configure Host Only Networking to access the guest from SQL developer on the host machine. Share a directory from the host machine to share files with the linux guest and mount it at startup.

Test the connection to the database from the host and verify that the shared file directory is visible from the guest.

Note: Running the ONNX exports with the default 23ai appliance resulted in many of the larger model exports terminating with a killed status. Increasing the virtual RAM to 16gb and CPU count to 8 allowed the exports to complete without being killed.

Fix Linux Yum Installer

The yum server is looking for the phoenix yum server in the OCI cloud, and this won’t be visible to the virtual machine. This has to be fixed in the VM before we can install linux packages.

Start the virtual machine and open a terminal. Switch your terminal login to root (the password is ‘oracle’) and go to /etc/dnf/vars to fix the ociregion file that is pointing to the phoenix OCI region. Clearing the value in this file will make the linux machine go to the default public Oracle yum server for installing packages.

Open this file in an editor:

[oracle@localhost ~]$ su
Password: 
[root@localhost oracle]# cd /etc/dnf/vars
[root@localhost vars]# ls
ocidomain  ociregion
[root@localhost vars]# gedit ociregion

In the editor you will see that the file content is a single entry:

-us-phoenix-1

Just delete this, save the file and exit the file editor. Now the yum installer will be able to find the linux packages easily.

Upgrade Linux Package Dependencies

Next, install the necessary Linux packages to support the Python 3.12 and OML4PY installations. This command will download the packages and ask if you want to install them, reply y and wait for the install to complete.

Note: I have added gcc-c++ to the package dependencies because it is needed for the Python pip installer.

[root@localhost vars]# cd /
[root@localhost /]# yum install perl-Env libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel gcc-c++

The Linux VM is now ready to upgrade the Python version.

Upgrade Python to 3.12.3 or higher

Exit root and create a directory for the Python installation in /home/oracle.

[oracle@localhost ~]$ mkdir /home/oracle/python
[oracle@localhost ~]$ cd /home/oracle/python

Get the python installation package and extract it. For OML4PY 2.1 we need at least Python 3.12.3. The currently available version is 3.12.6.

[oracle@localhost python]$ wget https://www.python.org/ftp/python/3.12.6/Python-3.12.6.tgz
...getting archive
[oracle@localhost python]$ tar xvf Python-3.12.6.tgz
...extracting files
[oracle@localhost python]$ ls
Python-3.12.6  Python-3.12.6.tgz

The Python installation files are now in /home/oracle/python/Python-3.12.6.

Export a PREFIX variable assigned to the path and change to that directory.

Configure the Python installation, this takes a while…

[oracle@localhost python]$ pwd
/home/oracle/python
[oracle@localhost python]$ export PREFIX=`pwd`/Python-3.12.6
[oracle@localhost python]$ cd $PREFIX
[oracle@localhost Python-3.12.6]$ ./configure --prefix=$PREFIX --enable-shared --enable-optimizations
...checking for lib extensions...
...configure: creating Makefile...

Execute the makefile, this will also take a while….

[oracle@localhost Python-3.12.6]$ make clean; make
...

After this completes make the altinstall, this installs both versions of Python in the system.

[oracle@localhost Python-3.12.6]$ make altinstall
...
Installing collected packages: pip
  WARNING: The script pip3.12 is installed in '/home/oracle/python/Python-3.12.6/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-24.2

Now add this directory to the path and create the symbolic link for running Python.

[oracle@localhost python]$ cd /home/oracle/python
[oracle@localhost python]$ export PREFIX=/home/oracle/python/Python-3.12.6
[oracle@localhost Python-3.12.6]$ echo $PREFIX
/home/oracle/python/Python-3.12.6
[oracle@localhost Python-3.12.6]$ export PYTHONHOME=$PREFIX
[oracle@localhost Python-3.12.6]$ export PATH=$PYTHONHOME/bin:$PATH
[oracle@localhost Python-3.12.6]$ export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
[oracle@localhost Python-3.12.6]$ echo $PYTHONHOME
/home/oracle/python/Python-3.12.6
[oracle@localhost Python-3.12.6]$ cd $PYTHONHOME/bin
[oracle@localhost bin]$ ln -s python3.12 python3

This can also be accomplished with a shell script. Create the following script in /home/oracle/python:

linkPython3Env.sh

#! /usr/bin/bash
# to execute the commands use `source ./linkPython3Env.sh
echo "creating symbolic link for python3.12 environment"
cd /home/oracle/python
export PREFIX=/home/oracle/python/Python-3.12.6
cd $PREFIX
echo $PREFIX
export PYTHONHOME=$PREFIX
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
echo $PYTHONHOME
cd $PYTHONHOME/bin
ln -s python3.12 python3

Make the shell script executable and run with the source command so that the exports are accomplished in the current process:

chmod -x ./linkPython3Env.sh
source ./linkPython3Env.sh

Check the python version. Note: use ctl-d to exit the Python environment.

[oracle@localhost bin]$ cd /home/oracle/python
[oracle@localhost python]$ python3
Python 3.12.6 (main, Apr 20 2025, 13:54:06) [GCC 8.5.0 20210514 (Red Hat 8.5.0-26.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

If you close this terminal session and start a new terminal, these variables won’t be set and the python3 command will run Python 3.6.

[oracle@localhost python]$ python3
Python 3.6.8 (default, Dec  4 2024, 01:35:34) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-22.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

For a new session, export these variables again and python3 will run the upgraded Python version:

[oracle@localhost python]$ export PREFIX=`pwd`/Python-3.12.6
[oracle@localhost python]$ export PYTHONHOME=$PREFIX
[oracle@localhost python]$ export PATH=$PYTHONHOME/bin:$PATH
[oracle@localhost python]$ export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
[oracle@localhost python]$ python3
Python 3.12.6 (main, Apr 20 2025, 13:54:06) [GCC 8.5.0 20210514 (Red Hat 8.5.0-26.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

We can create a shell script to do this after we install OML4PY2.1.

Add Python Components

Upgrade the pip installer to the latest version:

[oracle@localhost python]$ pip install --upgrade pip
Requirement already satisfied: pip in ./Python-3.12.6/lib/python3.12/site-packages (24.2)
...
Successfully installed pip-25.1.1

The requirements for Python are a bit different for the 2.1 OML4PY client. The OML4PY documentation has also been updated to show that these PIP installs can be executed together with a requirements file.

Create a file called oml4py-requirements.txt in /oracle/home/python and add all of the python packages necessary for installing OML4PY2.1:

oml4py-requirements.txt

--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.2.2
setuptools==70.0.0
scipy==1.14.0
matplotlib==3.8.4
oracledb==2.4.1
scikit-learn==1.5.1
numpy==2.0.1
onnxruntime==1.20.0
onnxruntime-extensions==0.12.0
onnx==1.17.0
torch==2.6.0
transformers==4.49.0
sentencepiece==0.2.0

Then do the pip install specifying this requirements file. The necessary components will be downloaded and installed. Installing these components is much faster than the prerequisites install for OML4PY 2.0.

[oracle@localhost python]$ ls
oml4py-requirements.txt  Python-3.12.6  Python-3.12.6.tgz
[oracle@localhost python]$ pip3.12 install -r oml4py-requirements.txt
...

Download and Install OML4PY 2.1

Go to the downloads page for Oracle Machine Learning and download the OML4Py 2.1.0 (Database 23ai) client for Linux 64. Extract the zip file on the host machine and put the resulting client folder in the directory that is shared with the linux guest. Move to the guest machine and confirm that the files are available.

[oracle@localhost client]$ pwd
/home/oracle/ext-data/oml-install/oml4py-2.1/client
[oracle@localhost client]$ ls
client.pl  oml-2.1-cp312-cp312-linux_x86_64.whl  OML4PInstallShared.pm  oml4py.ver

In the linux guest, create /home/oracle/oml4py and copy these files.

[oracle@localhost client]$ cd /home/oracle
[oracle@localhost ~]$ mkdir ./oml4py
[oracle@localhost ~]$ cd ./oml4py
[oracle@localhost oml4py]$ cp -r /home/oracle/ext-data/oml-install/oml4py-2.1/client ./
[oracle@localhost oml4py]$ ls ./client
client.pl  oml-2.1-cp312-cp312-linux_x86_64.whl  OML4PInstallShared.pm  oml4py.ver

Run the client.pl perl install for OML4PY2.1. If you reinstall the Python package requirements, just rerun this install and it will update the OML4PY installation.

[oracle@localhost oml4py]$ pwd
/home/oracle/oml4py
[oracle@localhost oml4py]$ perl -Iclient /home/oracle/oml4py/client/client.pl -i --ask

Oracle Machine Learning for Python 2.1 Client.

Copyright (c) 2018, 2025 Oracle and/or its affiliates. All rights reserved.
Checking platform .................. Pass
Checking Python .................... Pass
Checking dependencies .............. Pass
Checking OML4P version ............. Pass
Current configuration
  Python Version ................... 3.12.6
  PYTHONHOME ....................... /home/oracle/python/Python-3.12.6
  Existing OML4P module version .... None
  Operation ........................ Install/Upgrade

Proceed? [yes]y

Processing ./client/oml-2.1-cp312-cp312-linux_x86_64.whl
Installing collected packages: oml
Successfully installed oml-2.1
Done

Validate the OML4PY installation by running python and importing the ONNXPipeline classes:

[oracle@localhost oml4py]$ python3
Python 3.12.6 (main, Apr 19 2025, 12:04:40) [GCC 8.5.0 20210514 (Red Hat 8.5.0-26.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oml.utils import ONNXPipeline, ONNXPipelineConfig

Using Alternate Python Package Releases

The first time I went through this tutorial, there were a lot of errors while exporting augmented models. These errors were due to a missing package and the error messages suggested the solution:

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`

Installing this Python component eliminated this error while exporting augmented models:

[oracle@localhost python]$ pip install hf_xet
...
Successfully installed hf_xet-1.0.3

Testing the install again showed that this can simply be added to the requirements file:

oml4py-requirements-alt.txt

--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.2.2
setuptools==70.0.0
scipy==1.14.0
matplotlib==3.8.4
oracledb==2.4.1
scikit-learn==1.5.1
numpy==2.0.1
onnxruntime==1.20.0
onnxruntime-extensions==0.12.0
onnx==1.17.0
torch==2.6.0
transformers==4.49.0
sentencepiece==0.2.0
hf_xet==1.0.3

Install the Python packages with these alternate requirements:

[oracle@localhost python]$ pip3.12 install -r oml4py-requirements-alt.txt

Then reinstall OML4PY with the perl script:

[oracle@localhost oml4py]$ pwd
/home/oracle/oml4py
[oracle@localhost oml4py]$ perl -Iclient /home/oracle/oml4py/client/client.pl -i --ask

Testing all of the preconfigured exports still had some issues, so I looked up the latest release for each Python package and used these for the requirements:

oml4py-requirements-latest.txt

--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.2.3
setuptools==70.0.0
scipy==1.15.3
matplotlib==3.10.3
oracledb==2.4.1
scikit-learn==1.6.1
numpy==2.2.0
onnxruntime==1.22.0
onnxruntime-extensions==0.14.0
onnx==1.18.0
torch==2.7.0
transformers==4.52.3
sentencepiece==0.2.0
hf_xet==1.1.2

Reinstalling Python packages with these requirements, and reinstalling OML4PY:

[oracle@localhost python]$ pip3.12 install -r oml4py-requirements-latest.txt
...
[oracle@localhost oml4py]$ pwd
/home/oracle/oml4py
[oracle@localhost oml4py]$ perl -Iclient /home/oracle/oml4py/client/client.pl -i --ask

Using requirements-latest.txt to install OML4PY2.1 eliminated most of the warnings encountered while exporting augmented models and all of the preconfigured models produced an export which successfully loaded into Oracle 23.7.

All of the augmented models that I have made available were converted using requirements-latest.txt. These models converted with the least amount of warnings and it makes sense to me to use the latest release of each package.

NumPy 2.x may have compatibility issues with oml.utils yet… as a further test I downgraded numpy to 1.x (1.26.4) and reinstalled the OML4PY 2.1 with the flag to skip dependency checks (–no-deps). This did not work with the latest releases for all requirements.

Testing again with numpy downgraded and the documented requirements worked. The exports for preconfigured models had some errors, but produced valid files that could be imported to Oracle. The resulting models still generated float32 vectors in all cases. The vectors generated by these models were the same as those generated by the models that were augmented with all package dependencies set to the latest release (requirements-latest.txt). It appears that we have a measure of flexibility in which Python package versions can be used with OML4PY.

oml4py-requirements-numpy1x.txt

--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.2.2
setuptools==70.0.0
scipy==1.14.0
matplotlib==3.8.4
oracledb==2.4.1
scikit-learn==1.5.1
numpy==1.26.4
onnxruntime==1.20.0
onnxruntime-extensions==0.12.0
onnx==1.17.0
torch==2.6.0
transformers==4.49.0
sentencepiece==0.2.0

Reinstall the Python packages with this requirements test, then reinstall OML4PY using the –no-deps option to skip dependency checking:

[oracle@localhost python]$ pip3.12 install -r oml4py-requirements-numpy1x.txt
...

[oracle@localhost oml4py]$ pwd
/home/oracle/oml4py
[oracle@localhost oml4py]$ perl -Iclient /home/oracle/oml4py/client/client.pl -i --ask --no-deps

Whichever set of requirements you use, the rest of the process is the same to export augmented models and load them to the database.

OML4PY 2.1 Show Preconfigured Models

Start Python and show the preconfigured models available:

[oracle@localhost oml4py]$ python3
Python 3.12.6 (main, Apr 19 2025, 12:04:40) [GCC 8.5.0 20210514 (Red Hat 8.5.0-26.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oml.utils import ONNXPipelineConfig
>>> ONNXPipelineConfig.show_preconfigured()

[
'sentence-transformers/all-mpnet-base-v2', 
'sentence-transformers/all-MiniLM-L6-v2', 
'sentence-transformers/multi-qa-MiniLM-L6-cos-v1', 
'sentence-transformers/distiluse-base-multilingual-cased-v2', 
'sentence-transformers/all-MiniLM-L12-v2', 
'BAAI/bge-small-en-v1.5', 
'BAAI/bge-base-en-v1.5', 
'taylorAI/bge-micro-v2', 
'intfloat/e5-small-v2', 
'intfloat/e5-base-v2', 
'thenlper/gte-base', 
'thenlper/gte-small', 
'TaylorAI/gte-tiny', 
'sentence-transformers/paraphrase-multilingual-mpnet-base-v2', 
'intfloat/multilingual-e5-base', 
'intfloat/multilingual-e5-small', 
'sentence-transformers/stsb-xlm-r-multilingual', 
'Snowflake/snowflake-arctic-embed-xs', 
'Snowflake/snowflake-arctic-embed-s', 
'Snowflake/snowflake-arctic-embed-m', 
'mixedbread-ai/mxbai-embed-large-v1', 
'openai/clip-vit-large-patch14', 
'google/vit-base-patch16-224', 
'microsoft/resnet-18', 
'microsoft/resnet-50', 
'WinKawaks/vit-tiny-patch16-224', 
'Falconsai/nsfw_image_detection', 
'WinKawaks/vit-small-patch16-224', 
'nateraw/vit-age-classifier', 
'rizvandwiki/gender-classification', 
'AdamCodd/vit-base-nsfw-detector', 
'trpakov/vit-face-expression', 
'BAAI/bge-reranker-base'
]

Add the argument include_properties to show preconfigured templates with their properties.

from oml.utils import ONNXPipelineConfig
ONNXPipelineConfig.show_preconfigured(include_properties=True)

See preconfigured-models-properties.json for the full output of this command

With a little work we can convert this to json that can be easily queried:

preconfigured-model-properties.sql

microsoft/resnet-18                                          IMAGE_CONVNEXT                                                                  
microsoft/resnet-50                                          IMAGE_CONVNEXT                                                                  
AdamCodd/vit-base-nsfw-detector                              IMAGE_VIT                                                                       
Falconsai/nsfw_image_detection                               IMAGE_VIT                                                                       
WinKawaks/vit-small-patch16-224                              IMAGE_VIT                                                                       
WinKawaks/vit-tiny-patch16-224                               IMAGE_VIT                                                                       
google/vit-base-patch16-224                                  IMAGE_VIT                                                                       
nateraw/vit-age-classifier                                   IMAGE_VIT                                                                       
rizvandwiki/gender-classification                            IMAGE_VIT                                                                       
trpakov/vit-face-expression                                  IMAGE_VIT                                                                       
openai/clip-vit-large-patch14                                MULTIMODAL_CLIP [quantize=True][metrics: COSINE]                                
BAAI/bge-base-en-v1.5                                        TEXT [quantize=True][metrics: COSINE]                                           
BAAI/bge-small-en-v1.5                                       TEXT [metrics: COSINE]                                                          
Snowflake/snowflake-arctic-embed-m                           TEXT [quantize=True][metrics: COSINE]                                           
Snowflake/snowflake-arctic-embed-s                           TEXT [metrics: COSINE]                                                          
Snowflake/snowflake-arctic-embed-xs                          TEXT [metrics: COSINE]                                                          
TaylorAI/gte-tiny                                            TEXT [metrics: COSINE]                                                          
intfloat/e5-base-v2                                          TEXT [quantize=True][metrics: COSINE]                                           
intfloat/e5-small-v2                                         TEXT [metrics: COSINE]                                                          
intfloat/multilingual-e5-base                                TEXT [quantize=True][metrics: COSINE]                                           
intfloat/multilingual-e5-small                               TEXT [quantize=True][metrics: COSINE]                                           
mixedbread-ai/mxbai-embed-large-v1                           TEXT [quantize=True][metrics: COSINE]                                           
sentence-transformers/all-MiniLM-L12-v2                      TEXT [metrics: COSINE,DOT,EUCLIDEAN]                                            
sentence-transformers/all-MiniLM-L6-v2                       TEXT [metrics: COSINE,DOT,EUCLIDEAN]                                            
sentence-transformers/all-mpnet-base-v2                      TEXT [quantize=True][metrics: COSINE,DOT,EUCLIDEAN]                             
sentence-transformers/distiluse-base-multilingual-cased-v2   TEXT [quantize=True][metrics: COSINE]                                           
sentence-transformers/multi-qa-MiniLM-L6-cos-v1              TEXT [metrics: COSINE,DOT,EUCLIDEAN]                                            
sentence-transformers/paraphrase-multilingual-mpnet-base-v2  TEXT [quantize=True][metrics: COSINE]                                           
sentence-transformers/stsb-xlm-r-multilingual                TEXT [quantize=True][metrics: COSINE]                                           
taylorAI/bge-micro-v2                                        TEXT [metrics: COSINE]                                                          
thenlper/gte-base                                            TEXT [metrics: COSINE]                                                          
thenlper/gte-small                                           TEXT [metrics: COSINE]                                                          
BAAI/bge-reranker-base                                       TEXT [quantize=True][mining_function=REGRESSION][metrics: COSINE]   

Use model_name to only show properties for a single model:

from oml.utils import ONNXPipelineConfig
m = 'mixedbread-ai/mxbai-embed-large-v1'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[
{'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': '91df8b84fdb1197c0e8db0782160339794930accc8f154ad80a498a7b562b435', 
'quantize_model': True, 
'model_type': 'TEXT'}
]

In addition to the preconfigured models, there are now templates that can be used to help with config settings and to convert models that are not preconfigured.

from oml.utils import ONNXPipelineConfig
ONNXPipelineConfig.show_templates()

['image_convnext', 'image_vit', 'multimodal_clip', 'text']

Exporting Augmented ONNX Models

At this point we are ready to export augmented models that can be loaded directly into the Oracle23ai database.

Create an exports directory for the models we will be exporting, navigate to this directory and open python.

[oracle@localhost oml4py]$ mkdir ./exports
[oracle@localhost oml4py]$ cd ./exports
[oracle@localhost exports]$ python3
Python 3.12.6 (main, Apr 20 2025, 13:54:06) [GCC 8.5.0 20210514 (Red Hat 8.5.0-26.0.1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Create a shell script to support changing to the Python 3.12 installation for new terminal sessions. Save this script as /home/oracle/oml4py/setPython3Env.sh:

setPython3Env.sh

#! /usr/bin/bash
# to execute the commands use `source ./setPython3Env.sh
echo "setting python3.12 environment and navigating to oml4py/exports"
cd /home/oracle/python
export PREFIX=/home/oracle/python/Python-3.12.6
cd $PREFIX
export PYTHONHOME=$PREFIX
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
cd /home/oracle/oml4py/exports
pwd

Make the shell script executable. Execute with the source command to enable Python 3.12 in a new terminal session:

[oracle@localhost oml4py]$ ls
client  exports  setPython3Env.sh
[oracle@localhost oml4py]$ chmod -x ./setPython3Env.sh
[oracle@localhost oml4py]$ source ./setPython3Env.sh
setting python3.12 environment and navigating to oml4py/exports
/home/oracle/oml4py/exports

Export Augmented Models: Syntax

Exporting an augmented text model now uses the ONNXPipeline in OML4PY2.1. The syntax is very simple, from the oml.utils import the ONNXPipeline and optionally ONNXPipelineConfig classes, set up the model name and any config settings and execute the pipeline to augment the model. Note: I have added the setting to force_download for each execution. Without this parameter you will not see the download progress meters or the file sizes when reconverting a model.

For all of the examples I will be exporting to a file that can then be imported to any database. The export2file arguments are the filename without the ONNX suffix and the current directory.

#pipeline without config settings (ONNXPipelineConfig import is not needed)
from oml.utils import ONNXPipeline
m = 'model-path/model-name'
f = 'export-file-name'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

We can change the configuration for the pipeline by importing ONNXPipelineConfig and generating the config from a template:

#export models using templates:
from oml.utils import ONNXPipeline, ONNXPipelineConfig
m = 'model-path/model-name'
f = 'export-file-name'

#config based on text template:
c = ONNXPipelineConfig.from_template("text", max_seq_length=256, distance_metrics=["COSINE"], quantize_model=True)

#config based on image template
c = ONNXPipelineConfig.from_template("image_vit", max_seq_length=512, distance_metrics=["COSINE"], quantize_model=True)

#config based on clip multimodal template
c = ONNXPipelineConfig.from_template("multimodal_clip", max_seq_length=512, distance_metrics=["COSINE"], quantize_model=True)

pipeline = ONNXPipeline(model_name=m, config=c, settings={"force_download":True})
pipeline.export2file(f, output_dir=".")

Classification and Reranking models add the appropriate MiningFunction:

from oml.utils import ONNXPipeline, ONNXPipelineConfig, MiningFunction
m = 'model-path/model-name'
f = 'export-file-name'

#classification model
p = ONNXPipeline(model_name=m, config=None, function=MiningFunction.CLASSIFICATION, settings={"force_download":True})

#reranking model
p = ONNXPipeline(model_name=m, config=None, function=MiningFunction.REGRESSION, settings={"force_download":True})

p.export2file(f, output_dir=".")

I have provided these augmented models for download if you just want to try loading them into Oracle 23.7 to work with generating vectors and vector similarity searches.

all-MiniLM-L6-v2.onnx    86.4 MB
all-MiniLM-L12-v2.onnx    127.2 MB
all-mpnet-base-v2.onnx    104.7 MB
bge-base-en-v1.5.onnx    104.7 MB
bge-micro-v2.onnx    33.2 MB
bge-reranker-base.onnx    275.5 MB
bge-small-en-v1.5.onnx    127.2 MB
clip-vit-large-patch14_img.onnx    292.2 MB
clip-vit-large-patch14_txt.onnx    120 MB
distiluse-base-multilingual-cased-v2.onnx    130.1
e5-base-v2.onnx    104.7 MB
e5-small-v2.onnx    127.2 MB
gender-classification.onnx    329.8 MB
gte-base.onnx    208.3 MB
gte-small.onnx    63.9 MB
gte-tiny.onnx    43.4 MB
multi-qa-MiniLM-L6-cos-v1.onnx    86.4 MB
multilingual-e5-base.onnx  270.1 MB
multilingual-e5-small.onnx    117.4 MB
mxbai-embed-large-v1.onnx    320.3 MB
mxbai-embed-xsmall-v1.onnx    23.3 MB (Not on preconfigured list, exported with TEXT template)
nsfw_image_detection.onnx    329.8 MB
paraphrase-multilingual-mpnet-base-v2.onnx    270.1 MB
resnet-18.onnx    42.6 MB
resnet-50.onnx    89.6 MB
snowflake-arctic-embed-m.onnx    104.7 MB
snowflake-arctic-embed-s.onnx    127.2 MB
snowflake-arctic-embed-xs.onnx    86.4 MB
stsb-xlm-r-multilingual.onnx    270.1 MB
vit-age-classifier.onnx    329.8 MB
vit-base-nsfw-detector.onnx    331 MB
vit-base-patch16-224.onnx    329.8 MB
vit-face-expression.onnx    329.8 MB
vit-small-patch16-224.onnx    83.5 MB
vit-tiny-patch16-224.onnx    21.5 MB

ONNX Models: Loading To Oracle 23ai

Create an external directory for the database if you are using the VirtualBox appliance and copy the augmented models there. On my setup, this directory is called ML_MODELS_DIR. I have also placed this on an external data share with the host machine to allow me to easily load different sets of models.

Loading a model to the database is straightforward, just use dbms_vector.load_onnx_model:

begin
    dbms_vector.load_onnx_model(
        'ML_MODELS_DIR', 
        'stsb-xlm-r-multilingual.onnx', 
        'stsb_xlm_r_multilingual');
    dbms_output.put_line(
        'stsb-xlm-r-multilingual.onnx loaded successfully as stsb_xlm_r_multilingual'
    );
end;
/

I create a dedicated user for loading models dev_vector, so that I can manage the space used by LLMs in my development systems.

--create.user.dev_vector.sql

create user dev_vector identified by oracle;

grant connect, db_developer_role to dev_vector;
grant create mining model to dev_vector;

alter user dev_vector quota unlimited on users;

--Oracle 23ai VirtualBox Linux Appliance with external data path
create or replace directory 
    ml_models_dir as '/home/oracle/ext-data/ora-db-directories/shared/ml-models';

grant read on directory ml_models_dir to dev_vector;
grant write on directory ml_models_dir to dev_vector;

In the source code for this article I put individual scripts for loading each model into the load-models directory.

There are also a few scripts to manage loading all of the models to compare their behavior. If you load all of these models at once, the virtualbox appliance begins to run out of tablespace. I separated text and image models so that they can be loaded together and compared.

drop_loaded_models.sql           --Drops all models currently loaded to the database
load-all-text-onnx-models.sql    --Loads all text embedding models
load-all-image-onnx-models.sql   --Loads all image and clip embedding models
list-loaded-models.sql           --Show all models currently loaded

Testing the text embedding models

To test out the text embedding models, create the test table recipes and load some test records.

create-table-recipes.sql

CREATE TABLE if not exists recipes (
    id NUMBER generated always as identity primary key
    , name VARCHAR2(100) not null unique
    , doc VARCHAR2(4000)
    , embedding VECTOR(*,*)
    , embedding_model varchar2(50)
)
/

--the rest of the script loads some test data

After setting up the recipes table, load all of the text embedding models and test some similarity searches with each model.

compare-model-searches.sql

--For each loaded model: 
--    Dynamically update embedding vectors in the recipes table using the model

            update recipes g
            set 
                embedding = vector_embedding(##model_name## using g.doc as data), 
                embedding_model = '##model_name##'    

--    Perform a similarity search with the model to find recipes most similar to 'yummy desert'

            select rownum as ranking, name, doc
            from
                (
                select name, doc
                from recipes g
                order by 
                    vector_distance(
                        g.embedding
                        , vector_embedding(##model_name## using :search_expression as data)
                        , cosine)
                fetch first 3 rows only
                )

The results of these searches show that the imported models are working correctly.

--------------------------------------------------
Mining Model: ALL_MINILM_L12_V2
Search Expression: yummy dessert
1-Strawberry Pie
2-Chocolate Cake
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: ALL_MINILM_L6_V2
Search Expression: yummy dessert
1-Strawberry Pie
2-Raspberry Tarts
3-Chocolate Cake
--------------------------------------------------
Mining Model: ALL_MPNET_BASE_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: BGE_BASE_EN_V1_5
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Raspberry Tarts
--------------------------------------------------
Mining Model: BGE_MICRO_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Raspberry Tarts
--------------------------------------------------
Mining Model: BGE_SMALL_EN_V1_5
Search Expression: yummy dessert
1-Strawberry Pie
2-Banana, Mango and Blueberry Smoothie
3-Grilled Cheese Sandwiches
--------------------------------------------------
Mining Model: DISTILUSE_BASE_MULTILINGUAL_CASED_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: E5_BASE_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: E5_SMALL_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: GTE_BASE
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Oatmeal Cookies
--------------------------------------------------
Mining Model: GTE_SMALL
Search Expression: yummy dessert
1-Chocolate Cake
2-Oatmeal Cookies
3-Strawberry Pie
--------------------------------------------------
Mining Model: GTE_TINY
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Oatmeal Cookies
--------------------------------------------------
Mining Model: MULTILINGUAL_E5_BASE
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: MULTILINGUAL_E5_SMALL
Search Expression: yummy dessert
1-Chocolate Cake
2-Banana, Mango and Blueberry Smoothie
3-Grilled Cheese Sandwiches
--------------------------------------------------
Mining Model: MULTI_QA_MINILM_L6_COS_V1
Search Expression: yummy dessert
1-Strawberry Pie
2-Chocolate Cake
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: MXBAI_EMBED_LARGE_V1
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Oatmeal Cookies
--------------------------------------------------
Mining Model: MXBAI_EMBED_XSMALL_V1
Search Expression: yummy dessert
1-Strawberry Pie
2-Raspberry Tarts
3-Chocolate Cake
--------------------------------------------------
Mining Model: PARAPHRASE_MULTILINGUAL_MPNET_BASE_V2
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Raspberry Tarts
--------------------------------------------------
Mining Model: SNOWFLAKE_ARCTIC_EMBED_M
Search Expression: yummy dessert
1-Chocolate Cake
2-Banana, Mango and Blueberry Smoothie
3-Curried Tofu
--------------------------------------------------
Mining Model: SNOWFLAKE_ARCTIC_EMBED_S
Search Expression: yummy dessert
1-Chocolate Cake
2-Strawberry Pie
3-Banana, Mango and Blueberry Smoothie
--------------------------------------------------
Mining Model: SNOWFLAKE_ARCTIC_EMBED_XS
Search Expression: yummy dessert
1-Strawberry Pie
2-Curried Tofu
3-Chocolate Cake
--------------------------------------------------
Mining Model: STSB_XLM_R_MULTILINGUAL
Search Expression: yummy dessert
1-Chocolate Cake
2-Raspberry Tarts
3-Strawberry Pie

For reference, I have included the export sessions for each model as an afterword showing each export and the preconfigured model properties.

Even though I was able to successfully export all of the preconfigured models, and another model from mixed bread using a template, all of the exports produce float32 vectors, even with the quantize_model property set to true.

My main goal was to see what it took to get all of these models augmented and loaded to Oracle 23ai. I will write a followup when I get one of the models from mixed bread to produce binary vectors, or other quantized models to produce int8 vectors.

–Anthony Harper

Afterword: Successful ONNX Export Details

For each model, I will show the python script for exporting the augmented model ready to load into the Oracle 23ai database and the results of executing the script. The settings for each preconfigured model are also shown.

A quick search for each preconfigured model path/name returns a link at hugging face with more details about the model. I have included these links if you want to learn more about the model.

sentence-transformers/all-MiniLM-L6-v2

from oml.utils import ONNXPipeline
m = 'sentence-transformers/all-MiniLM-L6-v2'
f = 'all-MiniLM-L6-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 350/350 [00:00<00:00, 3.79MB/s]
tokenizer_config.json: 100%|██████████████████████| 350/350 [00:00<00:00, 4.21MB/s]
vocab.txt: 100%|██████████████████████████████████| 232k/232k [00:00<00:00, 12.1MB/s]
special_tokens_map.json: 100%|████████████████████| 112/112 [00:00<00:00, 1.70MB/s]
tokenizer_config.json: 100%|██████████████████████| 350/350 [00:00<00:00, 5.12MB/s]
tokenizer.json: 100%|█████████████████████████████| 466k/466k [00:00<00:00, 26.3MB/s]
config.json: 100%|████████████████████████████████| 612/612 [00:00<00:00, 3.65MB/s]
config.json: 100%|████████████████████████████████| 612/612 [00:00<00:00, 4.47MB/s]
model.safetensors: 100%|██████████████████████████| 90.9M/90.9M [00:00<00:00, 225MB/s]

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/all-MiniLM-L6-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE', 'DOT', 'EUCLIDEAN'], 
'languages': ['us'], 
'max_seq_length': 256, 
'checksum': 'f931f6dc592102f9e7693fe763fe35ce539e51da11e8055d50cf0ee88f5f4ce0', 'model_type': 'TEXT'
}]

sentence-transformers/all-MiniLM-L12-v2

from oml.utils import ONNXPipeline
m = 'sentence-transformers/all-MiniLM-L12-v2'
f = 'all-MiniLM-L12-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 352/352 [00:00<00:00, 1.49MB/s]
tokenizer_config.json: 100%|██████████████████████| 352/352 [00:00<00:00, 1.15MB/s]
vocab.txt: 100%|██████████████████████████████████| 232k/232k [00:00<00:00, 10.3MB/s]
special_tokens_map.json: 100%|████████████████████| 112/112 [00:00<00:00, 1.56MB/s]
tokenizer_config.json: 100%|██████████████████████| 352/352 [00:00<00:00, 4.73MB/s]
tokenizer.json: 100%|█████████████████████████████| 466k/466k [00:00<00:00, 36.5MB/s]
config.json: 100%|████████████████████████████████| 615/615 [00:00<00:00, 7.23MB/s]
config.json: 100%|████████████████████████████████| 615/615 [00:00<00:00, 8.60MB/s]
model.safetensors: 100%|██████████████████████████| 133M/133M [00:00<00:00, 232MB/s]

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/all-MiniLM-L12-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE', 'DOT', 'EUCLIDEAN'], 
'languages': ['us'], 
'max_seq_length': 256, 
'checksum': '6c4dea8e58882ee6827c5855b28db372fda45a12b2bb31d995b07582b1bee9e0', 
'model_type': 'TEXT'
}]

sentence-transformers/all-mpnet-base-v2

from oml.utils import ONNXPipeline
m = 'sentence-transformers/all-mpnet-base-v2'
f = 'all-mpnet-base-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|███████████████████████| 363/363 [00:00<00:00, 4.91MB/s]
tokenizer_config.json: 100%|███████████████████████| 363/363 [00:00<00:00, 1.54MB/s]
vocab.txt: 100%|███████████████████████████████████| 232k/232k [00:00<00:00, 10.5MB/s]
special_tokens_map.json: 100%|█████████████████████| 239/239 [00:00<00:00, 485kB/s]
tokenizer_config.json: 100%|███████████████████████| 363/363 [00:00<00:00, 2.58MB/s]
tokenizer.json: 100%|██████████████████████████████| 466k/466k [00:00<00:00, 26.6MB/s]
config.json: 100%|█████████████████████████████████| 571/571 [00:00<00:00, 8.74MB/s]
config.json: 100%|█████████████████████████████████| 571/571 [00:00<00:00, 3.04MB/s]
model.safetensors: 100%|███████████████████████████| 438M/438M [00:00<00:00, 1.04GB/s]
UserWarning:Batch inference not supported in quantized models. Setting batch size to 1.

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/all-mpnet-base-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE', 'DOT', 'EUCLIDEAN'], 
'languages': ['us'], 
'max_seq_length': 384, 
'checksum': 'be647852cec0a7658375495b76962bf9ec9412e279901c86695290f8a48b9c36', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

sentence-transformers/multi-qa-MiniLM-L6-cos-v1

from oml.utils import ONNXPipeline
m = 'sentence-transformers/multi-qa-MiniLM-L6-cos-v1'
f = 'multi-qa-MiniLM-L6-cos-v1'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|███████████████████████| 383/383 [00:00<00:00, 4.41MB/s]
tokenizer_config.json: 100%|███████████████████████| 383/383 [00:00<00:00, 4.05MB/s]
vocab.txt: 100%|███████████████████████████████████| 232k/232k [00:00<00:00, 15.7MB/s]
special_tokens_map.json: 100%|█████████████████████| 112/112 [00:00<00:00, 537kB/s]
tokenizer_config.json: 100%|███████████████████████| 383/383 [00:00<00:00, 4.01MB/s]
tokenizer.json: 100%|██████████████████████████████| 466k/466k [00:00<00:00, 32.3MB/s]
config.json: 100%|█████████████████████████████████| 612/612 [00:00<00:00, 8.85MB/s]
config.json: 100%|█████████████████████████████████| 612/612 [00:00<00:00, 6.20MB/s]
model.safetensors: 100%|███████████████████████████| 90.9M/90.9M [00:00<00:00, 189MB/s]

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/multi-qa-MiniLM-L6-cos-v1'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE', 'DOT', 'EUCLIDEAN'], 
'languages': ['us', 'd', 'f', 'esm'], 
'max_seq_length': 512, 
'checksum': '1d657f78c41356a0f6d2bc5d069f8a9ae5a036bc3eccadeafcc235008a3f669b', 
'model_type': 'TEXT'
}]

sentence-transformers/distiluse-base-multilingual-cased-v2

from oml.utils import ONNXPipeline
m = 'sentence-transformers/distiluse-base-multilingual-cased-v2'
f = 'distiluse-base-multilingual-cased-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 531/531 [00:00<00:00, 2.22MB/s]
config.json: 100%|████████████████████████████████| 610/610 [00:00<00:00, 757kB/s]
tokenizer_config.json: 100%|██████████████████████| 531/531 [00:00<00:00, 7.93MB/s]
vocab.txt: 100%|██████████████████████████████████| 996k/996k [00:00<00:00, 21.2MB/s]
special_tokens_map.json: 100%|████████████████████| 112/112 [00:00<00:00, 61.4kB/s]
tokenizer_config.json: 100%|██████████████████████| 531/531 [00:00<00:00, 7.33MB/s]
tokenizer.json: 100%|█████████████████████████████| 1.96M/1.96M [00:00<00:00, 5.79MB/s]
config.json: 100%|████████████████████████████████| 610/610 [00:00<00:00, 2.29MB/s]
config.json: 100%|████████████████████████████████| 610/610 [00:00<00:00, 3.09MB/s]
model.safetensors: 100%|██████████████████████████| 539M/539M [00:01<00:00, 514MB/s]
UserWarning:Batch inference not supported in quantized models. Setting batch size to 1.

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/distiluse-base-multilingual-cased-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': False, 
'post_processors': [
     {'name': 'Pooling', 'type': 'mean'}, 
     {'name': 'Dense', 'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'Tanh'}], 
'distance_metrics': ['COSINE'], 
'languages': [
     'ar', 'bg', 'ca', 'cs', 'dk', 'd', 'us', 'el', 'et', 'fa', 'sf', 'f', 'frc', 'gu', 'iw', 
     'hi', 'hr', 'hu', 'hy', 'in', 'i', 'ja', 'ko', 'lt', 'lv', 'mk', 'mr', 'ms', 'n', 'nl', 
     'pl', 'pt', 'ptb', 'ro', 'ru', 'sk', 'sl', 'sq', 'lsr', 's', 'th', 'tr', 'uk', 'ur', 
     'vn', 'zhs', 'zht'], 
'max_seq_length': 128, 
'checksum': 'eeaf6f21f79c42a9b38d56c5dc440440efd4c1afac0b7c37fca0fba114af373d', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

BAAI/bge-small-en-v1.5

from oml.utils import ONNXPipeline
m = 'BAAI/bge-small-en-v1.5'
f = 'bge-small-en-v1.5'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|███████████████████████| 366/366 [00:00<00:00, 1.28MB/s]
tokenizer_config.json: 100%|███████████████████████| 366/366 [00:00<00:00, 4.95MB/s]
vocab.txt: 100%|███████████████████████████████████| 232k/232k [00:00<00:00, 11.2MB/s]
special_tokens_map.json: 100%|█████████████████████| 125/125 [00:00<00:00, 1.67MB/s]
tokenizer_config.json: 100%|███████████████████████| 366/366 [00:00<00:00, 5.08MB/s]
tokenizer.json: 100%|██████████████████████████████| 711k/711k [00:00<00:00, 4.82MB/s]
config.json: 100%|█████████████████████████████████| 743/743 [00:00<00:00, 1.28MB/s]
config.json: 100%|█████████████████████████████████| 743/743 [00:00<00:00, 10.6MB/s]
model.safetensors: 100%|███████████████████████████| 133M/133M [00:01<00:00, 67.4MB/s]

from oml.utils import ONNXPipelineConfig
m = 'BAAI/bge-small-en-v1.5'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [
     {'name': 'Pooling', 'type': 'mean'}, 
     {'name': 'Normalize'}], 'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': 'e5dd8407c6d42b88b356e6497ce5ffd5455d625a9afd564ea1cba8f7c3e27175', 
'model_type': 'TEXT'
}]

BAAI/bge-base-en-v1.5

from oml.utils import ONNXPipeline
m = 'BAAI/bge-base-en-v1.5'
f = 'bge-base-en-v1.5'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████████████| 366/366 [00:00<00:00, 1.72MB/s]
tokenizer_config.json: 100%|████████████████████| 366/366 [00:00<00:00, 3.99MB/s]
vocab.txt: 100%|████████████████████████████████| 232k/232k [00:00<00:00, 12.7MB/s]
special_tokens_map.json: 100%|██████████████████| 125/125 [00:00<00:00, 1.52MB/s]
tokenizer_config.json: 100%|████████████████████| 366/366 [00:00<00:00, 4.67MB/s]
tokenizer.json: 100%|███████████████████████████| 711k/711k [00:00<00:00, 31.5MB/s]
config.json: 100%|██████████████████████████████| 777/777 [00:00<00:00, 10.5MB/s]
config.json: 100%|██████████████████████████████| 777/777 [00:00<00:00, 10.4MB/s]
model.safetensors: 100%|████████████████████████| 438M/438M [00:03<00:00, 111MB/s]
UserWarning:Batch inference not supported in quantized models. Setting batch size to 1.

from oml.utils import ONNXPipelineConfig
m = 'BAAI/bge-base-en-v1.5'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': '393be8692afc7e5803cc83175daf08d3abd195977e9137d74d5ad7a590cb0b92', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

taylorAI/bge-micro-v2

from oml.utils import ONNXPipeline
m = 'taylorAI/bge-micro-v2'
f = 'bge-micro-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████████████████| 1.56k/1.56k [00:00<00:00, 18.7MB/s]
tokenizer_config.json: 100%|████████████████████████| 1.56k/1.56k [00:00<00:00, 9.15MB/s]
vocab.txt: 100%|████████████████████████████████████| 232k/232k [00:00<00:00, 10.7MB/s]
added_tokens.json: 100%|████████████████████████████| 82.0/82.0 [00:00<00:00, 1.15MB/s]
special_tokens_map.json: 100%|██████████████████████| 228/228 [00:00<00:00, 3.44MB/s]
tokenizer_config.json: 100%|████████████████████████| 1.56k/1.56k [00:00<00:00, 20.6MB/s]
tokenizer.json: 100%|███████████████████████████████| 712k/712k [00:00<00:00, 10.4MB/s]
config.json: 100%|██████████████████████████████████| 745/745 [00:00<00:00, 9.70MB/s]
config.json: 100%|██████████████████████████████████| 745/745 [00:00<00:00, 238kB/s]
model.safetensors: 100%|████████████████████████████| 34.8M/34.8M [00:00<00:00, 85.6MB/s]

from oml.utils import ONNXPipelineConfig
m = 'taylorAI/bge-micro-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'use_float16': True, 
'checksum': 'd82dcd72c469903355db29278b9ad17d26a4df28afc8a39ecc0607eef198c677', 
'model_type': 'TEXT'
}]

TaylorAI/gte-tiny

from oml.utils import ONNXPipeline
m = 'TaylorAI/gte-tiny'
f = 'gte-tiny'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████████████| 1.54k/1.54k [00:00<00:00, 16.5MB/s]
tokenizer_config.json: 100%|█████████████████████| 1.54k/1.54k [00:00<00:00, 18.9MB/s]
vocab.txt: 100%|█████████████████████████████████| 232k/232k [00:00<00:00, 10.6MB/s]
added_tokens.json: 100%|█████████████████████████| 82.0/82.0 [00:00<00:00, 529kB/s]
special_tokens_map.json: 100%|███████████████████| 228/228 [00:00<00:00, 1.80MB/s]
tokenizer_config.json: 100%|█████████████████████| 1.54k/1.54k [00:00<00:00, 20.2MB/s]
tokenizer.json: 100%|████████████████████████████| 712k/712k [00:00<00:00, 38.6MB/s]
config.json: 100%|███████████████████████████████| 669/669 [00:00<00:00, 1.53MB/s]
config.json: 100%|███████████████████████████████| 669/669 [00:00<00:00, 4.73MB/s]
model.safetensors: 100%|█████████████████████████| 45.5M/45.5M [00:00<00:00, 67.7MB/s]

from oml.utils import ONNXPipelineConfig
m = 'TaylorAI/gte-tiny'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'use_float16': True, 
'checksum': '8e476c879c79f982db0f65d429b7db7c8edd2f5097a5ba18a16c505f5744c28e', 
'model_type': 'TEXT'
}]

thenlper/gte-base

from oml.utils import ONNXPipeline
m = 'thenlper/gte-base'
f = 'gte-base'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 1.07MB/s]
tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 2.08MB/s]
vocab.txt: 100%|██████████████████████████████████| 232k/232k [00:00<00:00, 10.8MB/s]
special_tokens_map.json: 100%|████████████████████| 125/125 [00:00<00:00, 609kB/s]
tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 4.51MB/s]
tokenizer.json: 100%|█████████████████████████████| 712k/712k [00:00<00:00, 37.9MB/s]
config.json: 100%|████████████████████████████████| 618/618 [00:00<00:00, 7.14MB/s]
config.json: 100%|████████████████████████████████| 618/618 [00:00<00:00, 8.67MB/s]
model.safetensors: 100%|██████████████████████████| 219M/219M [00:01<00:00, 113MB/s]

from oml.utils import ONNXPipelineConfig
m = 'thenlper/gte-base'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'use_float16': True, 
'checksum': '7d786e661409a48ae396c0b04b1f6a0a40b3cd390779327a49da24702aac5778', 
'model_type': 'TEXT'
}]

thenlper/gte-small

from oml.utils import ONNXPipeline
m = 'thenlper/gte-small'
f = 'gte-small'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 394/394 [00:00<00:00, 5.49MB/s]
tokenizer_config.json: 100%|██████████████████████| 394/394 [00:00<00:00, 2.97MB/s]
vocab.txt: 100%|██████████████████████████████████| 232k/232k [00:00<00:00, 8.78MB/s]
special_tokens_map.json: 100%|████████████████████| 125/125 [00:00<00:00, 1.86MB/s]
tokenizer_config.json: 100%|██████████████████████| 394/394 [00:00<00:00, 5.58MB/s]
tokenizer.json: 100%|█████████████████████████████| 712k/712k [00:00<00:00, 37.2MB/s]
config.json: 100%|████████████████████████████████| 583/583 [00:00<00:00, 1.47MB/s]
config.json: 100%|████████████████████████████████| 583/583 [00:00<00:00, 5.53MB/s]
model.safetensors: 100%|██████████████████████████| 66.7M/66.7M [00:00<00:00, 102MB/s]

from oml.utils import ONNXPipelineConfig
m = 'thenlper/gte-small'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'use_float16': True, 
'checksum': 'c6a341533f7b45b5613520a289999628cb94337d2d48975e51a9c2cf984ecf71', 
'model_type': 'TEXT'
}]

intfloat/e5-small-v2

from oml.utils import ONNXPipeline
m = 'intfloat/e5-small-v2'
f = 'e5-small-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 2.51MB/s]
tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 1.90MB/s]
vocab.txt: 100%|██████████████████████████████████| 232k/232k [00:00<00:00, 10.6MB/s]
special_tokens_map.json: 100%|████████████████████| 125/125 [00:00<00:00, 2.10MB/s]
tokenizer_config.json: 100%|██████████████████████| 314/314 [00:00<00:00, 1.49MB/s]
tokenizer.json: 100%|█████████████████████████████| 711k/711k [00:00<00:00, 38.5MB/s]
config.json: 100%|████████████████████████████████| 615/615 [00:00<00:00, 7.03MB/s]
config.json: 100%|████████████████████████████████| 615/615 [00:00<00:00, 8.16MB/s]
model.safetensors: 100%|██████████████████████████| 133M/133M [00:01<00:00, 97.9MB/s]

from oml.utils import ONNXPipelineConfig
m = 'intfloat/e5-small-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': '6e959c0d6f7559f98d832ea9d75afc03569aa89c7b967830d25011f699604648', 
'model_type': 'TEXT'
}]

intfloat/e5-base-v2

from oml.utils import ONNXPipeline
m = 'intfloat/e5-base-v2'
f = 'e5-base-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")


tokenizer_config.json: 100%|███████████████| 314/314 [00:00<00:00, 1.01MB/s]
tokenizer_config.json: 100%|███████████████| 314/314 [00:00<00:00, 1.47MB/s]
vocab.txt: 100%|███████████████████████████| 232k/232k [00:00<00:00, 12.8MB/s]
special_tokens_map.json: 100%|█████████████| 125/125 [00:00<00:00, 1.70MB/s]
tokenizer_config.json: 100%|███████████████| 314/314 [00:00<00:00, 3.97MB/s]
tokenizer.json: 100%|██████████████████████| 711k/711k [00:00<00:00, 40.6MB/s]
config.json: 100%|█████████████████████████| 650/650 [00:00<00:00, 3.33MB/s]
config.json: 100%|█████████████████████████| 650/650 [00:00<00:00, 3.96MB/s]
model.safetensors: 100%|███████████████████| 438M/438M [00:04<00:00, 105MB/s]
>>> 

from oml.utils import ONNXPipelineConfig
m = 'intfloat/e5-base-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': '86d1ce3d71ee3e9da1c2177858d590c7bc360f385248144b89c8d1441c227fc6', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

intfloat/multilingual-e5-small

from oml.utils import ONNXPipeline
m = 'intfloat/multilingual-e5-small'
f = 'multilingual-e5-small'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|██████████████████████| 443/443 [00:00<00:00, 4.85MB/s]
tokenizer_config.json: 100%|██████████████████████| 443/443 [00:00<00:00, 5.77MB/s]
sentencepiece.bpe.model: 100%|████████████████████| 5.07M/5.07M [00:00<00:00, 56.0MB/s]
special_tokens_map.json: 100%|████████████████████| 167/167 [00:00<00:00, 738kB/s]
tokenizer_config.json: 100%|██████████████████████| 443/443 [00:00<00:00, 2.33MB/s]
tokenizer.json: 100%|█████████████████████████████| 17.1M/17.1M [00:00<00:00, 98.7MB/s]
config.json: 100%|████████████████████████████████| 655/655 [00:00<00:00, 1.97MB/s]
config.json: 100%|████████████████████████████████| 655/655 [00:00<00:00, 7.80MB/s]
model.safetensors: 100%|██████████████████████████| 471M/471M [00:04<00:00, 114MB/s]
>>> 

from oml.utils import ONNXPipelineConfig
m = 'intfloat/multilingual-e5-small'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us', 'am', 'ar', 'as', 'az', 'be', 'bg', 'ca', 'cs', 'dk', 'd', 'el', 'gb', 'e', 'et', 'eu', 'fa', 'sf', 'f', 'ga', 'gu', 'iw', 'hi', 'hr', 'hu', 'hy', 'in', 'is', 'i', 'ja', 'km', 'kn', 'ko', 'lo', 'lt', 'lv', 'mk', 'ml', 'mr', 'ms', 'ne', 'nl', 'n', 'or', 'pl', 'pt', 'ro', 'ru', 'si', 'sk', 'sl', 'sq', 's', 'sw', 'ta', 'te', 'th', 'tr', 'uk', 'ur', 'vn'], 
'max_seq_length': 512, 
'checksum': 'bb40d156841decd031cc816d03791e7a0dfc4e4eedc93ca5013b46dad3f892db', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

Snowflake/snowflake-arctic-embed-xs

from oml.utils import ONNXPipeline
m = 'Snowflake/snowflake-arctic-embed-xs'
f = 'snowflake-arctic-embed-xs'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████████████| 1.43k/1.43k [00:00<00:00, 7.52MB/s]
tokenizer_config.json: 100%|█████████████████████| 1.43k/1.43k [00:00<00:00, 7.29MB/s]
vocab.txt: 100%|█████████████████████████████████| 232k/232k [00:00<00:00, 8.55MB/s]
special_tokens_map.json: 100%|███████████████████| 695/695 [00:00<00:00, 2.65MB/s]
tokenizer_config.json: 100%|█████████████████████| 1.43k/1.43k [00:00<00:00, 9.56MB/s]
tokenizer.json: 100%|████████████████████████████| 712k/712k [00:00<00:00, 6.59MB/s]
config.json: 100%|███████████████████████████████| 737/737 [00:00<00:00, 2.96MB/s]
config.json: 100%|███████████████████████████████| 737/737 [00:00<00:00, 9.60MB/s]
model.safetensors: 100%|█████████████████████████| 90.3M/90.3M [00:01<00:00, 53.1MB/s]

from oml.utils import ONNXPipelineConfig
m = 'Snowflake/snowflake-arctic-embed-xs'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': 'dc1cf555778eaf2bc8c8cbd52929e58e930a50539d96bd2aef05c4667c9afef1', 
'model_type': 'TEXT'
}]

Snowflake/snowflake-arctic-embed-s

from oml.utils import ONNXPipeline
m = 'Snowflake/snowflake-arctic-embed-s'
f = 'snowflake-arctic-embed-s'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████████| 1.43k/1.43k [00:00<00:00, 17.1MB/s]
tokenizer_config.json: 100%|█████████████████| 1.43k/1.43k [00:00<00:00, 18.8MB/s]
vocab.txt: 100%|█████████████████████████████| 232k/232k [00:00<00:00, 10.9MB/s]
special_tokens_map.json: 100%|███████████████| 695/695 [00:00<00:00, 7.92MB/s]
tokenizer_config.json: 100%|█████████████████| 1.43k/1.43k [00:00<00:00, 17.5MB/s]
tokenizer.json: 100%|████████████████████████| 712k/712k [00:00<00:00, 35.0MB/s]
config.json: 100%|███████████████████████████| 703/703 [00:00<00:00, 8.06MB/s]
config.json: 100%|███████████████████████████| 703/703 [00:00<00:00, 1.91MB/s]
model.safetensors: 100%|█████████████████████| 133M/133M [00:02<00:00, 64.3MB/s]

from oml.utils import ONNXPipelineConfig
m = 'Snowflake/snowflake-arctic-embed-s'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': 'a4e1c3e0397361c6add42a542052e40e32f679a44d33829659d31592e078a3f1', 
'model_type': 'TEXT'
}]

Snowflake/snowflake-arctic-embed-m

from oml.utils import ONNXPipeline
m = 'Snowflake/snowflake-arctic-embed-m'
f = 'snowflake-arctic-embed-m'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████| 1.38k/1.38k [00:00<00:00, 2.70MB/s]
tokenizer_config.json: 100%|█████████████| 1.38k/1.38k [00:00<00:00, 8.57MB/s]
vocab.txt: 100%|█████████████████████████| 232k/232k [00:00<00:00, 9.87MB/s]
special_tokens_map.json: 100%|███████████| 695/695 [00:00<00:00, 2.51MB/s]
tokenizer_config.json: 100%|█████████████| 1.38k/1.38k [00:00<00:00, 16.6MB/s]
tokenizer.json: 100%|████████████████████| 712k/712k [00:00<00:00, 35.1MB/s]
config.json: 100%|███████████████████████| 738/738 [00:00<00:00, 5.31MB/s]
config.json: 100%|███████████████████████| 738/738 [00:00<00:00, 9.21MB/s]
model.safetensors: 100%|█████████████████| 436M/436M [00:05<00:00, 82.3MB/s]
UserWarning:Batch inference not supported in quantized models. Setting batch size to 1.

from oml.utils import ONNXPipeline, ONNXPipelineConfig
m = 'Snowflake/snowflake-arctic-embed-m'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': 'cc17acb7d96a63f74b7e8ecca3d2a2c21fd98df606bdd67346ba88eb81e74eef', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

google/vit-base-patch16-224

from oml.utils import ONNXPipeline
m = 'google/vit-base-patch16-224'
f = 'vit-base-patch16-224'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|████████████████████████████| 69.7k/69.7k [00:00<00:00, 9.59MB/s]
config.json: 100%|████████████████████████████| 69.7k/69.7k [00:00<00:00, 5.34MB/s]
model.safetensors: 100%|██████████████████████| 346M/346M [00:00<00:00, 370MB/s]
preprocessor_config.json: 100%|███████████████| 160/160 [00:00<00:00, 1.44MB/s]
config.json: 100%|████████████████████████████| 69.7k/69.7k [00:00<00:00, 69.9MB/s]
preprocessor_config.json: 100%|███████████████| 160/160 [00:00<00:00, 619kB/s]

TracerWarning:Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
>>> 

from oml.utils import ONNXPipelineConfig
m = 'google/vit-base-patch16-224'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': 'beaffddb81f18163b991541becd355e54db6d27b57b7df4d440997a74f5786ff', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
    {'name': 'DecodeImage', 'do_convert_rgb': True}, 
    {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
    {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
    {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
    {'name': 'OrderChannels', 'order': 'CHW'}]
, 'post_processors_img': []
}]

WinKawaks/vit-tiny-patch16-224

from oml.utils import ONNXPipeline
m = 'WinKawaks/vit-tiny-patch16-224'
f = 'vit-tiny-patch16-224'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|██████████████████████████████| 69.7k/69.7k [00:00<00:00, 9.10MB/s]
config.json: 100%|██████████████████████████████| 69.7k/69.7k [00:00<00:00, 42.1MB/s]
model.safetensors: 100%|████████████████████████| 22.9M/22.9M [00:00<00:00, 63.5MB/s]
preprocessor_config.json: 100%|█████████████████| 160/160 [00:00<00:00, 880kB/s]
config.json: 100%|██████████████████████████████| 69.7k/69.7k [00:00<00:00, 24.6MB/s]
preprocessor_config.json: 100%|█████████████████| 160/160 [00:00<00:00, 2.16MB/s]

from oml.utils import ONNXPipelineConfig
m = 'WinKawaks/vit-tiny-patch16-224'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': '2213c4e82776c1e77564df09e8e7d1eedb8978ddfa8f5f001204cbf378800205', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

WinKawaks/vit-small-patch16-224

from oml.utils import ONNXPipeline
m = 'WinKawaks/vit-small-patch16-224'
f = 'vit-small-patch16-224'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|█████████████████████████| 69.7k/69.7k [00:00<00:00, 10.2MB/s]
config.json: 100%|█████████████████████████| 69.7k/69.7k [00:00<00:00, 9.97MB/s]
model.safetensors: 100%|███████████████████| 88.2M/88.2M [00:00<00:00, 103MB/s]
preprocessor_config.json: 100%|████████████| 160/160 [00:00<00:00, 612kB/s]
config.json: 100%|█████████████████████████| 69.7k/69.7k [00:00<00:00, 18.4MB/s]
preprocessor_config.json: 100%|████████████| 160/160 [00:00<00:00, 743kB/s]

from oml.utils import ONNXPipelineConfig
m = 'WinKawaks/vit-small-patch16-224'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': 'd2e7b93b9a8826968fa3dba9891d11dc4f3449015bee6706536c1e73f959f709', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

Falconsai/nsfw_image_detection

from oml.utils import ONNXPipeline
m = 'Falconsai/nsfw_image_detection'
f = 'nsfw_image_detection'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

Try setting the quantize_model property to true for models of this size.
config.json: 100%|████████████████████████████████| 724/724 [00:00<00:00, 4.00MB/s]
config.json: 100%|████████████████████████████████| 724/724 [00:00<00:00, 7.48MB/s]
model.safetensors: 100%|██████████████████████████| 343M/343M [00:03<00:00, 104MB/s]
preprocessor_config.json: 100%|███████████████████| 325/325 [00:00<00:00, 3.03MB/s]

from oml.utils import ONNXPipelineConfig
m = 'Falconsai/nsfw_image_detection'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': '98427273b394524ba714032174e919ab8e79789c7aef2b14245242a101af088d', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

AdamCodd/vit-base-nsfw-detector

from oml.utils import ONNXPipeline
m = 'AdamCodd/vit-base-nsfw-detector'
f = 'vit-base-nsfw-detector'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|████████████████████████████████| 715/715 [00:00<00:00, 4.54MB/s]
config.json: 100%|████████████████████████████████| 715/715 [00:00<00:00, 8.62MB/s]
model.safetensors: 100%|██████████████████████████| 344M/344M [00:03<00:00, 109MB/s]
preprocessor_config.json: 100%|███████████████████| 232/232 [00:00<00:00, 2.98MB/s]

from oml.utils import ONNXPipelineConfig
m = 'AdamCodd/vit-base-nsfw-detector'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': 'c8441ca4fc7341c43ed6911f6dcbe5ca38f93f2225abc19a3efb71ea907fe7d2', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

microsoft/resnet-18

from oml.utils import ONNXPipeline
m = 'microsoft/resnet-18'
f = 'resnet-18'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|███████████████████████████████| 69.5k/69.5k [00:00<00:00, 14.4MB/s]
config.json: 100%|███████████████████████████████| 69.5k/69.5k [00:00<00:00, 59.3MB/s]
model.safetensors: 100%|█████████████████████████| 46.8M/46.8M [00:00<00:00, 89.3MB/s]
preprocessor_config.json: 100%|██████████████████| 266/266 [00:00<00:00, 1.90MB/s]

from oml.utils import ONNXPipelineConfig
m = 'microsoft/resnet-18'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': 'fb749de60cdcc676f5049aa5c00fab19ea1d312b9ad8c2961737aa1a13d56520', 
'model_type': 'IMAGE_CONVNEXT', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'height': 384, 'width': 384}, 'resample': 'bilinear'},
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

microsoft/resnet-50

from oml.utils import ONNXPipeline
m = 'microsoft/resnet-50'
f = 'resnet-50'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|███████████████████████████████| 69.6k/69.6k [00:00<00:00, 10.3MB/s]
config.json: 100%|███████████████████████████████| 69.6k/69.6k [00:00<00:00, 4.79MB/s]
model.safetensors: 100%|█████████████████████████| 102M/102M [00:01<00:00, 99.5MB/s]
preprocessor_config.json: 100%|██████████████████| 266/266 [00:00<00:00, 1.61MB/s]

from oml.utils import ONNXPipelineConfig
m = 'microsoft/resnet-50'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': 'fca7567354d0cb0b7258a44810150f7c1abf8e955cff9320b043c0554ba81f8e', 
'model_type': 'IMAGE_CONVNEXT', 
'pre_processors_img': [
    {'name': 'DecodeImage', 'do_convert_rgb': True}, 
    {'name': 'Resize', 'enable': True, 'size': {'height': 384, 'width': 384}, 'resample': 'bilinear'},
    {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
    {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
    {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

trpakov/vit-face-expression

from oml.utils import ONNXPipeline
m = 'trpakov/vit-face-expression'
f = 'vit-face-expression'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|█████████████████████████████| 915/915 [00:00<00:00, 6.35MB/s]
config.json: 100%|█████████████████████████████| 915/915 [00:00<00:00, 12.8MB/s]
model.safetensors: 100%|███████████████████████| 343M/343M [00:00<00:00, 354MB/s]
preprocessor_config.json: 100%|████████████████| 228/228 [00:00<00:00, 2.66MB/s]

from oml.utils import ONNXPipelineConfig
m = 'trpakov/vit-face-expression'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': '8960f3a5dad4d109a9478c175a5a08e45286c0d2ec5604f55c61d2814a2f179a', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
    {'name': 'DecodeImage', 'do_convert_rgb': True}, 
    {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
    {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
    {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
    {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

nateraw/vit-age-classifier

# for classifier add mining function
from oml.utils import ONNXPipeline, MiningFunction
m = 'nateraw/vit-age-classifier'
f = 'vit-age-classifier'
p = ONNXPipeline(model_name=m, config=None, function=MiningFunction.CLASSIFICATION, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|█████████████████████████████████████| 850/850 [00:00<00:00, 9.69MB/s]
config.json: 100%|█████████████████████████████████████| 850/850 [00:00<00:00, 10.3MB/s]
model.safetensors: 100%|███████████████████████████████| 343M/343M [00:03<00:00, 110MB/s]
preprocessor_config.json: 100%|████████████████████████| 197/197 [00:00<00:00, 2.02MB/s]
config.json: 100%|█████████████████████████████████████| 850/850 [00:00<00:00, 9.56MB/s]
preprocessor_config.json: 100%|████████████████████████| 197/197 [00:00<00:00, 1.25MB/s]

from oml.utils import ONNXPipelineConfig
m = 'nateraw/vit-age-classifier'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': '80c7610807d8dee9d38f0baea16924303550d59d2d155f49c4d2294849a394d2', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
    {'name': 'DecodeImage', 'do_convert_rgb': True}, 
    {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
    {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
    {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
    {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

rizvandwiki/gender-classification

from oml.utils import ONNXPipeline, MiningFunction
m = 'rizvandwiki/gender-classification'
f = 'gender-classification'
p = ONNXPipeline(m, config=None, function=MiningFunction.CLASSIFICATION, settings={"force_download":True})
p.export2file(f, output_dir=".")

config.json: 100%|██████████████████████████████| 727/727 [00:00<00:00, 6.00MB/s]
config.json: 100%|██████████████████████████████| 727/727 [00:00<00:00, 6.79MB/s]
model.safetensors: 100%|████████████████████████| 343M/343M [00:00<00:00, 708MB/s]
preprocessor_config.json: 100%|█████████████████| 325/325 [00:00<00:00, 4.12MB/s]

from oml.utils import ONNXPipelineConfig
m = 'rizvandwiki/gender-classification'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'checksum': '834c9fd8abeda91f762e779e2ab80a5f153e73ff84384d827d191055a1ede797', 
'model_type': 'IMAGE_VIT', 
'pre_processors_img': [
    {'name': 'DecodeImage', 'do_convert_rgb': True}, 
    {'name': 'Resize', 'enable': True, 'size': {'height': 224, 'width': 224}, 'resample': 'bilinear'},
    {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
    {'name': 'Normalize', 'enable': True, 'image_mean': 'IMAGENET_STANDARD_MEAN', 'image_std': 'IMAGENET_STANDARD_STD'}, 
    {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

mixedbread-ai/mxbai-embed-large-v1

from oml.utils import ONNXPipeline
m = 'mixedbread-ai/mxbai-embed-large-v1'
f = 'mxbai-embed-large-v1'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████████| 1.24k/1.24k [00:00<00:00, 8.38MB/s]
tokenizer_config.json: 100%|████████████████| 1.24k/1.24k [00:00<00:00, 8.05MB/s]
vocab.txt: 100%|████████████████████████████| 232k/232k [00:00<00:00, 10.4MB/s]
special_tokens_map.json: 100%|██████████████| 695/695 [00:00<00:00, 2.31MB/s]
tokenizer_config.json: 100%|████████████████| 1.24k/1.24k [00:00<00:00, 4.89MB/s]
tokenizer.json: 100%|███████████████████████| 711k/711k [00:00<00:00, 27.0MB/s]
config.json: 100%|██████████████████████████| 677/677 [00:00<00:00, 4.47MB/s]
config.json: 100%|██████████████████████████| 677/677 [00:00<00:00, 5.63MB/s]
model.safetensors: 100%|████████████████████| 670M/670M [00:01<00:00, 394MB/s]

from oml.utils import ONNXPipelineConfig
m = 'mixedbread-ai/mxbai-embed-large-v1'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [
    {'name': 'Pooling', 'type': 'mean'}, 
    {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': '91df8b84fdb1197c0e8db0782160339794930accc8f154ad80a498a7b562b435', 
'quantize_model': True, 'model_type': 'TEXT'
}]

mxbai-embed-xsmall-v1

This model is not in preconfigured list, convert it using a text template

from oml.utils import ONNXPipeline, ONNXPipelineConfig
m = 'mixedbread-ai/mxbai-embed-xsmall-v1'
f = 'mxbai-embed-xsmall-v1'
c = ONNXPipelineConfig.from_template("text", max_seq_length=384, quantize_model=True, distance_metrics=["HAMMING","JACCARD","COSINE"])
p = ONNXPipeline(model_name=m, config=c, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████████████| 1.43k/1.43k [00:00<00:00, 18.0MB/s]
tokenizer_config.json: 100%|████████████████████| 1.43k/1.43k [00:00<00:00, 8.27MB/s]
vocab.txt: 100%|████████████████████████████████| 232k/232k [00:00<00:00, 10.9MB/s]
special_tokens_map.json: 100%|██████████████████| 695/695 [00:00<00:00, 9.88MB/s]
tokenizer_config.json: 100%|████████████████████| 1.43k/1.43k [00:00<00:00, 16.1MB/s]
tokenizer.json: 100%|███████████████████████████| 712k/712k [00:00<00:00, 43.6MB/s]
config.json: 100%|██████████████████████████████| 675/675 [00:00<00:00, 10.7MB/s]
config.json: 100%|██████████████████████████████| 675/675 [00:00<00:00, 8.85MB/s]
model.safetensors: 100%|████████████████████████| 48.2M/48.2M [00:01<00:00, 40.9MB/s]

openai/clip-vit-large-patch14

This is a multimodal clip model..it produces an image embedding model and a text embedding model.

from oml.utils import ONNXPipeline
m = 'openai/clip-vit-large-patch14'
f = 'clip-vit-large-patch14'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████████████| 905/905 [00:00<00:00, 9.96MB/s]
tokenizer_config.json: 100%|█████████████████████| 905/905 [00:00<00:00, 9.73MB/s]
vocab.json: 100%|████████████████████████████████| 961k/961k [00:00<00:00, 21.6MB/s]
merges.txt: 100%|████████████████████████████████| 525k/525k [00:00<00:00, 16.4MB/s]
special_tokens_map.json: 100%|███████████████████| 389/389 [00:00<00:00, 1.78MB/s]
tokenizer_config.json: 100%|█████████████████████| 905/905 [00:00<00:00, 5.29MB/s]
tokenizer.json: 100%|████████████████████████████| 2.22M/2.22M [00:00<00:00, 27.2MB/s]
config.json: 100%|███████████████████████████████| 4.52k/4.52k [00:00<00:00, 38.8MB/s]
config.json: 100%|███████████████████████████████| 4.52k/4.52k [00:00<00:00, 19.7MB/s]
model.safetensors: 100%|█████████████████████████| 1.71G/1.71G [00:17<00:00, 98.1MB/s]
config.json: 100%|███████████████████████████████| 4.52k/4.52k [00:00<00:00, 25.7MB/s]
config.json: 100%|███████████████████████████████| 4.52k/4.52k [00:00<00:00, 40.3MB/s]
model.safetensors: 100%|█████████████████████████| 1.71G/1.71G [00:20<00:00, 85.5MB/s]
preprocessor_config.json: 100%|██████████████████| 316/316 [00:00<00:00, 3.81MB/s]

TracerWarning:Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
UserWarning:Exporting aten::index operator of advanced indexing in opset 17 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.

config.json: 100%|█████████████████████████| 4.52k/4.52k [00:00<00:00, 46.7MB/s]
config.json: 100%|█████████████████████████| 4.52k/4.52k [00:00<00:00, 42.8MB/s]
model.safetensors: 100%|███████████████████| 1.71G/1.71G [00:23<00:00, 71.9MB/s]
preprocessor_config.json: 100%|████████████| 316/316 [00:00<00:00, 513kB/s]

TracerWarning:Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

from oml.utils import ONNXPipelineConfig
m = 'openai/clip-vit-large-patch14'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'max_seq_length': 77, 
'do_lower_case': True, 
'post_processors': [{'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'checksum': '010cf7792646b40a3d8ed9bb39d2f223944170d32fbeb36207f9fbf4eeed935c', 
'quantize_model': True, 
'model_type': 'MULTIMODAL_CLIP', 
'pre_processors_img': [
     {'name': 'DecodeImage', 'do_convert_rgb': True}, 
     {'name': 'Resize', 'enable': True, 'size': {'shortest_edge': 224}, 'resample': 'bicubic'}, 
     {'name': 'CenterCrop', 'enable': True, 'crop_size': {'height': 224, 'width': 224}}, 
     {'name': 'Rescale', 'enable': True, 'rescale_factor': 0.00392156862}, 
     {'name': 'Normalize', 'enable': True, 'image_mean': 'OPENAI_CLIP_MEAN', 'image_std': 'OPENAI_CLIP_STD'}, 
     {'name': 'OrderChannels', 'order': 'CHW'}], 
'post_processors_img': []
}]

sentence-transformers/paraphrase-multilingual-mpnet-base-v2

from oml.utils import ONNXPipeline
m = 'sentence-transformers/paraphrase-multilingual-mpnet-base-v2'
f = 'paraphrase-multilingual-mpnet-base-v2'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|█████████████████████| 402/402 [00:00<00:00, 1.90MB/s]
config.json: 100%|███████████████████████████████| 723/723 [00:00<00:00, 3.81MB/s]
tokenizer_config.json: 100%|█████████████████████| 402/402 [00:00<00:00, 1.04MB/s]
sentencepiece.bpe.model: 100%|███████████████████| 5.07M/5.07M [00:00<00:00, 19.5MB/s]
special_tokens_map.json: 100%|███████████████████| 239/239 [00:00<00:00, 1.50MB/s]
tokenizer_config.json: 100%|█████████████████████| 402/402 [00:00<00:00, 2.39MB/s]
tokenizer.json: 100%|████████████████████████████| 9.08M/9.08M [00:00<00:00, 26.4MB/s]
config.json: 100%|███████████████████████████████| 723/723 [00:00<00:00, 127kB/s]
config.json: 100%|███████████████████████████████| 723/723 [00:00<00:00, 9.81MB/s]
model.safetensors: 100%|█████████████████████████| 1.11G/1.11G [00:02<00:00, 459MB/s]

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/paraphrase-multilingual-mpnet-base-v2'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us', 'ar', 'bg', 'ca', 'cs', 'dk', 'd', 'el', 'gb', 'e', 'et', 'fa', 'sf', 'f', 'gu', 'iw', 'hi', 'hr', 'hu', 'hy', 'in', 'i', 'ja', 'ko', 'lt', 'lv', 'mk', 'mr', 'ms', 'nl', 'pl', 'pt', 'ro', 'ru', 'sk', 'sl', 'sq', 's', 'th', 'tr', 'uk', 'ur', 'vn'], 
'max_seq_length': 512, 
'checksum': 'c51c3acc79acf777c62cb42d3dd5bc02fc57afde87b5a1aefb21b0688534518a', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

sentence-transformers/stsb-xlm-r-multilingual

from oml.utils import ONNXPipeline
m = 'sentence-transformers/stsb-xlm-r-multilingual'
f = 'stsb-xlm-r-multilingual'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████| 505/505 [00:00<00:00, 6.05MB/s]
config.json: 100%|█████████████████████████████████| 709/709 [00:00<00:00, 3.94MB/s]
tokenizer_config.json: 100%|███████████████████████| 505/505 [00:00<00:00, 3.65MB/s]
sentencepiece.bpe.model: 100%|█████████████████████| 5.07M/5.07M [00:00<00:00, 29.5MB/s]
special_tokens_map.json: 100%|█████████████████████| 150/150 [00:00<00:00, 1.91MB/s]
tokenizer_config.json: 100%|███████████████████████| 505/505 [00:00<00:00, 6.62MB/s]
tokenizer.json: 100%|██████████████████████████████| 9.10M/9.10M [00:00<00:00, 30.0MB/s]
config.json: 100%|█████████████████████████████████| 709/709 [00:00<00:00, 4.60MB/s]
config.json: 100%|█████████████████████████████████| 709/709 [00:00<00:00, 7.53MB/s]
model.safetensors: 100%|███████████████████████████| 1.11G/1.11G [00:01<00:00, 586MB/s]

from oml.utils import ONNXPipelineConfig
m = 'sentence-transformers/stsb-xlm-r-multilingual'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us'], 
'max_seq_length': 512, 
'checksum': 'd78f6e5ae352ddccd06fda98daebec1bbb7c2755026222e8612157a4e448f906', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

intfloat/multilingual-e5-base

from oml.utils import ONNXPipeline
m = 'intfloat/multilingual-e5-base'
f = 'multilingual-e5-base'
p = ONNXPipeline(model_name=m, config=None, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|███████████| 418/418 [00:00<00:00, 4.45MB/s]
tokenizer_config.json: 100%|███████████| 418/418 [00:00<00:00, 4.99MB/s]
sentencepiece.bpe.model: 100%|█████████| 5.07M/5.07M [00:00<00:00, 52.7MB/s]
special_tokens_map.json: 100%|█████████| 280/280 [00:00<00:00, 3.79MB/s]
tokenizer_config.json: 100%|███████████| 418/418 [00:00<00:00, 1.38MB/s]
tokenizer.json: 100%|██████████████████| 17.1M/17.1M [00:00<00:00, 92.0MB/s]
config.json: 100%|█████████████████████| 694/694 [00:00<00:00, 8.85MB/s]
config.json: 100%|████████████████████████████████████| 694/694 [00:00<00:00, 10.1MB/s]
model.safetensors: 100%|██████████████████████████████| 1.11G/1.11G [00:09<00:00, 113MB/s]

from oml.utils import ONNXPipelineConfig
m = 'intfloat/multilingual-e5-base'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [{'name': 'Pooling', 'type': 'mean'}, {'name': 'Normalize'}], 
'distance_metrics': ['COSINE'], 
'languages': ['us', 'am', 'ar', 'as', 'az', 'be', 'bg', 'ca', 'cs', 'dk', 'd', 'el', 'gb', 'e', 'et', 'eu', 'fa', 'sf', 'f', 'ga', 'gu', 'iw', 'hi', 'hr', 'hu', 'hy', 'in', 'is', 'i', 'ja', 'km', 'kn', 'ko', 'lo', 'lt', 'lv', 'mk', 'ml', 'mr', 'ms', 'ne', 'nl', 'n', 'or', 'pl', 'pt', 'ro', 'ru', 'si', 'sk', 'sl', 'sq', 's', 'sw', 'ta', 'te', 'th', 'tr', 'uk', 'ur', 'vn'], 
'max_seq_length': 512, 
'checksum': 'c2bf85d8f7d9dff379e6dfaa7a864d296a918b122ebf7409a4abee61b99528b5', 
'quantize_model': True, 
'model_type': 'TEXT'
}]

BAAI/bge-reranker-base

from oml.utils import ONNXPipeline, MiningFunction
m = 'BAAI/bge-reranker-base'
f = 'bge-reranker-base'
p = ONNXPipeline(model_name=m, config=None, function=MiningFunction.REGRESSION, settings={"force_download":True})
p.export2file(f, output_dir=".")

tokenizer_config.json: 100%|████████████████████| 443/443 [00:00<00:00, 2.20MB/s]
tokenizer_config.json: 100%|████████████████████| 443/443 [00:00<00:00, 4.17MB/s]
sentencepiece.bpe.model: 100%|██████████████████| 5.07M/5.07M [00:00<00:00, 50.3MB/s]
special_tokens_map.json: 100%|██████████████████| 279/279 [00:00<00:00, 1.05MB/s]
tokenizer_config.json: 100%|████████████████████| 443/443 [00:00<00:00, 5.18MB/s]
tokenizer.json: 100%|███████████████████████████| 17.1M/17.1M [00:00<00:00, 90.2MB/s]
config.json: 100%|██████████████████████████████| 799/799 [00:00<00:00, 2.39MB/s]
config.json: 100%|██████████████████████████████| 799/799 [00:00<00:00, 9.71MB/s]
model.safetensors: 100%|████████████████████████| 1.11G/1.11G [00:09<00:00, 111MB/s]

from oml.utils import ONNXPipelineConfig
m = 'BAAI/bge-reranker-base'
ONNXPipelineConfig.show_preconfigured(include_properties=True, model_name=m)

[{
'do_lower_case': True, 
'post_processors': [], 
'distance_metrics': ['COSINE'], 
'languages': ['us', 'zhs'], 
'max_seq_length': 512, 
'checksum': 'c9b7db292f323ef4158921dd1881963fbe2b7ae19db1de406b5391e16ed53c56', 
'quantize_model': True, 
'function': 'REGRESSION', 
'model_type': 'TEXT'
}]


Source code for this article can be found at https://github.com/gaiansentience/practicalplsql

Posted

in

, ,

by

Discussion and Comments

3 responses to “OML4PY 2.1 Quickstart”

  1. iudithd5bf8e4d8d Avatar
    iudithd5bf8e4d8d

    Hi Anthony,Excellent work, chapeau for your diligence :)Regarding the models producing only FLOAT32 vectors, supposing that you will ultimately succeed in “convincing” a model to produce other formats, like for example INT8, I wonder whether this “reduction” in the format will mean being “less accurate”, or, alternatively, the model will compensate the “coarser” values for each dimension by adding more dimensions to the produced vectors.That is, whether their embedding algorithm will kind of go for “width” instead of “depth”.For us, mere mortals, the inner workings of the models are a completely sealed black box,so, gaining knowledge about how these models work, whether a model can or cannot be customized to produce specific vector formats, a.s.o is surely beyond our reach.At one stage or another, maybe Oracle will come up with some kind of “classification” for the so many and various existing models, otherwise, for “common people” like us, it could be extremely difficult to decide how to choose one model upon a different one, and it is extremely demanding to expect that a “normal developer” start to test each model for his specific use case, just to figure out which one gives the best results.This is a kind of “where positive science ends and witchery starts” …Cheers & Best Regards,Iudith

    Like

    1. Anthony Harper Avatar

      Hi Iudith,

      That’s a good question, I have not seen anything suggesting an increased number of less precise dimensions as an option to maintain accuracy while decreasing overall vector embedding size. The documentation I have read on ‘quantizing’ vectors to reduce storage requirements indicates that this is just a reduction in the precision of the dimension values without a corresponding increase in the number of dimensions. The result will always be a reduction in overall vector size with a small decrease in accuracy due to the coarser grain of the dimensions.

      There is an excellent article on quantizing vectors on the mixed bread site: https://www.mixedbread.com/blog/binary-mrl. Here they discuss two approaches to reducing vector size: MRL (Matryoshka Representation Learning) and Vector Quantization. The MRL approach puts significant dimensions first, allowing the vector to be truncated without sacrificing accuracy. The quantizing approach reduces the storage for each dimension by downsizing it to a smaller datatype, with binary being the most extreme case. Essentially the MRL approach sacrifices width after prioritizing the most meaningful dimensions and the quantizing approach seeks to maintain total dimension count by reducing precision. I have not seen any indications that a quantized vector could use more, less precise, dimensions to represent the same embedding with a smaller storage requirement.

      It is a bit daunting that all of the preconfigured models are currently producing float32 vectors. I was really excited that the mixed bread model was added to the preconfigured models because it is designed for effective binary quantization… I communicated with Brendan Tierney (oralytics.com) about this and he recomended downgrading the NumPy package because oml_utils was not yet compatible with NumPy 2.x even though this was the documented release dependency. That experiment failed, and I do think that using the latest release of each Python library project makes the most sense if there are less errors. Hopefully this situation will be resolved and we can obtain a set of quantized models with different dimension formats to experiment with.

      The Mixed Bread model documentation for creating binary embeddings in Python indicates that an embedding is created, and the resulting vector is then post-processed through a binary quantizer. In my next post in this series I will take this approach in the database, first generating the float32 vector and then quantizing it with a macro. This would be unsafe for any vector that always produced non-zero results, because all dimensions would end up being 1…effectively losing all information in the vector. In my testing so far, the Mixed Bread model produces a good range of negative and positive dimensions which would allow the resulting quantized binary vector to retain the information with some reduction in accuracy.

      As an Oracle developer, my intention is to explore the toolset we are getting for working with vectors in the database. I don’t really see that a ‘normal developer’ would be able to make a case for using one model over another outside of performance considerations based on indexing and vector storage requirements for different dimension formats. As pragmatic developers, I feel that we should ultimately be able to load any model recomended for use by the AI data scientists into the database and expose the model functionality as needed in SQL and PL/SQL with confidence.

      Regards,
      Anthony Harper

      Like

  2. […] models from Hugging face into the database after exporting the models in ONNX format. The article Quickstart article on OML4PY 2.1 has instructions on setting up and using Python with OML4PY 2.1 to generate properly prepared LLM […]

    Like

Leave a comment