Binary Vectors: Semantic Searches

declare
    l_search_text varchar2(100);
    l_search_float  vector(*, float32);
    l_search_binary vector(*, binary);
    l_results sys.odcivarchar2list;
begin
    l_search_text := 'A wickedly delectable, delicious and decadent dessert.';

    select
        vector_embedding(
            ALL_MINILM_L12_V2 
            using l_search_text as data)
        , to_binary_vector(
            vector_embedding(
                MXBAI_EMBED_LARGE_V1 
                using l_search_text as data)
            )
        into l_search_float, l_search_binary;

    with fast_binary_search as (
        select 
            i.item_name
            , v.embedding3 as embedding_float
        from 
            menu_items i
            join menu_vectors v on i.item_id = v.item_id
        order by 
            vector_distance(
                v.embedding1_binary
                , l_search_binary
                , hamming)
        fetch first 25 rows only   
    ), precise_float_search as (
        select item_name
        from fast_binary_search b
        order by 
            vector_distance(
                b.embedding_float
                , l_search_float
                , cosine)
        fetch first 5 rows only
    )
    select item_name
    bulk collect into l_results
    from precise_float_search;
    
    for v in values of l_results loop
        dbms_output.put_line(v);
    end loop;

end;
/

Turtle Cheesecake
Chocolate Soufflé
Chocolate Mousse
Peach Cobbler
Molten Lava Cake

In the previous post (Binary Vectors: Quantization) we created a scalar macro to convert vectors to binary vectors. This gives us a tool to work with binary vectors in the database while we cannot load models that generate binary vectors directly. With binary vectors loaded into tables used for semantic searches, we open up the possibilities for highly optimized reranking searches combining multiple models in the same search.

In this post we will explore persisting these converted binary vectors and using them for semantic searches. After some basic searches, we will be able to take a look at the concept of a reranking search that uses multiple models in the same semantic search.

To accomplish this we will quickly review the process of loading LLMs to the database to generate vector embeddings. Then we will set up test data, generate embeddings and convert these to binary vectors with the macro. This will give us the ability to compare semantic searches with binary vectors to the same searches using float vectors.

  1. Getting Started
  2. Load Large Language Models
  3. Set Up Sample Data and Vector Tables
  4. Create The Scalar Macro TO_BINARY_VECTOR
  5. Set up Menu Vectors Test Data
  6. Persisting the Binary Vectors
  7. Semantic Search with Binary Vectors
  8. Reranking Search With Binary Vectors
  9. Mixed Model Reranking Searches
  10. Conclusion

Getting Started

Scripts for this post are located in vector-search/binary-vector-search. These scripts requires a minimum version of Oracle 23ai release 7 to support binary vectors. All scripts have been tested in OCI Autonomous databases and the Oracle 23ai Free database VirtualBox appliance. At the time of writing this article, both the OCI Autonomous Database and the VirtualBox appliance are at Oracle 23ai Release 9.

The source code for all articles here on practicalplsql.org is available in the practicalplsql public repository on github in the scripts/articles directory organized by concept and article. You can use create.user.practicalplsql.sql to create the practicalplsql schema. Sample data for articles can be found in the scripts/examples directory organized by example name. This article will be using the examples in scripts/examples/sample_foods as a starting point for test data.

The code for this article depends on loading some of the example models from Hugging face into the database after exporting the models in ONNX format. The article Quickstart article on OML4PY 2.1 has instructions on setting up and using Python with OML4PY 2.1 to generate properly prepared LLM mining models, examples for loading models to the database, and links to downloadable models in ONNX format ready for use stored in my Google Drive. All necessary scripts to load the models used in this article are included in the source code for this article.

Load Large Language Models

Rendering a picture of a sunset in black and white would basically remove the concept of sunset from the picture. The same is true for scalar binary quantization. If the model generates vectors that are all positive values, the resulting binary vectors would be all ones for every dimension value. The LLMs from the MixedBread developers on Hugging Face were designed for binary quantization with no accuracy loss. In this post I will be working with the MixedBread models mxbai-embed-large-v1 and mxbai-embed-xsmall-v1 to produce vectors that are intended for binary quantization. Both of these LLMs are in the sample models that can be prepared using OML4PY2.1. I have provided ready to load models in ONNX format here: mxbai-embed-large-v1, mxbai-embed-xsmall-v1.

Once the models are in a file system that is mapped to a database directory, they can be easily loaded to the database. We can create a procedure to load them and show what kind of embeddings they generate. For now we will set this procedure up to use a local database with a directory mapping. I am using an Always Free 23.9 VirtualBox database. Note: An alternate form of the procedure could be created to load models to an OCI Autonomous Database from an object storage bucket.

create.procedure.load_onnx_model.local.sql

create or replace procedure load_onnx_model(f in varchar2, m in varchar2)
is
    l_info varchar2(100);
begin
    dbms_vector.load_onnx_model('ML_MODELS_DIR', f, m);
    dbms_output.put_line('Loaded ONNX Model ' || m);
    
    select a.vector_info into l_info
    from user_mining_model_attributes a
    where
        a.model_name = upper(m)
        and a.attribute_type = 'VECTOR';     
        
    dbms_output.put_line('Model generates ' || l_info);  
    
exception
    when others then
        dbms_output.put_line(sqlerrm);
end load_onnx_model;
/

Run the script to load the mixed bread models.

load.models.mxbai.sql

set serveroutput on;
begin
    load_onnx_model('mxbai-embed-large-v1.onnx', 'mxbai_embed_large_v1');    
    load_onnx_model('mxbai-embed-xsmall-v1.onnx', 'mxbai_embed_xsmall_v1');
end;
/

Loaded ONNX Model mxbai_embed_large_v1
Model generates VECTOR(1024,FLOAT32)

Loaded ONNX Model mxbai_embed_xsmall_v1
Model generates VECTOR(384,FLOAT32)

We can also load the popular all_miniLM models for comparisons. The source models on Hugging Face are sentence-transformers/all-MiniLM-L6-v2 and sentence-transformers/all-MiniLM-L12-v2. You can download exported models that in ONNX format with OML4PY2.1 here: all-MiniLM-L6.onnx and all-MiniLM-L12-v2.onnx. After making these models available in a directory exposed to the database, run the script to load these models.

load.models.all_miniLM.sql

set serveroutput on;
begin
    load_onnx_model('all-MiniLM-L6-v2.onnx', 'all_MiniLM_L6_v2');    
    load_onnx_model('all-MiniLM-L12-v2.onnx', 'all_MiniLM_L12_v2');
end;
/

Loaded ONNX Model all_MiniLM_L6_v2
Model generates VECTOR(384,FLOAT32)

Loaded ONNX Model all_MiniLM_L12_v2
Model generates VECTOR(384,FLOAT32)

Set Up Sample Data and Vector Tables

We will need test data so that we can generate some vectors and work with converting them to binary format. Run the script in examples/sample-foods to create the recipes, menu_categories and menu_items tables and load them with test data.

--sample-foods/create.examples.all.sql
prompt creating all sample foods examples
@create.example.recipes.sql
@create.example.menus.sql
prompt created all sample foods examples

Table RECIPES created.
Created 15 simple recipes

Table MENU_CATEGORIES created.
Table MENU_ITEMS created.

Created 14 menu categories
Created 352 menu items
Menu Category Appetizers:  14 menu items created.
Menu Category Beverages:  10 menu items created.
Menu Category Breakfast:  13 menu items created.
Menu Category Desserts:  136 menu items created.
Menu Category Gluten-Free:  6 menu items created.
Menu Category Main Courses:  8 menu items created.
Menu Category Pastas:  13 menu items created.
Menu Category Pizzas:  11 menu items created.
Menu Category Salads:  56 menu items created.
Menu Category Sandwiches:  18 menu items created.
Menu Category Seafood:  8 menu items created.
Menu Category Soups:  11 menu items created.
Menu Category Vegan:  24 menu items created.
Menu Category Vegetarian:  24 menu items created.

Using this data as a starting point, we can create some tables to store vectors based on the descriptive text in the sample data.

For basic testing, create a copy of the recipes table with added columns for vector embeddings. Also create a column named embedding_model to keep track of what model was used to generate the stored vectors.

create.table.recipe_vectors.sql
--requires loaded recipes table from examples/sample_foods recipes example

drop table if exists recipe_vectors purge
/

create table recipe_vectors as 
select name, doc
from recipes
/

alter table recipe_vectors add(
    embedding vector(*,*)
    , embedding_model varchar2(50)
    , embedding_binary vector(*, binary)    
)
/

describe recipe_vectors;             

Table RECIPE_VECTORS created.
Table RECIPE_VECTORS altered.

Name             Null?    Type                   
---------------- -------- ---------------------- 
NAME             NOT NULL VARCHAR2(100 CHAR)     
DOC                       VARCHAR2(4000 CHAR)    
EMBEDDING                 VECTOR(*,*,DENSE)      
EMBEDDING_MODEL           VARCHAR2(50)           
EMBEDDING_BINARY          VECTOR(*,BINARY,DENSE) 

Generate float32 vectors for the rows in the recipe vectors table using the vector_embedding function with xsmall model from mixed bread.

generate.vectors.recipe_vectors.mxbai_xsmall.sql

set serveroutput on;
declare
    l_start timestamp := localtimestamp;
    l_model varchar2(100);
begin

    update recipe_vectors
    set 
        embedding = vector_embedding(MXBAI_EMBED_XSMALL_V1 using doc as data),
        embedding_model = 'MXBAI_EMBED_XSMALL_V1';
        
    commit;
    
    select  
        embedding_model || ' ('
        || vector_dimension_count(embedding) || ' dimensions, '
        || vector_dimension_format(embedding) || ') '
    into l_model
    from recipe_vectors
    where rownum = 1;
    
    dbms_output.put_line('Created embeddings using ' || l_model);
    dbms_output.put_line('Total Duration: ' || to_char(localtimestamp - l_start));
    
end;
/

update recipe_vectors.embeddings using MXBAI_EMBED_XSMALL_V1
Created embeddings using MXBAI_EMBED_XSMALL_V1 (384 dimensions, FLOAT32) 
Total Duration: +000000000 00:00:00.121922000

The basic syntax for a semantic search uses the vector_distance function with a given metric and a search vector that can be generated using the vector_embedding function with a model that matches the vectors being compared.

--semantic_search.basic_syntax.sql
--current embeddings loaded should use embedding model MXBAI_EMBED_XSMALL_V1

select name
from recipe_vectors
order by 
    vector_distance(
        embedding
        , vector_embedding(MXBAI_EMBED_XSMALL_V1 using 'tasty dessert' as data)
        , cosine)
fetch first 3 rows only
/

NAME                
--------------------
Strawberry Pie
Chocolate Cake
Raspberry Tarts

Notice that the model argument to vector_embedding is an identifier created when the model was loaded and the metric used is also an identifier. This makes it necessary to write specialized queries to use different models or metrics.

Having the call to vector_embedding to convert the search term to a vector located in the order by clause will cause this function to be called with every row. It will be better to convert the search term to a vector first so that this conversion is only done once for each search.

Instead of writing a lot of similar queries, we can create a procedure to dynamically create the necessary queries using a given model, metric and search term. The procedure can also convert the search text to a vector and pass it as a bind variable to the actual search query. This procedure is a flexible tool to run a variety of vector searches easily.

create.procedure.search_recipe_vectors.sql

create or replace procedure search_recipe_vectors(
    p_search in varchar2, 
    p_rows   in number default 3,
    p_model  in varchar2 default null,
    p_metric in varchar2 default 'cosine') 
is
    l_search_vector vector;
    type t_row is record (ranking number, name recipes.name%type);
    type t_rows is table of t_row;
    l_rows t_rows;
    l_model varchar2(100);
    l_start timestamp := localtimestamp;
    l_search_vector_sql varchar2(4000);
    l_sql varchar2(4000);
    cv sys_refcursor;
    c_dividing_line constant varchar2(80) := lpad('-', 80, '-');
begin
    --validate the vectors currently loaded
    select embedding_model 
    into l_model 
    from recipe_vectors
    fetch first row only;
    
    if p_model is not null and l_model <> p_model then
        raise_application_error(-20100, l_model || ' embeddings found, expecting ' || p_model);
    end if;
    
    --generate the vector for the search text
    l_search_vector_sql := 
    'select vector_embedding(' || l_model || ' using :search_term as data)';
    
    execute immediate l_search_vector_sql 
    into l_search_vector 
    using p_search;
    
    l_sql := 
    'select 
        rownum as ranking, name
    from
        (
        select name
        from recipe_vectors g
        order by 
            vector_distance(
                g.embedding
                , :search_vector
                , ' || p_metric || ')
        fetch first :p_rows rows only
        )';
    
    dbms_output.put_line(c_dividing_line);
    dbms_output.put_line('Run Test Vector Search For [' || p_search || ']');
    
    open cv for l_sql 
    using l_search_vector, p_rows;
    fetch cv bulk collect into l_rows;
    close cv;
    
    dbms_output.put_line('Search completed using ' || l_model);
    dbms_output.put_line('Vector Distance Metric ' || p_metric);
    dbms_output.put_line('Search duration: ' || to_char(localtimestamp - l_start));
    dbms_output.put_line('Search text = [' || p_search || ']');
    dbms_output.put_line('Top k=' || p_rows || ' results:');
    
    for v in values of l_rows loop
        dbms_output.put_line(v.ranking || ') ' || v.name);
    end loop;
    
    dbms_output.put_line(c_dividing_line);
    
end search_recipe_vectors;
/

Use this procedure to run some tests for vector searches with the small mixed bread model using float32 vector embeddings, a variety of metrics and a couple of different search terms.

search.vectors.recipe_vectors.mxbai_xsmall.sql
--generate the embeddings using the mxbai_xsmall model
@generate.vectors.recipe_vectors.mxbai_xsmall.sql

declare
    type t_list is table of varchar2(100);
    l_model  varchar2(100) := 'MXBAI_EMBED_XSMALL_V1';
    l_metrics t_list := t_list('cosine', 'euclidean', 'manhattan');
    l_searches t_list := t_list('healthy dinner', 'yummy dessert');
begin
    for s in values of l_searches loop
        for m in values of l_metrics loop
            search_recipe_vectors(s, 3, l_model, m);
        end loop;
    end loop;
end;
/

Reviewing the results on a search for ‘healthy dinner’ in the recipe_vectors table shows that all searches returned the same results. The timings for searching with different metrics showed some minor differences.

Run Test Vector Search For [healthy dinner]
Search completed using MXBAI_EMBED_XSMALL_V1
Top k=3 results:
1) Shepherd's Pie
2) Grilled Cheese Sandwiches
3) Curried Tofu

Vector Distance Metric cosine
Search duration: +000000000 00:00:00.004073000

Vector Distance Metric euclidean
Search duration: +000000000 00:00:00.002777000

Vector Distance Metric manhattan
Search duration: +000000000 00:00:00.003189000

Searching for a ‘yummy dessert’ came up with a different 3rd place option using the manhattan distance metric with all searches executed in approximately the same time.

Run Test Vector Search For [yummy dessert]
Search completed using MXBAI_EMBED_XSMALL_V1
Top k=3 results:
1) Strawberry Pie
2) Chocolate Cake
3) Raspberry Tarts [cosine and euclidean]
3) Banana, Mango and Blueberry Smoothie [manhattan]

Vector Distance Metric cosine
Search duration: +000000000 00:00:00.003149000

Vector Distance Metric euclidean
Search duration: +000000000 00:00:00.002980000

Vector Distance Metric manhattan
Search duration: +000000000 00:00:00.002708000

There are scripts in the repository to test all of the loaded models with these two searches using three different metrics: cosine, euclidean and manhattan. You can run them to get a feel for the results using different models and metrics. Each script updates the embeddings in the recipe_vectors table and then runs the same test searches with the appropriate model. Run the mxbai_xsmall script again last to leave the table with 384 dimension float32 vectors from the small mixed bread model. This will be the starting point for looking at binary vector conversions.

search.vectors.recipe_vectors.mxbai_xsmall.sql
search.vectors.recipe_vectors.mxbai_large.sql
search.vectors.recipe_vectors.all_miniLM_L6.sql
search.vectors.recipe_vectors.all_miniLM_L12.sql

Create The Scalar Macro TO_BINARY_VECTOR

In the previous post we designed a scalar macro function to convert vectors to binary vectors using SQL JSON functionality. For review, this is the final code for the macro, make sure it is compiled into the same schema that the models have been loaded into.

create.function.to_binary_vector.sql

create or replace function to_binary_vector(
    p_input_vector in vector
    ) return varchar2
    sql_macro (scalar)
is
begin

    return q'~
    
        select 
            to_vector(
                json_serialize(
                    json_arrayagg(pb.uint_byte order by pb.byte#) 
                    returning clob)
                , *, binary)
        from 
            (
            select
                p.byte#
                , bin_to_num(p.b#1, p.b#2, p.b#3, p.b#4, p.b#5, p.b#6, p.b#7, p.b#8) as uint_byte
            from 
                (
                select 
                    case sign(jt.dimval) when 1 then 1 else 0 end as dim_bitval
                    , mod(jt.dim#, 8) as bit#
                    , ceil(jt.dim#/8) as byte#
                from 
                    json_table (
                        json(vector_serialize(p_input_vector returning clob))
                        , '$[*]'
                        columns(
                            dim# for ordinality
                            , dimval number path '$'
                            )
                        ) jt
                ) q
            pivot(
                max(q.dim_bitval) for bit# in (
                    1 as b#1, 2 as b#2, 3 as b#3, 4 as b#4
                    , 5 as b#5, 6 as b#6, 7 as b#7, 0 as b#8)
                ) p
            ) pb 
    
    ~';
end to_binary_vector;
/

We can use the macro with any table that has a vector column. Try the test with the recipe_vectors table.

test.function.to_binary_vector.recipe_vectors.sql

select
    v_model
    , v_fmt, v_dims
    , vector_dimension_format(v_binary) as v_bin_fmt
    , vector_dimension_count(v_binary) as v_bin_dims
from
    (
    select 
        r.embedding_model as v_model
        , vector_dimension_format(r.embedding) as v_fmt
        , vector_dimension_count(r.embedding) as v_dims
        , to_binary_vector(r.embedding) as v_binary
    from recipe_vectors r
    )
fetch first row only
/

V_MODEL                V_FMT         V_DIMS V_BIN_FMT V_BIN_DIMS
---------------------- --------- ---------- --------- ----------
MXBAI_EMBED_XSMALL_V1  FLOAT32          384 BINARY           384

The macro is working as expected, converting the stored float32 vectors to binary dimension format.

Set up Menu Vectors Test Data

To get more data to test with, lets use the menu_items table to test with a dataset of a few hundred vectors. If you created the full sample_foods example earlier, the menu_items table has been created and loaded with about 350 rows of test data.

Create the menu_vectors table with several vector columns to allow comparison of float32 vectors created with all the models loaded and the converted binary vectors. We will only keep the vectors in this table, using a join to menu_items on item_id to access the Item_name and item_description columns.

create.table.menu_vectors.sql

create table menu_vectors as 
select item_id
from menu_items
/

alter table menu_vectors add(
    embedding vector(*,*)
    , embedding_model varchar2(50)
    , embedding_binary vector(*, binary)   
    , embedding1 vector(*,*)
    , embedding1_model varchar2(50)
    , embedding1_binary vector(*, binary)     
    , embedding2 vector(*,*)
    , embedding2_model varchar2(50)
    , embedding2_binary vector(*, binary)     
    , embedding3 vector(*,*)
    , embedding3_model varchar2(50)
    , embedding3_binary vector(*, binary)       
)
/

Run the script to generate vector embeddings using the item_description from the menu_items table and each of the four loaded models. The MXBAI_EMBED_LARGE_V1 model takes the longest time to generate embeddings.

generate.vectors.menu_vectors.sql

Updated 352 embeddings using MXBAI_EMBED_XSMALL_V1
Total duration: +000000000 00:00:04.851623000

Updated 352 embeddings using MXBAI_EMBED_LARGE_V1
Total duration: +000000000 00:02:02.573723000

Updated 352 embeddings using ALL_MINILM_L6_V2
Total duration: +000000000 00:00:07.123483000

Updated 352 embeddings using ALL_MINILM_L12_V2
Total duration: +000000000 00:00:13.843263000

We can use the scalar macro to convert any of these vectors to binary format dynamically in a query. We can try the embeddings generated by the mixed bread xsmall model which have 384 float32 dimensions, these are stored in the menu_vectors.embedding column.

test.function.to_binary_vector.menu_vectors.sql

select 
    count(*) items
    , any_value(llm) as llm
    , any_value(v_dim_cnt) || ' '
        || any_value(v_dim_fmt) 
        || ' dimensions' as source_vector
    , any_value(v_binary_dim_cnt) || ' '
        || any_value(v_binary_dim_fmt) 
        || ' dimensions' as converted_vector    
from 
    (
    select
        item_id, llm, v_dim_cnt, v_dim_fmt
        , vector_dimension_count(v_binary) as v_binary_dim_cnt
        , vector_dimension_format(v_binary) as v_binary_dim_fmt
    from
        (
        select 
            item_id, embedding_model as llm
            , vector_dimension_count(embedding) as v_dim_cnt
            , vector_dimension_format(embedding) as v_dim_fmt
            , to_binary_vector(embedding) as v_binary
        from menu_vectors
        )
    )
/

     ITEMS LLM                    SOURCE_VECTOR            CONVERTED_VECTOR        
---------- ---------------------- ------------------------ ------------------------
       352 MXBAI_EMBED_XSMALL_V1  384 FLOAT32 dimensions   384 BINARY dimensions   

It takes about twice as much time to convert the 1024 dimension float32 vectors generated from the mixed bread large model.

select 
    count(*) items
    , any_value(llm) as llm
    , any_value(v_dim_cnt) || ' '
        || any_value(v_dim_fmt) 
        || ' dimensions' as source_vector
    , any_value(v_binary_dim_cnt) || ' '
        || any_value(v_binary_dim_fmt) 
        || ' dimensions' as converted_vector   
from 
    (
    select
        item_id, llm, v_dim_cnt, v_dim_fmt
        , vector_dimension_count(v_binary) as v_binary_dim_cnt
        , vector_dimension_format(v_binary) as v_binary_dim_fmt
    from
        (
        select 
            item_id, embedding1_model as llm
            , vector_dimension_count(embedding1) as v_dim_cnt
            , vector_dimension_format(embedding1) as v_dim_fmt
            , to_binary_vector(embedding1) as v_binary
        from menu_vectors
        )
    )
/

     ITEMS LLM                    SOURCE_VECTOR            CONVERTED_VECTOR        
---------- ---------------------- ------------------------ ------------------------
       352 MXBAI_EMBED_LARGE_V1   1024 FLOAT32 dimensions  1024 BINARY dimensions  

Persisting the Binary Vectors

With the source vectors in the table, we can use the macro to update a column with the converted binary format vectors instead of dynamically doing the quantization in a query.

quantize_vectors.recipe_vectors.sql

begin
    update recipe_vectors r
    set r.embedding_binary = to_binary_vector(r.embedding);

    dbms_output.put_line('Converted ' || sql%rowcount || ' vectors to binary format');
    commit;
end;
/

Converted 15 vectors to binary format
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.026

While the mixed bread models are intended to be quantized in binary format, the other models we have loaded are not necessarily optimized for binary quantizing. We can still use the macro to convert these vectors in a single update statement. Later we can compare binary search results from the MiniLM models with the mixed bread models.

quantize_vectors.menu_vectors.sql

begin
    update menu_vectors m
    set 
        --quantize the mixed bread embeddings
        m.embedding_binary = to_binary_vector(m.embedding)
        , m.embedding1_binary = to_binary_vector(m.embedding1)
        --quantize the all_minilm embeddings for comparison
        , m.embedding2_binary = to_binary_vector(m.embedding2)
        , m.embedding3_binary = to_binary_vector(m.embedding3);

    dbms_output.put_line('Converted ' || (sql%rowcount * 4) || ' vectors to binary format');
    commit;
end;
/

use the macro to quantize the all of the vectors in the menu vectors table
Converted 1408 vectors to binary format
Elapsed: 00:00:00.753

It did not really take long to run the full sql json based binary conversion of 1400 vectors. Now our test tables are all set up with plenty of vectors to test.

Semantic Search with Binary Vectors

Create a procedure to run various semantic searches against the recipe_vectors table with parameters for search term, distance metric and a switch to use binary embeddings or regular embeddings. If a binary search is requested, we will use the macro to convert the search term vector to a binary vector and we will use the embedding_binary column for the vector_distance function.

create.procedure.search_compare_recipe_vectors.sql

create or replace procedure search_compare_recipe_vectors(
    p_search_text in varchar2, 
    p_rows in number default 3, 
    p_metric in varchar2 default 'cosine', 
    p_use_binary in boolean default false
)
is
    l_model varchar2(100);
    l_format varchar2(100);
    l_dim_count number;
    l_vector_column varchar2(100);
    l_model_sql varchar2(1000);
    l_search_v vector;
    l_search_v_sql varchar2(4000);
    l_sql varchar2(4000);
    cv sys_refcursor;
    l_start timestamp;
    l_results sys.odcivarchar2list;
begin
    --get the model, format, dimensions currently loaded to recipe_vectors
    l_vector_column := 'embedding' || case when p_use_binary then '_binary' end;
    l_model_sql := '
    select 
        embedding_model
        , vector_dimension_format(##VECTOR_COLUMN##)
        , vector_dimension_count(##VECTOR_COLUMN##)
    from recipe_vectors 
    where rownum = 1';
    l_model_sql := replace(l_model_sql, '##VECTOR_COLUMN##', l_vector_column);
    execute immediate l_model_sql
    into l_model, l_format, l_dim_count;
    
    dbms_output.put_line(
        'Semantic search for [' || p_search_text || '] top k=' || p_rows);
    dbms_output.put_line(
        'Model ' || l_model || ', metric ' || p_metric); 
    dbms_output.put_line(
        'Use Column ' || l_vector_column 
        || ', dimensions ' || l_dim_count || ', format ' || l_format);
    
    --convert the search string to a vector using the correct model and format
    --wrap the call to vector_embeddings with the macro for a binary search
    if p_use_binary then
        l_search_v_sql := '
            select 
                to_binary_vector(
                    vector_embedding(##MODEL## using :search_text as data))
            ';
    else
        l_search_v_sql := '
            select 
                vector_embedding(##MODEL## using :search_text as data)
            ';        
    end if;
    l_search_v_sql := replace(l_search_v_sql, '##MODEL##', l_model);
    
    l_start := localtimestamp;
    execute immediate l_search_v_sql 
    into l_search_v using p_search_text;
    dbms_output.put_line('Time to get search vector: ' || to_char(localtimestamp - l_start));
    
    --build the semantic search query
    l_sql := '
        select name 
        from recipe_vectors
        order by vector_distance(##VECTOR_COLUMN##, :l_search_vector, ##METRIC##)
        fetch first :results rows only
        ';
    l_sql := replace(l_sql, '##METRIC##', p_metric);
    l_sql := replace(l_sql, '##VECTOR_COLUMN##', l_vector_column);
    
    l_start := localtimestamp;
    open cv for l_sql using l_search_v, p_rows;
    fetch cv bulk collect into l_results;
    close cv;
    dbms_output.put_line('Time to run search: ' || to_char(localtimestamp - l_start));
    for v in values of l_results loop
        dbms_output.put_line(v);
    end loop;
    
    dbms_output.put_line(lpad('-', 50, '-'));
    
end search_compare_recipe_vectors;
/

We can use this procedure to compare searches with different metrics for binary vectors and float vectors. For binary searches, we have to use either the hamming or jaccard distance metrics.

--search.compare_binary.recipe_vectors.sql
declare
    l_search_text varchar2(100);
begin
    l_search_text := 'a deliciously decadent dessert sounds good';
    search_compare_recipe_vectors(l_search_text, 3, 'cosine', p_use_binary => false);
    search_compare_recipe_vectors(l_search_text, 3, 'euclidean', p_use_binary => false);
    search_compare_recipe_vectors(l_search_text, 3, 'manhattan', p_use_binary => false);
    search_compare_recipe_vectors(l_search_text, 3, 'hamming', p_use_binary => true);
    search_compare_recipe_vectors(l_search_text, 3, 'jaccard', p_use_binary => true);    
end;
/

All of the searches produced the same results for binary and non binary searches based on vectors from mxbai small.

Semantic search for [a deliciously decadent dessert sounds good] top k=3
Strawberry Pie
Raspberry Tarts
Chocolate Cake

Create a similar procedure to test semantic searches against the various vectors stored in the menu_vectors table produced by the different models.

--create.procedure.search_compare_menu_vectors.sql

create or replace procedure search_compare_menu_vectors(
    p_search_text in varchar2, 
    p_rows in number default 3, 
    p_metric in varchar2 default 'cosine', 
    p_vector_col in varchar2 default 'embedding',
    p_use_binary in boolean default false
)
is
    l_model varchar2(100);
    l_format varchar2(100);
    l_dim_count number;
    l_vector_column varchar2(100);
    l_search_v vector;
    l_model_sql varchar2(1000);
    l_search_v_sql varchar2(4000);
    l_sql varchar2(4000);
    cv sys_refcursor;
    l_start timestamp;
    l_results sys.odcivarchar2list;
begin
    l_vector_column := p_vector_col || case when p_use_binary then '_binary' end;
    --get the model and format currently loaded to menu_vectors. embedding column specified
    l_model_sql := '
        select 
            ##COLUMN_BASE##_model
            , vector_dimension_format(##VECTOR_COLUMN##)
            , vector_dimension_count(##VECTOR_COLUMN##)
        from menu_vectors
        where rownum = 1';
    l_model_sql := replace(l_model_sql, '##COLUMN_BASE##', p_vector_col);
    l_model_sql := replace(l_model_sql, '##VECTOR_COLUMN##', l_vector_column);
    execute immediate l_model_sql
    into l_model, l_format, l_dim_count;
    
    dbms_output.put_line('Semantic search for [' || p_search_text 
        || '] top k=' || p_rows);
    dbms_output.put_line('Model ' || l_model || ', metric ' || p_metric);
    dbms_output.put_line('Use Column ' || l_vector_column
        || ', dimensions ' || l_dim_count
        || ', format ' || l_format);
    
    --convert the search string to a vector using the correct model and format
    if p_use_binary then
        l_search_v_sql := '
            select 
                to_binary_vector(
                    vector_embedding(##MODEL## using :search_text as data))
            ';
    else
        l_search_v_sql := '
            select 
                vector_embedding(##MODEL## using :search_text as data)
            ';        
    end if;
    l_search_v_sql := replace(l_search_v_sql, '##MODEL##', l_model);
    
    l_start := localtimestamp;
    execute immediate l_search_v_sql 
    into l_search_v using p_search_text;
    dbms_output.put_line('Time to get search vector: ' || to_char(localtimestamp - l_start));
    
    --build the semantic search query
    l_sql := '
        select mi.item_name 
        from 
            menu_items mi
            join menu_vectors v on mi.item_id = v.item_id
        order by vector_distance(##VECTOR_COLUMN##, :l_search_vector, ##METRIC##)
        fetch first :results rows only
        ';
    l_sql := replace(l_sql, '##METRIC##', p_metric);
    l_sql := replace(l_sql, '##VECTOR_COLUMN##', l_vector_column);
    
    open cv for l_sql using l_search_v, p_rows;
    fetch cv bulk collect into l_results;
    close cv;
    dbms_output.put_line('Time to run search: ' || to_char(localtimestamp - l_start));
    for v in values of l_results loop
        dbms_output.put_line(v);
    end loop;
        
    dbms_output.put_line(lpad('-', 50, '-'));
    
end search_compare_menu_vectors;
/

Now we can run a variety of searches to compare different models easily.

search.compare_binary.menu_vectors.sql

declare
    type t_list is table of varchar2(100);
    l_searches t_list;      
    l_columns t_list;
    l_metrics t_list;
begin
    l_searches := t_list(
        'a decadent dessert would be tasty'
        , 'what kinds of healthy lunch options are there?'
        );
        
    l_columns := t_list('embedding', 'embedding1', 'embedding2', 'embedding3');
    l_metrics := t_list('cosine','euclidean','manhattan');
    
    <<searches>>
    for s in values of l_searches loop
    
        <<models>>
        for c in values of l_columns loop
        
            <<float_metrics>>
            for m in values of l_metrics loop
                search_compare_menu_vectors(s, 3, m, c, p_use_binary => false);
            end loop float_metrics;
            
            search_compare_menu_vectors(s, 3, 'hamming', c, p_use_binary => true);
            search_compare_menu_vectors(s, 3, 'jaccard', c, p_use_binary => true);  
            
        end loop models;
    
    end loop searches;
    
end;
/

For the small mixed bread model, all of the dessert searches took about the same amount of time. All of the returned results are almost the same in most cases, indicating that this binary search would be a good candidate for a fast initial search.

Semantic search for [a decadent dessert would be tasty] top k=3
Model MXBAI_EMBED_XSMALL_V1

Use Column embedding, dimensions 384, format FLOAT32
Cosine and Euclidean float searches produced the same results:
Turtle Cheesecake, Molten Lava Cake, Crème Brûlée

cosine search: +000000000 00:00:00.055934000
euclidean search: +000000000 00:00:00.007555000

manhattan search: +000000000 00:00:00.008049000
Turtle Cheesecake, Crème Brûlée, Molten Lava Cake

Use Column embedding_binary, dimensions 384, format BINARY

hamming search: +000000000 00:00:00.013352000
Turtle Cheesecake, Molten Lava Cake, Crème Brûlée

jaccard search: +000000000 00:00:00.007535000
Turtle Cheesecake, Molten Lava Cake, Chocolate Marble Cheesecake

Based on the performance of the cosine float search, the large mixed bread model is definitely not intended for cosine searches. All the other searches were consistent between the binary or float search timing. The results were also very consistent, although churro fans may lean towards the binary searches here.

Semantic search for [a decadent dessert would be tasty] top k=3
Model MXBAI_EMBED_LARGE_V1

Use Column embedding1, dimensions 1024, format FLOAT32
All float searches returned the same results:
Chocolate Soufflé, Turtle Cheesecake, Molten Lava Cake

cosine search: +000000000 00:00:04.004011000
euclidean search: +000000000 00:00:00.953499000
manhattan search: +000000000 00:00:00.941320000

Use Column embedding1_binary, dimensions 1024, format BINARY

Both binary searches returned the same results:
Turtle Cheesecake, Chocolate Soufflé, Churros
hamming search: +000000000 00:00:01.073347000
jaccard search: +000000000 00:00:00.941120000

With MiniLM L6, the cosine search also turned out to be the slowest performer. Surprisingly, the vectors converted to binary format returned results that are close to the float vector searches for this model. The vector space defined by this model may be satisfactory for binary conversions, at least for preliminary reranking searches.

Semantic search for [a decadent dessert would be tasty] top k=3
Model ALL_MINILM_L6_V2

Use Column embedding2, dimensions 384, format FLOAT32

cosine search: +000000000 00:00:00.362923000
Turtle Cheesecake, Molten Lava Cake, Crème Brûlée

euclidean search: +000000000 00:00:00.009617000
Turtle Cheesecake, Molten Lava Cake, Crème Brûlée

manhattan search: +000000000 00:00:00.009927000
Crème Brûlée, Turtle Cheesecake, Molten Lava Cake

Use Column embedding2_binary, dimensions 384, format BINARY

hamming search: +000000000 00:00:00.023431000
Molten Lava Cake, Hot Chocolate, Crème Brûlée

jaccard search: +000000000 00:00:00.009421000
Molten Lava Cake, Hot Chocolate, Coconut Cream Pie

The larger MiniLM L12 model also produced reasonable results with the converted binary searches for desserts.

Semantic search for [a decadent dessert would be tasty] top k=3
Model ALL_MINILM_L12_V2

Use Column embedding3, dimensions 384, format FLOAT32
Top 3 results are the same for float searches, with slight variance in ordering.

cosine search: +000000000 00:00:00.467303000
Turtle Cheesecake, Chocolate Marble Cheesecake, Clafoutis

euclidean search: +000000000 00:00:00.018389000
Turtle Cheesecake, Chocolate Marble Cheesecake, Clafoutis

manhattan search: +000000000 00:00:00.018366000
Turtle Cheesecake, Clafoutis, Chocolate Marble Cheesecake

Use Column embedding3_binary, dimensions 384, format BINARY
Both binary searches returned the same results
Clafoutis, Turtle Cheesecake, Qatayef

hamming search: +000000000 00:00:00.025727000
jaccard search: +000000000 00:00:00.017236000

The results for the lunch searches showed the same general results as the dessert searches. Here are the outcomes for the small mixed bread model:

Semantic search for [what kinds of healthy lunch options are there?] top k=3
Model MXBAI_EMBED_XSMALL_V1

Use Column embedding, dimensions 384, format FLOAT32
All float searches returned the same results:
Quinoa Salad, Egg Salad Sandwich, Cobb Salad

cosine search: +000000000 00:00:00.110066000
euclidean search: +000000000 00:00:00.008086000
manhattan search: +000000000 00:00:00.007773000

Use Column embedding_binary, dimensions 384, format BINARY

hamming search: +000000000 00:00:00.010658000
Quinoa Salad, Tofu Scramble, Apple Walnut Salad

jaccard search: +000000000 00:00:00.007808000
Quinoa Salad, Tofu Lasagna, Vegan Lasagna

Looking at the lunch searches, the large mixed bread model shows almost the same results for float searches as it does for binary searches. The timing on a binary hamming search with this model makes it less suitable for an initial search.

Semantic search for [what kinds of healthy lunch options are there?] top k=3
Model MXBAI_EMBED_LARGE_V1

Use Column embedding1, dimensions 1024, format FLOAT32
All float searches produced the same results:
Quinoa Salad, Egg Salad Sandwich, Lentil Salad

cosine search: +000000000 00:00:01.261661000
euclidean search: +000000000 00:00:00.958576000
manhattan search: +000000000 00:00:00.940072000

Use Column embedding1_binary, dimensions 1024, format BINARY
Both binary searches produced the same results:
Quinoa Salad, Egg Salad Sandwich, Asparagus Pea Salad

hamming search: +000000000 00:00:05.450606000
jaccard search: +000000000 00:00:00.867010000

A jaccard search on a binary vector offers fast performance with about the same search accurracy. This could be a good supplemental vector column in a table with embeddings with a minimal storage cost. If we do store multiple vector columns for a variety of searches, we also open the possibilities for reranking searches.

Reranking Search With Binary Vectors

One approach to a reranking search obtains a set of preliminary results using one model, and then run another search on these results using a different model. This approach can be used to offset slower performance with larger models by using an initial search with a smaller model using quantized vectors as the basis for final results refinement using the larger model.

We can run a reranking search in a single query using inline views to nest the preliminary search as the source of the final search. Setting aside the need to pass the search terms as a bind variable, the general syntax looks like this.

reranking_search.basic_syntax.sql

select rownum as float_rank, binary_rank, item_name
from
    (
    select binary_rank, item_name
    from
        (
        select rownum as binary_rank, item_name, embedding
        from
            (
            select i.item_name, v.embedding
            from 
                menu_items i
                join menu_vectors v on i.item_id = v.item_id
            order by 
                vector_distance(
                    v.embedding_binary
                    , to_binary_vector(
                        vector_embedding(
                            MXBAI_EMBED_XSMALL_V1 
                            using 'A really sweet treat would be good.' as data))
                    , hamming)
            fetch first 25 rows only
            )
        )
    order by
        vector_distance(
            embedding 
            , vector_embedding(
                MXBAI_EMBED_XSMALL_V1 
                using 'A really sweet treat would be good.' as data)
            , manhattan)
    fetch first 5 rows only
    )
/

FLOAT_RANK BINARY_RANK ITEM_NAME                     
---------- ----------- ------------------------------
         1           1 Rocky Road                    
         2           8 Snickerdoodle                 
         3          22 Oatmeal Raisin Cookie         
         4           3 Fruit Salad                   
         5           4 Sugar Cookie    

Using a PL/SQL block, we can pass the search vectors to the reranking search query. We can also compare the performance and results to a search that only uses the float32 vectors.

--reranking_search.binary_and_float.same_model.mxbai_xsmall.sql

declare
    l_start timestamp;
    l_search_text varchar2(100);
    v_float vector;
    v_binary vector;
    cursor c is
    select rownum as float_rank, binary_rank,item_name
    from
        (
        select binary_rank, item_name
        from
            (
            select rownum as binary_rank, item_name, embedding
            from
                (
                select i.item_name, i.item_description, v.embedding
                from 
                    menu_items i
                    join menu_vectors v on i.item_id = v.item_id
                order by 
                    vector_distance(
                        v.embedding_binary
                        , v_binary
                        , jaccard)
                fetch first 20 rows only
                )
            )
        order by
            vector_distance(
                embedding
                , v_float
                , manhattan)
        fetch first 5 rows only);
begin
    l_search_text := 'A really sweet and decadent dessert.';
    l_start := localtimestamp;
    --Generate search vectors
    select vector_embedding(MXBAI_EMBED_XSMALL_V1 using l_search_text as data)
    into v_float;
    
    select to_binary_vector(v_float)
    into v_binary;
    
    dbms_output.put_line(
        'Time to generate search vectors: ' || to_char(localtimestamp - l_start));
    
    l_start := localtimestamp;
    
    for r in c loop   

        dbms_output.put_line(
            'Rank (Float ' || r.float_rank 
            || ', Binary ' || rpad(r.binary_rank,2,' ') 
            || ') ' || r.item_name);

    end loop;
    
    dbms_output.put_line(
        'Nested Reranking Search Duration: ' || to_char(localtimestamp - l_start));
        
    dbms_output.put_line(lpad('-',50,'-'));
    dbms_output.put_line('Compare to search using only mxbai xsmall:');
    search_compare_menu_vectors(l_search_text, 5, 'manhattan', 'embedding', p_use_binary => false);           

end;
/

The mxbai xsmall model is pretty fast, however, the reranking approach to the same search using the binary vectors only took about half the time.

The results of both searches are identical.

Rank (Float 1, Binary 1 ) Turtle Cheesecake
Rank (Float 2, Binary 2 ) Molten Lava Cake
Rank (Float 3, Binary 7 ) Crème Brûlée
Rank (Float 4, Binary 6 ) Hot Chocolate
Rank (Float 5, Binary 3 ) Clafoutis
Nested Reranking Search Duration: +000000000 00:00:00.004549000

Time to run search: +000000000 00:00:00.008192000
Turtle Cheesecake
Molten Lava Cake
Crème Brûlée
Hot Chocolate
Clafoutis

The difference is more dramatic with the mxbai large model.

--reranking_search.binary_and_float.same_model.mxbai_large.sql

prompt run the initial search by binary vectors for top 20 hits
prompt use the mxbai large model with hamming metric
prompt then rerank the binary search results with search on float32 vectors
prompt use the mxbai large model float vectors with cosine metric

declare
    l_start timestamp;
    l_search_text varchar2(100);
    v_float vector;
    v_binary vector;
    cursor c is
    select rownum as float_rank, binary_rank,item_name
    from
        (
        select binary_rank, item_name
        from
            (
            select rownum as binary_rank, item_name, embedding
            from
                (
                select i.item_name, i.item_description, v.embedding1 as embedding
                from 
                    menu_items i
                    join menu_vectors v on i.item_id = v.item_id
                order by 
                    vector_distance(
                        v.embedding1_binary
                        , v_binary
                        , jaccard)
                fetch first 20 rows only
                )
            )
        order by
            vector_distance(
                embedding
                , v_float
                , cosine)
        fetch first 5 rows only);
begin
    l_search_text := 'A healthy and nutritious dinner.';
    l_start := localtimestamp;
    --Generate search vectors
    select vector_embedding(MXBAI_EMBED_LARGE_V1 using l_search_text as data)
    into v_float;
    
    select to_binary_vector(v_float)
    into v_binary;
    dbms_output.put_line(
        'Time to generate search vectors: ' || to_char(localtimestamp - l_start));
    
    l_start := localtimestamp;
    
    for r in c loop   

        dbms_output.put_line(
            'Rank (Float ' || r.float_rank 
            || ', Binary ' || rpad(r.binary_rank,2,' ') 
            || ') ' || r.item_name);

    end loop;
    
    dbms_output.put_line(
        'Nested Reranking Search Duration: ' || to_char(localtimestamp - l_start));
        
    dbms_output.put_line(lpad('-',50,'-'));
    dbms_output.put_line('Compare to search using only mxbai large:');
    search_compare_menu_vectors(l_search_text, 5, 'cosine', 'embedding1', p_use_binary => false);        

end;
/

Using the larger model with 1024 dimensions, the binary reranking search took 0.00648 seconds, the simple search using only the float32 vectors took 0.9784 seconds-about two orders of magnitude difference in search performance.

The results are almost the same for the two searches.

Rank (Float 1, Binary 6 ) Lentil Salad
Rank (Float 2, Binary 3 ) Tofu and Kale Power Bowl
Rank (Float 3, Binary 5 ) Quinoa Salad
Rank (Float 4, Binary 8 ) Tofu and Mango Salad
Rank (Float 5, Binary 7 ) Farro Vegetable Salad
Nested Reranking Search Duration: +000000000 00:00:00.006480000


Time to run search: +000000000 00:00:00.978401000
Lentil Salad
Tofu and Kale Power Bowl
Tofu Buddha Bowl
Quinoa Salad
Tofu and Mango Salad

The mixed bread models both show clear optimizations in performance with the reranking approach used to search the binary vector space defined by the float vectors produced by the same model.

Mixed Model Reranking Searches

We don’t have to use vectors the same models for the reranking search if we have stored vectors from different models in the same table. We just need to make sure that the search vectors match the vectors being searched. This allows us to leverage the performance of a small quantized model to optimize the search with a large precise model.

Set up a reranking search with binary vectors converted from vectors generated by mxbai xsmall and float32 vectors generated from the miniLM_12 model. The inner search will get the top 25 results using the small binary vectors with a hamming distance search. The final search will rank these results using the float32 vector and cosine distance.

After executing the reranking search, run a top 5 search using just the miniLM_12 model float vectors and cosine distance for comparison.

reranking_search.binary_and_float.mixed_models.sql

prompt run the initial search by binary vectors for top 20 hits
prompt use the mxbai_xsmall binary vectors in menu_vectors.embedding_binary
prompt then rerank the binary search results with search on float32 vectors
prompt use the miniLM_12 float vectors in menu_vectors.embedding3

declare
    l_start timestamp;
    l_search_text varchar2(100);
    v_float vector;
    v_binary vector;
    cursor c is
    select rownum as float_rank, binary_rank,item_name
    from
        (
        select binary_rank, item_name
        from
            (
            select rownum as binary_rank, item_name, embedding
            from
                (
                select i.item_name, i.item_description, v.embedding3 as embedding
                from 
                    menu_items i
                    join menu_vectors v on i.item_id = v.item_id
                order by 
                    vector_distance(
                        v.embedding_binary
                        , v_binary
                        , hamming)
                fetch first 20 rows only
                )
            )
        order by
            vector_distance(
                embedding
                , v_float
                , cosine)
        fetch first 5 rows only);
begin
    l_search_text := 'A healthy and nutritious dinner.';
    l_start := localtimestamp;
    dbms_output.put_line('Generate final search vector using ALL_MINILM_L12_V2');
    
    select vector_embedding(ALL_MINILM_L12_V2 using l_search_text as data)
    into v_float;
    
    dbms_output.put_line('Generate binary search vector using MXBAI_EMBED_XSMALL_V1');
    select 
        to_binary_vector(
                vector_embedding(MXBAI_EMBED_XSMALL_V1 using l_search_text as data))
    into v_binary;
    dbms_output.put_line(
        'Time to generate search vectors: ' || to_char(localtimestamp - l_start));
    
    l_start := localtimestamp;
    
    for r in c loop   

        dbms_output.put_line(
            'Rank (Float ' || r.float_rank 
            || ', Binary ' || rpad(r.binary_rank,2,' ') 
            || ') ' || r.item_name);

    end loop;
    
    dbms_output.put_line(
        'Nested Reranking Search Duration: ' || to_char(localtimestamp - l_start));
    
    dbms_output.put_line(lpad('-',50,'-'));
    dbms_output.put_line('Compare to search using only MiniLM12:');
    search_compare_menu_vectors(l_search_text, 5, 'cosine', 'embedding3', p_use_binary => false);
    
end;
/

With mixed model reranking searches, we still get search performance that is several orders of magnitude better with about the same search results returned by the final query.

Rank (Float 1, Binary 19) Tofu and Lentil Shepherd's Pie
Rank (Float 2, Binary 4 ) Quinoa Salad
Rank (Float 3, Binary 1 ) Tofu and Kale Power Bowl
Rank (Float 4, Binary 6 ) Tofu Buddha Bowl
Rank (Float 5, Binary 7 ) Seitan and Spinach Lasagna
Nested Reranking Search Duration: +000000000 00:00:00.002750000

Same search without mixed model reranking: +000000000 00:00:00.110360000
Tofu and Lentil Shepherd's Pie
Tofu and Peanut Noodles
Chicken Caesar Wrap
Quinoa Salad
Tofu and Kale Power Bowl

Conclusion

Binary vectors may have a critical role to play in optimizing vector search performance, especially if we use them for the initial results phase of reranking searches. This would have a minimal requirement for additional storage of the binary vectors that would be more than balanced by the reranking search gains that are measurable in orders of magnitude.

It could also be reasonable to only store binary vectors and run initial searches using these vectors to gather a candidate set of search results quickly. These results could then be used to generate vectors from a larger more precise model dynamically for the final phase of reranking queries. This approach would all but eliminate the overhead of storing the vectors at a cost of calculating ten or twenty vectors at run time.

As we saw in this post, once we have the option to convert vectors to binary format, there are few limits to the explorations we can make to understand the usefullness and limitations for this compact vector datatype.

–Anthony Harper

Discussion and Comments

One response to “Binary Vectors: Semantic Searches”

  1. […] the next post (Binary Vectors: Semantic Search) we will load large language models in ONNX format, set up test data and generate float vectors. […]

    Like

Leave a comment