Dynamic Unpivot Using JSON_DATAGUIDE

Anthony Harper

July 5, 2024

set serveroutput on
declare
    l_goal varchar2(100) := 'dynamic unpivot';
    l_problem varchar2(100) := 'custom code for every query';
    l_solution varchar2(100) := 'json_table and json_dataguide';
begin
    dbms_output.put_line('For ' || l_goal);
    dbms_output.put_line('Use ' || l_solution || ' to eliminate ' || l_problem);   
end;

As Oracle developers, we are sometimes confronted with tasks that require tedius and repetitive custom SQL coding to accomplish. The SQL language is powerful enough that we can come up with better ways to do the same thing that have less risk of errors, but sometimes the tedium cannot be avoided.

Consider a data source that shows quarterly sales totals for several years with each quarterly total in its own column with each year in a single row. This is a common presentation format that is easy to understand, but the structure of the data makes it hard to work with for further analysis.

sales_data-quarterly.sql
Note: run sales_data.report_sources.sql to create pivoted data sources

select 
    "Year", "Qtr1", "Qtr2", "Qtr3", "Qtr4" 
from quarterly_sales_v
order by "Year";

  Year    Qtr1    Qtr2    Qtr3    Qtr4
  2021  200184  198036  145044  153057
  2022  140424  158398  150220  177484
  2023  172737  163434  161465  152014

If we need to verify that the yearly totals are correct, we can add up the quarterly columns for each year.

select 
    "Year", 
    (
        "Qtr1" + "Qtr2" + "Qtr3" + "Qtr4"
    ) as "Volume"
from quarterly_sales_v
order by "Year";

  Year  Volume
  2021  696321
  2022  626526
  2023  649650

To work with this data, it would be easier to unpivot it into a set of rows with columns for year, quarter and amount.

select "Year", "Quarter", "Volume"
from 
    quarterly_sales_v
    unpivot (
        "Volume" for "Quarter" in (
            "Qtr1", "Qtr2", "Qtr3", "Qtr4")
    )
order by "Year", "Quarter";

  Year Quarter    Volume
  2021 Qtr1       200184
  2021 Qtr2       198036
  2021 Qtr3       145044
...
  2023 Qtr2       163434
  2023 Qtr3       161465
  2023 Qtr4       152014

12 rows selected.

After unpivoting, a query with the sum of volume grouped by year is simple.

with unpivot_sales as (
    select "Year", "Quarter", "Volume"
    from 
        quarterly_sales_v
        unpivot (
            "Volume" for "Quarter" in (
                "Qtr1", "Qtr2", "Qtr3", "Qtr4")
        )
)
select 
    "Year", 
    sum("Volume") as "Volume"
from unpivot_sales
group by "Year"
order by "Year";

  Year  Volume
  2021  696321
  2022  626526
  2023  649650

Adding more columns to unpivot just adds more work to the situation. Instead of starting from a pivot of quarterly sales, lets try starting with a pivot of monthly sales.

sales_data-monthly.sql

select 
    "Year",
    "January", "February", "March", "April", "May", "June",
    "July", "August", "September", "October", "November", "December"        
from monthly_sales_v
order by "Year";

  Year January February  March  April    May   June   July August September October November December
  2021   53231    75425  71528  71715  62200  64121  42125  65347     37572   50038    55317    47702
  2022   39111    52382  48931  45432  56751  56215  38223  70713     41284   63527    51175    62782
  2023   35370    70456  66911  48697  57753  56984  52034  71215     38216   50581    42483    58950

Adding all of the monthly columns can work to validate the yearly totals.

select 
    "Year", 
    (
        "January" + "February" + "March" + "April" + "May" + "June" +
        + "July" + "August" + "September" + "October" + "November" + "December"
    ) as "Volume"
from monthly_sales_v
order by "Year";

Using unpivot simplifies the summation, but twelve columns in the unpivot expression doesn’t this task any easier.

with unpivoted_sales as (
    select "Year", "Month", "Volume"
    from
        monthly_sales_v
        unpivot (
            "Volume" for "Month" in (
                "January", "February", "March", "April", "May", "June",
                "July", "August", "September", "October", "November", "December"        
                )
        )
)
select 
    "Year", 
    sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";

What if the task is to validate many different reports that split this same data into an increasing number of columns? Lets look at some examples with more columns to unpivot.

The regional breakdown is similar to the quarterly report.

sales_data-regional.sql

select 
    "Year", "East", "West", "North", "South", "Central"
from regional_sales_v
order by "Year";

  Year    East    West   North   South Central
  2021  139621  162371  168171  118172  107986
  2022  125800  127568  105420  155688  112050
  2023  152208  123094  108329  122119  143900

Not too bad you say, just unpivot and validate.

with unpivot_sales as (
    select "Year", "Region", "Volume"
    from 
        regional_sales_v
        unpivot (
            "Volume" for "Region" in (
                "East", "West", "North", "South", "Central")
        )
)
select 
    "Year", 
    sum("Volume") as "Volume"
from unpivot_sales
group by "Year"
order by "Year";

Regions by quarter has so many columns that we have to run it separately for each region.

sales_data-regional_quarterly.sql

select 
    "Year", "WestQtr1", "WestQtr2", "WestQtr3", "WestQtr4"
from regional_quarterly_sales_v
order by "Year";

  Year WestQtr1 WestQtr2 WestQtr3 WestQtr4
  2021    52364    45971    36192    27844
  2022    20720    44459    32197    30192
  2023    22997    37030    30066    33001

This is definitely not something we want to validate by adding all of the columns, it becomes hard to tell if we have added all the columns correctly. The risk of cut and paste errors is increasing.

select 
    "Year", 
    (
        "EastQtr1" + "WestQtr1" + "NorthQtr1" + "SouthQtr1" + "CentralQtr1"
        + "EastQtr2" + "WestQtr2" + "NorthQtr2" + "SouthQtr2" + "CentralQtr2"
        + "EastQtr3" + "WestQtr3" + "NorthQtr3" + "SouthQtr3" + "CentralQtr3"
        + "EastQtr4" + "WestQtr4" + "NorthQtr4" + "SouthQtr4" + "CentralQtr4"
    ) as "Volume"    
from regional_quarterly_sales_v
order by "Year";

Using an unpivot approach is equally difficult.

with unpivoted_sales as (
    select "Year", "Volume", "RegionQuarter"
    from
        regional_quarterly_sales_v
        unpivot (
            "Volume" for "RegionQuarter" in (
                "EastQtr1", "WestQtr1", "NorthQtr1", "SouthQtr1", "CentralQtr1",
                "EastQtr2", "WestQtr2", "NorthQtr2", "SouthQtr2", "CentralQtr2",
                "EastQtr3", "WestQtr3", "NorthQtr3", "SouthQtr3", "CentralQtr3",
                "EastQtr4", "WestQtr4", "NorthQtr4", "SouthQtr4", "CentralQtr4"    
                )
        )
)
select 
    "Year", 
    sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";

What if the same data is also pivoted to include a column for every month for each of the 5 sales regions?

sales_data-regional_monthly.sql

select 
    "Year", 
    "EastJanuary", "EastFebruary", "EastMarch",
    "EastApril", "EastMay", "EastJune"
from regional_monthly_sales_v
order by "Year";

 Year EastJanuary EastFebruary EastMarch EastApril EastMay EastJune
 2021         529        13488     19528     15659   16739    13319
 2022       13220         2556     18191      5850   14574     6741
 2023       12905        14277      7257      5711   18163    14741

Adding 60 columns no longer seems manageable, and the risk of errors is basically unacceptable.

select 
    "Year", 
     (
        "EastJanuary" + "WestJanuary" + "NorthJanuary" + "SouthJanuary" + "CentralJanuary" +
        "EastFebruary" + "WestFebruary" + "NorthFebruary" + "SouthFebruary" + "CentralFebruary" +
...
        "EastNovember" + "WestNovember" + "NorthNovember" + "SouthNovember" + "CentralNovember" +
        "EastDecember" + "WestDecember" + "NorthDecember" + "SouthDecember" + "CentralDecember"
    ) as "Volume"
from regional_monthly_sales_v
order by "Year";

Writing this unpivot is equally tedius and is an open invitation to mistakes.

with unpivoted_sales as (
    select "Year", "Volume", "RegionMonth"
    from
        regional_monthly_sales_v
        unpivot (
            "Volume" for "RegionMonth" in (
                "EastJanuary", "WestJanuary", "NorthJanuary", "SouthJanuary", "CentralJanuary",
                "EastFebruary", "WestFebruary", "NorthFebruary", "SouthFebruary", "CentralFebruary",
...
                "EastNovember", "WestNovember", "NorthNovember", "SouthNovember", "CentralNovember",
                "EastDecember", "WestDecember", "NorthDecember", "SouthDecember", "CentralDecember"
                )
        )
)
select 
    "Year", 
    sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";

The data may be useful in a spreadsheet, but it cuts into our lunch hour when we are asked at the last minute to confirm that all of these reports add up to the same yearly totals. Wouldn’t it be nice if there was a way to dynamically generate unpivot queries?

In this post I will explore a way to generate and run unpivot queries dynamically in PL/SQL. We will build a utility package to dynamically create unpivot queries using json and leveraging json_dataguide to create json_table and unpivot sql expressions.

Designing and building this utility will provide an excellent opportunity to walk through the development process step by step. By the end of the article we should be able to compare all of these reports in a single query with a minimum of custom code.

Getting Started

The source code for all articles here on practicalplsql.org is available in the practicalplsql public repository on github in the scripts/articles directory organized by subject and article. You can use create.user.practicalplsql.sql to create the practicalplsql schema. Sample data for articles can be found in the scripts/examples directory organized by example name. Scripts for this post are located in dynamic-json-unpivot. I have included the filename of the script in each code sample for easy reference. If you want to clean up the sample schema objects, use the drop_all_user_objects.sql script.

All scripts have been tested in SQL Developer on Oracle 19c, 21c and 23ai. Note: the anonymous blocks used to design the code all compiled in Oracle XE 21c, but the packaged code causes an internal error when compiling. The same packages compiled in OCI 21c, so it must be due to the patch level of XE 21c. As of this writing, OCI has decommissioned 21c autonomous databases, so the target database versions are really 19c and 23ai.

All scripts should be executed with serveroutput on, so this repetition is omitted for clarity. Some queries or code build on previous examples, these omit the repeated code for brevity, the source code has no omissions. Finally, when several scripts produce the same correct output, the output will only be shown once to avoid repetition.

JSON_TABLE and UNPIVOT queries

We can query tables or generate relational data to create a relational data source.

intro.relational_data.sql

with base as (
    select 
        level * 10 as "c1_Number", 
        'details for ' || level * 10 as "c2_String",
        date '2024-06-01' + level as "c3_Date"
    from dual connect by level <= 3
)
select rownum as id, b.* from base b;

ID c1_Number c2_String       c3_Date            
 1        10 details for 10  2024-06-02 00:00:00
 2        20 details for 20  2024-06-03 00:00:00
 3        30 details for 30  2024-06-04 00:00:00

JSON SQL methods can be used to project rows of relational data as a simple array of json objects. The default behavior of JSON_OBJECT is to convert table rows using the column names of the data source. This allows generation of json objects with a simple json_object(*) sql expression.

These rows can then be aggregated into a json array with JSON_ARRAYAGG. Going from relational data to json data can be done without specific references to the actual columns in the relational data source, so this conversion can easily be done with a minimal amount of code.

intro.relational_to_json.sql

with base as (...
), json_base as (
    select json_arrayagg(json_object(*) order by "c1_Number") as "jDoc" 
    from base
)
select json_serialize("jDoc" pretty) as "jDoc" 
from json_base;

[
  {"c1_Number" : 10, "c2_String" : "details for 10", "c3_Date" : "2024-06-02T00:00:00"},
  {"c1_Number" : 20, "c2_String" : "details for 20", "c3_Date" : "2024-06-03T00:00:00"},
  {"c1_Number" : 30, "c2_String" : "details for 30", "c3_Date" : "2024-06-04T00:00:00"}
]

Converting this json document to relational data can be done with a JSON_TABLE expression that defines the document path to generate rows from, and a COLUMNS clause defining the mapping between json keys and relational columns to be projected. We can use dot notation and item methods to define each column quickly, but each column needs to be specified individually. If there are a large number of json keys to convert to columns, the COLUMNS clause requires a lot of literal code.

intro.json_to_relational.sql

with base as (...
), json_base as (...
), json_to_relational as (
    select j.id, j."c1_Number", j."c2_String", j."c3_Date"
    from 
        json_base b,
        json_table(b."jDoc", '$[*]' 
            columns(
                id for ordinality,  
                "c1_Number" number       path '$.c1_Number', 
                "c2_String" varchar2(50) path '$.c2_String',
                "c3_Date"   date         path '$.c3_Date'
            )
        ) j 
)
select * from json_to_relational;

ID c1_Number c2_String       c3_Date            
 1        10 details for 10  2024-06-02 00:00:00
 2        20 details for 20  2024-06-03 00:00:00
 3        30 details for 30  2024-06-04 00:00:00

We can unpivot relational data, but the code is not very dynamic. The columns to be unpivoted need to be the same datatype. The UNPIVOT expression has to specify each column to be unpivoted.

If the relational source is selecting all columns from a table, we can identify the columns and their datatypes by querying the data dictionary. When the datasource is a query, it is much harder to determine the columns and their datatypes. This information is needed to coerce the datatypes and create an unpivot query.

intro.unpivot_relational.sql

with base as (...
), base_to_common_datatype as (
    select 
        rownum as id, 
        to_char("c1_Number") as "c1_Number", 
        "c2_String", 
        to_char("c3_Date") as "c3_Date" 
    from base
), unpivot_base as (
    select id, "cName", "cValue"
    from
        base_to_common_datatype
        unpivot (
            "cValue" for "cName" in (
                "c1_Number", "c2_String", "c3_Date"
                )
        )
)
select * from unpivot_base;

ID cName       cValue              
 1 c1_Number   10                  
 1 c2_String   details for 10      
 1 c3_Date     2024-06-02 00:00:00 
 2 c1_Number   20                  
 2 c2_String   details for 20      
 2 c3_Date     2024-06-03 00:00:00 
 3 c1_Number   30                  
 3 c2_String   details for 30      
 3 c3_Date     2024-06-04 00:00:00

If we convert a json document to relational data using JSON_TABLE, we can take a shortcut and coerce the datatypes in the COLUMNS clause. This makes the unpivot simpler. All column definitions and the unpivot will still require specific references to column names.

intro.unpivot_json_table.sql

with base as (...
), json_base as (...
), json_to_relational as (
    select j.id, j."c1_Number", j."c2_String", j."c3_Date"
    from 
        json_base b,
        json_table(b."jDoc", '$[*]' 
            columns(
                id for ordinality,  
                "c1_Number" varchar2(50) path '$.c1_Number', 
                "c2_String" varchar2(50) path '$.c2_String',
                "c3_Date"   varchar2(50) path '$.c3_Date'
            )
        ) j 
), unpivot_base as (
    select id, "cName", "cValue"
    from
        json_to_relational
        unpivot (
            "cValue" for "cName" in (
                "c1_Number", "c2_String", "c3_Date"
                )
        )
)
select * from unpivot_base;

ID cName       cValue              
 1 c1_Number   10                  
 1 c2_String   details for 10      
 1 c3_Date     2024-06-02T00:00:00 
 2 c1_Number   20                  
 2 c2_String   details for 20      
 2 c3_Date     2024-06-03T00:00:00 
 3 c1_Number   30                  
 3 c2_String   details for 30      
 3 c3_Date     2024-06-04T00:00:00

If the data is already a json document, we can access the json keys by calling the JSON SQL aggregate method JSON_DATAGUIDE. This will return the schema of the json document, with the json paths and their datatypes.

For this simple json document: the primary array is typed as an array, each row from the datasource is an object, and all the other elements are simple scalar values typed as string or number. The date value is typed as a string because there really isn’t a data in json. Because we want to coerce all the datatypes for the unpivot, this won’t be a problem.

For these scalar elements, the json path is the original column name. As we will see, the results of the JSON_DATAGUIDE function provide sufficient information to create a JSON_TABLE expression and an UNPIVOT query dynamically!

intro.json_dataguide.sql

with base as (...
), json_base as (...
)
select json_serialize(json_dataguide("jDoc") pretty) as "jDataguide" 
from json_base;

[
  {"o:path" : "$", "type" : "array", "o:length" : 256},
  {"o:path" : "$.c3_Date", "type" : "string", "o:length" : 32},
  {"o:path" : "$.c1_Number", "type" : "number", "o:length" : 2},
  {"o:path" : "$.c2_String", "type" : "string", "o:length" : 16}
]

In a PL/SQL function, we can pass a json document as the input parameter, query the dataguide to get the necessary columns information to build a JSON_TABLE expression and to build an UNPIVOT query based on that expression, and then return the unpivoted values in a standard shape as a pipelined row.

The design process to get to a dynamic unpivot function can be broken down into separate parts based on each functionality needed. Each part can be prototyped using an anonymous block and local functions so that we can see how the pieces fit together. When all the parts are worked out, we will be ready to put them together as a package that can be compiled and tested.

Test JSON Documents

Using the same json document will keep the design process simple. Each json object in the array may have null values that were not created in the json (by using absent on null), so our objects should not always have all of the same keys present.

The json array could also be a nested attribute of a json object, so we should also use a second test document that has this more complex structure. We will look closer at the differences this creates for the dataguide and for the necessary JSON_TABLE expressions when converting to relational data.

To keep the design scripts uncluttered lets put this test json function into a package and add a print procedure to view it directly. This can easily be modified to return the JSON datatype instead of a CLOB if we are in a version of Oracle that is 21 or greater. Since many of us are currently preparing to migrate 19c codebases to 23ai, I will use conditional compilation to make the code compatible with Oracle 19c, 21c and 23ai for the rest of the code in this article.

design-1.1-test_jdoc-package.sql

create or replace package dynamic_json#test 
authid current_user
as
    subtype json_format is varchar2(10);
    c_array_simple    constant json_format := 'simple';
    c_array_nested    constant json_format := 'nested';
    
    function test_jdoc(
        format_option in json_format default c_array_simple
    ) return $if dbms_db_version.version < 21 $then clob $else json $end ;

end dynamic_json#test;
/

create or replace package body dynamic_json#test 
as
    function test_jdoc(
        format_option in json_format default c_array_simple
    ) return $if dbms_db_version.version < 21 $then clob $else json $end
    is
        l_json_text clob;
    begin
        case format_option 
            when c_array_simple then
                l_json_text := to_clob(
                    q'~
                    [
                    {"name":"square","side":5,"color":"blue"},
                    {"name":"rectangle","length":5,"width":3},  
                    {"name":"box","length":5,"width":3,"height":2},  
                    {"name":"hexagon","side":3,"color":"red"},
                    {"name":"circle","radius":3},
                    ]
                ~');
            when c_array_nested then
                l_json_text := to_clob(
                    q'~
                    {"my_shapes":
                        [
                        {"name":"square","side":5,"color":"blue"},
                        {"name":"rectangle","length":5,"width":3},  
                        {"name":"box","length":5,"width":3,"height":2},  
                        {"name":"hexagon","side":3,"color":"red"},
                        {"name":"circle","radius":3},
                        ]
                    }
                ~');
            else
                raise_application_error(-20100, format_option 
                    || ' is an undefined test document type'
                );
        end case;
        
        return 
$if dbms_db_version.version < 21 $then 
            l_json_text 
$else 
            json(l_json_text) 
$end 
        ;

    end test_jdoc;
    
end dynamic_json#test;
/

These sample documents can be checked in a query to be sure they are properly constructed json. When we are finished with the design process, we can unit test with this query as the source for our pipelined function.

design-1.2-test_jdoc-query.sql

with json_document as (
    select dynamic_json#test.test_jdoc('simple') as jdoc
    from dual
)
select json_serialize(b.jdoc returning clob pretty) as jdoc
from json_document b;

with json_document as (
    select dynamic_json#test.test_jdoc('nested') as jdoc
    from dual
)
select json_serialize(b.jdoc returning clob pretty) as jdoc
from json_document b;

Design: Test Document Dataguides

We can use the test function to see what the dataguides will look like for each of the test json documents.

design-2.1-dataguide-test_jdoc.sql

with json_document as (
    select dynamic_json#test.test_jdoc('simple') as jdoc
    from dual
)
select 
    json_serialize(json_dataguide(b.jdoc) returning clob pretty) as jdoc_dataguide
from json_document b;

with json_document as (
    select dynamic_json#test.test_jdoc('nested') as jdoc
    from dual
)
select 
    json_serialize(json_dataguide(b.jdoc) returning clob pretty) as jdoc_dataguide
from json_document b;

dataguide for simple test document:
[
  {"o:path" : "$", "type" : "array", "o:length" : 1},
  {"o:path" : "$.name", "type" : "string", "o:length" : 16},
  {"o:path" : "$.side", "type" : "number", "o:length" : 2},
  {"o:path" : "$.color", "type" : "string", "o:length" : 4},
  {"o:path" : "$.width", "type" : "number", "o:length" : 2},
  {"o:path" : "$.height", "type" : "number", "o:length" : 2},
  {"o:path" : "$.length", "type" : "number", "o:length" : 2},
  {"o:path" : "$.radius", "type" : "number", "o:length" : 2}
]

dataguide for nested test document:
[
  {"o:path" : "$", "type" : "object", "o:length" : 1},
  {"o:path" : "$.my_shapes", "type" : "array", "o:length" : 1},
  {"o:path" : "$.my_shapes.name", "type" : "string", "o:length" : 16},
  {"o:path" : "$.my_shapes.side", "type" : "number", "o:length" : 2},
  {"o:path" : "$.my_shapes.color", "type" : "string", "o:length" : 4},
  {"o:path" : "$.my_shapes.width", "type" : "number", "o:length" : 2},
  {"o:path" : "$.my_shapes.height", "type" : "number", "o:length" : 2},
  {"o:path" : "$.my_shapes.length", "type" : "number", "o:length" : 2},
  {"o:path" : "$.my_shapes.radius", "type" : "number", "o:length" : 2}
]

The dataguide for the nested json array is almost identical, but the column paths include the root object with the array attribute my_shapes.

Looking at the dataguide for these document dataguides we can see that there is a standard structure for all flat dataguides. The json attribute values of this meta dataguide get escaped because they include double quotes and colons.

design-2.2-any_dataguide.sql

with json_document as (
    select dynamic_json#test.test_jdoc('simple') as jdoc
    from dual
), document_dataguide as (
    select json_dataguide(jdoc) as jdg
    from json_document
), dataguide_dataguide as (
    select json_dataguide(jdg) as jdg
    from document_dataguide
)
select json_serialize(jdg returning clob pretty) as any_dataguide
from dataguide_dataguide;

[
  {"o:path" : "$", "type" : "array", "o:length" : 512},
  {"o:path" : "$.type", "type" : "string", "o:length" : 8},
  {"o:path" : "$.\"o:path\"", "type" : "string", "o:length" : 8},
  {"o:path" : "$.\"o:length\"", "type" : "number", "o:length" : 2}
]

Convert Dataguide JSON To Relational Data

Because the dataguide has a standard structure, converting any dataguide to relational data will always use the same JSON_TABLE expression. The paths in the simple document are the relative paths we need to convert the source document to relational data. The paths are also very close to column names that we could use when converting to relational data.

design-3.1-dataguide_columns-query.sql

with dataguide as (
    select 
        json_dataguide(
            dynamic_json#test.test_jdoc('simple')
        ) as jdoc_dataguide
    from dual
), dataguide_relational as (
    select 
        dbms_assert.enquote_name(
            ltrim(j.dg_path, '$.')
            , capitalize => false
        ) as column_name,
        j.column_type,
        j.dg_path as column_path
    from 
        dataguide g,
        json_table(g.jdoc_dataguide, '$[*]'
            columns(
                dg_path path '$."o:path".string()',
                column_type path '$.type.string()'
            )
        ) j
)
select column_name, column_type, column_path
from dataguide_relational
where column_type not in ('object', 'array');

If we strip out the array key for the nested document from the column paths using the replace function, we will end up with the same values for the column definitions.

design-3.1-dataguide_columns-query.sql

prompt adjust column name and relative path for the nested format by removing the array path

with dataguide as (
    select 
        json_dataguide(
            dynamic_json#test.test_jdoc('nested')
        ) as jdoc_dataguide
    from dual
), dataguide_relational as (
    select 
        dbms_assert.enquote_name(
            ltrim(replace(j.dg_path, '.my_shapes'), '$.'),
            capitalize => false
        ) as column_name,
        j.column_type,
        replace(j.dg_path, '.my_shapes') as column_path
    from 
        dataguide g,
        json_table(g.jdoc_dataguide, '$[*]'
            columns(
                dg_path path '$."o:path".string()',
                column_type path '$.type.string()'
            )
        ) j
)
select column_name, column_type, column_path
from dataguide_relational
where column_type not in ('object', 'array');

COLUMN_NAME  COLUMN_TYPE  COLUMN_PATH 
"name"       string       $.name      
"side"       number       $.side      
"color"      string       $.color     
"width"      number       $.width     
"height"     number       $.height    
"length"     number       $.length    
"radius"     number       $.radius

In the dataguide for the nested document, the paths include the array attribute that they are nested into. Removing this creates simple column names and relative paths. The array path will need to be a parameter to our function so that we can do this replacment. We will also need this as a parameter to specify the path for JSON_TABLE to generate rows from.

Convert Dataguide To Associative Array

Using an anonymous block with nested functions allows us to design the final functions we will need without actually compiling them.

Because the flat dataguide always has the same structure, we can select it into an associative array after using JSON_TABLE to convert it to relational data. An associative array will be easier to use when building dynamic sql statements based on these column definitions. A function to print out the associative array of column definitions will also be useful.

The dataguide elements with a type of object and array are not scalar elements in the document, so we will exclude these in the WHERE clause.

We have to adjust the path arguments for the nested json document so that they create well formed column names and the column paths are all relative paths for use in JSON_TABLE. Notice that using inline conditional compilation to conditionally define a variable’s datatype makes it clear why the $end should not have a semicolon.

The collection type column_definitions is an associative array of a record type column_definition. We can use this to keep the column definitions available to format both the JSON_TABLE expression and the UNPIVOT expression.

In 23ai SQL supports booleans, so we can just use dbms_assert to properly quote identifiers. Since we want the code to be compatible with 19c, we have to use different syntax to achieve the same goal. If we were just writing an SQL query, conditional compilation cannot be used and we would have to write the query specifically for lower versions that don’t support booleans in SQL. Because we are writing the query in PL/SQL, we can customize it with conditional compilation and the compiler will use the syntax that is appropriate to the version of Oracle.

Note: each part of the design process will include a debug procedure that uses dbms_output to show the results. These procedures are very simple and have been omitted from the code samples for clarity. Look at the full script in git to see the details of these procedures.

design-3.2-dataguide-columns-collection.sql
    
    type column_definition is record (
        column_name varchar2(64), 
        column_type varchar2(20), 
        column_path varchar2(100));
        
    type column_definitions is table of column_definition index by pls_integer;

    function dataguide_columns(
        jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end , 
        array_path in varchar2 default null
    ) return column_definitions
    is
        l_key_columns column_definitions;
    begin    
        with dataguide as (
            select json_dataguide(jdoc) as jdoc_dataguide
            from dual
        ), dataguide_relational as (
            select 
$if dbms_db_version.version >= 23 $then
                dbms_assert.enquote_name(
                    ltrim(replace(j.dg_path, array_path), '$.'), 
                    capitalize => false
                ) as column_name,
$else
                '"' || ltrim(replace(j.dg_path, array_path), '$.') || '"' as column_name,
$end
                j.column_type,
                replace(j.dg_path, array_path) as column_path
            from 
                dataguide g,
                json_table(g.jdoc_dataguide, '$[*]'
                    columns(
                        dg_path path '$."o:path".string()',
                        column_type path '$.type.string()'
                    )
                ) j
        )
        select column_name, column_type, column_path
        bulk collect into l_key_columns
        from dataguide_relational
        where column_type not in ('object', 'array');
        
        return l_key_columns;
    end dataguide_columns;

column_name => "name", column_type => string, column_path => $.name
column_name => "side", column_type => number, column_path => $.side
column_name => "color", column_type => string, column_path => $.color
column_name => "width", column_type => number, column_path => $.width
column_name => "height", column_type => number, column_path => $.height
column_name => "length", column_type => number, column_path => $.length
column_name => "radius", column_type => number, column_path => $.radius

Both sample documents return the same column information with relative paths because we passed the array path as a parameter and used it to clean up both the column names and paths.

Building The Dynamic JSON_TABLE Expression

Now that we can pull the column information from the dataguide, we have enough information to build the COLUMNS clause for JSON_TABLE. This dynamic query will be internal to our primary function, so we will put the json document into it as a bind variable. It is also intended to be used for unpivoting, so all of the column expressions can be typed as varchar2(4000) in the JSON_TABLE expression. Using a constant to represent this will allow us some flexibility later if we want to change to another datatype or size.

Create some formatting constants to keep the code readable and to allow easy changes later on. Use c_max_column_name_length to pad the column name and keep the resulting sql well formatted.

To make the structure of the sql expression we are creating more clear, we can create it as a template with markers for sections that will be replaced (##ARRAY_PATH##, ##JSON_TABLE_COLUMNS##). This results in more readable code that clearly separates the building of the replacement pieces from the template of the sql statement.

design-4.1-json_table_expression.sql
    
    c_lf constant varchar2(1) := chr(10);
    c_indent constant varchar2(100) := lpad(' ', 12, ' ');
    c_max_column_name_length constant number := 10;
    c_unpivot_to_datatype constant varchar2(20) := 'varchar2(4000)';
    
    function json_table_expression(
        key_columns in column_definitions,
        array_path in varchar2 default null
    ) return varchar2
    is
        l_sql varchar2(32000);
        l_columns_clause varchar2(4000);
    begin
    
        --build the columns expression
        for i in indices of key_columns loop
            l_columns_clause := l_columns_clause || c_indent
                || ', ' || rpad(key_columns(i).column_name, c_max_column_name_length + 1, ' ')
                || c_unpivot_to_datatype 
                || ' path ''' || key_columns(i).column_path || '''' 
                || case when i <> key_columns.count then c_lf end;
        end loop;

        l_sql := q'+
select j.*
from 
    json_table(:jdoc, '$##ARRAY_PATH##[*]' 
        columns (
            "row#id" for ordinality
##JSON_TABLE_COLUMNS##
        )
    ) j
+';

        l_sql := replace(l_sql, '##JSON_TABLE_COLUMNS##', l_columns_clause);
        l_sql := replace(l_sql, '##ARRAY_PATH##', rtrim(array_path,'.'));
        return l_sql;
    
    end json_table_expression;

JSON_TABLE Expression generated from document dataguide. no array path

select j.*
from 
    json_table(:jdoc, '$[*]' 
        columns (
            "row#id" for ordinality
            , "name"     varchar2(4000) path '$.name'
            , "side"     varchar2(4000) path '$.side'
            , "color"    varchar2(4000) path '$.color'
            , "width"    varchar2(4000) path '$.width'
            , "height"   varchar2(4000) path '$.height'
            , "length"   varchar2(4000) path '$.length'
            , "radius"   varchar2(4000) path '$.radius'
        )
    ) j

JSON_TABLE Expression generated from document dataguide. array_path => .my_shapes

select j.*
from 
    json_table(:jdoc, '$.my_shapes[*]' 
        columns (
            "row#id" for ordinality
            , "name"     varchar2(4000) path '$.name'
            , "side"     varchar2(4000) path '$.side'
            , "color"    varchar2(4000) path '$.color'
            , "width"    varchar2(4000) path '$.width'
            , "height"   varchar2(4000) path '$.height'
            , "length"   varchar2(4000) path '$.length'
            , "radius"   varchar2(4000) path '$.radius'
        )
    ) j

The output here is a well formed JSON_TABLE statement to convert a json bind variable to relational data. Test this query by commenting out the bind variable and replacing with the function that gets the test document.

design-4.2-json_table_expression-test.sql

    --json_table(:jdoc, '$[*]' 
    json_table(dynamic_json#test.test_jdoc(), '$[*]' 

    --json_table(:jdoc, '$.my_shapes[*]' 
    json_table(dynamic_json#test.test_jdoc('nested'), '$.my_shapes[*]'

row#id name       side       color      width      height     length     radius    
     1 square     5          blue       <null>     <null>     <null>     <null>    
     2 rectangle  <null>     <null>     3          <null>     5          <null>    
     3 box        <null>     <null>     3          2          5          <null>    
     4 hexagon    3          red        <null>     <null>     <null>     <null>    
     5 circle     <null>     <null>     <null>     <null>     <null>     3

The JSON_TABLE document path has been set appropriately for each json document. Both queries return the same results.

Building The Dynamic Unpivot Expression

Now that the dynamic JSON_TABLE expression is working correctly, we can create another function to build a dynamic UNPIVOT query from the JSON_TABLE query used as a CTE.

design-5.1-unpivot-query.sql
    
    function unpivot_expression(
        key_columns in column_definitions,
        json_table_query in varchar2
    ) return varchar2
    is
        l_sql varchar2(32000);
        l_unpivot_columns varchar2(4000);
    begin
    
        --build the unpivot expression using the columns collection
        for i in indices of key_columns loop
            l_unpivot_columns := l_unpivot_columns || c_indent
                || case when i > 1 then ', ' end 
                || key_columns(i).column_name 
                || case when i <> key_columns.count then c_lf end;
        end loop;

        l_sql := q'+
with json_to_relational as (
##JSON_TABLE_EXPRESSION##
)
select "row#id", "column#key", "column#value"
from 
    json_to_relational 
    unpivot (
        "column#value" for "column#key" in (
##UNPIVOT_COLUMNS##
            )
    ) 
order by "row#id", "column#key"
+';

        l_sql := replace(l_sql, '##JSON_TABLE_EXPRESSION##', json_table_query);
        l_sql := replace(l_sql, '##UNPIVOT_COLUMNS##', l_unpivot_columns);
        return l_sql;
    
    end unpivot_expression;

UNPIVOT Query generated from document dataguide. no array path

with json_to_relational as (
select j.*
from 
    json_table(:jdoc, '$[*]' 
        columns (
            "row#id" for ordinality
            , "name"     varchar2(4000) path '$.name'
            , "side"     varchar2(4000) path '$.side'
            , "color"    varchar2(4000) path '$.color'
            , "width"    varchar2(4000) path '$.width'
            , "height"   varchar2(4000) path '$.height'
            , "length"   varchar2(4000) path '$.length'
            , "radius"   varchar2(4000) path '$.radius'
        )
    ) j
)
select "row#id", "column#key", "column#value"
from 
    json_to_relational 
    unpivot (
        "column#value" for "column#key" in (
            "name"
            , "side"
            , "color"
            , "width"
            , "height"
            , "length"
            , "radius"
            )
    ) 
order by "row#id", "column#key"

UNPIVOT Query generated from document dataguide. array_path => .my_shapes

with json_to_relational as (
select j.*
from 
    json_table(:jdoc, '$.my_shapes[*]' 
        columns (
            "row#id" for ordinality
            , "name"     varchar2(4000) path '$.name'
            , "side"     varchar2(4000) path '$.side'
            , "color"    varchar2(4000) path '$.color'
            , "width"    varchar2(4000) path '$.width'
            , "height"   varchar2(4000) path '$.height'
            , "length"   varchar2(4000) path '$.length'
            , "radius"   varchar2(4000) path '$.radius'
        )
    ) j
)
select "row#id", "column#key", "column#value"
from 
    json_to_relational 
    unpivot (
        "column#value" for "column#key" in (
            "name"
            , "side"
            , "color"
            , "width"
            , "height"
            , "length"
            , "radius"
            )
    ) 
order by "row#id", "column#key"

The unpivot query is the same for both test document formats, the only difference is in the JSON_TABLE expression.

Testing the resulting unpivot queries with the appropriate test documents returns the correct results in both cases.

design-5.2-unpivot-query-test.sql

row#id column#key column#value
     1 color      blue                
     1 name       square              
     1 side       5                   
     2 length     5                   
     2 name       rectangle           
     2 width      3                   
     3 height     2                   
     3 length     5                   
     3 name       box                 
     3 width      3                   
     4 color      red                 
     4 name       hexagon             
     4 side       3                   
     5 name       circle              
     5 radius     3

Unpivot Function Returning Associative Array

Because our final function is intended to be pipelined for use in SQL, it will return a collection type declared in a package specification or as a user defined type. We can test the code for this in the anonymous block with a local function that returns a collection of record types.

For the initial design we can use an associative array indexed by pls_integer that will have to be changed to a nested table for the pipelined function.

design-5.3-unpivot-query-collection.sql
    
    type column_value is record(
        row#id number,
        column#key varchar2(64),
        column#value varchar2(4000));
    
    type column_values is table of column_value index by pls_integer;

    function unpivot_collection(
        jdoc in json,
        array_path in varchar2 default null    
    ) return column_values
    is
        l_key_columns column_definitions;
        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);
        l_values column_values;
    begin
        l_key_columns := dataguide_columns(jdoc, array_path);
        l_json_table_query := json_table_expression(l_key_columns, array_path);
        l_sql := unpivot_expression(l_key_columns, l_json_table_query);
    
        execute immediate l_sql bulk collect into l_values using jdoc;
        return l_values;
    end unpivot_collection;

row#id => 1, column#key => color, column#value => blue
row#id => 1, column#key => name, column#value => square
row#id => 1, column#key => side, column#value => 5
row#id => 2, column#key => length, column#value => 5
row#id => 2, column#key => name, column#value => rectangle
row#id => 2, column#key => width, column#value => 3
row#id => 3, column#key => height, column#value => 2
row#id => 3, column#key => length, column#value => 5
row#id => 3, column#key => name, column#value => box
row#id => 3, column#key => width, column#value => 3
row#id => 4, column#key => color, column#value => red
row#id => 4, column#key => name, column#value => hexagon
row#id => 4, column#key => side, column#value => 3
row#id => 5, column#key => name, column#value => circle
row#id => 5, column#key => radius, column#value => 3

Pipelined Function To Stream Unpivot Results

The unpivot_collection function is very similar to a pipelined function. The primary difference is that we have to change the column_values type to a nested table (remove the index by clause) instead of an associative array (index by pls_integer).

Put all of the constants, types and procedure or function signatures into the specification. Add a function signature for the pipelined function at the end.

design-6.1-dynamic_json_unpivot-package.sql
    
create or replace package dynamic_json#alpha
authid current_user
as
    
    c_lf constant varchar2(1) := chr(10);
    c_indent constant varchar2(100) := lpad(' ', 12, ' ');
    c_max_column_name_length constant number := 10;
    c_unpivot_to_datatype constant varchar2(20) := 'varchar2(4000)';
    
    type column_value is record(
        row#id number,
        column#key varchar2(64),
        column#value varchar2(4000));
    
    type column_values is table of column_value;

    type column_definition is record (
        column_name varchar2(64), 
        column_type varchar2(20), 
        column_path varchar2(100));
        
    type column_definitions is table of column_definition index by pls_integer;

    function dataguide_columns(
        jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end , 
        array_path in varchar2 default null
    ) return column_definitions;

    function json_table_expression(
        key_columns in column_definitions,
        array_path in varchar2 default null
    ) return varchar2;

    function unpivot_expression(
        key_columns in column_definitions,
        json_table_query in varchar2
    ) return varchar2;

    function unpivot_collection(
        jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
        array_path in varchar2 default null    
    ) return column_values;

    function unpivot_json_array(
        jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
        array_path in varchar2 default null    
    ) return column_values pipelined;

end dynamic_json#alpha;
/

The package body is identical to the anonymous block we were developing, except for the constants and types that we moved to the package specification.

The pipelined function implementation is very simple: build the correct sql statement and use it to populate the output collection, then pipe each element of the collection back to the sql engine.

create or replace package body dynamic_json#alpha 
as

    function dataguide_columns...
    function json_table_expression...
    function unpivot_expression...
    function unpivot_collection...    

    function unpivot_json_array(
        jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
        array_path in varchar2 default null    
    ) return column_values pipelined
    is
        l_key_columns column_definitions;
        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);
        l_values column_values;
    begin
        l_key_columns := dataguide_columns(jdoc, array_path);
        l_json_table_query := json_table_expression(l_key_columns, array_path);
        l_sql := unpivot_expression(l_key_columns, l_json_table_query);
    
        execute immediate l_sql bulk collect into l_values using jdoc;
        
        for i in indices of l_values loop
            pipe row (l_values(i));
        end loop;
        
        return;

    end unpivot_json_array;

end dynamic_json#alpha;
/

We can test the pipelined function directly from SQL using the test document function and the appropriate arguments. Both test documents return the same results. It only took a few helper functions to get there and we are dynamically converting simple json arrays to unpivoted relational data!

design-6.2-dynamic_json_unpivot-test.sql

select *
from dynamic_json#alpha.unpivot_json_array(dynamic_json#test.test_jdoc());

select *
from dynamic_json#alpha.unpivot_json_array(dynamic_json#test.test_jdoc('nested'), '.my_shapes');

ROW#ID COLUMN#KEY COLUMN#VALUE        
     1 color      blue                
     1 name       square              
     1 side       5                   
     2 length     5                   
     2 name       rectangle           
     2 width      3                   
     3 height     2                   
     3 length     5                   
     3 name       box                 
     3 width      3                   
     4 color      red                 
     4 name       hexagon             
     4 side       3                   
     5 name       circle              
     5 radius     3

Enhancements To Generalize Functionality

For a pipelined function that unpivots a simple json array into rows with column names and values, we need to identify these column values with the source row that they are unpivoted from. Currently, our prototype generates a row identifier using the json array index ordinality. If the json source data contains an attribute that is a valid row identifier, we should use that instead of the ordinality value.

We can add an optional parameter to specify this row identifier to our functions and procedures. Most of the time we will just pass this identifier to other functions, the only places we actually need to do something with this parameter is in building the columns collection and creating the JSON_TABLE expression.

The columns collection can filter out the row identifier, so that it doesn’t become part of the unpivot expression. The JSON_TABLE expression function will decide if the row#id column should be based on a row identifier or the array ordinality.

First we need to update our test document package to return json that contains a valid identifier in each json object. Add constants for the new test formats to the package specification.

design-7.1-test_jdoc-package.sql

    c_array_simple_id constant json_format := 'simple_id';
    c_array_nested_id constant json_format := 'nested_id';

In the package body, add the new test documents with the shape_id attribute to the case statement that gets the correct document.

design-7.1-test_jdoc-package.sql

        case format_option 
            when c_array_simple then...
            when c_array_nested then...
            when c_array_simple_id then
                l_json_text := to_clob(
                    q'~
                    [
                    {"shape_id":101,"name":"square","side":5,"color":"blue"},
                    {"shape_id":102,"name":"rectangle","length":5,"width":3},  
                    {"shape_id":103,"name":"box","length":5,"width":3,"height":2},  
                    {"shape_id":104,"name":"hexagon","side":3,"color":"red"},
                    {"shape_id":105,"name":"circle","radius":3},
                    ]
                ~');
            when c_array_nested_id then
                l_json_text := to_clob(
                    q'~
                    {"my_shapes":
                        [
                        {"shape_id":101,"name":"square","side":5,"color":"blue"},
                        {"shape_id":102,"name":"rectangle","length":5,"width":3},  
                        {"shape_id":103,"name":"box","length":5,"width":3,"height":2},  
                        {"shape_id":104,"name":"hexagon","side":3,"color":"red"},
                        {"shape_id":105,"name":"circle","radius":3},
                        ]
                    }
                ~');
            else
                raise_application_error(-20100, format_option || ' is an undefined test document type');
        end case;

Now we can update the package to handle this case with a parameter for row_identifer. Add this parameter to the function signatures and pass it when a subfunction is called.

In the dataguide_columns function the implicit cursor should exclude this column from the resulting columns collection.

design-8.1-enhancements.sql

    function dataguide_columns(
        jdoc...
        row_identifier in varchar2 default null,        
        array_path...
    ) return column_definitions
    is
        l_key_columns column_definitions;
    begin    
...
        where 
            column_type not in ('object', 'array')
            and column_name <> dbms_assert.enquote_name(nvl(row_identifier,'~#~#~'), capitalize => false);

The json_table_expression function can now define the row#id column conditionally. If row_identifier is not present, we can default to using the ordinality index. Create the correct expression for this column and add a new replacement marker to the template sql.

design-8.1-enhancements.sql

    function json_table_expression(
        key_columns...
        row_identifier in varchar2 default null,        
        array_path...
    ) return varchar2
    is
        l_sql varchar2(32000);
        l_columns_clause varchar2(4000);
        l_row_identifier varchar2(1000) := '"row#id" for ordinality';
    begin
        ...

        --if row identifier has been passed, substitute it for ordinality value using local path        
        if row_identifier is not null then
            l_row_identifier := '"row#id" number path ''$.' || row_identifier || '''';
        end if;

        l_sql := q'+
select j.*
from 
    json_table(:jdoc, '$##ARRAY_PATH##[*]' 
        columns (
            ##ROW_IDENTIFIER##
##JSON_TABLE_COLUMNS##
        )
    ) j
+';

        l_sql := replace(l_sql, '##ROW_IDENTIFIER##', l_row_identifier);

The unpivot_collection and unpivot_json_array just need to pass the row_identifier parameter when they call dataguide_columns or json_table_expression.

design-8.1-enhancements.sql

    function unpivot_collection(
        jdoc...
        row_identifier in varchar2 default null,        
        ...
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
        ...

    function unpivot_json_array(
        jdoc...
        row_identifier in varchar2 default null,        
        ...
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
        ...

Debugging all of the functionality with the new test documents shows that everything is still working correctly, and the shape_id is used instead of the ordinality index.

declare
    l_json $if dbms_db_version.version >= 21 $then json $else clob $end ;
begin
    l_json := dynamic_json#test.test_jdoc(dynamic_json#test.c_array_simple_id);
    dynamic_json#beta.debug_json_table_expression(l_json, 'shape_id');
end;
/

select j.*
from 
    json_table(:jdoc, '$[*]' 
        columns (
            "row#id" number path '$.shape_id'
            , "name"     varchar2(4000) path '$.name'
            , "side"     varchar2(4000) path '$.side'
            , "color"    varchar2(4000) path '$.color'
            , "width"    varchar2(4000) path '$.width'
            , "height"   varchar2(4000) path '$.height'
            , "length"   varchar2(4000) path '$.length'
            , "radius"   varchar2(4000) path '$.radius'
        )
    ) j

Testing the pipelined functions shows that the shape_id value is being used correctly as the row identifier.

design-8.2-enhancements-test.sql

select *
from dynamic_json#beta.unpivot_json_array(dynamic_json#test.test_jdoc('simple'));

select *
from dynamic_json#beta.unpivot_json_array(
                           dynamic_json#test.test_jdoc('simple_id'), 
                           row_identifier => 'shape_id');

Our shape ids start at 101, so its easy to tell the difference between the results that used an identifier and the results that used ordinality.

simple json with no row identifier

ROW#ID COLUMN#KEY COLUMN#VALUE        
     1 color      blue                
     1 name       square              
     1 side       5                   
     2 length     5                   
     2 name       rectangle           
     2 width      3                   
     3 height     2                   
     3 length     5                   
     3 name       box                 
     3 width      3                   
     4 color      red                 
     4 name       hexagon             
     4 side       3                   
     5 name       circle              
     5 radius     3                   

simple json with shape_id as row identifier

ROW#ID COLUMN#KEY COLUMN#VALUE        
   101 color      blue                
   101 name       square              
   101 side       5                   
   102 length     5                   
   102 name       rectangle           
   102 width      3                   
   103 height     2                   
   103 length     5                   
   103 name       box                 
   103 width      3                   
   104 color      red                 
   104 name       hexagon             
   104 side       3                   
   105 name       circle              
   105 radius     3

Finishing Touches

The primary functionality we want to expose from this utility is the pipelined function. The column_values nested table type, column_value record type and unpivot_json_array pipelined function are the minimum elements that need to be exposed in the package specification.

Because the pipelined function could potentially return a very large set of rows, fetching the entire resultset at once into a collection could easily exceed available memory resources. Converting the actual fetch from the dynamic query to use bulk collect with a limit clause will help manage the memory footprint.

For this change we need to add a ref cursor variable and open the cursor before fetching batches of rows into the return collection. Exit the loop when the collection count is 0 from the last fetch that didn’t find any more rows.

Instead of hard coding the limit value, use a global variable in the package body for flexibility.

    function unpivot_json_array(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null    
    ) return column_values pipelined
    is
        l_key_columns column_definitions;
        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);
        l_values column_values;
        c sys_refcursor;
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
        l_sql := unpivot_expression(l_key_columns, l_json_table_query, row_identifier);
        
        open c for l_sql using jdoc;
        loop
        
            fetch c bulk collect into l_values limit g_bulk_limit_rows;
            exit when l_values.count = 0;
            
            for i in indices of l_values loop
                pipe row (l_values(i));
            end loop;
        
        end loop;
        close c;
        
        return;

    end unpivot_json_array;

A procedure to set the limit value for the session allows tuning the utility to specific workloads.

    procedure set_bulk_limit_rows( 
        bulk_limit_rows in positiven default 100
    )
    is 
    begin
        g_bulk_limit_rows := bulk_limit_rows;
    end set_bulk_limit_rows;

During development, we used varchar2(4000) as a default common datatype for JSON_TABLE to support the unpivot operation. While this works for a broad range of cases, we may later want the flexibility of unpivoting a set of numbers or even CLOB values. Converting the constant g_unpivot_to_datatype to a global variable with a procedure to set it will support this.

Because our output record always uses a varchar, this setting will not influence the datatype of the column#value that is returned.

A group of constants in the specification will document what unpivot datatypes are currently supported for the unpivot operation. Raising an exception for unsupported values in the set procedure ensures that the generated sql is well formed and helps to document these options.

    subtype unpivot_datatype is varchar2(20);
    c_varchar2_100  constant unpivot_datatype := 'varchar2(100)';
    c_varchar2_1000 constant unpivot_datatype := 'varchar2(100)';
    c_varchar2_4000 constant unpivot_datatype := 'varchar2(4000)';
    c_clob          constant unpivot_datatype := 'clob';
    c_number        constant unpivot_datatype := 'number';

    procedure set_unpivot_to_datatype(
        unpivot_to_datatype in unpivot_datatype default c_varchar2_4000)
    is
    begin
        if unpivot_to_datatype not in (
            c_varchar2_100, c_varchar2_1000, c_varchar2_4000, c_clob, c_number) 
        then
            raise_application_error(-20100, unpivot_to_datatype 
                || ' is not supported for dynamic unpivot'
            );
        end if;
        g_unpivot_to_datatype := unpivot_to_datatype;
    end set_unpivot_to_datatype;

Because unit testing a utility package like this with known inputs contributes significantly to confidence in code stability, the test_jdoc function should definitely be included. Additional test documents can easily be added to support unpivoting to numeric common datatypes and to demonstrate unsupported json structures. A unit testing framework like utPlsql could use these test documents to create a suite of unit tests for the utility package.

We can combine all of the debug procedures with a single wrapper with flags to switch each type of debug output and expose this in the specification.

    procedure debug_unpivot_json_array(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null,
        dataguide_columns in boolean default true,
        json_table_expression in boolean default true,
        unpivot_expression in boolean default true
    )
    is
    begin
    
        if dataguide_columns then
            debug_dataguide_columns(jdoc, row_identifier, array_path);
        end if;
        
        if json_table_expression then
            debug_json_table_expression(jdoc, row_identifier, array_path);
        end if;
        
        if unpivot_expression then
            debug_unpivot_expression(jdoc, row_identifier, array_path);
        end if;
        
    end debug_unpivot_json_array;

Creating a subtype for the json input document will clarify what we are trying to accomplish with conditional compilation based on database versions supporting the json constructor instead of using clobs for json documents.

subtype json_document_type is $if dbms_db_version.version >= 21 $then json $else clob $end ;

Dynamic Json Unpivot: Final Package Specification

With these changes in place, the final package specification has been pared down to the minimal necessary code to use the utility. Everything we need to use, test and debug the utility is contained in a single package. Developers can review the specification and understand how this utility will work as well as what sort of json structures the utility will support.

design-9.1-dynamic_json-spec.sql

create or replace package dynamic_json
authid current_user
as

    subtype json_document_type is $if dbms_db_version.version >= 21 $then json $else clob $end ;

    subtype unpivot_datatype is varchar2(20);
    c_varchar2_100  constant unpivot_datatype := 'varchar2(100)';
    c_varchar2_1000 constant unpivot_datatype := 'varchar2(100)';
    c_varchar2_4000 constant unpivot_datatype := 'varchar2(4000)';
    c_clob          constant unpivot_datatype := 'clob';
    c_number        constant unpivot_datatype := 'number';

    subtype json_format is varchar2(10);
    c_array_simple    constant json_format := 'simple';
    c_array_nested    constant json_format := 'nested';
    c_array_simple_id constant json_format := 'simple_id';
    c_array_nested_id constant json_format := 'nested_id';
            
    type column_value is record(
        row#id       number,
        column#key   varchar2(64),
        column#value varchar2(4000));
    
    type column_values is table of column_value;
    
    procedure set_bulk_limit_rows(
        bulk_limit_rows in positiven default 100
    );
    
    procedure set_unpivot_to_datatype(
        unpivot_to_datatype in unpivot_datatype default c_varchar2_4000
    );
    
    function unpivot_json_array(
        jdoc           in json_document_type,
        row_identifier in varchar2 default null,        
        array_path     in varchar2 default null    
    ) return column_values pipelined;
        
    procedure debug_unpivot_json_array(
        jdoc                  in json_document_type,
        row_identifier        in varchar2 default null,        
        array_path            in varchar2 default null,
        dataguide_columns     in boolean default true,
        json_table_expression in boolean default true,
        unpivot_expression    in boolean default true
    );

    function test_jdoc(
        format_option in json_format default c_array_simple
    ) return json_document_type;
    
end dynamic_json;
/

Dynamic Json Unpivot: Final Package Body

The final package body contains more functions and procedures than the specification exposes to callers. This encapsulation of helper modules is the real advantage of PL/SQL packages. While everything could have been accomplished in a single function, the separation into different sub functions makes the code much easier to understand and maintain.

design-9.2-dynamic_json-body.sql
    
create or replace package body dynamic_json 
as

    c_lf constant varchar2(1) := chr(10);
    c_indent constant varchar2(100) := lpad(' ', 12, ' ');
    c_max_column_name_length constant number := 64;
    
    g_unpivot_to_datatype unpivot_datatype := c_varchar2_4000;
    g_bulk_limit_rows positiven := 100;
    
    type column_definition is record (
        column_name varchar2(64), 
        column_type varchar2(20), 
        column_path varchar2(100));
        
    type column_definitions is table of column_definition index by pls_integer;
    
    procedure set_bulk_limit_rows( 
        bulk_limit_rows in positiven default 100
    )
    is 
    begin
        g_bulk_limit_rows := bulk_limit_rows;
    end set_bulk_limit_rows;
    
    procedure set_unpivot_to_datatype(
        unpivot_to_datatype in unpivot_datatype default c_varchar2_4000
    )
    is
    begin
        if unpivot_to_datatype not in (
            c_varchar2_100, c_varchar2_1000, c_varchar2_4000, c_clob, c_number) 
        then
            raise_application_error(-20100, unpivot_to_datatype 
                || ' is not supported for dynamic unpivot'
            );
        end if;
        g_unpivot_to_datatype := unpivot_to_datatype;
    end set_unpivot_to_datatype;    

    function dataguide_columns(
        jdoc in json_document_type, 
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null
    ) return column_definitions
    is
        l_key_columns column_definitions;
    begin    
        with dataguide as (
            select json_dataguide(jdoc) as jdoc_dataguide
            from dual
        ), dataguide_relational as (
            select 
$if dbms_db_version.version >= 23 $then
                dbms_assert.enquote_name(
                    ltrim(replace(j.dg_path, array_path), '$.'), 
                    capitalize => false
                ) as column_name,
$else
                '"' || ltrim(replace(j.dg_path, array_path), '$.') || '"' as column_name,
$end
                j.column_type,
                replace(j.dg_path, array_path) as column_path
            from 
                dataguide g,
                json_table(g.jdoc_dataguide, '$[*]'
                    columns(
                        dg_path path '$."o:path".string()',
                        column_type path '$.type.string()'
                    )
                ) j
        )
        select column_name, column_type, column_path
        bulk collect into l_key_columns
        from dataguide_relational
        where 
            column_type not in ('object', 'array')
$if dbms_db_version.version >= 23 $then            
            and column_name <> dbms_assert.enquote_name(nvl(row_identifier,'~#~#~'), capitalize => false)
$else
            and column_name <> '"' || nvl(row_identifier,'~#~#~') || '"'
$end
        ;
        
        return l_key_columns;
    end dataguide_columns;
        
    function json_table_expression(
        key_columns in column_definitions,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null
    ) return varchar2
    is
        l_sql varchar2(32000);
        l_columns_clause varchar2(4000);
        l_row_identifier varchar2(1000) := '"row#id" for ordinality';
    begin
    
        --build the columns expression
        for i in 
$if dbms_db_version.version >= 21 $then 
            indices of key_columns 
$else 
            1..key_columns.count 
$end 
        loop
            l_columns_clause := l_columns_clause || c_indent
                || ', ' || key_columns(i).column_name
                || ' ' || g_unpivot_to_datatype 
                || ' path ''' || key_columns(i).column_path || '''' 
                || case when i <> key_columns.count then c_lf end;
        end loop;

        --if row identifier has been passed, substitute it for ordinality value using local path        
        if row_identifier is not null then
            l_row_identifier := '"row#id" number path ''$.' || row_identifier || '''';
        end if;

        l_sql := q'+
select j.*
from 
    json_table(:jdoc, '$##ARRAY_PATH##[*]' 
        columns (
            ##ROW_IDENTIFIER##
##JSON_TABLE_COLUMNS##
        )
    ) j
+';

        l_sql := replace(l_sql, '##JSON_TABLE_COLUMNS##', l_columns_clause);
        l_sql := replace(l_sql, '##ARRAY_PATH##', rtrim(array_path,'.'));
        l_sql := replace(l_sql, '##ROW_IDENTIFIER##', l_row_identifier);
        return l_sql;
    
    end json_table_expression;
    
    function unpivot_expression(
        key_columns in column_definitions,        
        json_table_query in varchar2,
        row_identifier in varchar2 default null        
    ) return varchar2
    is
        l_sql varchar2(32000);
        l_unpivot_columns varchar2(4000);
    begin
    
        --build the unpivot expression
        for i in 
$if dbms_db_version.version >= 21 $then 
            indices of key_columns 
$else 
            1..key_columns.count 
$end 
        loop
            l_unpivot_columns := l_unpivot_columns || c_indent
                || case when i > 1 then ', ' end 
                || key_columns(i).column_name 
                || case when i <> key_columns.count then c_lf end;
        end loop;

        l_sql := q'+
with json_to_relational as (
##JSON_TABLE_EXPRESSION##
) 
select "row#id", "column#key", "column#value"
from 
    json_to_relational 
    unpivot (
        "column#value" for "column#key" in (
##UNPIVOT_COLUMNS##
            )
    ) 
order by "row#id", "column#key"
+';

        l_sql := replace(l_sql, '##JSON_TABLE_EXPRESSION##', json_table_query);
        l_sql := replace(l_sql, '##UNPIVOT_COLUMNS##', l_unpivot_columns);
        return l_sql;
    
    end unpivot_expression;

    function unpivot_json_array(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null    
    ) return column_values
    pipelined
    is
        l_key_columns column_definitions;
        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);
        l_values column_values;
        c sys_refcursor;
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
        l_sql := unpivot_expression(l_key_columns, l_json_table_query, row_identifier);
        
        open c for l_sql using jdoc;
        loop
        
            fetch c bulk collect into l_values limit g_bulk_limit_rows;
            exit when l_values.count = 0;
            
            for i in 
$if dbms_db_version.version >= 21 $then 
                indices of l_values 
$else 
                1..l_values.count 
$end 
            loop
                pipe row (l_values(i));
            end loop;
        
        end loop;
        close c;
        
        return;

    end unpivot_json_array;
    
    procedure debug_test_output_heading(p_heading in varchar2)
    is
        c_separator constant varchar2(100) := lpad('-', 100, '-');
    begin
        dbms_output.put_line(c_separator);
        dbms_output.put_line('--' || p_heading);
        dbms_output.put_line(c_separator);            
    end debug_test_output_heading;
    
    procedure debug_dataguide_columns(
        jdoc in json_document_type, 
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null
    )
    is
        l_key_columns column_definitions;        
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        
        debug_test_output_heading('Column definitions from document dataguide.' 
            || case 
               when array_path is null then ' no array path' 
               else ' array_path => ' || array_path 
               end 
        );

        for i in 
$if dbms_db_version.version >= 21 $then 
            indices of l_key_columns 
$else 
            1..l_key_columns.count 
$end 
        loop
            dbms_output.put_line(
                'column_name => ' || l_key_columns(i).column_name
                || ', column_type => ' || l_key_columns(i).column_type
                || ', column_path => ' || l_key_columns(i).column_path);
        end loop;    
    end debug_dataguide_columns;

    procedure debug_json_table_expression(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null
    )
    is
        l_key_columns column_definitions;
        l_sql varchar2(4000);
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_sql := json_table_expression(l_key_columns, row_identifier, array_path);
        
        debug_test_output_heading('JSON_TABLE Expression generated from document dataguide.' 
            || case 
                   when array_path is null then ' no array path' 
                   else ' array_path => ' || array_path 
               end 
        );

        dbms_output.put_line(l_sql);
    end debug_json_table_expression;

    procedure debug_unpivot_expression(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null
    )
    is
        l_key_columns column_definitions;
        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);
    begin
        l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
        l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
        l_sql := unpivot_expression(l_key_columns, l_json_table_query);
        
        debug_test_output_heading('UNPIVOT Query generated from document dataguide.' 
            || case 
                   when array_path is null then ' no array path' 
                   else ' array_path => ' || array_path 
               end 
        );
        
        dbms_output.put_line(l_sql);
    end debug_unpivot_expression;

    procedure debug_unpivot_json_array(
        jdoc in json_document_type,
        row_identifier in varchar2 default null,        
        array_path in varchar2 default null,
        dataguide_columns in boolean default true,
        json_table_expression in boolean default true,
        unpivot_expression in boolean default true
    )
    is
    begin
    
        if dataguide_columns then
            debug_dataguide_columns(jdoc, row_identifier, array_path);
        end if;
        
        if json_table_expression then
            debug_json_table_expression(jdoc, row_identifier, array_path);
        end if;
        
        if unpivot_expression then
            debug_unpivot_expression(jdoc, row_identifier, array_path);
        end if;
        
    end debug_unpivot_json_array;
    
    function test_jdoc(
        format_option in json_format default c_array_simple
    ) return json_document_type
    is
        l_json_text clob;
    begin
        case format_option 
            when c_array_simple then
                l_json_text := to_clob(
                    q'~
                    [
                    {"name":"square","side":5,"color":"blue"},
                    {"name":"rectangle","length":5,"width":3},  
                    {"name":"box","length":5,"width":3,"height":2},  
                    {"name":"hexagon","side":3,"color":"red"},
                    {"name":"circle","radius":3},
                    ]
                ~');
            when c_array_nested then
                l_json_text := to_clob(
                    q'~
                    {"my_shapes":
                        [
                        {"name":"square","side":5,"color":"blue"},
                        {"name":"rectangle","length":5,"width":3},  
                        {"name":"box","length":5,"width":3,"height":2},  
                        {"name":"hexagon","side":3,"color":"red"},
                        {"name":"circle","radius":3},
                        ]
                    }
                ~');
            when c_array_simple_id then
                l_json_text := to_clob(
                    q'~
                    [
                    {"shape_id":101,"name":"square","side":5,"color":"blue"},
                    {"shape_id":102,"name":"rectangle","length":5,"width":3},  
                    {"shape_id":103,"name":"box","length":5,"width":3,"height":2},  
                    {"shape_id":104,"name":"hexagon","side":3,"color":"red"},
                    {"shape_id":105,"name":"circle","radius":3},
                    ]
                ~');
            when c_array_nested_id then
                l_json_text := to_clob(
                    q'~
                    {"my_shapes":
                        [
                        {"shape_id":101,"name":"square","side":5,"color":"blue"},
                        {"shape_id":102,"name":"rectangle","length":5,"width":3},  
                        {"shape_id":103,"name":"box","length":5,"width":3,"height":2},  
                        {"shape_id":104,"name":"hexagon","side":3,"color":"red"},
                        {"shape_id":105,"name":"circle","radius":3},
                        ]
                    }
                ~');
            else
                raise_application_error(-20100, format_option 
                    || ' is an undefined test document type');
        end case;
        
        return $if dbms_db_version.version < 21 $then l_json_text $else json(l_json_text) $end ;

    end test_jdoc;
            
end dynamic_json;
/

Dynamic Json Unpivot: Unit Testing

Using the set of test documents, we can unit test the final pacakge to be sure that everything is working as designed.

design-9.3-dynamic_json-test.sql

select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('simple'));

select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('simple_id'), 
                                     row_identifier => 'shape_id');

select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('nested'), 
                                     array_path => '.my_shapes');

select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('nested_id'), 
                                     row_identifier => 'shape_id', 
                                     array_path => '.my_shapes');

All four test queries still return the correct unpivoted results.

ROW#ID COLUMN#KEY COLUMN#VALUE        
   101 color      blue                
   101 name       square              
   101 side       5                   
   102 length     5                   
   102 name       rectangle           
   102 width      3                   
   103 height     2                   
   103 length     5                   
   103 name       box                 
   103 width      3                   
   104 color      red                 
   104 name       hexagon             
   104 side       3                   
   105 name       circle              
   105 radius     3

Looking at user objects will give us a test with number, string and date columns all unpivoted to the common varchar2(4000) datatype. Testing in 23ai we can use the shortcuts for the json constructor to create the objects and aggregate the array. The json constructor returns json in binary format, and the package has been conditionally compiled to expect this.

design-9.4-dynamic_json-test.sql

with base as (
    select 
        object_id, 
        object_name, 
        object_type, 
        created, 
        namespace
    from user_objects
), to_json as (
    select json [ json { b.* } ] as jdoc
    from base b
)
select u.*
from 
    to_json j,
    dynamic_json.unpivot_json_array(j.jdoc, row_identifier => 'OBJECT_ID') u;

ROW#ID COLUMN#KEY           COLUMN#VALUE        
100457 CREATED              2024-07-04T19:01:22 
100457 NAMESPACE            1                   
100457 OBJECT_NAME          DYNAMIC_JSON#TEST   
100457 OBJECT_TYPE          PACKAGE             
100458 CREATED              2024-07-04T19:01:22 
100458 NAMESPACE            2                   
100458 OBJECT_NAME          DYNAMIC_JSON#TEST   
100458 OBJECT_TYPE          PACKAGE BODY        
100459 CREATED              2024-07-04T19:22:51 
100459 NAMESPACE            1                   
100459 OBJECT_NAME          DYNAMIC_JSON#ALPHA  
100459 OBJECT_TYPE          PACKAGE             
100460 CREATED              2024-07-04T19:22:51 
100460 NAMESPACE            2                   
100460 OBJECT_NAME          DYNAMIC_JSON#ALPHA  
100460 OBJECT_TYPE          PACKAGE BODY        
100461 CREATED              2024-07-04T19:30:24 
100461 NAMESPACE            1                   
100461 OBJECT_NAME          DYNAMIC_JSON#BETA   
100461 OBJECT_TYPE          PACKAGE             
100462 CREATED              2024-07-04T19:30:24 
100462 NAMESPACE            2                   
100462 OBJECT_NAME          DYNAMIC_JSON#BETA   
100462 OBJECT_TYPE          PACKAGE BODY        
100463 CREATED              2024-07-04T19:32:40 
100463 NAMESPACE            1                   
100463 OBJECT_NAME          DYNAMIC_JSON        
100463 OBJECT_TYPE          PACKAGE             
100464 CREATED              2024-07-04T19:36:20 
100464 NAMESPACE            2                   
100464 OBJECT_NAME          DYNAMIC_JSON        
100464 OBJECT_TYPE          PACKAGE BODY

In 19c or lower we will need to format json as a clob. The package will be conditionally compiled to handle the json as clob data in these versions.

design-9.4-dynamic_json-test-19c.sql

        json_arrayagg(
            json_object(b.* returning clob)
        returning clob) as jdoc

Dynamically Unpivoting Sales Data

In the introduction we looked at unpivoting data sources for quarterly or monthly sales. Now that our utility is working we can see that the dynamic unpivot can be easily applied to these situations. All that we have to do is convert the data source to an array of json objects and pass it to our function. We can also alias the outputs and alter the final value datatypes so that it doesn’t look like we used a generic solution to get to lunch on time.

Lets revisit the sales data validation task that we started with and see how the new utility makes it easier. First, confirm that the quarterly sales data is being correctly unpivoted.

design-9.6-dynamic_json-test.sql

with quarterly_json as (
    select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
    from quarterly_sales_v s
), quarterly_unpivot as (
select 
    u.row#id as "Year", 
    u.column#key as "Quarter", 
    to_number(u.column#value) as "Volume"
from
    quarterly_json j,
    dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
)
select "Year", "Quarter", "Volume"
from quarterly_unpivot
order by "Year", "Quarter";

  Year Quarter    Volume
  2021 Qtr1       200184
  2021 Qtr2       198036
  2021 Qtr3       145044
  2021 Qtr4       153057
  2022 Qtr1       140424
  2022 Qtr2       158398
  2022 Qtr3       150220
  2022 Qtr4       177484
  2023 Qtr1       172737
  2023 Qtr2       163434
  2023 Qtr3       161465
  2023 Qtr4       152014

12 rows selected.

It looks like all the columns are being properly unpivoted. The next step is to verify that these values add up to the same values as the yearly totals. The column#key is not part of the grouping query so we can ignore it.

with quarterly_json as (
    select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
    from quarterly_sales_v s
), quarterly_unpivot as (
select 
    u.row#id as "Year", 
    to_number(u.column#value) as "Volume"
from
    quarterly_json j,
    dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
)
select "Year", sum("Volume") as "Volume"
from quarterly_unpivot
group by "Year"
order by "Year";

select "Year", "Volume"
from annual_sales_v
order by "Year";

Validate yearly totals for Quarterly Sales using dynamic unpivot

  Year  Volume
  2021  696321
  2022  626526
  2023  649650

compare to the yearly totals view

  Year  Volume
  2021  696321
  2022  626526
  2023  649650

Perfect, the quarterly sales data has the correct yearly values. Because we are unpivoting programatically, if one works they should all work.

Instead of running each validation separately, lets put them all together to get the task finished. If we add a field to each unpivot result to show the report it came from, we can combine all of the unpivot queries and then group by report and year at the end to see if everything agrees.

with quarterly_json as (
    select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
    from quarterly_sales_v s
), quarterly_unpivot as (
    select 
        'Quarterly Sales' as "Report",
        u.row#id as "Year", 
        to_number(u.column#value) as "Volume"
    from
        quarterly_json j,
        dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
...regional_unpivot...
...regional_quarterly_unpivot...
...monthly_unpivot...
...regional_monthly_unpivot...
), all_unpivots as (
    select "Report", "Year", "Volume" from quarterly_unpivot
    union all
    select "Report", "Year", "Volume" from regional_unpivot
    union all
    select "Report", "Year", "Volume" from regional_quarterly_unpivot
    union all
    select "Report", "Year", "Volume" from monthly_unpivot
    union all
    select "Report", "Year", "Volume" from regional_monthly_unpivot
    union all
    select 'Annual Sales View' as "Report", "Year", "Volume" from annual_sales_v
)
select "Report", "Year", sum("Volume") as "Volume"
from all_unpivots
group by "Report", "Year"
order by "Year", "Report";

Running this generates an exception!

ORA-06502: PL/SQL: value or conversion error: character string buffer too small
ORA-06512: at "CLOUD_PRACTICALPLSQL.DYNAMIC_JSON", line 176

It looks the problem is in the variables used by the pipelined function to hold the generated sql statements. Looking at the variable declarations for unpivot_json_array we can see that the json_table expression and the final query can’t be over 4000 characters. If we are unpivoting a large number of columns, the resulting sql statement can be very large. Thinking about the data here, the regional months unpivot is the likely suspect.

        l_json_table_query varchar2(4000);
        l_sql varchar2(4000);

Lets fix this and future proof the code by putting a subtype in the top of the package body and using it for building sql statements or fragments like the columns definitions. This is a good reminder to always create subtypes for variables that are used for a common purpose. If we had started with the subtype we could just update the value to 32000 or even to a clob without looking through the code for all the places where we are building sql.

    subtype sql_text is varchar2(32000);

        l_json_table_query sql_text;
        l_sql sql_text;

Note: I have made this correction in the source code, so you won’t see this error. To reproduce the error just change the sql_text subtype to 4000 characters and recompile before running the big validation query.

This is a good opportunity to run the debug procedure for all of these json documents with this script and see just how large the queries all turned out to be in the script output.

After this quick fix, the validation query runs without exceptions and the results are what we were looking for.

validate all the pivot reports together

Report                           Year  Volume
Annual Sales View                2021  696321
Monthly Sales                    2021  696321
Quarterly Sales                  2021  696321
Regional Monthly Sales           2021  696321
Regional Quarterly Sales         2021  696321
Regional Sales                   2021  696321
...
18 rows selected.

You can tell your manager that everything balances perfectly and that you are going to lunch now. Except for the fact that it still took about a half page of sql code to write the validation query. Was all that typing really necessary? Could we be even more concise in our approach?

One Step Further: A Dynamic Unpivot Macro

The only part of our final query that changed for each unpivot was the data source of each expression. Recognizing this pattern is enough to create a macro to conceal the fact that we are converting everything to json in order to accomplish the dynamic unpivot.

select u.row#id as "Year", u.column#key as "Month", to_number(u.column#value) as "Volume"
from
    (
    select json_arrayagg( json{ * } ) as jdoc
    from monthly_sales_v
    ) j,
    dynamic_json.unpivot_json_array(j.jdoc, 'Year') u;

Now we can capture this pattern in a macro function. To pass string parameters to a macro we have to use the polymorphic functions columns type. Columns passed with this type will always have qualified names that are enclosed in double quotes. Since there are always going to be three columns in the output of the pipelined function, we can require meaningful aliases as the first three elements of this columns collection.

The first column element will always be our row identifier, trim the double quotes from this column name and it can be passed to the pipelined function as the row_identifer parameter. There is no need to use an array path because we are generating the json from relational data directly.

We can pass a numeric parameter to the macro to support converting the column#value to a numeric datatype in the resulting sql statement.

Because the json document is created in the sql generated by the macro, we can use conditional compilation to generate the json as a clob or as the native json datatype.

design-9.8-pipelined-macro.sql

create or replace function unpivot_results(
    source_data in dbms_tf.table_t, 
    alias_list in dbms_tf.columns_t,
    value_as_number in number default 1
) return varchar2
sql_macro  $if dbms_db_version.version >= 21 $then (table) $end
is
    l_sql varchar2(4000);
    l_json_data varchar2(100);
    l_value_expression varchar2(100);
begin

$if dbms_db_version.version >= 21 $then
        l_json_data := 'json_arrayagg( json{ * } )';
$else
        l_json_data := 'json_arrayagg( json_object( * ) returning clob) returning clob)'; 
$end

l_sql := q'[
select 
    u.row#id as ##ROW_IDENTIFIER##, 
    u.column#key as ##COLUMN_KEY##, 
    ##VALUE_EXPRESSION## as ##COLUMN_VALUE##
from
    (
    select ##JSON_DATA## as jdoc
    from source_data 
    ) j,
    dynamic_json.unpivot_json_array(j.jdoc, '##ROW_IDENTIFIER_PARAMETER##') u
]';    

    l_sql := replace(l_sql, '##JSON_DATA##', l_json_data);
    l_sql := replace(l_sql, '##ROW_IDENTIFIER_PARAMETER##', trim(both '"' from alias_list(1)));
    l_sql := replace(l_sql, '##ROW_IDENTIFIER##', alias_list(1));
    l_sql := replace(l_sql, '##COLUMN_KEY##', alias_list(2));
    l_sql := replace(l_sql, '##COLUMN_VALUE##', alias_list(3));
    
    l_value_expression := case when value_as_number = 1 then 'to_number(u.column#value)' else 'u.column#value' end;
    
    l_sql := replace(l_sql, '##VALUE_EXPRESSION##', l_value_expression);
    
    --enable this to check the macro output
    --dbms_output.put_line(l_sql);
    
    return l_sql;

end unpivot_results;
/

Now we can write the final validation query in a form that is incredibly concise, and the query is self documenting.

design-9.8-pipelined-macro-test.sql

select "Report", "Year", sum("Volume") as "Volume"
from
    (
    select 'Quarterly Sales' as "Report", "Year", "Volume"
    from unpivot_results(quarterly_sales_v, columns("Year", "Quarter", "Volume"))
    union all
    select 'Regional Sales' as "Report", "Year", "Volume"
    from unpivot_results(regional_sales_v, columns("Year", "Region", "Volume"))
    union all
    select 'Regional Quarterly Sales' as "Report", "Year", "Volume"
    from unpivot_results(regional_quarterly_sales_v, columns("Year", "RegionQuarter", "Volume"))
    union all
    select 'Monthly Sales' as "Report", "Year", "Volume"
    from unpivot_results(monthly_sales_v, columns("Year", "Month", "Volume"))
    union all
    select 'Regional Monthly Sales' as "Report", "Year", "Volume"
    from unpivot_results(regional_monthly_sales_v, columns("Year", "RegionMonth", "Volume"))
    union all
    select 'Annual Sales' as "Report", "Year", "Volume"
    from annual_sales_v
    )
group by "Report", "Year"
order by "Year", "Report";

Report                     Year  Volume
Annual Sales               2021  696321
Monthly Sales              2021  696321
Quarterly Sales            2021  696321
Regional Monthly Sales     2021  696321
Regional Quarterly Sales   2021  696321
Regional Sales             2021  696321
Annual Sales               2022  626526
Monthly Sales              2022  626526
Quarterly Sales            2022  626526
Regional Monthly Sales     2022  626526
Regional Quarterly Sales   2022  626526
Regional Sales             2022  626526
Annual Sales               2023  649650
Monthly Sales              2023  649650
Quarterly Sales            2023  649650
Regional Monthly Sales     2023  649650
Regional Quarterly Sales   2023  649650
Regional Sales             2023  649650

The results are still right on target and we just might be a bit excited for the next ‘tedius’ data validation that our manager needs immediately!

Next Steps

There may be many other situations that can benefit from a dynamic unpivot utility, and the current code is still really an initial draft which is limited in its supported use cases. While the code may work for the design use cases perfectly, there is nothing compared to real world testing to see how will it will hold up.

I actually decided the add the larger sales data test cases the day after initial publication of this article and gleefully discovered that I had undersized the variables for building sql. The fix improves the overall quality of the code, and a test case of this magnitude should probably be added to the unit tests. The next thing to do with this mini project is to create a test suite with utPlsql and create even more test cases.

Like any sort of utility code, the only real limitations are our ability to imagine different applications of the functionality that has been defined and how well the test cases we create push the boundaries of the solution.

In closing the post on Table Data Comparison: Row Differences, I suggested that this concept may be useful for isolating the columns with different values instead of manually creating unpivot queries. The next post for the series on comparison macros, Table Data Comparision: Column Differences, will begin finding these exact column differences by dynamically unpivoting the columns and applying a row comparison macro.

–Anthony Harper

Recent

Categories

Archives

2025 (8)
2024 (5)
2023 (7)

Source code for this article can be found at https://github.com/gaiansentience/practicalplsql

Posted

July 5, 2024

JSON, Pl/Sql, SQL, Tutorials

Anthony Harper

Dynamic SQL, JSON_DATAGUIDE, JSON_TABLE, oracle, Oracle 19c, Oracle 21c, Oracle 23ai, oracle-database, Pipelined Functions, SQL, SQL_MACRO, UNPIVOT, Utilities

Discussion and Comments

One response to “Dynamic Unpivot Using JSON_DATAGUIDE”

Table Data Comparison: Row Differences – Practical Pl/Sql

July 5, 2024

[…] This will give us a tool for easily unpivoting the results of the json row comparison methods and using the unpivoted results as inputs to the row comparison methods to find specific columns with differences. We will create this utility using a json dataguide to find out what the json keys that need to be unpivoted are. The article will be Dynamic Unpivot Using JSON_DATAGUIDE […]

LikeLike

Reply