set serveroutput on
declare
l_goal varchar2(100) := 'dynamic unpivot';
l_problem varchar2(100) := 'custom code for every query';
l_solution varchar2(100) := 'json_table and json_dataguide';
begin
dbms_output.put_line('For ' || l_goal);
dbms_output.put_line('Use ' || l_solution || ' to eliminate ' || l_problem);
end;
As Oracle developers, we are sometimes confronted with tasks that require tedius and repetitive custom SQL coding to accomplish. The SQL language is powerful enough that we can come up with better ways to do the same thing that have less risk of errors, but sometimes the tedium cannot be avoided.
Consider a data source that shows quarterly sales totals for several years with each quarterly total in its own column with each year in a single row. This is a common presentation format that is easy to understand, but the structure of the data makes it hard to work with for further analysis.
sales_data-quarterly.sql
Note: run sales_data.report_sources.sql to create pivoted data sources
select
"Year", "Qtr1", "Qtr2", "Qtr3", "Qtr4"
from quarterly_sales_v
order by "Year";
Year Qtr1 Qtr2 Qtr3 Qtr4
2021 200184 198036 145044 153057
2022 140424 158398 150220 177484
2023 172737 163434 161465 152014
If we need to verify that the yearly totals are correct, we can add up the quarterly columns for each year.
select
"Year",
(
"Qtr1" + "Qtr2" + "Qtr3" + "Qtr4"
) as "Volume"
from quarterly_sales_v
order by "Year";
Year Volume
2021 696321
2022 626526
2023 649650
To work with this data, it would be easier to unpivot it into a set of rows with columns for year, quarter and amount.
select "Year", "Quarter", "Volume"
from
quarterly_sales_v
unpivot (
"Volume" for "Quarter" in (
"Qtr1", "Qtr2", "Qtr3", "Qtr4")
)
order by "Year", "Quarter";
Year Quarter Volume
2021 Qtr1 200184
2021 Qtr2 198036
2021 Qtr3 145044
...
2023 Qtr2 163434
2023 Qtr3 161465
2023 Qtr4 152014
12 rows selected.
After unpivoting, a query with the sum of volume grouped by year is simple.
with unpivot_sales as (
select "Year", "Quarter", "Volume"
from
quarterly_sales_v
unpivot (
"Volume" for "Quarter" in (
"Qtr1", "Qtr2", "Qtr3", "Qtr4")
)
)
select
"Year",
sum("Volume") as "Volume"
from unpivot_sales
group by "Year"
order by "Year";
Year Volume
2021 696321
2022 626526
2023 649650
Adding more columns to unpivot just adds more work to the situation. Instead of starting from a pivot of quarterly sales, lets try starting with a pivot of monthly sales.
sales_data-monthly.sql
select
"Year",
"January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"
from monthly_sales_v
order by "Year";
Year January February March April May June July August September October November December
2021 53231 75425 71528 71715 62200 64121 42125 65347 37572 50038 55317 47702
2022 39111 52382 48931 45432 56751 56215 38223 70713 41284 63527 51175 62782
2023 35370 70456 66911 48697 57753 56984 52034 71215 38216 50581 42483 58950
Adding all of the monthly columns can work to validate the yearly totals.
select
"Year",
(
"January" + "February" + "March" + "April" + "May" + "June" +
+ "July" + "August" + "September" + "October" + "November" + "December"
) as "Volume"
from monthly_sales_v
order by "Year";
Using unpivot simplifies the summation, but twelve columns in the unpivot expression doesn’t this task any easier.
with unpivoted_sales as (
select "Year", "Month", "Volume"
from
monthly_sales_v
unpivot (
"Volume" for "Month" in (
"January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"
)
)
)
select
"Year",
sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";
What if the task is to validate many different reports that split this same data into an increasing number of columns? Lets look at some examples with more columns to unpivot.
The regional breakdown is similar to the quarterly report.
sales_data-regional.sql
select
"Year", "East", "West", "North", "South", "Central"
from regional_sales_v
order by "Year";
Year East West North South Central
2021 139621 162371 168171 118172 107986
2022 125800 127568 105420 155688 112050
2023 152208 123094 108329 122119 143900
Not too bad you say, just unpivot and validate.
with unpivot_sales as (
select "Year", "Region", "Volume"
from
regional_sales_v
unpivot (
"Volume" for "Region" in (
"East", "West", "North", "South", "Central")
)
)
select
"Year",
sum("Volume") as "Volume"
from unpivot_sales
group by "Year"
order by "Year";
Regions by quarter has so many columns that we have to run it separately for each region.
sales_data-regional_quarterly.sql
select
"Year", "WestQtr1", "WestQtr2", "WestQtr3", "WestQtr4"
from regional_quarterly_sales_v
order by "Year";
Year WestQtr1 WestQtr2 WestQtr3 WestQtr4
2021 52364 45971 36192 27844
2022 20720 44459 32197 30192
2023 22997 37030 30066 33001
This is definitely not something we want to validate by adding all of the columns, it becomes hard to tell if we have added all the columns correctly. The risk of cut and paste errors is increasing.
select
"Year",
(
"EastQtr1" + "WestQtr1" + "NorthQtr1" + "SouthQtr1" + "CentralQtr1"
+ "EastQtr2" + "WestQtr2" + "NorthQtr2" + "SouthQtr2" + "CentralQtr2"
+ "EastQtr3" + "WestQtr3" + "NorthQtr3" + "SouthQtr3" + "CentralQtr3"
+ "EastQtr4" + "WestQtr4" + "NorthQtr4" + "SouthQtr4" + "CentralQtr4"
) as "Volume"
from regional_quarterly_sales_v
order by "Year";
Using an unpivot approach is equally difficult.
with unpivoted_sales as (
select "Year", "Volume", "RegionQuarter"
from
regional_quarterly_sales_v
unpivot (
"Volume" for "RegionQuarter" in (
"EastQtr1", "WestQtr1", "NorthQtr1", "SouthQtr1", "CentralQtr1",
"EastQtr2", "WestQtr2", "NorthQtr2", "SouthQtr2", "CentralQtr2",
"EastQtr3", "WestQtr3", "NorthQtr3", "SouthQtr3", "CentralQtr3",
"EastQtr4", "WestQtr4", "NorthQtr4", "SouthQtr4", "CentralQtr4"
)
)
)
select
"Year",
sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";
What if the same data is also pivoted to include a column for every month for each of the 5 sales regions?
sales_data-regional_monthly.sql
select
"Year",
"EastJanuary", "EastFebruary", "EastMarch",
"EastApril", "EastMay", "EastJune"
from regional_monthly_sales_v
order by "Year";
Year EastJanuary EastFebruary EastMarch EastApril EastMay EastJune
2021 529 13488 19528 15659 16739 13319
2022 13220 2556 18191 5850 14574 6741
2023 12905 14277 7257 5711 18163 14741
Adding 60 columns no longer seems manageable, and the risk of errors is basically unacceptable.
select
"Year",
(
"EastJanuary" + "WestJanuary" + "NorthJanuary" + "SouthJanuary" + "CentralJanuary" +
"EastFebruary" + "WestFebruary" + "NorthFebruary" + "SouthFebruary" + "CentralFebruary" +
...
"EastNovember" + "WestNovember" + "NorthNovember" + "SouthNovember" + "CentralNovember" +
"EastDecember" + "WestDecember" + "NorthDecember" + "SouthDecember" + "CentralDecember"
) as "Volume"
from regional_monthly_sales_v
order by "Year";
Writing this unpivot is equally tedius and is an open invitation to mistakes.
with unpivoted_sales as (
select "Year", "Volume", "RegionMonth"
from
regional_monthly_sales_v
unpivot (
"Volume" for "RegionMonth" in (
"EastJanuary", "WestJanuary", "NorthJanuary", "SouthJanuary", "CentralJanuary",
"EastFebruary", "WestFebruary", "NorthFebruary", "SouthFebruary", "CentralFebruary",
...
"EastNovember", "WestNovember", "NorthNovember", "SouthNovember", "CentralNovember",
"EastDecember", "WestDecember", "NorthDecember", "SouthDecember", "CentralDecember"
)
)
)
select
"Year",
sum("Volume") as "Volume"
from unpivoted_sales
group by "Year"
order by "Year";
The data may be useful in a spreadsheet, but it cuts into our lunch hour when we are asked at the last minute to confirm that all of these reports add up to the same yearly totals. Wouldn’t it be nice if there was a way to dynamically generate unpivot queries?
In this post I will explore a way to generate and run unpivot queries dynamically in PL/SQL. We will build a utility package to dynamically create unpivot queries using json and leveraging json_dataguide to create json_table and unpivot sql expressions.
Designing and building this utility will provide an excellent opportunity to walk through the development process step by step. By the end of the article we should be able to compare all of these reports in a single query with a minimum of custom code.
- Getting Started
- JSON_TABLE and UNPIVOT queries
- Test JSON Documents
- Design: Test Document Dataguides
- Convert Dataguide JSON To Relational Data
- Convert Dataguide To Associative Array
- Building The Dynamic JSON_TABLE Expression
- Building The Dynamic Unpivot Expression
- Unpivot Function Returning Associative Array
- Pipelined Function To Stream Unpivot Results
- Enhancements To Generalize Functionality
- Finishing Touches
- Dynamic Json Unpivot: Final Package Specification
- Dynamic Json Unpivot: Final Package Body
- Dynamic Json Unpivot: Unit Testing
- Dynamically Unpivoting Sales Data
- One Step Further: A Dynamic Unpivot Macro
- Next Steps
Getting Started
The source code for all articles here on practicalplsql.org is available in the practicalplsql public repository on github in the scripts/articles directory organized by subject and article. You can use create.user.practicalplsql.sql to create the practicalplsql schema. Sample data for articles can be found in the scripts/examples directory organized by example name. Scripts for this post are located in dynamic-json-unpivot. I have included the filename of the script in each code sample for easy reference. If you want to clean up the sample schema objects, use the drop_all_user_objects.sql script.
All scripts have been tested in SQL Developer on Oracle 19c, 21c and 23ai. Note: the anonymous blocks used to design the code all compiled in Oracle XE 21c, but the packaged code causes an internal error when compiling. The same packages compiled in OCI 21c, so it must be due to the patch level of XE 21c. As of this writing, OCI has decommissioned 21c autonomous databases, so the target database versions are really 19c and 23ai.
All scripts should be executed with serveroutput on, so this repetition is omitted for clarity. Some queries or code build on previous examples, these omit the repeated code for brevity, the source code has no omissions. Finally, when several scripts produce the same correct output, the output will only be shown once to avoid repetition.
JSON_TABLE and UNPIVOT queries
We can query tables or generate relational data to create a relational data source.
intro.relational_data.sql
with base as (
select
level * 10 as "c1_Number",
'details for ' || level * 10 as "c2_String",
date '2024-06-01' + level as "c3_Date"
from dual connect by level <= 3
)
select rownum as id, b.* from base b;
ID c1_Number c2_String c3_Date
1 10 details for 10 2024-06-02 00:00:00
2 20 details for 20 2024-06-03 00:00:00
3 30 details for 30 2024-06-04 00:00:00
JSON SQL methods can be used to project rows of relational data as a simple array of json objects. The default behavior of JSON_OBJECT is to convert table rows using the column names of the data source. This allows generation of json objects with a simple json_object(*) sql expression.
These rows can then be aggregated into a json array with JSON_ARRAYAGG. Going from relational data to json data can be done without specific references to the actual columns in the relational data source, so this conversion can easily be done with a minimal amount of code.
intro.relational_to_json.sql
with base as (...
), json_base as (
select json_arrayagg(json_object(*) order by "c1_Number") as "jDoc"
from base
)
select json_serialize("jDoc" pretty) as "jDoc"
from json_base;
[
{"c1_Number" : 10, "c2_String" : "details for 10", "c3_Date" : "2024-06-02T00:00:00"},
{"c1_Number" : 20, "c2_String" : "details for 20", "c3_Date" : "2024-06-03T00:00:00"},
{"c1_Number" : 30, "c2_String" : "details for 30", "c3_Date" : "2024-06-04T00:00:00"}
]
Converting this json document to relational data can be done with a JSON_TABLE expression that defines the document path to generate rows from, and a COLUMNS clause defining the mapping between json keys and relational columns to be projected. We can use dot notation and item methods to define each column quickly, but each column needs to be specified individually. If there are a large number of json keys to convert to columns, the COLUMNS clause requires a lot of literal code.
intro.json_to_relational.sql
with base as (...
), json_base as (...
), json_to_relational as (
select j.id, j."c1_Number", j."c2_String", j."c3_Date"
from
json_base b,
json_table(b."jDoc", '$[*]'
columns(
id for ordinality,
"c1_Number" number path '$.c1_Number',
"c2_String" varchar2(50) path '$.c2_String',
"c3_Date" date path '$.c3_Date'
)
) j
)
select * from json_to_relational;
ID c1_Number c2_String c3_Date
1 10 details for 10 2024-06-02 00:00:00
2 20 details for 20 2024-06-03 00:00:00
3 30 details for 30 2024-06-04 00:00:00
We can unpivot relational data, but the code is not very dynamic. The columns to be unpivoted need to be the same datatype. The UNPIVOT expression has to specify each column to be unpivoted.
If the relational source is selecting all columns from a table, we can identify the columns and their datatypes by querying the data dictionary. When the datasource is a query, it is much harder to determine the columns and their datatypes. This information is needed to coerce the datatypes and create an unpivot query.
intro.unpivot_relational.sql
with base as (...
), base_to_common_datatype as (
select
rownum as id,
to_char("c1_Number") as "c1_Number",
"c2_String",
to_char("c3_Date") as "c3_Date"
from base
), unpivot_base as (
select id, "cName", "cValue"
from
base_to_common_datatype
unpivot (
"cValue" for "cName" in (
"c1_Number", "c2_String", "c3_Date"
)
)
)
select * from unpivot_base;
ID cName cValue
1 c1_Number 10
1 c2_String details for 10
1 c3_Date 2024-06-02 00:00:00
2 c1_Number 20
2 c2_String details for 20
2 c3_Date 2024-06-03 00:00:00
3 c1_Number 30
3 c2_String details for 30
3 c3_Date 2024-06-04 00:00:00
If we convert a json document to relational data using JSON_TABLE, we can take a shortcut and coerce the datatypes in the COLUMNS clause. This makes the unpivot simpler. All column definitions and the unpivot will still require specific references to column names.
intro.unpivot_json_table.sql
with base as (...
), json_base as (...
), json_to_relational as (
select j.id, j."c1_Number", j."c2_String", j."c3_Date"
from
json_base b,
json_table(b."jDoc", '$[*]'
columns(
id for ordinality,
"c1_Number" varchar2(50) path '$.c1_Number',
"c2_String" varchar2(50) path '$.c2_String',
"c3_Date" varchar2(50) path '$.c3_Date'
)
) j
), unpivot_base as (
select id, "cName", "cValue"
from
json_to_relational
unpivot (
"cValue" for "cName" in (
"c1_Number", "c2_String", "c3_Date"
)
)
)
select * from unpivot_base;
ID cName cValue
1 c1_Number 10
1 c2_String details for 10
1 c3_Date 2024-06-02T00:00:00
2 c1_Number 20
2 c2_String details for 20
2 c3_Date 2024-06-03T00:00:00
3 c1_Number 30
3 c2_String details for 30
3 c3_Date 2024-06-04T00:00:00
If the data is already a json document, we can access the json keys by calling the JSON SQL aggregate method JSON_DATAGUIDE. This will return the schema of the json document, with the json paths and their datatypes.
For this simple json document: the primary array is typed as an array, each row from the datasource is an object, and all the other elements are simple scalar values typed as string or number. The date value is typed as a string because there really isn’t a data in json. Because we want to coerce all the datatypes for the unpivot, this won’t be a problem.
For these scalar elements, the json path is the original column name. As we will see, the results of the JSON_DATAGUIDE function provide sufficient information to create a JSON_TABLE expression and an UNPIVOT query dynamically!
intro.json_dataguide.sql
with base as (...
), json_base as (...
)
select json_serialize(json_dataguide("jDoc") pretty) as "jDataguide"
from json_base;
[
{"o:path" : "$", "type" : "array", "o:length" : 256},
{"o:path" : "$.c3_Date", "type" : "string", "o:length" : 32},
{"o:path" : "$.c1_Number", "type" : "number", "o:length" : 2},
{"o:path" : "$.c2_String", "type" : "string", "o:length" : 16}
]
In a PL/SQL function, we can pass a json document as the input parameter, query the dataguide to get the necessary columns information to build a JSON_TABLE expression and to build an UNPIVOT query based on that expression, and then return the unpivoted values in a standard shape as a pipelined row.
The design process to get to a dynamic unpivot function can be broken down into separate parts based on each functionality needed. Each part can be prototyped using an anonymous block and local functions so that we can see how the pieces fit together. When all the parts are worked out, we will be ready to put them together as a package that can be compiled and tested.
Test JSON Documents
Using the same json document will keep the design process simple. Each json object in the array may have null values that were not created in the json (by using absent on null), so our objects should not always have all of the same keys present.
The json array could also be a nested attribute of a json object, so we should also use a second test document that has this more complex structure. We will look closer at the differences this creates for the dataguide and for the necessary JSON_TABLE expressions when converting to relational data.
To keep the design scripts uncluttered lets put this test json function into a package and add a print procedure to view it directly. This can easily be modified to return the JSON datatype instead of a CLOB if we are in a version of Oracle that is 21 or greater. Since many of us are currently preparing to migrate 19c codebases to 23ai, I will use conditional compilation to make the code compatible with Oracle 19c, 21c and 23ai for the rest of the code in this article.
design-1.1-test_jdoc-package.sql
create or replace package dynamic_json#test
authid current_user
as
subtype json_format is varchar2(10);
c_array_simple constant json_format := 'simple';
c_array_nested constant json_format := 'nested';
function test_jdoc(
format_option in json_format default c_array_simple
) return $if dbms_db_version.version < 21 $then clob $else json $end ;
end dynamic_json#test;
/
create or replace package body dynamic_json#test
as
function test_jdoc(
format_option in json_format default c_array_simple
) return $if dbms_db_version.version < 21 $then clob $else json $end
is
l_json_text clob;
begin
case format_option
when c_array_simple then
l_json_text := to_clob(
q'~
[
{"name":"square","side":5,"color":"blue"},
{"name":"rectangle","length":5,"width":3},
{"name":"box","length":5,"width":3,"height":2},
{"name":"hexagon","side":3,"color":"red"},
{"name":"circle","radius":3},
]
~');
when c_array_nested then
l_json_text := to_clob(
q'~
{"my_shapes":
[
{"name":"square","side":5,"color":"blue"},
{"name":"rectangle","length":5,"width":3},
{"name":"box","length":5,"width":3,"height":2},
{"name":"hexagon","side":3,"color":"red"},
{"name":"circle","radius":3},
]
}
~');
else
raise_application_error(-20100, format_option
|| ' is an undefined test document type'
);
end case;
return
$if dbms_db_version.version < 21 $then
l_json_text
$else
json(l_json_text)
$end
;
end test_jdoc;
end dynamic_json#test;
/
These sample documents can be checked in a query to be sure they are properly constructed json. When we are finished with the design process, we can unit test with this query as the source for our pipelined function.
design-1.2-test_jdoc-query.sql
with json_document as (
select dynamic_json#test.test_jdoc('simple') as jdoc
from dual
)
select json_serialize(b.jdoc returning clob pretty) as jdoc
from json_document b;
with json_document as (
select dynamic_json#test.test_jdoc('nested') as jdoc
from dual
)
select json_serialize(b.jdoc returning clob pretty) as jdoc
from json_document b;
Design: Test Document Dataguides
We can use the test function to see what the dataguides will look like for each of the test json documents.
design-2.1-dataguide-test_jdoc.sql
with json_document as (
select dynamic_json#test.test_jdoc('simple') as jdoc
from dual
)
select
json_serialize(json_dataguide(b.jdoc) returning clob pretty) as jdoc_dataguide
from json_document b;
with json_document as (
select dynamic_json#test.test_jdoc('nested') as jdoc
from dual
)
select
json_serialize(json_dataguide(b.jdoc) returning clob pretty) as jdoc_dataguide
from json_document b;
dataguide for simple test document:
[
{"o:path" : "$", "type" : "array", "o:length" : 1},
{"o:path" : "$.name", "type" : "string", "o:length" : 16},
{"o:path" : "$.side", "type" : "number", "o:length" : 2},
{"o:path" : "$.color", "type" : "string", "o:length" : 4},
{"o:path" : "$.width", "type" : "number", "o:length" : 2},
{"o:path" : "$.height", "type" : "number", "o:length" : 2},
{"o:path" : "$.length", "type" : "number", "o:length" : 2},
{"o:path" : "$.radius", "type" : "number", "o:length" : 2}
]
dataguide for nested test document:
[
{"o:path" : "$", "type" : "object", "o:length" : 1},
{"o:path" : "$.my_shapes", "type" : "array", "o:length" : 1},
{"o:path" : "$.my_shapes.name", "type" : "string", "o:length" : 16},
{"o:path" : "$.my_shapes.side", "type" : "number", "o:length" : 2},
{"o:path" : "$.my_shapes.color", "type" : "string", "o:length" : 4},
{"o:path" : "$.my_shapes.width", "type" : "number", "o:length" : 2},
{"o:path" : "$.my_shapes.height", "type" : "number", "o:length" : 2},
{"o:path" : "$.my_shapes.length", "type" : "number", "o:length" : 2},
{"o:path" : "$.my_shapes.radius", "type" : "number", "o:length" : 2}
]
The dataguide for the nested json array is almost identical, but the column paths include the root object with the array attribute my_shapes.
Looking at the dataguide for these document dataguides we can see that there is a standard structure for all flat dataguides. The json attribute values of this meta dataguide get escaped because they include double quotes and colons.
design-2.2-any_dataguide.sql
with json_document as (
select dynamic_json#test.test_jdoc('simple') as jdoc
from dual
), document_dataguide as (
select json_dataguide(jdoc) as jdg
from json_document
), dataguide_dataguide as (
select json_dataguide(jdg) as jdg
from document_dataguide
)
select json_serialize(jdg returning clob pretty) as any_dataguide
from dataguide_dataguide;
[
{"o:path" : "$", "type" : "array", "o:length" : 512},
{"o:path" : "$.type", "type" : "string", "o:length" : 8},
{"o:path" : "$.\"o:path\"", "type" : "string", "o:length" : 8},
{"o:path" : "$.\"o:length\"", "type" : "number", "o:length" : 2}
]
Convert Dataguide JSON To Relational Data
Because the dataguide has a standard structure, converting any dataguide to relational data will always use the same JSON_TABLE expression. The paths in the simple document are the relative paths we need to convert the source document to relational data. The paths are also very close to column names that we could use when converting to relational data.
design-3.1-dataguide_columns-query.sql
with dataguide as (
select
json_dataguide(
dynamic_json#test.test_jdoc('simple')
) as jdoc_dataguide
from dual
), dataguide_relational as (
select
dbms_assert.enquote_name(
ltrim(j.dg_path, '$.')
, capitalize => false
) as column_name,
j.column_type,
j.dg_path as column_path
from
dataguide g,
json_table(g.jdoc_dataguide, '$[*]'
columns(
dg_path path '$."o:path".string()',
column_type path '$.type.string()'
)
) j
)
select column_name, column_type, column_path
from dataguide_relational
where column_type not in ('object', 'array');
If we strip out the array key for the nested document from the column paths using the replace function, we will end up with the same values for the column definitions.
design-3.1-dataguide_columns-query.sql
prompt adjust column name and relative path for the nested format by removing the array path
with dataguide as (
select
json_dataguide(
dynamic_json#test.test_jdoc('nested')
) as jdoc_dataguide
from dual
), dataguide_relational as (
select
dbms_assert.enquote_name(
ltrim(replace(j.dg_path, '.my_shapes'), '$.'),
capitalize => false
) as column_name,
j.column_type,
replace(j.dg_path, '.my_shapes') as column_path
from
dataguide g,
json_table(g.jdoc_dataguide, '$[*]'
columns(
dg_path path '$."o:path".string()',
column_type path '$.type.string()'
)
) j
)
select column_name, column_type, column_path
from dataguide_relational
where column_type not in ('object', 'array');
COLUMN_NAME COLUMN_TYPE COLUMN_PATH
"name" string $.name
"side" number $.side
"color" string $.color
"width" number $.width
"height" number $.height
"length" number $.length
"radius" number $.radius
In the dataguide for the nested document, the paths include the array attribute that they are nested into. Removing this creates simple column names and relative paths. The array path will need to be a parameter to our function so that we can do this replacment. We will also need this as a parameter to specify the path for JSON_TABLE to generate rows from.
Convert Dataguide To Associative Array
Using an anonymous block with nested functions allows us to design the final functions we will need without actually compiling them.
Because the flat dataguide always has the same structure, we can select it into an associative array after using JSON_TABLE to convert it to relational data. An associative array will be easier to use when building dynamic sql statements based on these column definitions. A function to print out the associative array of column definitions will also be useful.
The dataguide elements with a type of object and array are not scalar elements in the document, so we will exclude these in the WHERE clause.
We have to adjust the path arguments for the nested json document so that they create well formed column names and the column paths are all relative paths for use in JSON_TABLE. Notice that using inline conditional compilation to conditionally define a variable’s datatype makes it clear why the $end should not have a semicolon.
The collection type column_definitions is an associative array of a record type column_definition. We can use this to keep the column definitions available to format both the JSON_TABLE expression and the UNPIVOT expression.
In 23ai SQL supports booleans, so we can just use dbms_assert to properly quote identifiers. Since we want the code to be compatible with 19c, we have to use different syntax to achieve the same goal. If we were just writing an SQL query, conditional compilation cannot be used and we would have to write the query specifically for lower versions that don’t support booleans in SQL. Because we are writing the query in PL/SQL, we can customize it with conditional compilation and the compiler will use the syntax that is appropriate to the version of Oracle.
Note: each part of the design process will include a debug procedure that uses dbms_output to show the results. These procedures are very simple and have been omitted from the code samples for clarity. Look at the full script in git to see the details of these procedures.
design-3.2-dataguide-columns-collection.sql
type column_definition is record (
column_name varchar2(64),
column_type varchar2(20),
column_path varchar2(100));
type column_definitions is table of column_definition index by pls_integer;
function dataguide_columns(
jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
array_path in varchar2 default null
) return column_definitions
is
l_key_columns column_definitions;
begin
with dataguide as (
select json_dataguide(jdoc) as jdoc_dataguide
from dual
), dataguide_relational as (
select
$if dbms_db_version.version >= 23 $then
dbms_assert.enquote_name(
ltrim(replace(j.dg_path, array_path), '$.'),
capitalize => false
) as column_name,
$else
'"' || ltrim(replace(j.dg_path, array_path), '$.') || '"' as column_name,
$end
j.column_type,
replace(j.dg_path, array_path) as column_path
from
dataguide g,
json_table(g.jdoc_dataguide, '$[*]'
columns(
dg_path path '$."o:path".string()',
column_type path '$.type.string()'
)
) j
)
select column_name, column_type, column_path
bulk collect into l_key_columns
from dataguide_relational
where column_type not in ('object', 'array');
return l_key_columns;
end dataguide_columns;
column_name => "name", column_type => string, column_path => $.name
column_name => "side", column_type => number, column_path => $.side
column_name => "color", column_type => string, column_path => $.color
column_name => "width", column_type => number, column_path => $.width
column_name => "height", column_type => number, column_path => $.height
column_name => "length", column_type => number, column_path => $.length
column_name => "radius", column_type => number, column_path => $.radius
Both sample documents return the same column information with relative paths because we passed the array path as a parameter and used it to clean up both the column names and paths.
Building The Dynamic JSON_TABLE Expression
Now that we can pull the column information from the dataguide, we have enough information to build the COLUMNS clause for JSON_TABLE. This dynamic query will be internal to our primary function, so we will put the json document into it as a bind variable. It is also intended to be used for unpivoting, so all of the column expressions can be typed as varchar2(4000) in the JSON_TABLE expression. Using a constant to represent this will allow us some flexibility later if we want to change to another datatype or size.
Create some formatting constants to keep the code readable and to allow easy changes later on. Use c_max_column_name_length to pad the column name and keep the resulting sql well formatted.
To make the structure of the sql expression we are creating more clear, we can create it as a template with markers for sections that will be replaced (##ARRAY_PATH##, ##JSON_TABLE_COLUMNS##). This results in more readable code that clearly separates the building of the replacement pieces from the template of the sql statement.
design-4.1-json_table_expression.sql
c_lf constant varchar2(1) := chr(10);
c_indent constant varchar2(100) := lpad(' ', 12, ' ');
c_max_column_name_length constant number := 10;
c_unpivot_to_datatype constant varchar2(20) := 'varchar2(4000)';
function json_table_expression(
key_columns in column_definitions,
array_path in varchar2 default null
) return varchar2
is
l_sql varchar2(32000);
l_columns_clause varchar2(4000);
begin
--build the columns expression
for i in indices of key_columns loop
l_columns_clause := l_columns_clause || c_indent
|| ', ' || rpad(key_columns(i).column_name, c_max_column_name_length + 1, ' ')
|| c_unpivot_to_datatype
|| ' path ''' || key_columns(i).column_path || ''''
|| case when i <> key_columns.count then c_lf end;
end loop;
l_sql := q'+
select j.*
from
json_table(:jdoc, '$##ARRAY_PATH##[*]'
columns (
"row#id" for ordinality
##JSON_TABLE_COLUMNS##
)
) j
+';
l_sql := replace(l_sql, '##JSON_TABLE_COLUMNS##', l_columns_clause);
l_sql := replace(l_sql, '##ARRAY_PATH##', rtrim(array_path,'.'));
return l_sql;
end json_table_expression;
JSON_TABLE Expression generated from document dataguide. no array path
select j.*
from
json_table(:jdoc, '$[*]'
columns (
"row#id" for ordinality
, "name" varchar2(4000) path '$.name'
, "side" varchar2(4000) path '$.side'
, "color" varchar2(4000) path '$.color'
, "width" varchar2(4000) path '$.width'
, "height" varchar2(4000) path '$.height'
, "length" varchar2(4000) path '$.length'
, "radius" varchar2(4000) path '$.radius'
)
) j
JSON_TABLE Expression generated from document dataguide. array_path => .my_shapes
select j.*
from
json_table(:jdoc, '$.my_shapes[*]'
columns (
"row#id" for ordinality
, "name" varchar2(4000) path '$.name'
, "side" varchar2(4000) path '$.side'
, "color" varchar2(4000) path '$.color'
, "width" varchar2(4000) path '$.width'
, "height" varchar2(4000) path '$.height'
, "length" varchar2(4000) path '$.length'
, "radius" varchar2(4000) path '$.radius'
)
) j
The output here is a well formed JSON_TABLE statement to convert a json bind variable to relational data. Test this query by commenting out the bind variable and replacing with the function that gets the test document.
design-4.2-json_table_expression-test.sql
--json_table(:jdoc, '$[*]'
json_table(dynamic_json#test.test_jdoc(), '$[*]'
--json_table(:jdoc, '$.my_shapes[*]'
json_table(dynamic_json#test.test_jdoc('nested'), '$.my_shapes[*]'
row#id name side color width height length radius
1 square 5 blue <null> <null> <null> <null>
2 rectangle <null> <null> 3 <null> 5 <null>
3 box <null> <null> 3 2 5 <null>
4 hexagon 3 red <null> <null> <null> <null>
5 circle <null> <null> <null> <null> <null> 3
The JSON_TABLE document path has been set appropriately for each json document. Both queries return the same results.
Building The Dynamic Unpivot Expression
Now that the dynamic JSON_TABLE expression is working correctly, we can create another function to build a dynamic UNPIVOT query from the JSON_TABLE query used as a CTE.
design-5.1-unpivot-query.sql
function unpivot_expression(
key_columns in column_definitions,
json_table_query in varchar2
) return varchar2
is
l_sql varchar2(32000);
l_unpivot_columns varchar2(4000);
begin
--build the unpivot expression using the columns collection
for i in indices of key_columns loop
l_unpivot_columns := l_unpivot_columns || c_indent
|| case when i > 1 then ', ' end
|| key_columns(i).column_name
|| case when i <> key_columns.count then c_lf end;
end loop;
l_sql := q'+
with json_to_relational as (
##JSON_TABLE_EXPRESSION##
)
select "row#id", "column#key", "column#value"
from
json_to_relational
unpivot (
"column#value" for "column#key" in (
##UNPIVOT_COLUMNS##
)
)
order by "row#id", "column#key"
+';
l_sql := replace(l_sql, '##JSON_TABLE_EXPRESSION##', json_table_query);
l_sql := replace(l_sql, '##UNPIVOT_COLUMNS##', l_unpivot_columns);
return l_sql;
end unpivot_expression;
UNPIVOT Query generated from document dataguide. no array path
with json_to_relational as (
select j.*
from
json_table(:jdoc, '$[*]'
columns (
"row#id" for ordinality
, "name" varchar2(4000) path '$.name'
, "side" varchar2(4000) path '$.side'
, "color" varchar2(4000) path '$.color'
, "width" varchar2(4000) path '$.width'
, "height" varchar2(4000) path '$.height'
, "length" varchar2(4000) path '$.length'
, "radius" varchar2(4000) path '$.radius'
)
) j
)
select "row#id", "column#key", "column#value"
from
json_to_relational
unpivot (
"column#value" for "column#key" in (
"name"
, "side"
, "color"
, "width"
, "height"
, "length"
, "radius"
)
)
order by "row#id", "column#key"
UNPIVOT Query generated from document dataguide. array_path => .my_shapes
with json_to_relational as (
select j.*
from
json_table(:jdoc, '$.my_shapes[*]'
columns (
"row#id" for ordinality
, "name" varchar2(4000) path '$.name'
, "side" varchar2(4000) path '$.side'
, "color" varchar2(4000) path '$.color'
, "width" varchar2(4000) path '$.width'
, "height" varchar2(4000) path '$.height'
, "length" varchar2(4000) path '$.length'
, "radius" varchar2(4000) path '$.radius'
)
) j
)
select "row#id", "column#key", "column#value"
from
json_to_relational
unpivot (
"column#value" for "column#key" in (
"name"
, "side"
, "color"
, "width"
, "height"
, "length"
, "radius"
)
)
order by "row#id", "column#key"
The unpivot query is the same for both test document formats, the only difference is in the JSON_TABLE expression.
Testing the resulting unpivot queries with the appropriate test documents returns the correct results in both cases.
design-5.2-unpivot-query-test.sql
row#id column#key column#value
1 color blue
1 name square
1 side 5
2 length 5
2 name rectangle
2 width 3
3 height 2
3 length 5
3 name box
3 width 3
4 color red
4 name hexagon
4 side 3
5 name circle
5 radius 3
Unpivot Function Returning Associative Array
Because our final function is intended to be pipelined for use in SQL, it will return a collection type declared in a package specification or as a user defined type. We can test the code for this in the anonymous block with a local function that returns a collection of record types.
For the initial design we can use an associative array indexed by pls_integer that will have to be changed to a nested table for the pipelined function.
design-5.3-unpivot-query-collection.sql
type column_value is record(
row#id number,
column#key varchar2(64),
column#value varchar2(4000));
type column_values is table of column_value index by pls_integer;
function unpivot_collection(
jdoc in json,
array_path in varchar2 default null
) return column_values
is
l_key_columns column_definitions;
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
l_values column_values;
begin
l_key_columns := dataguide_columns(jdoc, array_path);
l_json_table_query := json_table_expression(l_key_columns, array_path);
l_sql := unpivot_expression(l_key_columns, l_json_table_query);
execute immediate l_sql bulk collect into l_values using jdoc;
return l_values;
end unpivot_collection;
row#id => 1, column#key => color, column#value => blue
row#id => 1, column#key => name, column#value => square
row#id => 1, column#key => side, column#value => 5
row#id => 2, column#key => length, column#value => 5
row#id => 2, column#key => name, column#value => rectangle
row#id => 2, column#key => width, column#value => 3
row#id => 3, column#key => height, column#value => 2
row#id => 3, column#key => length, column#value => 5
row#id => 3, column#key => name, column#value => box
row#id => 3, column#key => width, column#value => 3
row#id => 4, column#key => color, column#value => red
row#id => 4, column#key => name, column#value => hexagon
row#id => 4, column#key => side, column#value => 3
row#id => 5, column#key => name, column#value => circle
row#id => 5, column#key => radius, column#value => 3
Pipelined Function To Stream Unpivot Results
The unpivot_collection function is very similar to a pipelined function. The primary difference is that we have to change the column_values type to a nested table (remove the index by clause) instead of an associative array (index by pls_integer).
Put all of the constants, types and procedure or function signatures into the specification. Add a function signature for the pipelined function at the end.
design-6.1-dynamic_json_unpivot-package.sql
create or replace package dynamic_json#alpha
authid current_user
as
c_lf constant varchar2(1) := chr(10);
c_indent constant varchar2(100) := lpad(' ', 12, ' ');
c_max_column_name_length constant number := 10;
c_unpivot_to_datatype constant varchar2(20) := 'varchar2(4000)';
type column_value is record(
row#id number,
column#key varchar2(64),
column#value varchar2(4000));
type column_values is table of column_value;
type column_definition is record (
column_name varchar2(64),
column_type varchar2(20),
column_path varchar2(100));
type column_definitions is table of column_definition index by pls_integer;
function dataguide_columns(
jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
array_path in varchar2 default null
) return column_definitions;
function json_table_expression(
key_columns in column_definitions,
array_path in varchar2 default null
) return varchar2;
function unpivot_expression(
key_columns in column_definitions,
json_table_query in varchar2
) return varchar2;
function unpivot_collection(
jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
array_path in varchar2 default null
) return column_values;
function unpivot_json_array(
jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
array_path in varchar2 default null
) return column_values pipelined;
end dynamic_json#alpha;
/
The package body is identical to the anonymous block we were developing, except for the constants and types that we moved to the package specification.
The pipelined function implementation is very simple: build the correct sql statement and use it to populate the output collection, then pipe each element of the collection back to the sql engine.
create or replace package body dynamic_json#alpha
as
function dataguide_columns...
function json_table_expression...
function unpivot_expression...
function unpivot_collection...
function unpivot_json_array(
jdoc in $if dbms_db_version.version >= 21 $then json $else clob $end ,
array_path in varchar2 default null
) return column_values pipelined
is
l_key_columns column_definitions;
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
l_values column_values;
begin
l_key_columns := dataguide_columns(jdoc, array_path);
l_json_table_query := json_table_expression(l_key_columns, array_path);
l_sql := unpivot_expression(l_key_columns, l_json_table_query);
execute immediate l_sql bulk collect into l_values using jdoc;
for i in indices of l_values loop
pipe row (l_values(i));
end loop;
return;
end unpivot_json_array;
end dynamic_json#alpha;
/
We can test the pipelined function directly from SQL using the test document function and the appropriate arguments. Both test documents return the same results. It only took a few helper functions to get there and we are dynamically converting simple json arrays to unpivoted relational data!
design-6.2-dynamic_json_unpivot-test.sql
select *
from dynamic_json#alpha.unpivot_json_array(dynamic_json#test.test_jdoc());
select *
from dynamic_json#alpha.unpivot_json_array(dynamic_json#test.test_jdoc('nested'), '.my_shapes');
ROW#ID COLUMN#KEY COLUMN#VALUE
1 color blue
1 name square
1 side 5
2 length 5
2 name rectangle
2 width 3
3 height 2
3 length 5
3 name box
3 width 3
4 color red
4 name hexagon
4 side 3
5 name circle
5 radius 3
Enhancements To Generalize Functionality
For a pipelined function that unpivots a simple json array into rows with column names and values, we need to identify these column values with the source row that they are unpivoted from. Currently, our prototype generates a row identifier using the json array index ordinality. If the json source data contains an attribute that is a valid row identifier, we should use that instead of the ordinality value.
We can add an optional parameter to specify this row identifier to our functions and procedures. Most of the time we will just pass this identifier to other functions, the only places we actually need to do something with this parameter is in building the columns collection and creating the JSON_TABLE expression.
The columns collection can filter out the row identifier, so that it doesn’t become part of the unpivot expression. The JSON_TABLE expression function will decide if the row#id column should be based on a row identifier or the array ordinality.
First we need to update our test document package to return json that contains a valid identifier in each json object. Add constants for the new test formats to the package specification.
design-7.1-test_jdoc-package.sql
c_array_simple_id constant json_format := 'simple_id';
c_array_nested_id constant json_format := 'nested_id';
In the package body, add the new test documents with the shape_id attribute to the case statement that gets the correct document.
design-7.1-test_jdoc-package.sql
case format_option
when c_array_simple then...
when c_array_nested then...
when c_array_simple_id then
l_json_text := to_clob(
q'~
[
{"shape_id":101,"name":"square","side":5,"color":"blue"},
{"shape_id":102,"name":"rectangle","length":5,"width":3},
{"shape_id":103,"name":"box","length":5,"width":3,"height":2},
{"shape_id":104,"name":"hexagon","side":3,"color":"red"},
{"shape_id":105,"name":"circle","radius":3},
]
~');
when c_array_nested_id then
l_json_text := to_clob(
q'~
{"my_shapes":
[
{"shape_id":101,"name":"square","side":5,"color":"blue"},
{"shape_id":102,"name":"rectangle","length":5,"width":3},
{"shape_id":103,"name":"box","length":5,"width":3,"height":2},
{"shape_id":104,"name":"hexagon","side":3,"color":"red"},
{"shape_id":105,"name":"circle","radius":3},
]
}
~');
else
raise_application_error(-20100, format_option || ' is an undefined test document type');
end case;
Now we can update the package to handle this case with a parameter for row_identifer. Add this parameter to the function signatures and pass it when a subfunction is called.
In the dataguide_columns function the implicit cursor should exclude this column from the resulting columns collection.
design-8.1-enhancements.sql
function dataguide_columns(
jdoc...
row_identifier in varchar2 default null,
array_path...
) return column_definitions
is
l_key_columns column_definitions;
begin
...
where
column_type not in ('object', 'array')
and column_name <> dbms_assert.enquote_name(nvl(row_identifier,'~#~#~'), capitalize => false);
The json_table_expression function can now define the row#id column conditionally. If row_identifier is not present, we can default to using the ordinality index. Create the correct expression for this column and add a new replacement marker to the template sql.
design-8.1-enhancements.sql
function json_table_expression(
key_columns...
row_identifier in varchar2 default null,
array_path...
) return varchar2
is
l_sql varchar2(32000);
l_columns_clause varchar2(4000);
l_row_identifier varchar2(1000) := '"row#id" for ordinality';
begin
...
--if row identifier has been passed, substitute it for ordinality value using local path
if row_identifier is not null then
l_row_identifier := '"row#id" number path ''$.' || row_identifier || '''';
end if;
l_sql := q'+
select j.*
from
json_table(:jdoc, '$##ARRAY_PATH##[*]'
columns (
##ROW_IDENTIFIER##
##JSON_TABLE_COLUMNS##
)
) j
+';
l_sql := replace(l_sql, '##ROW_IDENTIFIER##', l_row_identifier);
The unpivot_collection and unpivot_json_array just need to pass the row_identifier parameter when they call dataguide_columns or json_table_expression.
design-8.1-enhancements.sql
function unpivot_collection(
jdoc...
row_identifier in varchar2 default null,
...
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
...
function unpivot_json_array(
jdoc...
row_identifier in varchar2 default null,
...
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
...
Debugging all of the functionality with the new test documents shows that everything is still working correctly, and the shape_id is used instead of the ordinality index.
declare
l_json $if dbms_db_version.version >= 21 $then json $else clob $end ;
begin
l_json := dynamic_json#test.test_jdoc(dynamic_json#test.c_array_simple_id);
dynamic_json#beta.debug_json_table_expression(l_json, 'shape_id');
end;
/
select j.*
from
json_table(:jdoc, '$[*]'
columns (
"row#id" number path '$.shape_id'
, "name" varchar2(4000) path '$.name'
, "side" varchar2(4000) path '$.side'
, "color" varchar2(4000) path '$.color'
, "width" varchar2(4000) path '$.width'
, "height" varchar2(4000) path '$.height'
, "length" varchar2(4000) path '$.length'
, "radius" varchar2(4000) path '$.radius'
)
) j
Testing the pipelined functions shows that the shape_id value is being used correctly as the row identifier.
design-8.2-enhancements-test.sql
select *
from dynamic_json#beta.unpivot_json_array(dynamic_json#test.test_jdoc('simple'));
select *
from dynamic_json#beta.unpivot_json_array(
dynamic_json#test.test_jdoc('simple_id'),
row_identifier => 'shape_id');
Our shape ids start at 101, so its easy to tell the difference between the results that used an identifier and the results that used ordinality.
simple json with no row identifier
ROW#ID COLUMN#KEY COLUMN#VALUE
1 color blue
1 name square
1 side 5
2 length 5
2 name rectangle
2 width 3
3 height 2
3 length 5
3 name box
3 width 3
4 color red
4 name hexagon
4 side 3
5 name circle
5 radius 3
simple json with shape_id as row identifier
ROW#ID COLUMN#KEY COLUMN#VALUE
101 color blue
101 name square
101 side 5
102 length 5
102 name rectangle
102 width 3
103 height 2
103 length 5
103 name box
103 width 3
104 color red
104 name hexagon
104 side 3
105 name circle
105 radius 3
Finishing Touches
The primary functionality we want to expose from this utility is the pipelined function. The column_values nested table type, column_value record type and unpivot_json_array pipelined function are the minimum elements that need to be exposed in the package specification.
Because the pipelined function could potentially return a very large set of rows, fetching the entire resultset at once into a collection could easily exceed available memory resources. Converting the actual fetch from the dynamic query to use bulk collect with a limit clause will help manage the memory footprint.
For this change we need to add a ref cursor variable and open the cursor before fetching batches of rows into the return collection. Exit the loop when the collection count is 0 from the last fetch that didn’t find any more rows.
Instead of hard coding the limit value, use a global variable in the package body for flexibility.
function unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
) return column_values pipelined
is
l_key_columns column_definitions;
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
l_values column_values;
c sys_refcursor;
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
l_sql := unpivot_expression(l_key_columns, l_json_table_query, row_identifier);
open c for l_sql using jdoc;
loop
fetch c bulk collect into l_values limit g_bulk_limit_rows;
exit when l_values.count = 0;
for i in indices of l_values loop
pipe row (l_values(i));
end loop;
end loop;
close c;
return;
end unpivot_json_array;
A procedure to set the limit value for the session allows tuning the utility to specific workloads.
procedure set_bulk_limit_rows(
bulk_limit_rows in positiven default 100
)
is
begin
g_bulk_limit_rows := bulk_limit_rows;
end set_bulk_limit_rows;
During development, we used varchar2(4000) as a default common datatype for JSON_TABLE to support the unpivot operation. While this works for a broad range of cases, we may later want the flexibility of unpivoting a set of numbers or even CLOB values. Converting the constant g_unpivot_to_datatype to a global variable with a procedure to set it will support this.
Because our output record always uses a varchar, this setting will not influence the datatype of the column#value that is returned.
A group of constants in the specification will document what unpivot datatypes are currently supported for the unpivot operation. Raising an exception for unsupported values in the set procedure ensures that the generated sql is well formed and helps to document these options.
subtype unpivot_datatype is varchar2(20);
c_varchar2_100 constant unpivot_datatype := 'varchar2(100)';
c_varchar2_1000 constant unpivot_datatype := 'varchar2(100)';
c_varchar2_4000 constant unpivot_datatype := 'varchar2(4000)';
c_clob constant unpivot_datatype := 'clob';
c_number constant unpivot_datatype := 'number';
procedure set_unpivot_to_datatype(
unpivot_to_datatype in unpivot_datatype default c_varchar2_4000)
is
begin
if unpivot_to_datatype not in (
c_varchar2_100, c_varchar2_1000, c_varchar2_4000, c_clob, c_number)
then
raise_application_error(-20100, unpivot_to_datatype
|| ' is not supported for dynamic unpivot'
);
end if;
g_unpivot_to_datatype := unpivot_to_datatype;
end set_unpivot_to_datatype;
Because unit testing a utility package like this with known inputs contributes significantly to confidence in code stability, the test_jdoc function should definitely be included. Additional test documents can easily be added to support unpivoting to numeric common datatypes and to demonstrate unsupported json structures. A unit testing framework like utPlsql could use these test documents to create a suite of unit tests for the utility package.
We can combine all of the debug procedures with a single wrapper with flags to switch each type of debug output and expose this in the specification.
procedure debug_unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null,
dataguide_columns in boolean default true,
json_table_expression in boolean default true,
unpivot_expression in boolean default true
)
is
begin
if dataguide_columns then
debug_dataguide_columns(jdoc, row_identifier, array_path);
end if;
if json_table_expression then
debug_json_table_expression(jdoc, row_identifier, array_path);
end if;
if unpivot_expression then
debug_unpivot_expression(jdoc, row_identifier, array_path);
end if;
end debug_unpivot_json_array;
Creating a subtype for the json input document will clarify what we are trying to accomplish with conditional compilation based on database versions supporting the json constructor instead of using clobs for json documents.
subtype json_document_type is $if dbms_db_version.version >= 21 $then json $else clob $end ;
Dynamic Json Unpivot: Final Package Specification
With these changes in place, the final package specification has been pared down to the minimal necessary code to use the utility. Everything we need to use, test and debug the utility is contained in a single package. Developers can review the specification and understand how this utility will work as well as what sort of json structures the utility will support.
design-9.1-dynamic_json-spec.sql
create or replace package dynamic_json
authid current_user
as
subtype json_document_type is $if dbms_db_version.version >= 21 $then json $else clob $end ;
subtype unpivot_datatype is varchar2(20);
c_varchar2_100 constant unpivot_datatype := 'varchar2(100)';
c_varchar2_1000 constant unpivot_datatype := 'varchar2(100)';
c_varchar2_4000 constant unpivot_datatype := 'varchar2(4000)';
c_clob constant unpivot_datatype := 'clob';
c_number constant unpivot_datatype := 'number';
subtype json_format is varchar2(10);
c_array_simple constant json_format := 'simple';
c_array_nested constant json_format := 'nested';
c_array_simple_id constant json_format := 'simple_id';
c_array_nested_id constant json_format := 'nested_id';
type column_value is record(
row#id number,
column#key varchar2(64),
column#value varchar2(4000));
type column_values is table of column_value;
procedure set_bulk_limit_rows(
bulk_limit_rows in positiven default 100
);
procedure set_unpivot_to_datatype(
unpivot_to_datatype in unpivot_datatype default c_varchar2_4000
);
function unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
) return column_values pipelined;
procedure debug_unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null,
dataguide_columns in boolean default true,
json_table_expression in boolean default true,
unpivot_expression in boolean default true
);
function test_jdoc(
format_option in json_format default c_array_simple
) return json_document_type;
end dynamic_json;
/
Dynamic Json Unpivot: Final Package Body
The final package body contains more functions and procedures than the specification exposes to callers. This encapsulation of helper modules is the real advantage of PL/SQL packages. While everything could have been accomplished in a single function, the separation into different sub functions makes the code much easier to understand and maintain.
design-9.2-dynamic_json-body.sql
create or replace package body dynamic_json
as
c_lf constant varchar2(1) := chr(10);
c_indent constant varchar2(100) := lpad(' ', 12, ' ');
c_max_column_name_length constant number := 64;
g_unpivot_to_datatype unpivot_datatype := c_varchar2_4000;
g_bulk_limit_rows positiven := 100;
type column_definition is record (
column_name varchar2(64),
column_type varchar2(20),
column_path varchar2(100));
type column_definitions is table of column_definition index by pls_integer;
procedure set_bulk_limit_rows(
bulk_limit_rows in positiven default 100
)
is
begin
g_bulk_limit_rows := bulk_limit_rows;
end set_bulk_limit_rows;
procedure set_unpivot_to_datatype(
unpivot_to_datatype in unpivot_datatype default c_varchar2_4000
)
is
begin
if unpivot_to_datatype not in (
c_varchar2_100, c_varchar2_1000, c_varchar2_4000, c_clob, c_number)
then
raise_application_error(-20100, unpivot_to_datatype
|| ' is not supported for dynamic unpivot'
);
end if;
g_unpivot_to_datatype := unpivot_to_datatype;
end set_unpivot_to_datatype;
function dataguide_columns(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
) return column_definitions
is
l_key_columns column_definitions;
begin
with dataguide as (
select json_dataguide(jdoc) as jdoc_dataguide
from dual
), dataguide_relational as (
select
$if dbms_db_version.version >= 23 $then
dbms_assert.enquote_name(
ltrim(replace(j.dg_path, array_path), '$.'),
capitalize => false
) as column_name,
$else
'"' || ltrim(replace(j.dg_path, array_path), '$.') || '"' as column_name,
$end
j.column_type,
replace(j.dg_path, array_path) as column_path
from
dataguide g,
json_table(g.jdoc_dataguide, '$[*]'
columns(
dg_path path '$."o:path".string()',
column_type path '$.type.string()'
)
) j
)
select column_name, column_type, column_path
bulk collect into l_key_columns
from dataguide_relational
where
column_type not in ('object', 'array')
$if dbms_db_version.version >= 23 $then
and column_name <> dbms_assert.enquote_name(nvl(row_identifier,'~#~#~'), capitalize => false)
$else
and column_name <> '"' || nvl(row_identifier,'~#~#~') || '"'
$end
;
return l_key_columns;
end dataguide_columns;
function json_table_expression(
key_columns in column_definitions,
row_identifier in varchar2 default null,
array_path in varchar2 default null
) return varchar2
is
l_sql varchar2(32000);
l_columns_clause varchar2(4000);
l_row_identifier varchar2(1000) := '"row#id" for ordinality';
begin
--build the columns expression
for i in
$if dbms_db_version.version >= 21 $then
indices of key_columns
$else
1..key_columns.count
$end
loop
l_columns_clause := l_columns_clause || c_indent
|| ', ' || key_columns(i).column_name
|| ' ' || g_unpivot_to_datatype
|| ' path ''' || key_columns(i).column_path || ''''
|| case when i <> key_columns.count then c_lf end;
end loop;
--if row identifier has been passed, substitute it for ordinality value using local path
if row_identifier is not null then
l_row_identifier := '"row#id" number path ''$.' || row_identifier || '''';
end if;
l_sql := q'+
select j.*
from
json_table(:jdoc, '$##ARRAY_PATH##[*]'
columns (
##ROW_IDENTIFIER##
##JSON_TABLE_COLUMNS##
)
) j
+';
l_sql := replace(l_sql, '##JSON_TABLE_COLUMNS##', l_columns_clause);
l_sql := replace(l_sql, '##ARRAY_PATH##', rtrim(array_path,'.'));
l_sql := replace(l_sql, '##ROW_IDENTIFIER##', l_row_identifier);
return l_sql;
end json_table_expression;
function unpivot_expression(
key_columns in column_definitions,
json_table_query in varchar2,
row_identifier in varchar2 default null
) return varchar2
is
l_sql varchar2(32000);
l_unpivot_columns varchar2(4000);
begin
--build the unpivot expression
for i in
$if dbms_db_version.version >= 21 $then
indices of key_columns
$else
1..key_columns.count
$end
loop
l_unpivot_columns := l_unpivot_columns || c_indent
|| case when i > 1 then ', ' end
|| key_columns(i).column_name
|| case when i <> key_columns.count then c_lf end;
end loop;
l_sql := q'+
with json_to_relational as (
##JSON_TABLE_EXPRESSION##
)
select "row#id", "column#key", "column#value"
from
json_to_relational
unpivot (
"column#value" for "column#key" in (
##UNPIVOT_COLUMNS##
)
)
order by "row#id", "column#key"
+';
l_sql := replace(l_sql, '##JSON_TABLE_EXPRESSION##', json_table_query);
l_sql := replace(l_sql, '##UNPIVOT_COLUMNS##', l_unpivot_columns);
return l_sql;
end unpivot_expression;
function unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
) return column_values
pipelined
is
l_key_columns column_definitions;
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
l_values column_values;
c sys_refcursor;
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
l_sql := unpivot_expression(l_key_columns, l_json_table_query, row_identifier);
open c for l_sql using jdoc;
loop
fetch c bulk collect into l_values limit g_bulk_limit_rows;
exit when l_values.count = 0;
for i in
$if dbms_db_version.version >= 21 $then
indices of l_values
$else
1..l_values.count
$end
loop
pipe row (l_values(i));
end loop;
end loop;
close c;
return;
end unpivot_json_array;
procedure debug_test_output_heading(p_heading in varchar2)
is
c_separator constant varchar2(100) := lpad('-', 100, '-');
begin
dbms_output.put_line(c_separator);
dbms_output.put_line('--' || p_heading);
dbms_output.put_line(c_separator);
end debug_test_output_heading;
procedure debug_dataguide_columns(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
)
is
l_key_columns column_definitions;
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
debug_test_output_heading('Column definitions from document dataguide.'
|| case
when array_path is null then ' no array path'
else ' array_path => ' || array_path
end
);
for i in
$if dbms_db_version.version >= 21 $then
indices of l_key_columns
$else
1..l_key_columns.count
$end
loop
dbms_output.put_line(
'column_name => ' || l_key_columns(i).column_name
|| ', column_type => ' || l_key_columns(i).column_type
|| ', column_path => ' || l_key_columns(i).column_path);
end loop;
end debug_dataguide_columns;
procedure debug_json_table_expression(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
)
is
l_key_columns column_definitions;
l_sql varchar2(4000);
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_sql := json_table_expression(l_key_columns, row_identifier, array_path);
debug_test_output_heading('JSON_TABLE Expression generated from document dataguide.'
|| case
when array_path is null then ' no array path'
else ' array_path => ' || array_path
end
);
dbms_output.put_line(l_sql);
end debug_json_table_expression;
procedure debug_unpivot_expression(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null
)
is
l_key_columns column_definitions;
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
begin
l_key_columns := dataguide_columns(jdoc, row_identifier, array_path);
l_json_table_query := json_table_expression(l_key_columns, row_identifier, array_path);
l_sql := unpivot_expression(l_key_columns, l_json_table_query);
debug_test_output_heading('UNPIVOT Query generated from document dataguide.'
|| case
when array_path is null then ' no array path'
else ' array_path => ' || array_path
end
);
dbms_output.put_line(l_sql);
end debug_unpivot_expression;
procedure debug_unpivot_json_array(
jdoc in json_document_type,
row_identifier in varchar2 default null,
array_path in varchar2 default null,
dataguide_columns in boolean default true,
json_table_expression in boolean default true,
unpivot_expression in boolean default true
)
is
begin
if dataguide_columns then
debug_dataguide_columns(jdoc, row_identifier, array_path);
end if;
if json_table_expression then
debug_json_table_expression(jdoc, row_identifier, array_path);
end if;
if unpivot_expression then
debug_unpivot_expression(jdoc, row_identifier, array_path);
end if;
end debug_unpivot_json_array;
function test_jdoc(
format_option in json_format default c_array_simple
) return json_document_type
is
l_json_text clob;
begin
case format_option
when c_array_simple then
l_json_text := to_clob(
q'~
[
{"name":"square","side":5,"color":"blue"},
{"name":"rectangle","length":5,"width":3},
{"name":"box","length":5,"width":3,"height":2},
{"name":"hexagon","side":3,"color":"red"},
{"name":"circle","radius":3},
]
~');
when c_array_nested then
l_json_text := to_clob(
q'~
{"my_shapes":
[
{"name":"square","side":5,"color":"blue"},
{"name":"rectangle","length":5,"width":3},
{"name":"box","length":5,"width":3,"height":2},
{"name":"hexagon","side":3,"color":"red"},
{"name":"circle","radius":3},
]
}
~');
when c_array_simple_id then
l_json_text := to_clob(
q'~
[
{"shape_id":101,"name":"square","side":5,"color":"blue"},
{"shape_id":102,"name":"rectangle","length":5,"width":3},
{"shape_id":103,"name":"box","length":5,"width":3,"height":2},
{"shape_id":104,"name":"hexagon","side":3,"color":"red"},
{"shape_id":105,"name":"circle","radius":3},
]
~');
when c_array_nested_id then
l_json_text := to_clob(
q'~
{"my_shapes":
[
{"shape_id":101,"name":"square","side":5,"color":"blue"},
{"shape_id":102,"name":"rectangle","length":5,"width":3},
{"shape_id":103,"name":"box","length":5,"width":3,"height":2},
{"shape_id":104,"name":"hexagon","side":3,"color":"red"},
{"shape_id":105,"name":"circle","radius":3},
]
}
~');
else
raise_application_error(-20100, format_option
|| ' is an undefined test document type');
end case;
return $if dbms_db_version.version < 21 $then l_json_text $else json(l_json_text) $end ;
end test_jdoc;
end dynamic_json;
/
Dynamic Json Unpivot: Unit Testing
Using the set of test documents, we can unit test the final pacakge to be sure that everything is working as designed.
design-9.3-dynamic_json-test.sql
select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('simple'));
select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('simple_id'),
row_identifier => 'shape_id');
select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('nested'),
array_path => '.my_shapes');
select *
from dynamic_json.unpivot_json_array(dynamic_json.test_jdoc('nested_id'),
row_identifier => 'shape_id',
array_path => '.my_shapes');
All four test queries still return the correct unpivoted results.
ROW#ID COLUMN#KEY COLUMN#VALUE
101 color blue
101 name square
101 side 5
102 length 5
102 name rectangle
102 width 3
103 height 2
103 length 5
103 name box
103 width 3
104 color red
104 name hexagon
104 side 3
105 name circle
105 radius 3
Looking at user objects will give us a test with number, string and date columns all unpivoted to the common varchar2(4000) datatype. Testing in 23ai we can use the shortcuts for the json constructor to create the objects and aggregate the array. The json constructor returns json in binary format, and the package has been conditionally compiled to expect this.
design-9.4-dynamic_json-test.sql
with base as (
select
object_id,
object_name,
object_type,
created,
namespace
from user_objects
), to_json as (
select json [ json { b.* } ] as jdoc
from base b
)
select u.*
from
to_json j,
dynamic_json.unpivot_json_array(j.jdoc, row_identifier => 'OBJECT_ID') u;
ROW#ID COLUMN#KEY COLUMN#VALUE
100457 CREATED 2024-07-04T19:01:22
100457 NAMESPACE 1
100457 OBJECT_NAME DYNAMIC_JSON#TEST
100457 OBJECT_TYPE PACKAGE
100458 CREATED 2024-07-04T19:01:22
100458 NAMESPACE 2
100458 OBJECT_NAME DYNAMIC_JSON#TEST
100458 OBJECT_TYPE PACKAGE BODY
100459 CREATED 2024-07-04T19:22:51
100459 NAMESPACE 1
100459 OBJECT_NAME DYNAMIC_JSON#ALPHA
100459 OBJECT_TYPE PACKAGE
100460 CREATED 2024-07-04T19:22:51
100460 NAMESPACE 2
100460 OBJECT_NAME DYNAMIC_JSON#ALPHA
100460 OBJECT_TYPE PACKAGE BODY
100461 CREATED 2024-07-04T19:30:24
100461 NAMESPACE 1
100461 OBJECT_NAME DYNAMIC_JSON#BETA
100461 OBJECT_TYPE PACKAGE
100462 CREATED 2024-07-04T19:30:24
100462 NAMESPACE 2
100462 OBJECT_NAME DYNAMIC_JSON#BETA
100462 OBJECT_TYPE PACKAGE BODY
100463 CREATED 2024-07-04T19:32:40
100463 NAMESPACE 1
100463 OBJECT_NAME DYNAMIC_JSON
100463 OBJECT_TYPE PACKAGE
100464 CREATED 2024-07-04T19:36:20
100464 NAMESPACE 2
100464 OBJECT_NAME DYNAMIC_JSON
100464 OBJECT_TYPE PACKAGE BODY
In 19c or lower we will need to format json as a clob. The package will be conditionally compiled to handle the json as clob data in these versions.
design-9.4-dynamic_json-test-19c.sql
json_arrayagg(
json_object(b.* returning clob)
returning clob) as jdoc
Dynamically Unpivoting Sales Data
In the introduction we looked at unpivoting data sources for quarterly or monthly sales. Now that our utility is working we can see that the dynamic unpivot can be easily applied to these situations. All that we have to do is convert the data source to an array of json objects and pass it to our function. We can also alias the outputs and alter the final value datatypes so that it doesn’t look like we used a generic solution to get to lunch on time.
Lets revisit the sales data validation task that we started with and see how the new utility makes it easier. First, confirm that the quarterly sales data is being correctly unpivoted.
design-9.6-dynamic_json-test.sql
with quarterly_json as (
select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
from quarterly_sales_v s
), quarterly_unpivot as (
select
u.row#id as "Year",
u.column#key as "Quarter",
to_number(u.column#value) as "Volume"
from
quarterly_json j,
dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
)
select "Year", "Quarter", "Volume"
from quarterly_unpivot
order by "Year", "Quarter";
Year Quarter Volume
2021 Qtr1 200184
2021 Qtr2 198036
2021 Qtr3 145044
2021 Qtr4 153057
2022 Qtr1 140424
2022 Qtr2 158398
2022 Qtr3 150220
2022 Qtr4 177484
2023 Qtr1 172737
2023 Qtr2 163434
2023 Qtr3 161465
2023 Qtr4 152014
12 rows selected.
It looks like all the columns are being properly unpivoted. The next step is to verify that these values add up to the same values as the yearly totals. The column#key is not part of the grouping query so we can ignore it.
with quarterly_json as (
select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
from quarterly_sales_v s
), quarterly_unpivot as (
select
u.row#id as "Year",
to_number(u.column#value) as "Volume"
from
quarterly_json j,
dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
)
select "Year", sum("Volume") as "Volume"
from quarterly_unpivot
group by "Year"
order by "Year";
select "Year", "Volume"
from annual_sales_v
order by "Year";
Validate yearly totals for Quarterly Sales using dynamic unpivot
Year Volume
2021 696321
2022 626526
2023 649650
compare to the yearly totals view
Year Volume
2021 696321
2022 626526
2023 649650
Perfect, the quarterly sales data has the correct yearly values. Because we are unpivoting programatically, if one works they should all work.
Instead of running each validation separately, lets put them all together to get the task finished. If we add a field to each unpivot result to show the report it came from, we can combine all of the unpivot queries and then group by report and year at the end to see if everything agrees.
with quarterly_json as (
select json{ 'jSales' : json_arrayagg( json{ s.* } ) } as jdoc
from quarterly_sales_v s
), quarterly_unpivot as (
select
'Quarterly Sales' as "Report",
u.row#id as "Year",
to_number(u.column#value) as "Volume"
from
quarterly_json j,
dynamic_json.unpivot_json_array(j.jdoc, 'Year', '.jSales') u
...regional_unpivot...
...regional_quarterly_unpivot...
...monthly_unpivot...
...regional_monthly_unpivot...
), all_unpivots as (
select "Report", "Year", "Volume" from quarterly_unpivot
union all
select "Report", "Year", "Volume" from regional_unpivot
union all
select "Report", "Year", "Volume" from regional_quarterly_unpivot
union all
select "Report", "Year", "Volume" from monthly_unpivot
union all
select "Report", "Year", "Volume" from regional_monthly_unpivot
union all
select 'Annual Sales View' as "Report", "Year", "Volume" from annual_sales_v
)
select "Report", "Year", sum("Volume") as "Volume"
from all_unpivots
group by "Report", "Year"
order by "Year", "Report";
Running this generates an exception!
ORA-06502: PL/SQL: value or conversion error: character string buffer too small
ORA-06512: at "CLOUD_PRACTICALPLSQL.DYNAMIC_JSON", line 176
It looks the problem is in the variables used by the pipelined function to hold the generated sql statements. Looking at the variable declarations for unpivot_json_array we can see that the json_table expression and the final query can’t be over 4000 characters. If we are unpivoting a large number of columns, the resulting sql statement can be very large. Thinking about the data here, the regional months unpivot is the likely suspect.
l_json_table_query varchar2(4000);
l_sql varchar2(4000);
Lets fix this and future proof the code by putting a subtype in the top of the package body and using it for building sql statements or fragments like the columns definitions. This is a good reminder to always create subtypes for variables that are used for a common purpose. If we had started with the subtype we could just update the value to 32000 or even to a clob without looking through the code for all the places where we are building sql.
subtype sql_text is varchar2(32000);
l_json_table_query sql_text;
l_sql sql_text;
Note: I have made this correction in the source code, so you won’t see this error. To reproduce the error just change the sql_text subtype to 4000 characters and recompile before running the big validation query.
This is a good opportunity to run the debug procedure for all of these json documents with this script and see just how large the queries all turned out to be in the script output.
After this quick fix, the validation query runs without exceptions and the results are what we were looking for.
validate all the pivot reports together
Report Year Volume
Annual Sales View 2021 696321
Monthly Sales 2021 696321
Quarterly Sales 2021 696321
Regional Monthly Sales 2021 696321
Regional Quarterly Sales 2021 696321
Regional Sales 2021 696321
...
18 rows selected.
You can tell your manager that everything balances perfectly and that you are going to lunch now. Except for the fact that it still took about a half page of sql code to write the validation query. Was all that typing really necessary? Could we be even more concise in our approach?
One Step Further: A Dynamic Unpivot Macro
The only part of our final query that changed for each unpivot was the data source of each expression. Recognizing this pattern is enough to create a macro to conceal the fact that we are converting everything to json in order to accomplish the dynamic unpivot.
select u.row#id as "Year", u.column#key as "Month", to_number(u.column#value) as "Volume"
from
(
select json_arrayagg( json{ * } ) as jdoc
from monthly_sales_v
) j,
dynamic_json.unpivot_json_array(j.jdoc, 'Year') u;
Now we can capture this pattern in a macro function. To pass string parameters to a macro we have to use the polymorphic functions columns type. Columns passed with this type will always have qualified names that are enclosed in double quotes. Since there are always going to be three columns in the output of the pipelined function, we can require meaningful aliases as the first three elements of this columns collection.
The first column element will always be our row identifier, trim the double quotes from this column name and it can be passed to the pipelined function as the row_identifer parameter. There is no need to use an array path because we are generating the json from relational data directly.
We can pass a numeric parameter to the macro to support converting the column#value to a numeric datatype in the resulting sql statement.
Because the json document is created in the sql generated by the macro, we can use conditional compilation to generate the json as a clob or as the native json datatype.
design-9.8-pipelined-macro.sql
create or replace function unpivot_results(
source_data in dbms_tf.table_t,
alias_list in dbms_tf.columns_t,
value_as_number in number default 1
) return varchar2
sql_macro $if dbms_db_version.version >= 21 $then (table) $end
is
l_sql varchar2(4000);
l_json_data varchar2(100);
l_value_expression varchar2(100);
begin
$if dbms_db_version.version >= 21 $then
l_json_data := 'json_arrayagg( json{ * } )';
$else
l_json_data := 'json_arrayagg( json_object( * ) returning clob) returning clob)';
$end
l_sql := q'[
select
u.row#id as ##ROW_IDENTIFIER##,
u.column#key as ##COLUMN_KEY##,
##VALUE_EXPRESSION## as ##COLUMN_VALUE##
from
(
select ##JSON_DATA## as jdoc
from source_data
) j,
dynamic_json.unpivot_json_array(j.jdoc, '##ROW_IDENTIFIER_PARAMETER##') u
]';
l_sql := replace(l_sql, '##JSON_DATA##', l_json_data);
l_sql := replace(l_sql, '##ROW_IDENTIFIER_PARAMETER##', trim(both '"' from alias_list(1)));
l_sql := replace(l_sql, '##ROW_IDENTIFIER##', alias_list(1));
l_sql := replace(l_sql, '##COLUMN_KEY##', alias_list(2));
l_sql := replace(l_sql, '##COLUMN_VALUE##', alias_list(3));
l_value_expression := case when value_as_number = 1 then 'to_number(u.column#value)' else 'u.column#value' end;
l_sql := replace(l_sql, '##VALUE_EXPRESSION##', l_value_expression);
--enable this to check the macro output
--dbms_output.put_line(l_sql);
return l_sql;
end unpivot_results;
/
Now we can write the final validation query in a form that is incredibly concise, and the query is self documenting.
design-9.8-pipelined-macro-test.sql
select "Report", "Year", sum("Volume") as "Volume"
from
(
select 'Quarterly Sales' as "Report", "Year", "Volume"
from unpivot_results(quarterly_sales_v, columns("Year", "Quarter", "Volume"))
union all
select 'Regional Sales' as "Report", "Year", "Volume"
from unpivot_results(regional_sales_v, columns("Year", "Region", "Volume"))
union all
select 'Regional Quarterly Sales' as "Report", "Year", "Volume"
from unpivot_results(regional_quarterly_sales_v, columns("Year", "RegionQuarter", "Volume"))
union all
select 'Monthly Sales' as "Report", "Year", "Volume"
from unpivot_results(monthly_sales_v, columns("Year", "Month", "Volume"))
union all
select 'Regional Monthly Sales' as "Report", "Year", "Volume"
from unpivot_results(regional_monthly_sales_v, columns("Year", "RegionMonth", "Volume"))
union all
select 'Annual Sales' as "Report", "Year", "Volume"
from annual_sales_v
)
group by "Report", "Year"
order by "Year", "Report";
Report Year Volume
Annual Sales 2021 696321
Monthly Sales 2021 696321
Quarterly Sales 2021 696321
Regional Monthly Sales 2021 696321
Regional Quarterly Sales 2021 696321
Regional Sales 2021 696321
Annual Sales 2022 626526
Monthly Sales 2022 626526
Quarterly Sales 2022 626526
Regional Monthly Sales 2022 626526
Regional Quarterly Sales 2022 626526
Regional Sales 2022 626526
Annual Sales 2023 649650
Monthly Sales 2023 649650
Quarterly Sales 2023 649650
Regional Monthly Sales 2023 649650
Regional Quarterly Sales 2023 649650
Regional Sales 2023 649650
The results are still right on target and we just might be a bit excited for the next ‘tedius’ data validation that our manager needs immediately!
Next Steps
There may be many other situations that can benefit from a dynamic unpivot utility, and the current code is still really an initial draft which is limited in its supported use cases. While the code may work for the design use cases perfectly, there is nothing compared to real world testing to see how will it will hold up.
I actually decided the add the larger sales data test cases the day after initial publication of this article and gleefully discovered that I had undersized the variables for building sql. The fix improves the overall quality of the code, and a test case of this magnitude should probably be added to the unit tests. The next thing to do with this mini project is to create a test suite with utPlsql and create even more test cases.
Like any sort of utility code, the only real limitations are our ability to imagine different applications of the functionality that has been defined and how well the test cases we create push the boundaries of the solution.
In closing the post on Table Data Comparison: Row Differences, I suggested that this concept may be useful for isolating the columns with different values instead of manually creating unpivot queries. The next post for the series on comparison macros, Table Data Comparision: Column Differences, will begin finding these exact column differences by dynamically unpivoting the columns and applying a row comparison macro.
–Anthony Harper
Leave a comment