Year-to-date alternatives

David Aldridge challenged me to write about functions that return lists of dates. So here goes

In blogs past I wrote about the joys of calculating year-to-date, YTD, on sparse data sets. To my mind sparsity is a good reason to not include ytd values in slightly aggregated summaries.

But that still leaves us a few methods to calculate YTD on the fly.

  • David already mentioned transformation tables
  • but dismissed the use of BETWEEN
  • in lists of dates can also be used
  • as can functions that return tables
I did a few tests on a 600,000,000 row partitioned (816 partitions) table running on Oracle 9.2.0.7 with a degree of parallel set to 2 (see Doug Burns) and bitmap indexes on all of the dimension keys. And to make things slightly more different the year starts on 1-APR-2006. So it's apples and oranges time if you want to compare things!

The tests

My denormalised time dimension table PS_DN_DT is fairly normal as things go except that it contains various additional columns to contain the date of the beginning of the year (it changes each year) and offsets for same-date-last-year. I also have defined a few data types including DAY_LIST

create or replace TYPE "DAY_LIST" AS
TABLE OF DATE;

And my own package of time series functions. One of these returns a table of date.

function ydt_day (p_day in date) return day_list
as
l_data day_list;
l_first_day date;
begin
select first_day into l_first_day from ps_dn_dt where actual_dt= p_day;
select cast (multiset (select actual_dt from ps_dn_dt
where actual_dt between l_first_day and p_day) as day_list)
into l_data from dual;
return l_data;
end;

Now I will look at three queries based on the same range of dates, 1-APR-2006 to 16-AUG-2006, one branch 'xxxx' and one product 'M111111' and compare three of the ways to calculate YDT - I'll leave out David's table - even though we have something very similar in place in this database. See attached file. Note I ran each query twice

The results

On the face of it there is little real difference in the clock time for any of these queries. There are differences in the query plans but probably the most significant effect is the bitmap indexes allowing the query to zero in on the rows that match.

To tell you the truth - I use the transformation table method in real-life; the table function was a "just for fun to prove I still could code" thing ;-)