Window Function for PostgreSQL Desgin Overview [v01]

this document is obsolete. newer version is here: http://umitanuki.net/pgsql/wfv02/design.html

Abstract

Window Function in SQL is an OLAP functionality that provides ranking, cumulative computation, and partitioning aggregation. Many commercial RDMBS such like Oracle, MS SQL Server and DB2 have implemented part of this specification, while open source RDMBS including PostgreSQL, MySQL and Firebird doesn't yet. To implement this functionality on PostgreSQL not only helps many users move from those RDBMS to PostgreSQL but encourages OLAP applications such as BI (Business Inteligence) to analyze large data set.

The first proposal: http://archives.postgresql.org/pgsql-hackers/2008-06/msg00380.php

patch v01: http://umitanuki.net/pgsql/wfv01/window_function_v01.diff

patch v01 applied source tgz: http://umitanuki.net/pgsql/wfv01/window_function_v01.tgz

sample SQL: http://umitanuki.net/pgsql/wfv01/sample.sql

Below is a description of how it is designed in the patch v01.

Roeadmap

How it works

SELECT * FROM empsalary;
  depname  | empno | salary | enroll_date 
-----------+-------+--------+-------------
 develop   |    10 |   5200 | 2008-04-01
 sales     |     1 |   5000 | 2006-10-01
 personnel |     5 |   3500 | 2007-12-10
 sales     |     4 |   4900 | 2007-08-08
 personnel |     2 |   3900 | 2006-12-23
 develop   |     7 |   4200 | 2008-01-01
 develop   |     9 |   4500 | 2008-01-01
 sales     |     3 |   4800 | 2007-08-01
 develop   |     8 |   6000 | 2008-01-01
(9 rows)
 SELECT 
   sum(salary) OVER (PARTITION BY depname) AS dep_sum
   ,sum(salary) OVER (PARTITION BY extract(YEAR FROM enroll_date)) AS year_sum
   ,*
 FROM 
   empsalary;
 dep_sum | year_sum |  depname  | empno | salary | enroll_date 
---------+----------+-----------+-------+--------+-------------
    7400 |     8900 | personnel |     2 |   3900 | 2006-12-23
   14700 |     8900 | sales     |     1 |   5000 | 2006-10-01
   14700 |    13200 | sales     |     4 |   4900 | 2007-08-08
    7400 |    13200 | personnel |     5 |   3500 | 2007-12-10
   14700 |    13200 | sales     |     3 |   4800 | 2007-08-01
   19900 |    19900 | develop   |     7 |   4200 | 2008-01-01
   19900 |    19900 | develop   |    10 |   5200 | 2008-04-01
   19900 |    19900 | develop   |     8 |   6000 | 2008-01-01
   19900 |    19900 | develop   |     9 |   4500 | 2008-01-01
(9 rows)

You may see dep_sum is the result of SUM() for each depname, and year_sum is the result of SUM() for each enrolling year, without rows aggregated.

Restriction

Added file

Added nodes

primnode

parser

planner

executor

Plan

EXPLAIN
SELECT 
  sum(salary) OVER (PARTITION BY depname) AS dep_sum
  ,sum(salary) OVER (PARTITION BY extract(YEAR FROM enroll_date)) AS year_sum
  ,*
FROM 
  empsalary;
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Window  (cost=127.23..129.83 rows=1040 width=48)
   ->  Sort  (cost=127.23..129.83 rows=1040 width=48)
         Sort Key: (date_part('year'::text, (enroll_date)::timestamp without time zone))
         ->  Window  (cost=72.52..75.12 rows=1040 width=48)
               ->  Sort  (cost=72.52..75.12 rows=1040 width=48)
                     Sort Key: depname
                     ->  Seq Scan on empsalary  (cost=0.00..20.40 rows=1040 width=48)

This plan is quite ugly, because for each window a Window node is added with a Sort node. Probably all of window and sort process is packed into a Window node. For this current plan, Sort node uses Tuplesort as you expect then Window node uses Tuplestore to store each Partition tuples. This is supposed to be the worst plan. We are able to get it better somehow.

Things to be discussed

Reference

this document is as of 2008/07/05, written by Hitoshi Harada (umi.tanuki@gmail.com)