Monday, December 1, 2014

Quick-and-Dirty Median in SQL

It's happened a few times where I want a median in postgres for something, but don't feel strongly enough about it to define a function temporarily or permanently.

So, here's the quick-and-dirty SQL for getting a median that I write in those times:

select some_column from some_table order by some_column offset (select count(*) from some_table) / 2 limit 1;

the problem with that way is that you have to specify where clauses in two places, but it is quick-and-dirty. E.g.:

select some_column from some_table where created_at > '2014-01-01' order by some_column offset (select count(*) from some_table where created_at > '2014-01-01') / 2 limit 1;

Change * to id for better performance.

The subsequent min, avg, max are just selecting min(some_column), avg(some_column), and max(some_column).

There are more accurate ways to do it, but that is ok for a reality check.

No comments: