-- Title : [PGS 9.x] PostgreSQL Tips & Tricks - 3
-- Reference : postgres.cz
-- Key word :
********************************************************************************/
to_string, to_array functions
Author: Pavel Stěhule. Functions string_to_array and array_to_string doesn't handle NULL value well. So I designed new functions with better NULL handling.
CREATE OR REPLACE FUNCTION to_string(anyarray, sep text, nullstr text DEFAULT '') RETURNS text AS $$ SELECT array_to_string(ARRAY(SELECT coalesce(v::text, $3) FROM unnest($1) g(v)), $2) $$ LANGUAGE sql; postgres=# select to_string(array[1,2,3,4,null,5],','); to_string ------------ 1,2,3,4,,5 (1 row) postgres=# select to_string(array[1,2,3,4,null,5],',','<NULL>'); to_string ------------------ 1,2,3,4,<NULL>,5 (1 row) CREATE OR REPLACE FUNCTION to_array(text, sep text, nullstr text DEFAULT '') RETURNS text[] AS $$ SELECT ARRAY(SELECT CASE WHEN v = $3 THEN NULL::text ELSE v END FROM unnest(string_to_array($1,$2)) g(v)) $$ LANGUAGE sql; postgres=# select to_array('1,2,3,4,,5',','); to_array ------------------ {1,2,3,4,NULL,5} (1 row) postgres=# select to_array('1,2,3,4,<NULL>,5',',','<NULL>'); to_array ------------------ {1,2,3,4,NULL,5} (1 row) postgres=# select to_array('1,2,3,,5',',')::int[]; to_array ---------------- {1,2,3,NULL,5} (1 row)
This issue is fixed in modern versions, that allow third parameter for NULL substitution.
postgres=# SELECT array_to_string(ARRAY[10,10,NULL,10], ',',''); array_to_string ───────────────── 10,10,,10 (1 row)
Cast between "unix timestamp" and timestamp
Author:??
CREATE OR REPLACE FUNCTION convert_timestamp_to_xtime(dt timestamp) RETURNS integer AS $$ SELECT EXTRACT(EPOCH FROM $1)::integer $$ LANGUAGE sql; CREATE OR REPLACE FUNCTION convert_xtime_to_timestamp(xdt integer) RETURNS timestamp AS $$ SELECT to_timestamp($1)::timestamp $$ LANGUAGE sql;
Note: You can use casting to abstime type, but: "The types abstime and reltime are lower precision types which are used internally. You are discouraged from using these types in applications; these internal types might disappear in a future release."
Domain for time zone
David E. Wheleer sent a nice trick. PostgreSQL missing a data type for time zone and doesn't contain any function for time zone verification. But we can use operator AT:
CREATE OR REPLACE FUNCTION is_timezone( tz TEXT ) RETURNS BOOLEAN as $$ BEGIN PERFORM now() AT TIME ZONE tz; RETURN TRUE; EXCEPTION WHEN invalid_parameter_value THEN RETURN FALSE; END; $$ language plpgsql STABLE; CREATE DOMAIN timezone AS CITEXT CHECK ( is_timezone( value ) );
Non-interactive entering a password for psql
psql missing any switch for password. People often ask, how can use psql inside scripts, when is necessary enter any password. One possibility is use a system variable PGPASSWORD.
postgres=# CREATE USER tom; CREATE ROLE postgres=# ALTER USER tom PASSWORD 'tiger'; ALTER ROLE ... postgres=> \q [pavel@nemesis ~]$ PGPASSWORD=tiger psql postgres -U tom Welcome to psql 8.3.8, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit postgres=> ...
The sense of this method is hiding a password in process list. Warning: On Linux (at least) this is still visible through /proc/<pid>/environ
Another possibility is a .pgpass file in your home directory.
Faster table's list
If your database has a high number of tables - thousands and more, you will like next trick:
alter function pg_table_is_visible(oid) cost 10;
by Tom Lane for 8.3 and 8.4. For 8.5 and newer it is default.
Array of arrays
PostgreSQL supports n-dimensional arrays, but doesn't support array of arrays, so we cannot use cumulative aggregate functions over arrays. Here is workaround:
postgres=# create type intarr as (f int[]); CREATE TYPE postgres=# select * from t; a ───────────── {10,20,30} {1,2,3,4,5} (2 rows) postgres=# select array_agg(a) from t; ERROR: could not find array type for data type integer[] postgres=# select array_agg(distinct row(a)::intarr) from t; array_agg ────────────────────────────────────────── {"(\"{1,2,3,4,5}\")","(\"{10,20,30}\")"} (1 row) postgres=# select (unnest(array_agg(distinct row(a)::intarr))::intarr).f from t; f ───────────── {1,2,3,4,5} {10,20,30} (2 rows)
Sent by Sam Mason.
MySQL function group_concat in PostgreSQL
In MySQL we can find very useful aggregate function group_concat. For this function we can define separator and ordering. This isn't possible in PostgreSQL. But we can use workaround via array functions:
postgres=# SELECT * FROM x; cat | town -----+--------- 1 | Benešov 1 | Tábor 1 | Písek 2 | Praha 2 | Hradec 3 | Cheb 3 | Aš (7 rows) postgres=# SELECT cat, array_to_string(ARRAY(SELECT unnest(array_agg(town)) ORDER BY 1),',') FROM x GROUP BY cat; kat | array_to_string -----+--------------------- 1 | Benešov,Písek,Tábor 3 | Aš,Cheb 2 | Hradec,Praha (3 rows)
Since PostgreSQL 9.0 this is natively available as string_agg()
MySQL function field in PostgreSQL
Source: http://stackoverflow.com/questions/1309624/simulating-mysqls-order-by-field-in-postgresql
When we can explicitly to specify some order, we should to use function field in MySQL.
select * from pet order by field(species, 'cat', 'dog', 'bird') desc; +----------+--------+---------+------+------------+------------+ | name | owner | species | sex | birthday | death | +----------+--------+---------+------+------------+------------+ | Fluffy | Harold | cat | f | 1993-02-04 | NULL | | Claws | Gwen | cat | m | 1994-03-17 | NULL | | Buffy | Harold | dog | f | 1989-05-13 | NULL | | Fang | Benny | dog | m | 1990-08-27 | NULL | | Bowser | Diane | dog | m | 1989-08-31 | 1995-07-29 | | Chirpy | Gwen | bird | f | 1998-09-11 | NULL | | Whistler | Gwen | bird | NULL | 1997-12-09 | NULL | +----------+--------+---------+------+------------+------------+
This functionality is missing in PostgreSQL - we could to use CASE statement or we could to write own function. It's possible - we have variadic functions:
select * from pet order by species, case species when 'cat' then 1 when 'dog' then 2 when 'bird' then 3 else 0 end desc;
Source code field function:
CREATE OR REPLACE FUNCTION field(varchar, VARIADIC text[]) RETURNS int AS $$ SELECT i FROM generate_subscripts($2,1) g(i) WHERE $1 = $2[i] UNION ALL SELECT 0 LIMIT 1 1 $$ LANGUAGE sql;Complement of function field is function elt:
CREATE OR REPLACE FUNCTION elt(int, VARIADIC text[]) RETURNS text AS $$ SELECT $2[$1]; $$ LANGUAGE sql;
The first from groups (with windows functions)
This exercise should be solved variously. One way is based on windows functions:
postgres=# SELECT * FROM staff; ┌───────────┬───────────┬────────────┬──────────┐ │ name │ surname │ department │ salary │ ├───────────┴───────────┴────────────┴──────────┤ │ Pavel │ Stehule │ 1 │ 10000.00 │ │ Zdenek │ Stehule │ 1 │ 9000.00 │ │ Vladimira │ Stehulova │ 2 │ 9000.00 │ └───────────────────────────────────────────────┘ (3 rows) postgres=# SELECT * FROM (SELECT *, max(salary) OVER (PARTITION BY department) FROM staff) x WHERE max = salary; ┌───────────┬───────────┬────────────┬──────────┬──────────┐ │ name │ surname │ department │ salary │ max │ ├───────────┴───────────┴────────────┴──────────┴──────────┤ │ Pavel │ Stehule │ 1 │ 10000.00 │ 10000.00 │ │ Vladimira │ Stehulova │ 2 │ 9000.00 │ 9000.00 │ └──────────────────────────────────────────────────────────┘ (2 rows)
Bigint to ip
By Jasen Betts:
create function bigint_to_inet(bigint) returns inet as $$ select (($1>>24&255)||'.'||($1>>16&255)||'.'||($1>>8&255)||'.'||($1>>0&255))::inet $$ language sql;
Marking files as orig (for difforig command) based on patch
PostgreSQL hackers like commands cporig and difforig. cporig creates copy file with .orig extension. difforig search files with extension .orig and does cumulative diff i.e. patch. Sometimes we need mark modified files from patch. This process can be simplified with unix commands:
cat mnnotation.diff | egrep -o -e "^\*\*\*.*(sql|out|h|c|sgml|y|lex|Makefile)" | replace "*** " "" | xargs cporig
Safe string to number conversion
Sometimes we do not want to raise an exception when the converted value is not a number. We can prevent the exception by testing if the input value is a number or not:
CREATE OR REPLACE FUNCTION read_int(varchar) RETURNS int AS $$ SELECT CASE WHEN $1 ~ e'^\\d+$' THEN $1::int END; $$ LANGUAGE SQL IMMUTABLE STRICT;
Dropping milliseconds from timestamp
Usually we don't need timestamp value in maximum precision. For mostly people only seconds are significant. Timestamp type allows to define precision - and we could to use this feature:
postgres=# select current_timestamp; now ------------------------------ 2009-05-23 20:42:21.57899+02 (1 row) Time: 196,784 ms postgres=# select current_timestamp::timestamp(2); now ------------------------ 2009-05-23 20:42:27.74 (1 row) Time: 51,861 ms postgres=# select current_timestamp::timestamp(0); now --------------------- 2009-05-23 20:42:31 (1 row) Time: 0,729 ms
Attention on IS NULL and IS NOT NULL operators for composite types
One may think that !(x IS NULL) = x IS NOT NULL is true in all cases. But there is an exception - composite types. When one field of a composite value is NULL and another field is NOT NULL, then result of both operators is false. IS NULL is true, only when all fields are NULL. IS NOT NULL is true, only when all fields are NOT NULL. For any case in between, then both operators return false.
CREATE OR REPLACE FUNCTION test_isnull() RETURNS TABLE (a int, b int, isnull bool, isnotnull bool) AS $$ DECLARE r foo; BEGIN isnull := r IS NULL; isnotnull := r IS NOT NULL; RETURN NEXT; a := NULL; b := 10; r := ROW(a, b); isnull := r IS NULL; isnotnull := r IS NOT NULL; RETURN NEXT; a := 10; b := 10; r := ROW(a, b); isnull := r IS NULL; isnotnull := r IS NOT NULL; RETURN NEXT; RETURN; END; $$ LANGUAGE plpgsql; SELECT * FROM test_isnull(); a | b | isnull | isnotnull ----+----+--------+----------- | | t | f | 10 | f | f 10 | 10 | f | t (3 rows)
Faster execution of PL/pgSQL function with explicit casting
PL/pgSQL execute right part of assignment statement independently on left part. Result of evaluation is checked and converted to destination type every time. This conversion should be little bit slow. So we could improve assignment statement via explicit casting to ensure type equality of left and right parts assignment statement. Use this technique carefully! And only when is it necessary.
create or replace function test1() returns int as $$ declare s int := 0; begin for i in 1..100000 loop s := 4e3; end loop; -- numeric constant return s; end; $$ language plpgsql immutable; create or replace function test2() returns int as $$ declare s int := 0; begin for i in 1..100000 loop s := 4e3::int; end loop; -- int return s; end; $$ language plpgsql immutable; postgres=# select test1(); test1 ------- 4000 (1 row) Time: 176,623 ms postgres=# select test2(); test2 ------- 4000 (1 row) Time: 47,673 ms
Same problem is implicit conversion in return statement:
create or replace function test1() returns int as $$begin return 4e1; end; $$ language plpgsql; create or replace function test2() returns int as $$begin return 4e1::int; end; $$ language plpgsql; postgres=# select count(test1()) from generate_series(1,100000); count --------nebo v konverzi návratové hodnoty 100000 (1 row) Time: 682,005 ms postgres=# select count(test2()) from generate_series(1,100000); count -------- 100000 (1 row) Time: 528,099 ms
Protection from parallel run of some PL function
Author: Pavel Stěhule
Theoretically we should to detect an run of any function - but these information are not accessible from user interface (without C coding). We should to look to list of processed queries or we should to use advisory locks. Test based on select from pg_stat_activity view is useful only for function directly called by user. Functions called from other functions are invisible.
create or replace function long_run() returns void as $$ begin if exists(select procpid from pg_stat_activity where procpid <> pg_backend_pid() and current_query like '%long_run(%') then raise notice 'procedure is running'; return; end if; perform pg_sleep(10); end$$ language plpgsql volatile; CREATE FUNCTION
Next possibility is using advisory locks http://www.postgresql.org/docs/8.3/interactive/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS.
create or replace function long_run() returns void as $$ declare foid oid:= 'long_run'::regproc::oid; begin if pg_try_advisory_lock(foid::bigint) then perform pg_sleep(10); perform pg_advisory_unlock(foid::bigint); else raise notice 'procedure is running'; end if; return; end; $$ language plpgsql;
Using scientific format
PostgreSQL doesn't support EEEE format in to_char function still (there is a patch for 8.5). We should to use external plperl function:
create or replace function floating_format(double precision, int) returns varchar as $$ sprintf("%.$_[1]e", $_[0]) $$ language plperl; create or replace function floating_format(double precision) returns varchar as $$ sprintf("%e", $_[0]) $$ language plperl; postgres=# select floating_format(10.223); floating_format ----------------- 1.022300e+01 (1 row) postgres=# select floating_format(10.223,2); floating_format ----------------- 1.02e+01 (1 row) postgres=# select floating_format(10.223,4); floating_format ----------------- 1.0223e+01 (1 row)
Terminal's configuration
I using two interfaces for access to PostgreSQL - emacs and psql. These application should be configured for more comfortable work:
- emacs (.emacs)
(setq-default truncate-lines t) (ansi-color-for-comint-mode-on) (setq lazy-lock-defer-on-scrolling t) (setq inhibit-startup-echo-area-message t) (setq inhibit-startup-message t) (show-paren-mode t) (setq show-paren-style 'mixed) (fset 'yes-or-no-p 'y-or-n-p) (global-set-key "\M-g" 'goto-line) (setq sql-database "postgres") ;; name of most often used database ;; don't use pager, empty title (setq sql-postgres-options '("-P" "pager=off" "-P" "title= ")) ;; interesting idea is setting of start of every new command on new line ;; (setq sql-postgres-options '("-P" "pager=off" "-P" "title= " "-v" "PROMPT1=%/.%n%#\n" "-v" "PROMPT2=" "-v" "PROMPT3="))
- psql
export PAGER="less -RSX" psql ... -P pager=always
Moving tables from one schema to second schema
We have to use plpgsql function:create or replace function mvtable(src_schema varchar, dest_schema varchar, mask varchar) returns void as $$ declare r record; begin for r in select * from information_schema.tables where table_schema = src_schema and table_name like mask loop execute 'alter table ' || quote_ident(r.table_schema) || '.' || quote_ident(r.table_name) || ' set schema ' || quote_ident(dest_schema); end loop; return; end; $$ language plpgsql;
Encoding number to any numeral system with CTE
Autor: Pavel Stehule
postgres=# create function to_base(num int, base int = 10) returns varchar as $$ with recursive z as ( select $1 as a, null::int as b, 0 as i union all select a/$2, a % $2, i+1 from z where a > 0 ) select array_to_string(array(select substring('0123456789abcdef' from b + 1 for 1) from z where i > 0 order by i desc),''); $$ language sql immutable; postgres=# select to_base(255); to_base --------- 255 (1 row) postgres=# select to_base(255,16); to_base --------- ff (1 row) postgres=# select to_base(255,8); to_base --------- 377 (1 row) postgres=# select to_base(255,2); to_base ---------- 11111111 (1 row)
Get default values from table definition
One typical query is getting the default values for a record. Since these definitions are stored in the system catalogs (tables), they are accessible.
CREATE OR REPLACE FUNCTION eval(varchar) RETURNS varchar AS $$ DECLARE result varchar; BEGIN EXECUTE 'SELECT ' || $1 INTO result; RETURN result; END;$$ LANGUAGE plpgsql STRICT; CREATE OR REPLACE FUNCTION defaults(text, OUT attname name, OUT type varchar, OUT default_val varchar) RETURNS SETOF RECORD AS $$ SELECT a.attname, pg_catalog.format_type(a.atttypid, a.atttypmod), (SELECT eval(pg_catalog.pg_get_expr(d.adbin, d.adrelid)) FROM pg_catalog.pg_attrdef d WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef) FROM pg_catalog.pg_attribute a WHERE a.attrelid = $1::regclass::oid AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum $$ LANGUAGE SQL STRICT;
Second solution, but not always usable is using RETURNING clause.
Function to_bin
PostgreSQL doesn't provide to_bin function. Implementation (based on casting to bit type) is simple and fast:
create or replace function to_bin(int) returns varchar as $$ declare r varchar; begin execute 'select ' || $1 || '::bit(' || case when $1 = 0 then 1 else trunc(log(2, $1)) + 1 end || ')' into r; return r; end; $$ language plpgsql immutable strict; create or replace function bin(varchar) returns varchar as $$ declare r varchar; begin execute 'select b''' || $1 || '''::int' into r; return r; end; $$ language plpgsql immutable strict; postgres=# select to_bin(5), bin('101'); to_bin | bin --------+----- 101 | 5 (1 row)
SQL code for hierarchical break-down of domains
based on Depesc's ideas from http://www.depesz.com/index.php/2008/12/01/getting-list-of-most-common-domains/#comment-27066
postgres=# create or replace function domain_list(varchar[]) returns setof varchar as $$ select array_to_string($1,’.') union all select domain_list($1[2:array_upper($1,1)]) where array_upper($1,1) > 1 $$ language sql immutable strict; CREATE FUNCTION postgres=# select domain_list(array['a','b','c','d']); domain_list ————- a.b.c.d b.c.d c.d d (4 rows) postgres=# create or replace function domain_list(varchar) returns setof varchar as $$ select domain_list(string_to_array($1, ‘.’)) $$ language sql immutable strict; CREATE FUNCTION postgres=# select domain_list(’a.b.c.d’); domain_list ————- a.b.c.d b.c.d c.d d
Solution for migration to 8.3 (problem with tsearch2)
PostgreSQL documentation has a section with some advice that should to help with migration of older application (based on TSearch2) to PostgreSQL 8.3 and higher (with integrated fulltext). This works well when you migrate databases individually, but it fails when you use pg_dump. As documented, you should do some pre steps - import TSearch compatible module into template1. But complete dump generates databases from template0 (you can't to change it). So you cannot load this dump, and you have to edit dump (advice - dump structure and data to two different files).
so, when you find bug:
ERROR: type "tsvector" already exists
You have to add import of tsearch2.sql to every creation database part in dump.
DROP DATABASE brigady; CREATE DATABASE brigady WITH TEMPLATE = template0 ENCODING = 'UTF8'; -- -- PostgreSQL database dump complete -- \connect brigady -- new line \i /usr/local/pgsql/share/contrib/tsearch2.sql SET search_path = public, pg_catalog; ...
Expect SQL functions to be really fast
Much faster than PL/pgSQL - especially when they are wrappers around integrated features (in this example - integrated qsort). PL/pgSQL is usually fast to interpret - I know only about two slow operations - array update and string update. It's based on the internal architecture - every change generate a new object - it isn't significant for small objects (less than 1000), but for large objects this operations needs significant time. However, we can go a different way, using integrated functionality (this way is usually the fastest in any embedded language).
PostgreSQL use internal qsort -so, I wrote qsort in PL/pgSQL too (results will be comparable):
CREATE OR REPLACE FUNCTION quicksort(l integer, r integer, a int[]) RETURNS int[] AS $$ DECLARE akt int[] = a; i integer := l; j integer := r; x integer = akt[(l+r) / 2]; w integer; BEGIN LOOP WHILE akt[i] < x LOOP i := i + 1; END LOOP; WHILE x < akt[j] loop j := j - 1; END LOOP; IF i <= j THEN w := akt[i]; akt[i] := akt[j]; akt[j] := w; i := i + 1; j := j - 1; END IF; EXIT WHEN i > j; END LOOP; IF l < j THEN akt := quicksort(l,j,akt); END IF; IF i < r then akt := quicksort(i,r,akt); END IF; RETURN akt; END; $$ LANGUAGE plpgsql STRICT IMMUTABLE; postgres=# SELECT array_upper(quicksort(1,10000,array_agg(a)),1) FROM test; array_upper ------------- 10000 (1 row) Time: 5918,531 ms
Then I wrote same function (note. some 8.4 features are used) in SQL:
CREATE OR REPLACE FUNCTION sort(anyarray) RETURNS anyarray AS $$ SELECT array(SELECT * FROM unnest($1) ORDER BY 1); $$ language sql; postgres=# SELECT array_upper(sort(array_agg(a)),1) FROM test; array_upper ------------- 10000 (1 row) Time: 35,980 ms