Partitioning: fire and forget

About the partitioning you can find a lot information, in particular here you can read about the theory and continue the author develops the idea and provides a solution to the quick add section. I recommend you to review.
After studying the theory, almost all comes the idea of automating the process of creating partitions. Above was one of the options, the second full version I've seen the creators of the distinguished, I think, not only me Zabbix.
After a short adaptation, I decided to implement it... unfortunately, it turned out a few flaws: when you create a new section, the first entry in this section were lost; with a large number of sections of the insert even one record takes too much time (caused by 2 factors: each time was calculated table, where should I put the entry; the use of multiple rules instead of 1 trigger with all the terms). Nevertheless, the guys did a great job and I take this opportunity to send them rays of respect.


In the result, I present to Your court their decision. To start example how to launch the partitioning:
Date range:

the
select create_partition('partitions.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
select create_partition('partitions.test2.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('partitions.test3.cdate', 'date', 'month,YYYY_MM', now()::text);

These commands will launch the partitioning by days, weeks and months for tables test1, test2, test3

Let us examine the options:
the first field 'partitions.test1.cdate' values, separated by a dot:
partitions — the name of the schema where shall be section
test1 — the table is a parent in the current scope
cdate — the field on which the table is delimited by
the second field, 'date' specifies the type of partitions
the third field 'day,YYYY_MM_DD' specifies the settings for the partition
day — create a section every day
YYYY_MM_DD is the suffix for the table
and finally, the fourth now()::text example of data for section

Another example:
the
select create_partition('partitions.test4.id', 'digits', '10,2', 1::int);

Here the number will be converted to 10-bit by adding zeros to the left (if needed) and all numbers to the left of the eighth to take the name of the section:
1000000000 will be in the partitions section.test4_10
100000000 in section of partitions.test4_01
10000000 in section of partitions.test4_00
10000000000 in section of partitions.test4_100

Briefly describe how it works:
when you run select create_partition('partitions.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text); the first thing is called "plug-in" partition_date. The purpose of this function — for the given parameters return the name of the section and getting into it. Then in the main function creates a new table that inherits from the current with the desired conditions, copies all the indexes of the ancestor.
Then a query from the system tables pull the data about the tables and their descendants check the condition of these data be re-created is the trigger that distributes data across partitions.
Now for any insert will fire the trigger and determine the appropriate table section, if no such table is called create_partition with the same settings when creating partitions.
For the first example, after years of filling the trigger will look like this:
the
CREATE OR REPLACE FUNCTION trig_partition_test1_cdate() RETURNS TRIGGER AS $BODY2$
declare
child text;
begin
IF ((NEW.cdate >= '2012-09-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-10-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_09 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-08-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-09-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_08 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-07-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-08-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_07 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-06-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-07-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_06 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-05-01 00:00:00-07'::timestamp with time zone) AND (NEW.cdate < '2012-06-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_05 VALUES (NEW.*);

ELSIF ((NEW.cdate >= '2012-03-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-04-01 00:00:00-07'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_03 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-02-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-03-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_02 VALUES (NEW.*);
ELSIF ((NEW.cdate >= '2012-01-01 00:00:00-08'::timestamp with time zone) AND (NEW.cdate < '2012-02-01 00:00:00-08'::timestamp with time zone)) THEN INSERT INTO public.test1_2012_01 VALUES (NEW.*);
ELSE
EXECUTE 'SELECT create_partition("public.test1.cdate"::text, "date"::text, "month,YYYY_MM"::text , "'||NEW.cdate||"'::text)' INTO child;
EXECUTE 'INSERT INTO '|| child || ' SELECT $1.*' USING NEW;
END IF;
RETURN NULL;
end;
$BODY$ LANGUAGE 'plpgsql';


As you can see, it turns almost classical example from the documentation :)

As you may have guessed, this solution is easy to expand, for example, partitioning on the first letter of the string or hash string. I propose to write this "plugin" yourself and put it in the comments.

I hope some of my decision would be helpful.

helper scripts found on the forum sql.ru
CREATE OR REPLACE FUNCTION to_timestamp(timestamp without time zone) RETURNS timestamp with time zone AS $_$
select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;

CREATE OR REPLACE FUNCTION to_timestamp(timestamp with time zone) RETURNS timestamp with time zone AS $_$
select $1::timestamp with time zone;
$_$ LANGUAGE sql STABLE STRICT;

CREATE OR REPLACE FUNCTION to_numeric(text) RETURNS numeric AS $_$
select regexp_replace($1, '\D+', ", 'g')::numeric;
$_$ LANGUAGE sql STABLE STRICT;

CREATE OR REPLACE FUNCTION copy_constraints(srcoid oid, oid dstoid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
constrs record;
srctable text;
dsttable text;
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for constrs in
select conname as name, pg_get_constraintdef(oid) as definition
from pg_constraint where conrelid = srcoid loop
begin
execute 'alter table' || dsttable
|| ' add constraint '
|| replace(replace(constrs.name, srctable, dsttable),'.','_')
|| '' || constrs.definition;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE OR REPLACE FUNCTION copy_constraints(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_constraints(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;

CREATE OR REPLACE FUNCTION copy_indexes(srcoid oid, oid dstoid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
record indexes;
srctable text;
dsttable text;
script text;
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for indexes in
select c.relname as name, pg_get_indexdef(idx.indexrelid) as definition
from pg_index idx, pg_class c where idx.indrelid = srcoid and c.oid = idx.indexrelid loop
script = replace (indexes.definition of ' INDEX '
|| indexes.name ' INDEX '
|| replace(replace(indexes.name, srctable, dsttable),'.','_'));
script = replace (script, ' ON '|| srctable, ' ON ' || dsttable);
begin
execute script;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE OR REPLACE FUNCTION copy_indexes(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_indexes(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;

CREATE OR REPLACE FUNCTION copy_triggers(srcoid oid, oid dstoid)
RETURNS integer AS
$BODY$
declare
i int4 := 0;
triggers record;
srctable text;
dsttable text;
script text = ";
begin
srctable = srcoid::regclass;
dsttable = dstoid::regclass;
for triggers in
select tgname as name, pg_get_triggerdef(oid) as definition
from pg_trigger where tgrelid = srcoid loop
script =
replace (triggers.definition of ' TRIGGER '
|| triggers.name, ' TRIGGER '
|| replace(replace(triggers.name, srctable, dsttable),'.','_'));
script = replace (script, ' ON '|| srctable, ' ON ' || dsttable);
begin
execute script;
i = i + 1;
exception
when duplicate_table then
end;
end loop;
return i;
exception when undefined_table then
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE OR REPLACE FUNCTION copy_triggers(src text, dst text)
RETURNS integer AS
$BODY$
begin
return copy_triggers(src::regclass::oid, dst::regclass::oid);
end;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;


Basic scripts
CREATE OR REPLACE FUNCTION "create_partition" (in tbl varchar, in method varchar, in params varchar, in sample text) RETURNS text AS $BODY$
declare
scheme varchar := split_part(tbl, '.', 1);
parent varchar := split_part(tbl, '.', 2);
field varchar := split_part(tbl, '.', 3);
child varchar;
script text;
trig text;
part text[];
begin
execute 'select partition_'||method||'("'||params||"', "'||field||"', "'||sample||"')' into part;
-- RAISE EXCEPTION 'part %', part;
child = scheme || '.' || parent || '_' || (part[1]::text);
execute 'create table IF NOT EXISTS' || child || '
(
constraint partition_' || (part[1]) || ' check '
|| (part[2]) || '
)
inherits (' || parent || ')';
perform copy_constraints(parent, child);
perform copy_indexes(parent, child);
-- execute 'GRANT SELECT ON' || child || 'TO some_other_user';
-- execute 'GRANT ALL ON' || child || 'TO user';
script = (select string_agg(c, chr(10)||' ELS') from (
select to_numeric(replace(t.table_name, parent||'_',")) as n, 'IF '||replace(left(right(cc.check_clause, -1), -1), c.column_name, 'NEW.'||c.column_name)||' THEN INSERT INTO '||t.table_schema||'.'||t.table_name||' VALUES (NEW.*);' as c
from information_schema.TABLE_CONSTRAINTS t
join information_schema.CONSTRAINT_COLUMN_USAGE c
ON t.constraint_name = c.constraint_name
join information_schema.check_constraints cc

where constraint_type IN ('CHECK')
and t.table_name like parent||'\_%'
group by t.table_schema, t.table_name, c.column_name, cc.check_clause
order by n desc) t);

trig = 'trig_partition_'||parent||'_'||field;
execute
'CREATE OR REPLACE FUNCTION '||trig||'() RETURNS TRIGGER AS $BODY2$
declare
child text;
begin
'||script||'
ELSE
EXECUTE "SELECT create_partition(""'||tbl||""'::text, ""'||method||""'::text, ""'||params||""'::text , """||NEW.'||field||'||"""::text)" INTO child;
EXECUTE 'INSERT INTO' | | child || " SELECT $1.*" USING NEW;
END IF;
RETURN NULL;
end;
$BODY2$
LANGUAGE "plpgsql" VOLATILE;';

execute 'DROP TRIGGER IF EXISTS t_'||trig||' ON "'||parent||'" CASCADE';
execute 'CREATE TRIGGER t_'||trig||' BEFORE INSERT ON "'||parent||'" FOR EACH ROW EXECUTE PROCEDURE '||trig||'();';
return child;
end;
$BODY$ LANGUAGE 'plpgsql';

CREATE OR REPLACE FUNCTION "partition_date" (in params text, in field varchar, in sample timestamp with time zone) RETURNS text[] AS $BODY$
declare
period varchar:= split_part(params, ',', 1);
fmt varchar := split_part(params, ',', 2);
clock timestamp with time zone := to_timestamp(sample);
delta varchar := '1 '||period;
suffix varchar;
check_beg varchar;
check_end varchar;
condition varchar;
begin
-- RAISE EXCEPTION 'period %, fmt % clock %', period, fmt, clock;
check_beg = date_trunc(period, clock);
check_end = date_trunc(period, clock + delta::interval);
suffix = to_char (clock, fmt);

condition =
'(
'|| field || '>= '|| quote_literal (check_beg) || ' and
'|| field || '< ' || quote_literal (check_end) || '
)';

return ARRAY[suffix condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;

CREATE OR REPLACE FUNCTION "partition_digits" (in params text, in field varchar, in sample numeric) RETURNS text[] AS $BODY$
declare
len int := split_part(params, ',', 1)::int;
pref int := split_part(params, ',', 2)::int;
norm text := to_char(sample::numeric, 'FM000000000000000000000');
suffix text := regexp_replace(left(norm, -1*(len-pref)), '0*(?=\d{'||pref::text||'})', ");
check_beg varchar;
check_end varchar;
condition varchar;
begin
check_beg = (trunc(norm::numeric, -1*(len-pref)))::numeric::text;
check_end = (check_beg::numeric+10^(len-pref))::numeric::text;
condition =
'(
'|| field || '>= '|| check_beg || ' and
'|| field || '< ' || check_end || '
)';
return ARRAY[suffix condition];
end;
$BODY$ LANGUAGE 'plpgsql' IMMUTABLE;


Tests
Partition by date
the
DROP TABLE IF EXISTS "public"."test1" CASCADE;
CREATE TABLE "public"."test1" (
"id" serial,
"cdate" timestamp with time zone,
"text" text,
CONSTRAINT "test11_pkey" PRIMARY KEY (id)
) WITH OIDS;
CREATE INDEX test_idx_cdate ON test1 USING btree (cdate);

-- select create_partition('public.test1.cdate', 'date', 'day,YYYY_MM_DD', now()::text);
-- select create_partition('public.test1.cdate', 'date', 'week,YYYY_IW', now()::text);
select create_partition('public.test1.cdate', 'date', 'month,YYYY_MM', now()::text);
-- insert into test1 (cdate, text) SELECT cdate, md5(random()::text) as text FROM generate_series('2012-01-01'::date, now(), '1 day'::interval) cdate;



Section range of numbers:
the
DROP TABLE IF EXISTS "public"."test2" CASCADE;
CREATE TABLE "public"."test2" (
"id" bigserial,
"text" text,
CONSTRAINT "test2_pkey" PRIMARY KEY (id)
) WITH OIDS;

select create_partition('public.test2.id', 'digits', '10,2', 1::int);
insert into test2 values(10000000, 'test2');
-- insert into test2 (id, text) SELECT t.id, md5(random()::text) as text FROM generate_series(0, 100000000000, 1000000000) as t(id) limit 200;


Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Integration of PostgreSQL with MS SQL Server for those who want faster and deeper

Parse URL in Zend Framework 2

2 years Kartavykh reviews — the story of an Amateur show Old-Hard