Loading an empty table with compressed data is not trivial. In order for Informix to build the compression dictionary, the table must contain some data but in the case of an empty table, there is no data. To help solve this problem, we can utilize the database scheduler to monitor a specific table such that when it contains enough rows, a compression dictionary will be built so that the remaining rows inserted into the table will be compressed.
To simplify this operation utilize a very simple database scheduler task. This task will monitor a specific table in the background, waiting for any fragment to reach a specific number of rows then build a compression dictionary on that fragment. When all fragments in a table have a dictionary built or a time out value is exceed then the database scheduler task will terminate.
First we are going to create to configuration values for a task in the sysadmin database. We do this by inserting values into the ph_threshold table in sysadmin.
INSERT INTO ph_threshold (id,name,task_name,value,value_type,description) VALUES (0,"COMPRESSION TABLE TIMEOUT", "compress_table","900", "NUMERIC", "The timeout values in seconds for this task." ); INSERT INTO ph_threshold (id,name,task_name,value,value_type,description) VALUES (0,"COMPRESSION TABLE ROW COUNT", "compress_table","2000", "NUMERIC", "The number of rows in a fragment before a compression dictionary will be created." );
Next we create the task to execute. This task is a little different than typical task as it will never be schedule to run and in fact is disabled. This task in only execute by an end user manually invoking the task with a form of the exectask() function. The reason for making this a database schedule task is that we can execute tasks in the background, asynchronously to the running program. The insert statement below will define the task, but not the stored procedure, compress_table, executed by the task
INSERT INTO ph_task ( tk_name, tk_type, tk_group, tk_description, tk_execute, tk_start_time, tk_stop_time, tk_frequency, tk_delete, tk_enable ) VALUES ( "compress_table", "TASK", "TABLES", "Task to be kicked off when loading a table to ensure data is compressed", "compress_table", NULL, NULL, INTERVAL ( 1 ) DAY TO DAY, INTERVAL ( 30 ) DAY TO DAY, 'f' );
TASKS STORED PROCEDURE
The last component is the stored procedure which is executed by the task. The task takes in three arguments, two are traditional task and the last is the table name upon which auto compression will be enabled on.
CREATE FUNCTION compress_table(task_id INTEGER, task_seq INTEGER, tabname LVARCHAR ) RETURNING INTEGER DEFINE timeout INTEGER; DEFINE fragments_left INTEGER; DEFINE row_count INTEGER; DEFINE fragid INTEGER; DEFINE rc INTEGER; DEFINE cnt INTEGER; DEFINE created_at DATETIME YEAR TO SECOND; -- Get the config thresholds SELECT MAX(value::integer) INTO timeout FROM sysadmin:ph_threshold WHERE name = "COMPRESSION TABLE TIMEOUT"; IF timeout IS NULL THEN LET timeout = 900; ELIF timeout < 0 THEN LET timeout = 10; ELIF timeout > 3600 THEN LET timeout = 3600; END IF SELECT MAX(value::integer) INTO row_count FROM sysadmin:ph_threshold WHERE name = "COMPRESSION TABLE ROW COUNT"; IF row_count IS NULL OR row_count < 1000 THEN LET row_count = 1000; END IF BEGIN ON EXCEPTION DROP TABLE IF EXISTS pt_list; INSERT INTO ph_alert (ID, alert_task_id,alert_task_seq,alert_type, alert_color, alert_object_type, alert_object_name, alert_message,alert_action) VALUES (0,task_id, task_seq, "INFO", "YELLOW", "SERVER","compress_table", "Failed to build compression dictionaries on " ||TRIM(tabname), NULL); END EXCEPTION IF tabname IS NULL THEN RETURN -1; END IF LET fragments_left = 99; LET cnt = 0; SELECT P.lockid FROM sysmaster:systabnames T, sysmaster:sysptnhdr P WHERE TRIM(t.dbsname)||":"||TRIM(T.tabname) = LOWER(tabname) AND P.lockid = T.partnum AND P.nkeys = 0 AND bitand( P.flags, '0x08000000' ) = 0 INTO TEMP pt_list WITH NO LOG; CREATE INDEX ix_temp_pt_list ON pt_list(lockid); WHILE ( timeout > 0 AND fragments_left > 0 ) FOREACH SELECT P.partnum INTO fragid FROM pt_list L, sysmaster:sysptnhdr P WHERE l.lockid = P.partnum AND P.nrows > row_count AND bitand( P.flags, '0x08000000' ) = 0 LET rc = admin('fragment create_dictionary', fragid); IF rc >= 0 THEN DELETE FROM pt_list WHERE lockid = fragid; LET cnt = cnt + 1; END IF END FOREACH SELECT NVL( count(*) , 0 ) INTO fragments_left FROM pt_list L, sysmaster:sysptnhdr P WHERE l.lockid = p.partnum AND P.nkeys = 0 AND bitand( P.flags, '0x08000000' ) = 0; LET rc = yieldn(1); LET timeout = timeout - 1; END WHILE END DROP TABLE IF EXISTS pt_list; INSERT INTO ph_alert (ID, alert_task_id,alert_task_seq,alert_type, alert_color, alert_object_type, alert_object_name, alert_message,alert_action) VALUES (0,task_id, task_seq, "INFO", "GREEN", "SERVER","compress_table", "Built "||cnt||" compression dictionaries on " ||TRIM(tabname), NULL); RETURN 0; END FUNCTION;
EXAMPLE
Below is an example of how to execute the auto compression task. We create an empty table then start the compress_table task. The exectask_aysnc() function takes the name of a database scheduler task and an optional argument. We wait one second to ensure the task was fully started, then start some load activity. Once this table has reached its the threshold the dictionary will automatically be created, and all rows inserted after that point will be compressed. While it is work noting that there will be a few thousand rows in our table that will not be compressed because they were inserted before the dictionary was created.
create table t1 (c1 serial, c2 char(500));
execute function sysadmin:exectask_async("compress_table","stores_demo:t1");
execute function sysadmin:yieldn(1);
--Load activity
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;
insert into t1 select 0, tabname from systables,syscolumns;