Foreign key creation performance

The Agile development model IBM has moved to for Informix development means that there’s always a few extra features in each fix pack. There are two rather nice enhancement in 11.70.FC8 when building a foreign key which alleviate some pain points when doing administration on large tables. The first is a no validate option which means that no checking of the data is done when creating the foreign key. This is appropriate in situations like migrating platforms when you know that referential integrity is already assured and you want to avoid the overhead of re-checking it all again on the new system. I want to look instead at the other enhancement: improved performance when creating a foreign key.

Nearly all of the time creating a foreign key constraint is spent checking the data; modifying the database schema to add the constraint is a very quick operation.

Prior to 11.70.FC8 creating a foreign key on a large table was a long process and with non-partitioned tables it was a single-threaded operation which did not benefit from the use of indices. Typically you’d see a thread working away like this:

tid      name     rstcb            flags    curstk   status
283      sqlexec  26e1fc3c78       --BPR--  22096    running-

Or with a fragmented table and PDQ turned on you might see multiple threads:

tid      name     rstcb            flags    curstk   status
324      sqlexec  26e1fc3c78       --BP---  19744    sleeping secs: 1     -
325      scan_1.0 26e1fc0a98       --B-R--  3248     running-
326      scan_1.1 26e1fc44c8       --B-R--  3248     running-
327      scan_1.2 26e1fc4d18       --B-R--  3248     running-
328      scan_1.3 26e1fc5568       --B-R--  3248     running-
329      scan_1.4 26e1fc5db8       --B-R--  3248     running-
330      scan_1.5 26e1fc6608       --B-R--  3248     running-
331      scan_1.6 26e1fc97e8       --B-R--  3248     running-
332      scan_1.7 26e1fcb0d8       --B-R--  3248     running-

Even with parallel operations foreign key builds could still take a long time.

With 11.70.FC8 foreign key builds are a lot faster and here is a test case which you can try. I’m going to create a 10 million row table, which is large enough to demonstrate the problem, and a referenced table with up to 32767 unique values and see how long it takes to create a foreign key.

Courtesy of the Informix FAQ I am going to create these procedures to generate random numbers.

CREATE PROCEDURE sp_setseed(n INTEGER)
  DEFINE GLOBAL seed DECIMAL(10) DEFAULT 1;

  LET seed = n;

END PROCEDURE;

CREATE PROCEDURE sp_random() RETURNING INTEGER;
  DEFINE GLOBAL seed DECIMAL(10) DEFAULT 1;
  DEFINE d DECIMAL(20,0);

  LET d = (seed * 1103515245) + 12345;
  -- MOD function does not handle 20-digit values
  LET seed = d - 4294967296 * TRUNC(d / 4294967296);

  RETURN MOD(TRUNC(seed / 65536), 32768);

END PROCEDURE;

These will generate a random number between 0 and 32767 and I’ll use these to create some data to populate my tables with.

drop procedure if exists populate;
create procedure populate ()

  define p_wid int;
  define i int;

  drop table if exists referenced;
  create table referenced (
    widget_id int not null
  );

  drop table if exists referencing;
  create table referencing (
    id_column bigserial not null,
    widget_id int not null
  );

  FOR i = 1 TO 10000000 STEP 1

    let p_wid = sp_random();
    insert into referencing (widget_id) values (p_wid);

  END FOR;

  create index ireferencing_x1 on referencing (widget_id);
  insert into referenced (widget_id) select distinct widget_id from referencing;

  create unique index ireferenced_u1 on referenced (widget_id);
  alter table referenced add constraint primary key (widget_id) constraint referenced_pk;

end procedure;

execute procedure populate();

That sets up the test. It just remains to create the foreign key, which I’ll time:

alter table referencing add constraint foreign key (widget_id) references referenced (widget_id) constraint referencing_fk;

And so for the results:

Informix version Fragmented table PDQ priority Parallel execution Foreign key build time (s)
11.70.FC7 No 0 No 40
11.70.FC7 No 100 No 40
11.70.FC7 Yes, 8-way partition by expression, evenly distributed 0 No 40
11.70.FC7 Yes, 8-way partition by expression, evenly distributed 100 Yes 40
11.70.FC7 Yes, 8-way round-robin 0 No 41
11.70.FC7 Yes, 8-way round-robin 100 Yes 67
11.70.FC8 No 0 No 0.3
11.70.FC8 No 100 No 0.3
11.70.FC8 Yes, 8-way partition by expression, evenly distributed 0 No 0.5
11.70.FC8 Yes, 8-way partition by expression, evenly distributed 100 Yes 40
11.70.FC8 Yes, 8-way round-robin 0 No 0.9
11.70.FC8 Yes, 8-way round-robin 100 Yes 68

And for comparison here is the time taken for some related SQL queries (without PDQ):

Query Duration (s)
select distinct widget_id from referencing; 9
select widget_id, count(*) from referencing group by widget_id; 7
select widget_id from referencing where widget_id not in (select widget_id from referenced); 37

The purpose of this table is to show the relative times of the foreign key build and not to benchmark any hardware or Informix itself so I’m not going to divulge what hardware I’m used for the tests. The server I used did have multiple processors and CPU VPs configured so that PDQ queries could scale out. The number of ready threads was very low or none during the operation.

I ran all the tests a few times and there was a small amount of variation so I’ve only used a low precision in the results.

So what conclusions can we draw from this?

  • The stand-out result is that the verification of foreign keys is much faster in FC8 but only for serial operations.
  • Serial foreign key builds in FC8 are significantly faster than any SQL query I could write that checked the referential integrity.
  • In FC7 and earlier parallel operations may be slower.
  • In FC8 there is not the same speed improvement for parallel operations and they are still much slower.

Another interesting finding is the large difference in performance between the expression-based fragmentation and round-robin fragmentation when using parallel operations. I had assumed that this kind of operation would be agnostic to the type of fragmentation used. I checked this a few times but examining the threads when it is running shows something different is definitely happening.

Round-robin fragmentation looks like the below. Notice that most of the threads are active:

tid      name     rstcb            flags    curstk   status
123535   sqlexec  15f2599e8        --BP---  29472    sleeping secs: 1     -
123536   scan_1.0 15f25e590        --B-R--  2400     running-
123537   scan_1.1 15f262068        Y-B----  496      cond wait  await_MC1 -
123538   scan_1.2 15f25c3f0        Y-B----  496      cond wait  await_MC1 -
123539   scan_1.3 15f261800        --B----  2400     running-
123540   scan_1.4 15f2628d0        --B-R--  2400     running-
123541   scan_1.5 15f263138        --B-R--  2400     running-
123542   scan_1.6 15f25edf8        --B-R--  2400     running-
123543   scan_1.7 15f25f660        --B----  2400     running-

Expression fragmentation seems to run with just one active thread for the majority of the time:

tid      name     rstcb            flags    curstk   status
123804   sqlexec  15f258918        --BP---  29472    sleeping secs: 1     -
123805   scan_1.0 15f25e590        --B-R--  2400     running-
123806   scan_1.1 15f25c3f0        Y-B----  496      cond wait  await_MC1 -
123807   scan_1.2 15f261800        Y-B----  496      cond wait  await_MC1 -
123808   scan_1.3 15f2628d0        Y-B----  496      cond wait  await_MC1 -
123809   scan_1.4 15f263138        Y-B----  496      cond wait  await_MC1 -
123810   scan_1.5 15f25edf8        Y-B----  496      cond wait  await_MC1 -
123811   scan_1.6 15f25f660        Y-B----  496      cond wait  await_MC1 -
123812   scan_1.7 15f2639a0        Y-B----  496      cond wait  await_MC1 -
123813   scan_1.8 15f260f98        Y-B----  496      cond wait  await_MC1 -

There is an extra scan thread (scan_1.8) because I have an (empty) remainder fragment in addition to my 8-way fragmentation.

This would go some way to explaining why round-robin is faster.

Finally, if I could wander into the realms of speculation for a moment, it does look like IBM has implemented the parallel foreign key constraint checking slightly differently for each fragmentation method, which might explain why the improved performance is only seen with serial operations.

To sum up, this is a wonderful enhancement but I’ll be careful to switch off parallel processing when building foreign keys. It greatly speeds up index builds on fragmented tables, which is the type of operation you might do at the same time so I can foresee scripts with parallel processing being enabled and disabled several times to ensure best speed.

Advertisements