Designing Tables with Integer Keys

While the “natural” key for a table is often a relatively stable string, better database designs leave the natural string keys as table columns/attributes and use integer “tag” values not containing any real world data for primary keys and foreign key relationships.   These integer keys values offer better maintainability and performance.  When using integer key tags, a unique key constraint and index should always be created on the natural primary key.

First “relatively stable” often isn’t as stable as believed during application/database design, particularly if business operations decides to implement some type of “coding” into the value like an leading “R” for residential customers.  During the next business reorganization, company merger, software upgrade or maybe when your company purchases a portfolio of customers, you may need to update that customer number or on-board additional customers from another system.  When merging data from two or more systems there are often “collisions” where the same customer number is in use in both systems or the number needs to change to meet the “encoding” standard business operations favors.  If the customer number used in customer table primary key changes then all the relationships into other tables like order, address, etc will be “broken” and require updating to re-establish the relationship – maybe requiring thousands or millions of updates in the related tables and their indexes.  If the primary key for the customer table is a integer “tag” and the character based value is an column/attribute, then the actual customer number only needs to be updated once in the customer table; none of the relationships to other tables have been broken and will not need repairs.

The integer tag key also provides better data integrity.  The preceding paragraph demonstrated the orphan record problem caused by updating the natural key in the parent table.  When data is duplicated in multiple tables, there are lots of variations where integrity is loss.  The update that fails on one table while updating the other nine or the developer missing one table requiring the update during the last application upgrade are just a couple that come to mind.  The best practice is to only store “key” information once and use non-changing integer tags to reference it where needed with appropriate unique and foreign key constraints.

The tag key also protects sensitive information.  It’s not uncommon for business tables containing information about people to use  a social security number for the identifying key.  When the SSN is an attribute in a single table, it is fairly easy to protect with role based security, data masking or VPDB.  If the SSN is a key value scattered through several related tables, then it needs to be protected in each table.  Keeping a value “hidden” while making it available for follow-up or drill-down queries is also a challenge that can be avoided by using tag keys.

Second, comparing integer values is faster than comparing character values.  While the capacity and coding available today enables computers to quickly search and compare all kinds of data, comparison operations ultimately reduce all that data to a numeric value.  Searches on large data sets require hundreds to thousands of comparisons to find or filter down to a particular data set.  Integer comparisons may be a single operation while character string comparisons usually require at least one operation per character used.

Third, using the integer key reduces the size of table rows in related and child tables.  Each row in the order table contains the integer customer id instead of the longer customer number string.  An order_item table would use the shorter integer order id instead of the longer string order number.  The deeper the design encompasses nested one to many relationships, the greater the size reduction.

Fourth, child tables and their integer indexes are smaller with each table or index block containing more records.  Even if you are not concerned with your disk storage, consider the blocks loaded into the database cache cover a greater number of table/index entries and are more likely to be “hit” during query operations.  Active blocks remain the cache longer too so the table/index block may very well still be sitting in memory for the next query.   Smaller databases also need fewer backup resources and can be cloned or restored quicker.

NOTE: The vsize and dump functions can be used to determine storage size and see the storage values byte by byte.

create table customer (cust_nbr varchar2(30), cust_id number(38));
  for i in 1..10 loop
    insert into customer values ( to_char(power(10,i)), power(10,i) );
  end loop;
column cust_nbr format a12
column cust_id format 999999999999
column cust_nbr_dmp format a48
column cust_id_dmp format a20
select cust_nbr, vsize(cust_nbr), dump(cust_nbr) as cust_nbr_dmp, 
       cust_id,  vsize(cust_id),  dump(cust_id)  as cust_id_dmp
  from customer;

NOTE: Use the integer data type for keys of less than 2 billion values or number(38) for hundreds of quadrillions (999,999,999,999,999,999).  Changing number(16) to number(38) doesn’t require any changes to current stored values it just increases the size future values may use.



Using the Right Tool – When You Need a Command Line

 I use four primary tools with Oracle databases depending on the task – Oracle Enterprise Manager (OEM), TOAD, Oracle SQL Developer, and Oracle SQL Command Line (SQLcl) or SQL*Plus.  Yes, that old command line interface is still useful on occasion and needs to be in your Toolbox.  I love SQL Developer for PL/SQL coding and actually querying the database, TOAD is tops for many DBA tasks, OEM is unbeatable for an Enterprise or multiple database tasks, and querying the OEM repository provides some very useful information too.  But there are still times when nothing gets the job done better than SQL*Plus, particularly when running DDL scripts to deploy database changes or generating DDL needed for many routine DBA tasks.  Anything you can do in SQL*Plus you should be able to do in SQLcl – except for some reporting functions that really are not used very often anyway.  I’m using SQLcl to run most of my sql scripts these days.  SQLcl is a simple download and extract install and adds functionality from the old SQL*Plus; however, SQLcl requires JDK to run where SQL*Plus does not so I’m not ready to completely walk away from SQL*Plus just yet.

Production Problems

Occasionally server problems create a situation where no database connection is available except via SQL*Plus.  If you cannot connect to the database, then it’s very hard to troubleshoot what’s going on in the database so every DBA needs some basic SQL and SQL*Plus skills.  The good news is that because you can run SQL script files from within a SQL*Plus session, you can prepare some “check” or basic troubleshooting SQL scripts and deploy them onto your database server before problems occur.  Because these situations are very infrequent, this area is one place where I have not had a chance to see if SQLcl will also run when there are server side issues with memory, semaphores, process load, etc.


When you deploy/promote database changes into test or production environments there are advantages to using command line SQL*Plus/SQLcl rather than GUI interfaces, chief among them being able to reliably test and duplicate your efforts.  Placing all the DDL statements into SQL script files means being able to run the same statements in the same order in each deployment environment.  Using a simple spool file generates a log file for each script that can be reviewed for errors and timing.  You can even prefix the SQL script filenames with the deployment step number.  This kind of organization works well in team environments too because once the setup is understood, anyone on the team can execute the scripts in any given deployment.  By using a shell script wrapper around the SQL script files you can easily generate logs and determine if any problems were encountered.  My favorite shell wrapper takes the script name and database SID as input parameters, sets the oracle environment for the database based on the SID, runs the SQL script, and scans the generated log and displays any error messages.  In large project situations where there are multiple development and test databases along with frequent database refreshes/clones, the deployment log quickly shows which scripts/updates have been applied to a particular environment.

Health Checks

As you support databases running different applications or using different Oracle features you may find that some databases need custom checks or sometimes a situation requires a health check.  For example, after getting burned a couple of times with performance problems after a database bounce because memory parameter values did not match spfile parameter values, I started running a script to compare in memory to spfile parameter values prior to database shutdown.  Some of the non-prod environments that were frequently refreshed were prone to some configuration issues (like running on a pfile instead of a spfile or having the spfile or control files on non-ASM storage) so a script that queries the database and quickly checks on all the known trouble helps prevent surprises like not having an spfile in the backup.

Tablespace and Partition Management

Tablespace and partition management are two areas where consistency needs to be maintained for both administrative ease and performance.  Scripting DDL generation to manage date range partitioning by adding tablespaces, splitting partitions, and making the appropriate quota grants greatly aids in getting a consistent result for annual maintenance efforts.

More Information

For SQLcl:

For SQL*Plus:


PFILE and SPFILE Management

The PFILE or “init.ora” file is a text based database parameter file that has been around as long as the Oracle database.

The SPFILE is a non-text server parameter file that replaces the PFILE from version 9i forward.  Beginning with version 10g there are significant advantages to using the SPFILE including automatic SPFILE backups by RMAN and saving of memory management pool sizes on database shutdown so those pools can be initiated to the same sizes as the database is restarted.

In 11gR2, the standard is to use a PFILE that contains one line stating that the SPFILE is in ASM.  If other parameters are listed in the PFILE, the PFILE settings will be ignored and Oracle will use the parameter values stored in the SPFILE.

Comparing PFILE/SPFILE parameter values to system memory.

The following query can be run to determine parameter values that have been changed in memory but not the database parameter file.  This is a good query to run anytime there is a planned shutdown of the database scheduled to avoid having problems with change parameter values in the restart.

select sp.sid,, m.isdefault as mem_default, 
       decode(sp.isspecified,'TRUE', 'FALSE','TRUE') as sp_default,
       sp.value             as sp_value,
       m.value              as mem_value,
       sp.display_value     as sp_display_value,
       m.display_value      as mem_display_value,
       sp.update_comment    as sp_update_comment,
       m.update_comment     as mem_update_comment,
       m.issys_modifiable, m.isinstance_modifiable, m.isadjusted
  from v$spparameter  sp,
       v$parameter    m
 where =
   and sp.value <> m.value
 order by, sid;

Creating the pfile/spfile from database parameter memory values.

You can create a PFILE or an SPFILE from memory values using the following commands.

create spfile = '+<ASMPATH>/<SID>/spfile<SID>.ora' from memory;
create pfile = '<ORACLE_HOME>/dbs/pfile<SID>.ora' from memory;

Moving from PFILE to SPFILE – Using the same parameters for each instance.

  • On the first node/instance, create the spfile from database parameter values in memory.
SQL> create spfile = '+<ASMPATH>/<SID>/spfile<SID>.ora' from memory;
  • Edit the pfile on the node to include only a single line specifying the spfile location in ASM.
  • Restart the database instance on the first node.
svrctl stop database -d <SID> -n <node_name>
svrctl start database -d <SID> -n <node_name>
  • On each additional node in the cluster, edit the PFILE located on that node as above and restart the instance on that node to use the single SPFILE stored in ASM for the database.

Moving from PFILE to SPFILE – Using the different parameters for each instance.

Repeat this process on each node in the cluster:

  • Create the spfile from database parameter values in memory.
SQL> create spfile = '+<ASMPATH>/<SID>/spfile<SID>_<instance>.ora' 
SQL> from memory;
  • Edit the pfile to include only a single line specifying the spfile location in ASM.
  • Restart the database instance.
svrctl stop database -d <SID> -n <node_name>
svrctl start database -d <SID> -n <node_name>

Note: Before moving your SPFILE file to ASM, the database controlfiles should be moved to ASM.  MOS note “468458.1 – How to Move Controlfiles to ASM” can be used to relocate your controlfiles from an RMAN backup to ASM.

SQL*Plus Spooling Wrapper

--** start_spool
--** starts a spool file named with the passed prefix and the database SID
--** and timestamp suffixes.
column database_name noprint new_value database
column database_time noprint new_value now
select sys.database_name, to_char(sysdate,'yymmddhh24mi') as database_time from dual;
spool &1._&database._&now.
set timing on
set echo on