Berkeley DB Module

Will Quan

   Cisco Systems

Edited by

Will Quan

   Copyright © 2007 Cisco Systems

   Table of Contents

   1. Admin Guide

        1. Overview
        2. Dependencies

              2.1. Kamailio Modules
              2.2. External Libraries or Applications

        3. Exported Parameters

              3.1. auto_reload (integer)
              3.2. log_enable (integer)
              3.3. journal_roll_interval (integer seconds)

        4. Exported Functions
        5. Exported MI Functions

              5.1. bdb_reload

        6. Installation and Running
        7. Database Schema and Metadata
        8. METADATA_COLUMNS (required)
        9. METADATA_KEYS (required)
        10. METADATA_READONLY (optional)
        11. METADATA_LOGFLAGS (optional)
        12. DB Maintaince Script : kamdbctl
        13. DB Recovery : kambdb_recover
        14. Known Limitations

   List of Examples

   1.1. Set auto_reload parameter
   1.2. Set log_enable parameter
   1.3. Set journal_roll_interval parameter
   1.5. contents of version table
   1.9. kamdbctl
   1.10. kambdb_recover usage

Chapter 1. Admin Guide

   Table of Contents

   1. Overview
   2. Dependencies

        2.1. Kamailio Modules
        2.2. External Libraries or Applications

   3. Exported Parameters

        3.1. auto_reload (integer)
        3.2. log_enable (integer)
        3.3. journal_roll_interval (integer seconds)

   4. Exported Functions
   5. Exported MI Functions

        5.1. bdb_reload

   6. Installation and Running
   7. Database Schema and Metadata
   8. METADATA_COLUMNS (required)
   9. METADATA_KEYS (required)
   10. METADATA_READONLY (optional)
   11. METADATA_LOGFLAGS (optional)
   12. DB Maintaince Script : kamdbctl
   13. DB Recovery : kambdb_recover
   14. Known Limitations

1. Overview

   This is a module which integrates the Berkeley DB into SIP-router. It
   implements the DB API defined in SIP-router.

2. Dependencies

   2.1. Kamailio Modules
   2.2. External Libraries or Applications

2.1. Kamailio Modules

   The following modules must be loaded before this module:
     * No dependencies on other Kamailio modules.

2.2. External Libraries or Applications

   The following libraries or applications must be installed before
   running Kamailio with this module loaded:
     * Berkeley Berkeley DB - an embedded database. Version >= 4.6.

3. Exported Parameters

   3.1. auto_reload (integer)
   3.2. log_enable (integer)
   3.3. journal_roll_interval (integer seconds)

3.1. auto_reload (integer)

   The auto-reload will close and reopen a Berkeley DB when the files
   inode has changed. The operation occurs only duing a query. Other
   operations such as insert or delete, do not invoke auto_reload.

   Default value is 0 (1 - on / 0 - off).

   Example 1.1. Set auto_reload parameter
modparam("db_berkeley", "auto_reload", 1)

3.2. log_enable (integer)

   The log_enable boolean controls when to create journal files. The
   following operations can be journaled: INSERT, UPDATE, DELETE. Other
   operations such as SELECT, do not. This journaling are required if you
   need to recover from a corrupt DB file. That is, kambdb_recover
   requires these to rebuild the db file. If you find this log feature
   useful, you may also be interested in the METADATA_LOGFLAGS bitfield
   that each table has. It will allow you to control which operations to
   journal, and the destination (like syslog, stdout, local-file). Refer
   to bdblib_log() and documentation on METADATA.

   Default value is 0 (1 - on / 0 - off).

   Example 1.2. Set log_enable parameter
modparam("db_berkeley", "log_enable", 1)

3.3. journal_roll_interval (integer seconds)

   The journal_roll_interval will close and open a new log file. The roll
   operation occurs only at the end of writing a log, so it is not
   guaranteed to to roll 'on time'.

   Default value is 0 (off).

   Example 1.3. Set journal_roll_interval parameter
modparam("db_berkeley", "journal_roll_interval", 3600)

4. Exported Functions

   No function exported to be used from configuration file.

5. Exported MI Functions

   5.1. bdb_reload

5.1. bdb_reload

   Causes db_berkeley module to re-read the contents of specified table
   (or dbenv). The db_berkeley DB actually loads each table on demand, as
   opposed to loading all at mod_init time. The bdb_reload operation is
   implemented as a close followed by a reopen. Note- bdb_reload will fail
   if a table has not been accessed before (because the close will fail).

   Name: bdb_reload

   Parameters: tablename (or db_path); to reload a particular table
   provide the tablename as the arguement (eg subscriber); to reload all
   tables provide the db_path to the db files. The path can be found in
   kamctlrc DB_PATH variable.

6. Installation and Running

   First download, compile and install the Berkeley DB. This is outside
   the scope of this document. Documentation for this procedure is
   available on the Internet.

   Next, prepare to compile SIP-router with the db_berkeley module. In the
   directory /modules/db_berkeley, modify the Makefile to point to your
   distribution of Berkeley DB. You may also define 'BDB_EXTRA_DEBUG' to
   compile in extra debug logs. However, it is not a recommended
   deployment to production servers.

   Because the module dependes on an external library, the db_berkeley
   module is not compiled and installed by default. You can use one of the
   next options.
     * edit the "Makefile" and remove "db_berkeley" from
       "excluded_modules" list. Then follow the standard procedure to
       install Kamailio: "make all; make install".
     * from command line use: 'make all include_modules="db_berkeley";
       make install include_modules="db_berkeley"'.

   Installation of SIP-router is performed by simply running make install
   as root user of the main directory. This will install the binaries in
   /usr/local/sbin/. If this was successful, SIP-router control engine
   files should now be installed as /usr/local/sbin/kamdbctl.

   Decide where (on the filesystem) you want to install the Berkeley DB
   files. For instance, '/usr/local/etc/kamailio/db_berkeley' directory.
   Make note of this directory as we need to add this path to the kamctlrc
   file. Note: SIP-router will not startup without these DB files.

   Edit kamctlrc - There are two parameters in this file that should be
   configured before kamdbctl script can work properly: DBENGINE and
   DB_PATH. Edit file: '/usr/local/etc/sip-router/kamctlrc'
                ## database type: MYSQL, PGSQL, DB_BERKELEY, or DBTEXT, by defau
lt none is loaded
                # DBENGINE=DB_BERKELEY

                ## database path used by dbtext or db_berkeley
                # DB_PATH="/usr/local/etc/kamailio/db_berkeley"

   (Optional) Pre creation step- Customize your meta-data. The DB files
   are initially seeded with necessary meta-data. This is a good time to
   review the meta-data section details, before making modifications to
   your tables dbschema. By default, the files are installed in
   '/usr/local/share/sip-router/db_berkeley/sip-router' By default these
   tables are created Read/Write and without any journalling as shown.
   These settings can be modified on a per table basis. Note: If you plan
   to use kambdb_recover, you must change the LOGFLAGS.

   Execute kamdbctl - There are three (3) groups of tables you may need
   depending on your situation.
                kamdbctl create                 (required)
                kamdbctl presence               (optional)
                kamdbctl extra                  (optional)

   Modify the SIP-router configuration file to use db_berkeley module. The
   database URL for modules must be the path to the directory where the
   Berkeley DB table-files are located, prefixed by "berkeley://", e.g.,

   A couple other IMPORTANT things to consider are the 'db_mode' and the
   'use_domain' modparams. The description of these parameters are found
   in usrloc documentation.

   Note on db_mode- The db_berkeley module will only journal the moment
   usrloc writes back to the DB. The safest mode is mode 3 , since the
   db_berkeley journal files will always be up-to-date. The main point is
   the db_mode vs. recovery by journal file interaction. Writing journal
   entries is 'best effort'. So if the hard drive becomes full, the
   attempt to write a journal entry may fail.

   Note on use_domain- The db_berkeley module will attempt natural joins
   when performing a query. This is basically a lexigraphical string
   compare using the keys provided. In most places in the db_berkeley
   dbschema (unless you customize), the domainname is identified as a
   natural key. Consider an example where use_domain = 0. In table
   subscriber, the db will be keying on 'username|NULL' because the
   default value will be used when that key column is not provided. This
   effectivly means that later queries must consistently use the username
   (w.o domain) in order to find a result to that particular subscriber
   query. The main point is 'use_domain' can not be changed once the
   db_berkeley is setup.

7. Database Schema and Metadata

   All Berkeley DB tables are created via the kamdbctl script. This
   section provides details as to the content and format of the DB file
   upon creation.

   Since the Berkeley DB stores key value pairs, the database is seeded
   with a few meta-data rows . The keys to these rows must begin with
   'METADATA'. Here is an example of table meta-data, taken from the table

   Note on reserved character- The '|' pipe character is used as a record
   delimiter within the Berkeley DB implementation and must not be present
   in any DB field.

   Example 1.4. METADATA_COLUMNS
table_name(str) table_version(int)

   In the above example, the row METADATA_COLUMNS defines the column names
   and type, and the row METADATA_KEY defines which column(s) form the
   key. Here the value of 0 indicates that column 0 is the key(ie
   table_name). With respect to column types, the db_berkeley modules only
   has the following types: string, str, int, double, and datetime. The
   default type is string, and is used when one of the others is not
   specified. The columns of the meta-data are delimited by whitespace.

   The actual column data is stored as a string value, and delimited by
   the '|' pipe character. Since the code tokenizes on this delimiter, it
   is important that this character not appear in any valid data field.
   The following is the output of the ' dump version'
   command. It shows contents of table 'version' in plain text.

   Example 1.5. contents of version table
 table_name(str) table_version(int)

8. METADATA_COLUMNS (required)

   The METADATA_COLUMNS row contains the column names and types. Each is
   space delimited. Here is an example of the data taken from table
   subscriber :

   Example 1.6. METADATA_COLUMNS
username(str) domain(str) password(str) ha1(str) ha1b(str) first_name(str) last_
name(str) email_address(str) datetime_created(datetime) timezone(str) rpid(str)

   Related (hardcoded) limitations:
     * maximum of 32 columns per table.
     * maximum tablename size is 64.
     * maximum data length is 2048

   Currently supporting these five types: str, datetime, int, double,

9. METADATA_KEYS (required)

   The METADATA_KEYS row indicates the indexes of the key columns, with
   respect to the order specified in METADATA_COLUMNS. Here is an example
   taken from table subscriber that brings up a good point:

   Example 1.7. METADATA_KEYS
 0 1

   The point is that both the username and domain name are require as the
   key to this record. Thus, usrloc modparam use_domain = 1 must be set
   for this to work.

10. METADATA_READONLY (optional)

   The METADATA_READONLY row contains a boolean 0 or 1. By default, its
   value is 0. On startup the DB will open initially as read-write (loads
   metadata) and then if this is set=1, it will close and reopen as read
   only (ro). I found this useful because readonly has impacts on the
   internal db locking etc.

11. METADATA_LOGFLAGS (optional)

   The METADATA_LOGFLAGS row contains a bitfield that customizes the
   journaling on a per table basis. If not present the default value is
   taken as 0. Here are the masks so far (taken from bdb_lib.h):

#define JLOG_NONE 0
#define JLOG_INSERT 1
#define JLOG_DELETE 2
#define JLOG_UPDATE 4
#define JLOG_STDOUT 8
#define JLOG_SYSLOG 16

   This means that if you want to journal INSERTS to local file and syslog
   the value should be set to 1+16=17. Or if you do not want to journal at
   all, set this to 0.

12. DB Maintaince Script : kamdbctl

   Use the kamdbctl script for maintaining SIP-router Berkeley DB tables.
   This script assumes you have DBENGINE and DB_PATH setup correctly in
   kamctlrc. Note Unsupported commands are- backup, restore, migrate,
   copy, serweb.

   Example 1.9. kamdbctl
usage: kamdbctl create
       kamdbctl presence
       kamdbctl extra
       kamdbctl drop
       kamdbctl reinit
       kamdbctl bdb list         (lists the underlying db files in DB_PATH)
       kamdbctl bdb cat       db (prints the contents of db file to STDOUT in pl
       kamdbctl bdb swap      db (installs by db -> db.old; -> db)
       kamdbctl bdb append    db datafile (appends data to a new instance of db;
 output DB_PATH/
       kamdbctl bdb newappend db datafile (appends data to a new instance of db;
 output DB_PATH/

13. DB Recovery : kambdb_recover

   The db_berkeley module uses the Concurrent Data Store (CDS)
   architecture. As such, no transaction or journaling is provided by the
   DB natively. The application kambdb_recover is specifically written to
   recover data from journal files that SIP-router creates. The
   kambdb_recover application requires an additional text file that
   contains the table schema.

   The schema is loaded with the '-s' option and is required for all
   operations. Provide the path to the db_berkeley plain-text schema
   files. By default, these install to

   The '-h' home option is the DB_PATH path. Unlike the Berkeley
   utilities, this application does not look for the DB_PATH environment
   variable, so you have to specify it. If not specified, it will assume
   the current working directory. The last argument is the operation.
   There are fundamentally only two operations- create and recover.

   The following illustrates the four operations available to the

   Example 1.10. kambdb_recover usage
usage: ./kambdb_recover -s schemadir [-h home] [-c tablename]
        This will create a brand new DB file with metadata.

usage: ./kambdb_recover -s schemadir [-h home] [-C all]
        This will create all the core tables, each with metadata.

usage: ./kambdb_recover -s schemadir [-h home] [-r journal-file]
        This will rebuild a DB and populate it with operation from journal-file.
        The table name is embedded in the journal-file name by convention.

usage: ./kambdb_recover -s schemadir [-h home] [-R lastN]
        This will iterate over all core tables enumerated. If journal files exis
t in 'home',
        a new DB file will be created and populated with the data found in the l
ast N files.
        The files are 'replayed' in chronological order (oldest to newest). This
        allows the administrator to rebuild the db with a subset of all possible
        operations if needed. For example, you may only be interested in
        the last hours data in table location.

   Important note- A corrupted DB file must be moved out of the way before
   kambdb_recover is executed.

14. Known Limitations

   The Berkeley DB does not nativly support an autoincrement (or sequence)
   mechanism. Consequently, this version does not support surragate keys
   in dbschema. These are the id columns in the tables.