Overall RMS ISAM File Structure
		-------------------------------


Prolog - The prolog contains important file-wide information and is
		always in VBN 1.  The most important information it
		contains is the size of the data buckets, the VBN where 
		the area descriptors begin, the global buffer count, and
		the prolog version number.  Currently allowable prologs
		are 1, 2 and 3.

Key Descriptors - There is a key descriptor for each key defined in the
		file.  The first key descriptor is in VBN 1, and overlays
		the prolog.  Key descriptors provide information about
		each key in the file such as where the key appears in
		the record, number of segments, length of each segment,
		etc.  Things like the root VBN, root level, null
		character, compression flags are also there, along with
		a pointer to the next key descriptor.  If there is more
		than one key, then the second key descriptor begins in
		VBN 2.

Area Descriptors - These descriptors begin in first VBN after the last
		VBN to contain a key descriptor, and contain information
		about the areas of the file.  

Index and Data Buckets - The index and data buckets appear after the area
		descriptors.  RMS ISAM files have their index and data
		buckets in a B-tree arrangement.  The root (or top)
		index bucket contains a high key value and a pointer for
		each bucket below it in the structure.  Buckets at that
		level contain similar keys and pointers to buckets at the
		next lower level.  At the bottom level (level 0, or the
		data level) appear the data records.  Records at the primary 
		index data level contain the actual data bytes of the 
		records in the file.  Records at the secondary index data
		level (SIDRs) contain secondary key values and pointers
		to primary index data records with the corresponding
		alternate key value.  Bucket levels are numbered from
		0 (at the data, or bottom level) upwards to the root level.


				Prolog Structure
				----------------


+---------------------------------------------------------------------------+
/                               unused (11 bytes)                           /
+------------------+                                                        +
!   PLG$B_DBKTSIZ  !                                                        !
+------------------+--------------------------------------------------------+
!                                    unused                                 !
+--------------------------------------------------------+------------------+
/                               unused (85 bytes)        !    PLG$B_FLAGS   !
+------------------+------------------+                  +------------------+
!    PLG$B_AMAX    !    PLG$B_AVBN    !                                     /
+------------------+------------------+-------------------------------------+
!                unused               !              PLG$W_DVBN             !
+-------------------------------------+-------------------------------------+
!                                   PLG$L_MRN                               !
+---------------------------------------------------------------------------+
!                                   PLG$L_EOF                               !
+-------------------------------------+-------------------------------------+
!               PLG$W_GBC             !             PLG$W_VER_NO            !
+-------------------------------------+-------------------------------------+


* Note that the prolog structure overlays the key descriptor for the 
	primary key

* PLG$B_FLAGS, PLG$L_MRN, and PLG$L_EOF are only used in relative files

* PLG$B_AVBN - VBN of first area descriptor

* PLG$B_AMAX - maximum number of areas

* PLG$W_DVBN - first data bucket VBN

* PLG$W_VER_NO - prolog version number

* PLG$W_GBC - default global buffer count


				Key  Descriptor
				---------------


+---------------------------------------------------------------------------+
!                                  KEY$L_IDXFL                              !
+------------------+------------------+-------------------------------------+
!    KEY$B_LANUM   !    KEY$B_IANUM   !             KEY$W_NOFF              !
+------------------+------------------+------------------+------------------+
!  KEY$B_DATBKTSZ  !  KEY$B_IDXBKTSZ  !   KEY$B_ROOTLEV  !    KEY$B_DANUM   !
+------------------+------------------+------------------+------------------+
!                                 KEY$L_ROOTVBN                             !
+------------------+------------------+------------------+------------------+
!  KEY$B_NULLCHAR  !  KEY$B_SEGMENTS  !  KEY$B_DATATYPE  !    KEY$B_FLAGS   !
+------------------+------------------+------------------+------------------+
!            KEY$W_MINRECSZ           !   KEY$B_KEYREF   !    KEY$B_KEYSZ   !
+-------------------------------------+------------------+------------------+
!             KEY$W_DATFILL           !             KEY$W_IDXFILL           !
+-------------------------------------+-------------------------------------+
!            KEY$W_POSITION1          !            KEY$W_POSITION           !
+-------------------------------------+-------------------------------------+
!            KEY$W_POSITION3          !            KEY$W_POSITION2          !
+-------------------------------------+-------------------------------------+
!            KEY$W_POSITION5          !            KEY$W_POSITION4          !
+-------------------------------------+-------------------------------------+
!            KEY$W_POSITION7          !            KEY$W_POSITION6          !
+------------------+------------------+------------------+------------------+
!    KEY$B_SIZE3   !    KEY$B_SIZE2   !    KEY$B_SIZE1   !    KEY$B_SIZE    !
+------------------+------------------+------------------+------------------+
!    KEY$B_SIZE7   !    KEY$B_SIZE6   !    KEY$B_SIZE5   !    KEY$B_SIZE4   !
+------------------+------------------+------------------+------------------+
/                            KEY$T_KEYNAM (32 bytes)                        /
+                                                                           +
!                                                                           !
+---------------------------------------------------------------------------+
!                                  KEY$L_LDVBN                              !
+------------------+------------------+------------------+------------------+
!    KEY$B_TYPE3   !    KEY$B_TYPE2   !    KEY$B_TYPE1   !    KEY$B_TYPE    !
+------------------+------------------+------------------+------------------+
!    KEY$B_TYPE7   !    KEY$B_TYPE6   !    KEY$B_TYPE5   !    KEY$B_TYPE4   !
+------------------+------------------+------------------+------------------+


				Key Descriptor (continued)
				--------------


* KEY$L_IDXFL - VBN for next key descriptor

* KEY$W_NOFF - Offset to next key descriptor

* KEY$B_IANUM - index area number

* KEY$B_LANUM - level 1 index area number

* KEY$B_DANUM - data level area number

* KEY$B_ROOTLEV - Root level: height of index tree

* KEY$B_IDXBKTSZ - index bucket size

* KEY$B_DATBKTSZ - data bucket size

* KEY$L_ROOTVBN - VBN of root bucket

* KEY$B_FLAGS - duplicates (bit 0), change key (1), null key (2), 
	index compression (3), index uninitialized (4), key compression (6), 
	record compression (7)

* KEY$B_DATATYPE - data type for key

* KEY$B_SEGMENTS - number of segments in key

* KEY$B_NULLCHAR - null character if specified

* KEY$B_KEYSZ - key size

* KEY$B_KEYREF - key of reference

* KEY$W_MINRECSIZ - minimum record size

* KEY$W_xxxFILL - index and data fill values

* KEY$W_POSITIONx, KEY$B_SIZEx - beginning positions and sizes
	of up to 8 key segments

* KEY$T_KEYNAM - key name (ASCII counted string)

* KEY$L_LDVBN - first data bucket VBN


				Area  Descriptor
				----------------


+------------------+------------------+------------------+------------------+
!  AREA$B_ARBKTSZ  !   AREA$B_AREAID  !   AREA$B_FLAGS   !      unused      !
+------------------+------------------+------------------+------------------+
!    AREA$B_AOP    !    AREA$B_ALN    !             AREA$W_VOLUME           !
+------------------+------------------+-------------------------------------+
!                                 AREA$L_AVAIL                              !
+---------------------------------------------------------------------------+
!                                  AREA$L_CVBN                              !
+---------------------------------------------------------------------------+
!                                 AREA$L_CNBLK                              !
+---------------------------------------------------------------------------+
!                                  AREA$L_USED                              !
+---------------------------------------------------------------------------+
!                                 AREA$L_NXTVBN                             !
+---------------------------------------------------------------------------+
!                                  AREA$L_NXT                               !
+---------------------------------------------------------------------------+
!                                 AREA$L_NXBLK                              !
+-------------------------------------+-------------------------------------+
!                unused               !              AREA$W_DEQ             !
+-------------------------------------+-------------------------------------+
!                                  AREA$L_LOC                               !
+---------------------------------------------------------------------------+
!                                  AREA$W_RFI                               !
+-------------------------------------+                                     +
    <----  AREA$L_TOTAL_ALLOC         !                                     !
+-------------------------------------+-------------------------------------+
!                                     !          AREA$L_TOTAL_ALLOC <----     
+                                     +-------------------------------------+
!                                    unused                                 !
+-------------------------------------+                                     +
!             AREA$W_CHECK            !                                     !
+-------------------------------------+-------------------------------------+


				Area Descriptor (continued)
				---------------


* AREA$B_FLAGS - not used

* AREA$B_AREAID - area number

* AREA$B_ARBKTSZ - bucket size for area

* AREA$W_VOLUME - relative volume number

* AREA$B_ALN - extend allocation alignment

* AREA$B_AOP - alignment options: absolute alignment/hard (bit 0),
	locate on cylinder (1), contiguous best try (5), contiguous (7)

* AREA$L_AVAIL - reclaimed bucket chain

* AREA$L_CVBN - starting VBN for current extent

* AREA$L_CNBLK - number of blocks in current extent

* AREA$L_USED - number of blocks used

* AREA$L_NXTVBN - next VBN to use

* AREA$L_NXT - starting VBN for next extent

* AREA$L_NXBLK - number of blocks in next extent

* AREA$W_DEQ - default extend quantity

* AREA$L_LOC - start LBN on volume

* AREA$W_RFI - related file ID (6 bytes)

* AREA$L_TOTAL_ALLOC - total block allocation

* AREA$W_CHECK - checksum


			Prologue 3 Data Bucket Structure
			--------------------------------


	             (Note that picture runs right to left)


+-------------------------------------+------------------+------------------+
!            BKT$W_ADRSAMPLE          !  BKT$B_INDEXNO   !  BKT$B_CHECKCHAR !
+-------------------------------------+------------------+------------------+
!            BKT$W_NXTRECID           !            BKT$W_FREESPACE          !
+-------------------------------------+-------------------------------------+
!                                 BKT$L_NXTBKT                              !
+-------------------------------------+------------------+------------------+
               <----   data records   !    BKT$B_BKTCB   !    BKT$B_LEVEL   !
                                      +------------------+------------------+


* BKT$B_CHECKCHAR - This first byte of the bucket should be identical to
	the last byte of the bucket.  Both are incremented every time the 
	bucket is modified.  If the bucket check bytes are out of phase,
	RMS will complain about a bucket format check error: what this 
	usually indicates is that something has interrupted the writing
	of all blocks in a bucket.

* BKT$B_INDEXNO - The index number: 0 for primary; 1, 2, ... for alternates.

* BKT$W_ADRSAMPLE - The low order word of the bucket VBN.

* BKT$W_FREESPACE - The first byte of unused space in the bucket.

* BKT$W_NEXTRECID - Next available record id.

* BKT$L_NXTBKT - Horizontal link to next bucket.

* BKT$B_LEVEL - Level in the index structure.

* BKT$B_BKTCB - Control byte.  Can indicate, among other things, that this
	is the last bucket in the structure at this level.


		Prologue 3 Data Record Structure (with key compression)
		-------------------------------------------------------


			(Note that picture runs right to left)


    +---...----------------------------------------------------------------+
    |  key +   | cnt | len | record  |         RRV         | record | ctrl |
    |    data  |     |     | length  |    VBN       |  id  |   id   | byte |
    +---...----------------------------------------------------------------+

size:		byte   byte   word         6 bytes            word    byte


* The first key in each bucket is always fully expanded (but may be
	repeating character truncated, however).

* The high order 6 bits of the record control byte indicate that 
	the record is deleted (bit 2), or is an RRV (bit 3).  
	ANALYZE/RMS will display the state and position of these 
	bits.  The low order two bits are practically meaningless: 
	a typical non-deleted record that is not an RRV will have a 
	control byte of hex 02.

* Data records are assigned a record id to uniquely identify them
	within the data bucket.  These ids are assigned in the order
	of insertion, and may have nothing to do with the physical
	order of records within the data bucket.  RRVs are in
	essence "forwarding addresses" of records that are useful
	only after the record has been displaced by a bucket split.
	If a record has never been moved by a bucket split, then
	its RRV points to itself.

* Prolog 3 compression features imply something that may not be 
	obvious about record lengths: even with fixed-length records, 
	if there is data or key compression, then there must be a 
	record length, since the length can vary based on the amount 
	of compression.

* "len" and "cnt" are key compression fields. "len" is the length 
	of the 	key (not including the "cnt" byte).  "cnt" is the 
	count of front bytes, based on the previous key.  Repeating 
	characters at the end are truncated.  There is an example 
	given below.

* Prolog 3 data records ALWAYS have the key at the front (even if 
	there is no key compression).  If the key field is in the 
	middle of the record, it is still extracted and placed at 
	the front for performance reasons (of course, it is inserted 
	at the proper point before the record is returned to the user.


	Example of key compression using 6-byte string keys (see 
		explanation of "len" and "cnt" given above):

			(Example runs right to left)


	    Second key in bucket has 	   First key in bucket has
	      value "ABCDFF"              value "ABCDEF"

            key  cnt len                       key       cnt len
  ...data... 46  04  01...      ...data... 464544434241   00  06 ...


	Note here that with the second key fully expanded based on the 
	preceding key, we only come up with 5 characters because there has 
	been rear end truncation of repeating characters.  We manufacture 
	enough bytes of the last character (D, or hex 46) and append them to 
	make the key 6 bytes long.


				Data Record Compression
				-----------------------

	The compression algorithm is different for data records -- it is 
	strictly a repeating character truncation process.  The data portion 
	of the record begins immediately after the key.  Note that RMS will 
	not do repeating character truncation unless there are at least 5 
	repeating characters, to make sure that the extra overhead will not 
	negate the savings.  

	There are two fields associated with data record compression: a word 
	field which points to the compressed character, and a byte field that 
	tells how many repetitions of the character were truncated.  The 
	general format of the record is: pointer word, data segment, 
	truncation count byte; pointer word, data segment, truncation count 
	byte, etc.  An example of data record compression is given below.  
	Each set of {word, data, byte} is termed a compression segment.

	A record with a data portion that looks like:

		"ABBBBBBCDDDDDDDDDDD"		(A, 7 Bs, C, 11 Ds)

	will compress to two segments:

		(Example runs right to left)


 count      data      pointer        count    data   pointer   
  byte 		       word           byte             word    overhead
		
  0A        4443       0002            06     4241     0002     ...


				Index Bucket Structure
				----------------------

	Index buckets look identical, except that the "index #" byte is not 
	necessarily 0, and neither is the "index level" byte.  They of course 
	reflect the index number and the level in the index structure.
	
	Index levels are numbered with the root level being the highest level, 
	and data levels always being level 0.  Note that there is a data level
	for alternate index structures as well that consists of a key and an 
	RRV pointer.  The RRV pointer points to the primary data record with 
	that secondary key value.

	RMS saves index bucket space for all prologs by minimizing the size of 
	the field needed to represent a bucket's VBN pointer.  For prolog 3, 
	all VBN pointers in a particular index bucket are the same size, 
	maximized to the size necessary to represent the largest pointer in 
	the bucket.  Bits 3 and 4 of the bucket control byte indicate the 
	pointer size for the bucket: 00 means two-byte pointers, 01 means 
	three-byte pointers, and 10 means four-byte pointers.

	Note that if there is no index compression, RMS will do a binary search 
	through index buckets for the requested key value.  This of course 
	includes binary and integer keys.  This is why prolog 3 keeps all VBN 
	pointers in a given index bucket the same size.
	
	Index compression is done exactly like key compression.

	Prolog 3 index records are split into two parts, the key and the VBN 
	pointer.  The keys are at the beginning of the index bucket, and the 
	VBN pointers are at the end of the index bucket.  (A little silly,
	but it's too late now.)


			Secondary Index Data Records (SIDRs)
			------------------------------------


	"Data level" records of alternate indexes are called "SIDRs".  A SIDR 
	consists of a size word, followed by a key value and one or more RRVs 
	with  control fields.  The control field indicates whether or not the 
	record is deleted.  If data key compression is enabled for this index, 
	then the key will be compressed, otherwise not.  The following 
	illustrates the layout of a SIDR record.

	(Examples run right to left)

	With key compression:

    +--...------------------------------------------------------------------+
    |  ...  |  RRV2  | ctl |  RRV1  | ctl |    key    | cnt | len |  size   |
    +--...------------------------------------------------------------------+


	Without key compression:

    +--...-----------------------------------------------------+  
    |  ...   |  RRV2  | ctl |  RRV1   | ctl |   key   |  size  | 
    +--...-----------------------------------------------------+


      o  Record Operations (assumes no bucket splits)


          -  $PUT


             1.  Initialization/validation (if sequential  access,  is

                 key  value  of  new  record greater than that of last

                 record, etc.)


             2.  Position to point  of  insert  (involves  positioning

                 through  the  index structure by key, and leaves data

                 bucket locked)


             3.  Adjust "high set" appropriately


             4.  Build record  overhead  fields  in  bucket;  move  in

                 record itself


             5.  Lock new record


             6.  Update primary index (if necessary)


             7.  Unlock bucket


             8.  Insert alternate keys (if any) (extracted  from  user

                 buffer)


          -  $DELETE (assumes previous $GET/$FIND)


              +  V4 $DELETE


                 1.  Initialization/validation  (is  there  a  current

                     record, etc.)


                 2.  Position by RFA to record (leaves bucket locked)


                 3.  SAVE RECORD IN INTERNAL BUFFER


                 4.  Delete the RRV (if any)


                 5.  Delete the primary record itself


                 6.  Unlock bucket


                 7.  Delete all alternate keys, plucking  values  from

                     saved record


              +  V3 $DELETE


                 1.  Initialization/validation


                 2.  Position by RFA to record (leaves bucket locked)


                 3.  Delete RRV (if any)


                 4.  Delete all alternate keys -- NOTE BUCKET IS STILL

                     LOCKED at this point (*)


                 5.  Delete primary data record


                 6.  Unlock bucket


          -  $UPDATE (assumes previous $GET/$FIND)


             1.  Initialization/validation


             2.  Position by RFA to record (leaves bucket locked)


             3.  If alternate keys will change, then:


                 1.  Save old record


                 2.  Unlock data bucket


                 3.  Insert new SIDR entries


                 4.  Reposition by RFA to record (leaves bucket locked

                     again)


             4.  Is new record size less than or equal to old size?


                  +  YES (smaller or same as old record)


                     1.  Adjust high set appropriately


                     2.  Insert record


                  +  NO (larger than old record)


                     1.  Save record ID


                     2.  Perform "pseudo-$DELETE"


                     3.  Perform "pseudo-$PUT" (stuffing saved  record

                         ID) (*)


             5.  Unlock bucket


             6.  Delete old SIDR entries (if  any)  using  old  record

                 buffer


      o  Bucket Splits (or How to Complicate Matters by a  Few  Orders

         of Magnitude) (assumes old bucket is already locked)


         1.  Lock area


         2.  Allocate and lock new bucket


         3.  Unlock area


         4.  Format new bucket


         5.  Set new  bucket's  next  pointer  to  old  bucket's  next

             pointer


         6.  Set old bucket's next pointer to the new bucket


         7.  Move data into new bucket


         8.  Unlock new bucket


         9.  Scan old bucket for records past  the  split  point  that

             have RRVs, and keep in a table.


        10.  Update free space in old bucket and unlock it


        11.  Update RRVs in table to point to new location of records.

             This  involves  multiple  positionings  by RRV -- one for

             each RRV to be updated.


             Note that SIDRs are not updated!  SIDR entries may  point

             to  an  RRV,  which  in  turn  points to the real record.

             Because of the RRV updating process however,  this  level

             of indirection never goes beyond one.


      o  Performance Issues


          -  Bucket Size versus Record Size


              +  Larger data buckets yield fewer index buckets,  which

                 results  in fewer DIOs, but longer search times (CPU)

                 at the data level


              +  Smaller data buckets yield more index buckets,  which

                 results in more DIOs, but shorter search times at the

                 data level


              +  Keep in mind:  Prolog 3 performs binary  searches  in

                 index  buckets  IFF  there  is  no index compression.

                 Index bucket search times are greatly reduced, so the

                 major  consideration  for  CPU  usage  is  data level

                 searches.


              +  What are you willing to trade?   If  you  don't  have

                 memory  to  burn, then the trade is more significant.

                 If you DO have lots of memory, you can have the  best

                 of both worlds:


          -  Index Caching and Global Buffers.  If you can use  global

             buffers   to  cache  the  entire  index  structure,  then

             EVERYBODY WINS!  If you cache the entire index  structure

             locally (multibuffer count), then the process wins at the

             expense of other processes (using more  memory).   Really

             better only in the nonshared case.


             Note that this argument for caching  lots  of  the  index

             structure  falls  apart  for  sequential  access, where a

             small number of buffers is plenty (2).


          -  Compression Considerations.  Certain data record  formats

             do NOT lend themselves to compression.  Consider the case

             of a file created at the beginning of a year.   The  data

             records  in  this file consist of twelve blank subfields,

             with  data  inserted  into  one   subfield   each   month

             throughout the year.  OUCH!