Multics Technical Bulletin                                MTB-548
DM: Record Manager Design

To:       Distribution

From:     Lindsey Leroy Spratt and Matthew Clement Pierret

Date:     04/27/82

Subject:  Data Managment: Record Manager Design


     This document describes various facets of the implementation
of the  record manager.  The record  manager uses three utilities
to  do  its  work,  data_mgmt_util_  and  rm_search_,  which  are
described  here,  and  collection_manager_.   The data_mgmt_util_
utility understands the format of records; the rm_search_ utility
positions in a record collection; collection_manager_ handles all
of the storage for a collection of records.

Comments should be sent to the author:

via Multics Mail:
   Spratt.Multics on either MIT Multics or System M.

via US Mail:
   Lindsey Spratt
   Honeywell Information Systems, inc.
   4 Cambridge Center
   Cambridge, Massachusetts 02142

via telephone:
   (HVN) 261-9321, or
   (617) 492-9321


Multics  project  internal  working  documentation.   Not  to  be
reproduced or distributed outside the Multics project without the
consent of the author or the author's management.



                 1 Abstract . . . . . . . . . . . . . .     i
                 2 Record Manager Utilities . . . . . .     1
                 3 Record Storage . . . . . . . . . . .     1
                 4 Record Layout  . . . . . . . . . . .     2
                    4.1 The field_table structure . . .     2
                 5 Allocation of Records  . . . . . . .     4
                    5.1 Allocation of Records to
                     Control Intervals  . . . . . . . .     4
                       5.1.1 Overlength Records . . . .     4
                    5.2 Allocation of a Record within a
                     Control Interval . . . . . . . . .     5
                 6 Rewriting Records  . . . . . . . . .     5

Appendix      1 Search Algorithm  . . . . . . . . . . .     1

Multics Technical Bulletin                                MTB-548
DM: Record Manager Design


     The     record     manager     uses     three     utilities,
collection_manager_,  data_mgmt_util_,   and  vector_util_.   The
storage  management utility,  collection_manager_, deals directly
with  the file  manager and talks  in terms of  "elements" of the
record  collection.   These  elements  are  undifferentiated  bit
strings;  they are  allocated in "control  intervals".  A control
interval is 1024 words long, one  Multics page.  At some point in
the  future this  size will  become settable  on a  per-page file
basis,  approximately in  the range  of 512  to 4096  words.  The
collection_manager_ is documented in  MTBs 551, "Data Management:
Collection  Manager  Functional Specification.",  and  552, "Data
Management:  Collection Manager Design.".

     The  record   data  is  presented  to   the  caller  of  the
record_manager_ as a typed_vector  structure (see MTB-541 "Toward
a  unification  of data  manipulation on  Multics."  for  a brief
discussion  of data  vectors).  The  vector_util_ module provides
tools for creating and manipulating vectors of data.

     The internal  representation of records is  as a continuous,
peculiarly  formatted,   bit  string.   This   representation  is
sometimes referred to as the "string" representation, in contrast
to  the  "vector"  representation  used to  communicate  with the
caller of  the record_manager_.  Converting  between the "vector"
representation and the  "string" representation, and manipulating
and interpreting the contents  of the "string" representation are
all  functions performed  by data_mgmt_util_ (which  is also used
for this purpose by the index_manager_).


     Records  are   stored  as  elements   in  collecitons  using
collection_manager_.   The  elements  are  a  storage abstraction
managed as  undifferentiated bit strings.   The control intervals
in a record collection all  contain nothing but the elements used
to store the records, i.e.,  there is no control information kept
inside of a record collection.

     When  a record  is stored  in the  record collection,  it is
broken up into  as many datums as necessary,  where each datum is
contained within a control interval.   All of the datums together
make  a  single "element"  of  the record  collection.  A  set of
collection_manager_  primitives  exists  for  allocating records,
freeing them,  rewriting and retrieving.  The  record is known to
these  primitives as  only a string  of bits,  an "element".  The
structure of the  record as a set of fields  is known and used by
the data_mgmt_util_ utility.

MTB-548                                Multics Technical Bulletin
                                        DM: Record Manager Design


     The record is  layed out within the element.   The layout of
the  contents  is  described   in  the  "field_table"  structure,
described below.   The layout of  the record is  divided into the
fixed portion of the record  and the variable portion.  The fixed
portion is  composed of all  of the fixed-length  fields, and the
length data for the variable  fields.  This minimizes the expense
of accessing the fixed  fields, as intermingling fixed-length and
varying-length  data requires  calculating each  field's position
for each record  and separating the fixed and  varying field data
eliminates  the need  for calculating  the location  of the fixed

     The  variable  portion of  the record  consists only  of the
contents  of  the  variable   length  fields.   The  field  table
describes the beginning location for  the fixed length fields and
the location of the length  value for the variable length fields.
It also describes the type of  each field, and the location where
the first variable field's contents start.

4.1 The field_table structure

     The field_table structure is stored in the record_manager_'s
per-record-collection  header  information.   The  field_table is
declared in dm_field_table.incl.pl1.

Multics Technical Bulletin                                MTB-548
DM: Record Manager Design

dcl 1 field_table             aligned based (field_table_ptr),
      2 version               fixed bin (17) unal,
      2 number_of_fields      fixed bin (17) unal,
      2 maximum_field_name_length
                              fixed bin (17) unal,
      2 location_of_first_varying_field
                              fixed bin (17) unal,
      2 field                 (ft_number_of_fields refer
        3 name                char (ft_maximum_field_name_length
                              refer (field_table
                              .maximum_field_name_length)) var,
        3 flags               unal,
          4 descriptor_is_varying
                              bit (1) unal,
          4 length_is_in_characters
                              bit (1),
          4 must_be_zero      bit (7) unal,
        3 location            fixed bin (35),
        3 descriptor          bit (36) aligned,
        3 length_in_bits      fixed bin (35),
      2 varying_field_map     (ft_number_of_fields refer (field_table
        3 field_id            fixed bin (17) unal,
        3 varying_field_index fixed bin (17) unal;

     The field_table.field array has one  entry for each field in
the record, varying and non-varying.  The varying and non-varying
fields may be intermingled in this array.

     The  field_table.field.location variable,  for a non-varying
field, identifies the starting offset in the record_string of the
data  for the  field.  field_table.field.type  identifes the data
type of  the field which  is sufficient to  know the size  of the
value  for most  types.  For  some types,  e.g.  char_nonvar, the
field_table.field.size value  is used to determine  the length of
the value.

     For  a varying  field, field_table.field.location identifies
the  location  of  the  length   value  for  the  field  (in  the
record_string).  The actual value of  the field is found (for the
first               varying               field)               at
field_table.location_of_first_varying_field in the record_string.
The      second      varying      field      is      found     at
field_table.location_of_first_varying_field  +  (length  of first
varying field).

     The length and contents of  varying fields are separated out
in  this  fashion to  help speed  references to  specific varying
fields,  by  allowing quicker  references to  the lengths  of the

MTB-548                                Multics Technical Bulletin
                                        DM: Record Manager Design

preceding  varying  fields  than  that  allowed  by  the  obvious
alternative of storing:

length1 + contents1 + length2 + contents2 + length3 + contents3 + ...

where:  to pick up contents3  one positions to length1, reads it,
adds  it  in  and  positions to  length2,  reads  it,  skips over
contents2, reads length3, reads contents3.  Also, this allocation
technique allows  storing the fields  in the same  order as their
definition by the caller at index-creation time.

     The varying_field_map array speeds the location of any given
field, which is of a varying type.

     The varying_field_map (ordinality_of_varying_field).field_id
is used  to determine the index  into field_table.field array for
the Nth varying field.

     The                                        varying_field_map
(field_array_index).varying_field_index is used  to determine the
ordinality  among  varying fields  of a  varying field  given its
field_id, its position in the field_table.field array.


     The technique for allocating  records has two parts; finding
an  appropriate  control interval  and allocating  records within
control intervals.

5.1 Allocation of Records to Control Intervals

     The basic  scheme for determining which  control interval in
which  to allocate  a record  follows:  The  tree is  searched to
determine the correct position for  the new record.  This results
in a  control interval id and  a slot being selected  for the new
record.  If  there is sufficient  room for the new  record in the
control interval selected, a single element containing the record
is allocated in this control interval at the selected slot.


     The   above   scenario  is   somewhat  complicated   by  the
possibility that a record will be  too long to fit in one control
interval.  To store an overlength record it is necessary to break
it into  multiple elements.  The  number of elements  required to
store  a  record is  equal to  the least  greater integer  of the
length of the record divided by the maximum allowed element size.

Multics Technical Bulletin                                MTB-548
DM: Record Manager Design

(i.e.,    number_of_elements    =     ceil    (record_length    /
maximum_allowed_element_size)  If  record_length  is  not  evenly
divisible  by the  maximum_allowed_element_size, then  one of the
elements of the record  is less than maximum_allowed_element_size
in length.  Otherwise, all of the elements are maximum sized.  If
there  is  a  less-than-maximum  sized element,  it  is allocated
first, and in the same fashion  as an element which is the entire
record as discussed above.  For the maximum-sized elements, which
necessarily  occupy  entire  control  intervals, a  set  of empty
control intervals is used.

5.2 Allocation of a Record within a Control Interval

     Once  the control  interval has  been selected  in which the
record  is  to be  inserted,  it may  be  necessary to  alter the
construction of the control interval  to make room for the record
contents.  The need to make room for the record's contents arises
when  there is  insufficient room  in the  "unused" space  of the
control  interval  for  its  contents,  but  enough  room  if the
"unused" space  and the scatterred free  space are combined.  The
scattered  free space  comes from deleting  (or rewriting larger)
records  and  simply leaving  the  storage their  contents occupy
behind to be recovered later.  This is solved by "compacting" the
control interval.

     Scatterred free space compaction is accomplished by making a
"logical" copy of  the control interval by inserting  each of the
records in the control interval  in a temp control interval, then
physically  copying  this  "compacted"  version  of  the  control
interval back into the original control interval.  This increases
the  unused  space  of  the control  interval  because  there was
scattered free space in the original copy.


     When  a  record  has  been  allocated,  it  is  assigned  an
identifier  corresponding   to  its  location.    The  identifier
contains  the  control interval  number  in which  the  record is
stored and  an index in  an array of offsets  within that control
interval.   The  physical location  of the  record inside  of the
control interval may change  as storage requirements dictate, but
once allocated, a record holds  on to the array slot.  Fulfilling
this  requirement  entails some  non-obvious behavior  in certain
cases of rewriting records.

     In the normal case, the record is stored in a single control
interval.  Rewriting a record with  a value not greater in length
than the old entails overwriting the old record.  Excess space is

MTB-548                                Multics Technical Bulletin
                                        DM: Record Manager Design

left  to be  reclaimed later.   Rewriting a  record with  a value
greater in length  than the old may require  allocating new space
in  the control  interval and freeing  the old.  If  there is not
enough room in  the free pool to allocate the  new value, but the
amount of free space plus the amount of scatterred space plus the
length of  the old value is  enough to allocate the  new value, a
scatterred free space compaction is  performed, and the new value
is stored in  the new free pool.  If even  a compaction would not
yield enough free  space in which to allocate  the new value, the
value is allocated in another control interval in the same manner
as  allocating  a  new  record.   There  is  one  exception:  the
identifier  of the  record does  not change  to indicate  the new
location of the record.  Instead, a "pointer" to the new value is
stored in the old control interval.  The indirection now required
to get to the record is sacrificed in order to gain permanency of
record identifiers.

     This  scheme is  curther complicated  by records  which span
control intervals.  If a record  value is larger than the maximum
size  for  an element,  the trailing  maximum sized  portions are
allocted  in  control intervals,  just as  when allocating  a new
record.   If  the  record  already spans  control  intervals, the
control  intervals  already  in  use  are  reused,  with unneeded
control  intervals  freed (if  the  record has  shrunk)  or extra
control intervals allocated (if the record has grown).

                       1  SEARCH ALGORITHM