Multics Technical Bulletin                                MTB-508
DM Architecture


To:       Distribution

From:     Lindsey L. Spratt

Date:     05/14/81

Subject:  Data Management: An Architectural Overview


1 ABSTRACT

    This document presents some of the basic concepts of the new
Data  Management  architecture for  Multics.   Each of  the major
design areas is introduced, and a  discussion of some of the more
important interactions between them is presented.


Comments should be sent to the author:

via Multics Mail:
  Spratt.Multics on either MIT Multics or System M.

via US Mail:
  Lindsey Spratt
  Honeywell Information Systems, inc.
  4 Cambridge Center
  Cambridge, Massachusetts 02142

via telephone:
  (HVN) 261-9321, or
  (617) 492-9321


_________________________________________________________________


Multics  project  internal  working  documentation.   Not  to  be
reproduced or distributed outside the Multics project without the
consent of the author or the author's management.


                            Page i. 


                           CONTENTS


                                                        Page

                1 Abstract . . . . . . . . . . . . . .     i
                2 Introduction . . . . . . . . . . . .     1
                   2.1 Note on the Organization of
                    Data Management Documents  . . . .     1
                3 The Data Management Architecture . .     3
                   3.1 Some Basic Definitions  . . . .     3
                      3.1.1 The Layer  . . . . . . . .     3
                      3.1.2 The Request  . . . . . . .     4
                      3.1.3 The Object of Discourse  .     4
                   3.2 Distributed System Architecture
                    (DSA) and Multics Data Management      4
                4 Example Use of DSR System  . . . . .     5
                   4.1 Definition of the Database  . .     5
                   4.2 Accessing . . . . . . . . . . .     7
                5 Objects of Discourse in the DSR
                 System  . . . . . . . . . . . . . . .     7
                   5.1 Description of the Relationship
                    Between the Objects of Discourse .     9
                6 External Access Layer  . . . . . . .    11
                   6.1 Services  . . . . . . . . . . .    11
                   6.2 Naming  . . . . . . . . . . . .    11
                   6.3 The Interface . . . . . . . . .    12
                7 Container Access Layer . . . . . . .    13
                   7.1 Services  . . . . . . . . . . .    13
                   7.2 The Interface . . . . . . . . .    13
                   7.3 Administrative Services . . . .    14
                   7.4 Naming  . . . . . . . . . . . .    14
                   7.5 Internal Structure  . . . . . .    14
                8 Page Access Layer  . . . . . . . . .    15
                   8.1 Services  . . . . . . . . . . .    15
                   8.2 Naming  . . . . . . . . . . . .    15
                9 Media Access Layer . . . . . . . . .    15
                   9.1 Services  . . . . . . . . . . .    15
                10 Ancillary Services System . . . . .    16
                   10.1 Services . . . . . . . . . . .    16
                      10.1.1 Transaction Management  .    16
                      10.1.2 Concurrent Access
                       Synchronization . . . . . . . .    17
                      10.1.3 Recovery Management . . .    18


                              iii Multics Technical Bulletin                                MTB-508
DM Architecture


2 INTRODUCTION

    The work being done is not simply an attempt to "make vfile_
better" or to  provide MRDS with a better file  manager; it is to
make  data  management  on  Multics  better  and  more respected.
Possibly  the  single-most important  objective  is to  make data
management  a   part  of  the  operating   system,  not  just  an
application on  top of it.   This means a  large concentration of
effort in hardcore-related issues.

    Part  and  parcel  to  making  data  management  "better" is
providing  Multics  with ways  to  cost effectively  and reliably
offer a coherent set of  data management services.  Some of these
are:

1)   Reliability and recovery against system failure.

2)   Handling of very large sets of data.

3)   Integrity of data.

4)   Concurrent access synchronization.

5)   Speed of access.

6)   Transaction management.

7)   Security    of   data,    with   both    discretionary   and
    non-discretionary access control.

    As  lagniappe  (something  extra),  we  want  to  centralize
redundant  code  as much  as  possible and  make  useful routines
available as external entrypoints,  both to save programming time
and to facilitate the writing of user utilities that can make use
of these features.

    An exhaustive  list of the  goals of data  management can be
found in MTB-481.


2.1 Note on the Organization of Data Management Documents

    There  is  a  set  of  documents  describing  the  new  data
management  of  which  this  document  is  intended  as  a  broad
overview.   There is  a level of  documents immediately following
this  one which  provide overviews of  specific areas  of the new
Data Management architecture.  Each  of these documents, in turn,
has the potential  for a series of documents  of a greater degree
of detail deriving from it.


                            Page 1. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


    The documents are as follows:

1)   MTB-508:   Architectural  Overview  -  A  discussion  of the
    entire Data Management Architecture.

2)   MTB-509:   External  Access  Layer -  Describes  the various
    collections  of  data  management  services  which  the Data
    Management architecture provides to  its users, and presents
    the elements of the implementation.

3)   MTB-510:   Container  Access Layer  -  Describes the  set of
    services  which   this  layer  provides   and  presents  the
    implementation of these services.

4)   MTB-511:  Page Access Layer -  Describes the set of services
    which this layer provides and presents the implementation.

5)   MTB-512:   Transaction  Management Overview  -  Defines what
    transactions  are, their  uses, their  interactions, and the
    techniques for managing them.

6)   MTB-513:   Recovery  Management  Overview  -  Describes  the
    possible  failure situations  for which recovery  of data is
    necessary, and also describes the techniques used to achieve
    this recovery.

7)   MTB-514:   Concurrency Management  Overview -  Describes how
    concurrent operations are synchronized.

8)   MTB-515:  Security - Describes how access security interacts
    with the architecture.

9)   MTB-516:   Distributed  Databases -  Describes how  the data
    management   architecture   will   deal   with   distributed
    environments.

Additional documents  will be produced  to discuss in  detail the
specification  and implementation  of various  parts of  the Data
Management Architecture.

    There are some other documents which are not part of the set
of  documents  describing the  Data Management  Architecture, but
relate closely to it.  They are:

  MTB-481 -   "A  Framework of  Goals toward  more Complete Data
              Management Capabilities on Multics."
  MTB-483 -   "Problems with Data Management on Multics."
  MTB-487 -   "Medium Range Goals in Data Management."


                            Page 2. 
Multics Technical Bulletin                                MTB-508
DM Architecture


3 THE DATA MANAGEMENT ARCHITECTURE

    The Data  Management architecture is composed  of two parts,
the  Data Storage  and Retrieval  system (DSR)  and the Ancillary
Services system  (AS).  These systems  provide logically distinct
sets  of services.   The DSR  system has  only the  capability of
storing  and retrieving  data, while the  AS system  knows what a
transaction is,  how to keep  track of locks, and  how to recover
from  various interruptions.   DSR and  AS may  make use  of each
other, particularly DSR  shall use AS, but it  is possible to use
one of  them without the other  as well.  If one  doesn't want to
have transactions and  all of the other things the  AS can add to
the  DSR,  it is  reasonable  to use  the  DSR system  by itself.
Similarly, one may use  the locking, journalling, and transaction
definition  facilities  of the  AS system  without using  the DSR
system,  if one  isn't interested  in doing  any data  storage or
retrieval via the DSR system.

    The  DSR  system  is   organized  in  several  layers.   The
outermost  layer is  the one which  all users  of Data Management
(DM) work through, or make their  requests of; it is the External
Access Layer.   The complete list of  (software) layers, in order
of depth, is:

  1) External Access Layer
  2) Container Access Layer
  3) Page Access Layer
  4) Media Access Layer


3.1 Some Basic Definitions

    The terms "layer", "request",  and "object of discourse" all
have fairly  specific meanings within the  discussion of the Data
Management Architecture.   The explanations of  these terms which
follow  are the  intended meanings whenever  one of  the terms is
used anywhere in the Data Management documents.


3.1.1 THE LAYER

    Within the  proposed architecture the term  "layer" has some
strong implications.   All of a layer's  services are employed by
invoking that layer's interface  requests.  All processing within
a  layer depends  on the  services of  the next  "lower" layer to
implement  data  storage and  retrieval "action"  requests.  This
processing  within  a layer  may employ  certain portions  of the
requests of the AS, depending on the particular implementation of
services being used.  It is  important to emphasize that the same
implementation of DSR may use the AS from several layers, this is


                            Page 3. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


why the  DSR system and  the AS system are  presented as separate
systems; the DSR system is a  user of the AS system, primarily in
an advisory fashion.


3.1.2 THE REQUEST

    Each layer  has two types  of requests which  it implements;
"administrative" and  "action".  The administrative  requests are
those which  allow one to specify  the method, or implementation,
which the layer is to use in carrying out its duties.  Typically,
the  implementation  of  these  requests  does  not  require  the
services of any  other layer of the DSR  system, specifically the
next lower layer.  The  action requests, already mentioned above,
implement the  data storage and retrieval  services of the layer.
The implementation of these requests necessarily does include the
use  of  the action  requests  of the  lower  layer.  In  fact, a
layer's action requests are only invoked by the implementation of
the next higher layer.   Administrative requests are only invoked
from outside the DSR system.


3.1.3 THE OBJECT OF DISCOURSE

    At the interface for any  given layer there exists a limited
set of objects which can be named when using the requests of that
layer.  These are the objects of discourse for that layer.  There
is a set  of objects for the administrative  requests and one for
the action requests.

    These two sets  of objects of discourse, the  objects of the
administrative discourse and the objects of the action discourse,
may be identical.  It is not  necessary that they be so, however.
An example object of discourse  is the "relation", when using the
mdbm_util_ interface of the External Access Layer It is an object
of  both  the  action   discourse  and  also  the  administrative
discourse to the External Access Layer.


3.2 Distributed System Architecture (DSA) and Multics Data
Management

    The basic outline  of this proposal is drawn  on the current
work for  DSA Data Storage  and Retrieval.  The  Data Storage and
Retrieval  portion  of  Multics  Data Management  which  is being
proposed will be compatible  with DSA DS&R sub-architecture.  The
Ancillary Services  portion of Multics DM  needs to be compatible
with the  Process and Resource Control  subarchitecture of DSA to
allow for  transaction management, recovery,  and synchronization
within  a DSA  environment.  Both architectures  (Multics DM DS&R


                            Page 4. 
Multics Technical Bulletin                                MTB-508
DM Architecture


and  DSA  DS&R)  employ  the layering  formalism,  with  a simple
correspondence.

    The  Internal  Access  Layer  of   the  DSA  DS&R  has  been
incorporated into  the External Access Layer  of Multics DM DS&R.
For the  other layers of DSA,  they have the same  names as their
corresponding layers in Multics DM.


4 EXAMPLE USE OF DSR SYSTEM

    To illustrate how the objects discussed in DSR interact with
each  other,  and  how  they  correspond  with  and  are  used by
traditional database  objects, consider the  following relational
database,  office, composed  of three  relations:  old_employees,
current_employees,   and   managers.    Both   old_employees  and
current_employees  relations  have the  attributes name,  job and
address; managers relation has  attributes mgr_name and emp_name.
The old_employees relation will rarely be accessed.

    The system  was previously defined,  in terms of  page files
and media, as such:

    Page File 1 is on Media File 1 (Tape A).
    Page File 2 is on Media File 2 (Disk A)
               and on Media File 3 (Disk B).


4.1 Definition of the Database

    When the database is  created, the Database Administrator or
the database management system describes the database as follows:

1)   The database is defined to be the external file "office".

2)   The containers which make up office are defined and created.

3)   Each container is determined to use a certain page file.

4)   Each container is described as  to how storage management is
    to be done.

These four steps yield the following set of containers which make
up the "office" external file:


                            Page 5. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


office: external file
old_employee_tuples: container, style3, Page File 1
ACTIVE_DATA: container, style4, Page File 2
old_employee_index: container, style1, Container ACTIVE_DATA
employee_tuples: container, style3, Container ACTIVE_DATA
employee_index: container, style1, Container ACTIVE_DATA
manager_tuples: container, style3, Container ACTIVE_DATA
manager_hash: container, style2, Container ACTIVE_DATA

For  this  example,  old_employee_tuples  contains  data  for the
old_employees  relation,  and  old_employee_index  contains  that
relation's  index.  Likewise  for the other  two relations.  With
each container is  specified a page file that  the container will
reside  on and  a style of  storage management.  Page  File 1 was
chosen for old_employee_tuples because the old_employees relation
is  seldom  referenced, and  Page  File 1  is  on a  tape device.
However, old_employee_index is on Page File 2 (disk) because disk
access  is  better for  an  index.  For  this  example it  is not
important what style1, style2 and style3 actually correspondd to,
except that the style of  storage management is specifiable.  The
containers given make  up the external file office.   Figure 1 is
the  resulting configuration  of external  file, containers, page
files and media files.

    External File                 Page Files      Media Files
. . . . . . . . . . . . . . . . .
.  ---------------------------  .
. |      office.xf           |  .
.  ---------------------------  .
.                               .
.       Containers              .
. . . . . . . . . . . . . . . . . . . . . . . .
.  --------                     .  ---------  .   ---------
. |  old_t |                    . | page    | .  | media   |
.  --------                     . |  file 1 | .  |  file 1 |
.                               .  ---------  .   ---------
. . . . . . . . . . . . . . . . . . . . . . . .
.  ---------------------------  .  ---------  .   ---------
. | -------                   | . |         | .  |         |
. || old_i |                  | . |         | .  | media   |
. | -------    ACTIVE_DATA    | . |         | .  |  file 2 |
. |                           | . |         | .   ---------
. | -----------   ----------- | . | page    | .   ---------
. || current_i | | current_t || . |  file 2 | .  | media   |
. | -----------   ----------- | . |         | .  |  file 3 |
. | -------    -------        | . |         | .  |         |
. || mgr_i |  | mgr_t |       | . |         | .  |         |
. | -------    -------        | . |         | .  |         |
.  ---------------------------  .  ---------  .   ---------
. . . . . . . . . . . . . . . . . . . . . . . .


                            Page 6. 
Multics Technical Bulletin                                MTB-508
DM Architecture


         Figure 1.  Organization of Example Database.


    Next the insides of each container must be defined.  In this
case,  there are  three kinds of  containers.  One  is that which
holds the complete tuple and is  managed by a record manager.  An
example of  this is the  old_employee_tuples.  The other  is that
which holds  an index for a  relation and is managed  by an index
manager.   An example  of this  is the  old_employees_index.  The
tuples are containers themselves, each of which contains elements
that  correspond  to  fields  in   a  tuple.   So,  for  example,
old_employees_tuples    container    contains    employee_tuple_N
containers, where  N is the  identifier of the  individual tuple,
which contains name, address and job container elements.

---------------------------------------------------------
|                 old_employee_tuples                     |
|  ----------------------   ----------------------   ...  |
| |   employee_tuple_1   | |   employee_tuple_2   |  ...  |
| |----------------------| |----------------------|  ...  |
| | name  | addr  | job  | | name |  addr |  job  |  ...  |
|  ----------------------   ----------------------        |
---------------------------------------------------------


         Figure 2.  Tuple Implemented as a Container.


4.2 Accessing

    Each  time  a  tuple is  stored,  current_employee_tuples is
checked to  see if there  is room in  it to fit  another element,
Page  File  2  is searched  for  free  space to  store  the tuple
container element  in, and Media  Files 2 and 3  are searched for
actual pages on disk to store the data.


5 OBJECTS OF DISCOURSE IN THE DSR SYSTEM

    Each layer in the Data  Storage and Retrieval system has two
objects  of  discourse,  a  file-like  object  and  a record-like
object.   A  file-like  object  comprises  a  set  of record-like
objects.  In tabular  form, the names of each  kind of object for
each layer are:


                            Page 7. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


   Layer Name             File-like         Record-like
                          Object            Object
  ----------------------------------------------------------
 | External Access      | external file   | record          |
 | Layer                |                 |                 |
 |----------------------|-----------------|-----------------|
 | Container Access     | container       | container       |
 | Layer                |                 | element         |
 |----------------------|-----------------|-----------------|
 | Page Access Layer    | unstructured    | extent (address |
 |                      | file            | and length)     |
 |----------------------|-----------------|-----------------|
 | Media Access Layer   | media           | physical record |
  ----------------------------------------------------------


            Figure 3.  Major Objects of the Layers.


In Data Structure Diagram form:


                            Page 8. 
Multics Technical Bulletin                                MTB-508
DM Architecture


----------------       ----------------
| external file  |     | record         |
----------------  -->  ----------------
 |                     |
 |                     |
 |                     |
 |                     |
 |                     |
 v                     v
----------------       ----------------
| container      |     | container      |
|                |     | element        |
----------------  -->  ----------------
 ^     |       ^        |
 |     |       |        |
 |      -------         |
 |                      |
 |                      |
                        v
----------------       ----------------
| unstructured   |     | extent         |
| file           |     |                |
----------------  -->  ----------------
 |                      |
 |                      |
 |                      |
 |                      |
 |                      |
 v                      v
----------------       -----------------
| media          |     | physical record |
----------------  -->  -----------------


    Figure 4.  Data Structure Diagram of the Major Objects.


The  boxes identify  the kind of  object and  the arrows indicate
that a single  instance of the kind of object  at the root of the
arrow identifies a set of instances  of the kind of object at the
head of  the arrow.  There can  be any number of  elements in the
identified set.


                            Page 9. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


5.1 Description of the Relationship Between the Objects of
Discourse

    There are two "directions"  of relationships to investigate;
the relationships between objects  in the same layer's discourse,
and  the relationships  between objects  in different discourses.
Within  the  same discourse  there  are only  two major  kinds of
objects,  the file-like  object and  the record-like  object.  An
instance of a  file-like object identifies a set  of instances of
record-like objects.

    The multi-layer  relationships are not obvious  from a layer
by layer presentation  and at the same time  are potentially more
interesting.   A few  are presented,  therefore, before  the more
detailed layer-by-layer is given.

    An  external file  can be  mapped on  to any  number of page
files.   A page  file can  be mapped  on to  any number  of media
files, but page files may not share media files.  Media files may
"compete" for the same physical space on disks.

    An external file is composed of any number of records.  When
talking  across  the  External  Access  Layer,  using  the action
requests of this  layer, these are the only  objects which can be
identified and manipulated.

Instances of external files, are:
- relational databases  (see the "Example use  of the DSR system"
 section),
- indexed sequential files, and
- unstructured files.

Examples of records are:
- tuples,
- file records (with keys as names), and
- bytes (or even single bits), respectively.

    An external file is built from containers.  A container must
be in  one and only  one external file.   A record is  built from
container  elements.   A  record  may  be  composed  of container
elements  from   more  than  one  container.    This  allows  the
construction  of  a record  which is  structured in  a strikingly
different fashion than the layout of the containers.

    A container is composed of any number of container elements.
A container is similar in some  respects to a pl/1 area.  In this
model, a  container element is  a piece of  allocated storage (or
freed  storage),  and  the   pointer  is  the  container  element
identifier.   A  container  which   is  not  contained  in  other


                           Page 10. 
Multics Technical Bulletin                                MTB-508
DM Architecture


containers is an "exterior" container.   Only one of these may be
in a page file.

    A container is part of a page file.  It is defined by one or
more offsets and extents within  a page file.  Container elements
are also defined by a set of one or more offsets and extents, but
these  must be  within (a subset  of) the offsets  and extents of
their parent container.  A container is parceled out as container
elements.  A container element within  a container may be defined
to be a container in its own right.


6 EXTERNAL ACCESS LAYER

    This  layer  provides "logical"  access  to data.   What are
generally thought of as database managers are implemented in this
layer  and,  hence,  the  services  of  database  management  are
available as "action" requests of this layer.


6.1 Services

    There are  several types of sets  of services; these "types"
are mutually exclusive.  The term "access method" will be used to
mean a  type of service  set.  It is not  necessarily possible to
view  a set  of data  through more  than one  access method.  For
instance, data organized for use  with a sequential access method
cannot be  used with the  relational access method.   Some of the
access methods are:

1)   stream access (unstructured).

2)   sequential access to record-like objects.

3)   direct access  to record-like objects  (keyed lookups).  The
    record-like objects  referred to here may  either be records
    organized  by  an index  of some  sort, or  they may  be the
    objects of which an index is composed ("keys").

4)   relational access (e.g., MRDS).

5)   network access (e.g., IDS).

6)   hierarchical access (For manipulating "tree" structured data
    simply).


                           Page 11. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


6.2 Naming

    An  external  file  is  named  by  a  pathname,  which  is a
character  string  with ">"  used  to delimit  components  of the
"path".   The pathname  of an  external file  is unique  within a
system, or node of a network.


6.3 The Interface

    There  are  many interfaces  to  the External  Access Layer.
Each  different "External  File Manager"  in the  External Access
Layer  defines  the terms  in which  applications outside  of the
External  Access  Layer  are  to "speak"  with  it.   Examples of
existing  systems  which  can  be implemented  as  "External File
Managers" are:  MRDS,  a relational manager; read_mail/send_mail,
the  "extended  mail facility";  continuum; and  vfile_'s several
interfaces, unstructured, sequential, blocked, and indexed.

    To  provide the  above services, several  classes of modules
within the External Access Layer are used:

1)   Index   managers.   For   instance,  a   b-tree  manager,  a
    hash-table manager, an array manager.

2)   Record manager(s).  For instance, a variable size manager, a
    fixed  size  manager,  a  variable size  within  fixed size,
    pre-allocated blocks  manager(a la "blocked"  vfiles).  Each
    of the foregoing would associate identifiers (which could be
    permanent) with records.  This last is a major service.

3)   Relation manager utility.  This  is currently implemented by
    mdbm_util_,  under  MRDS.    This  utility  understands  the
    structure of  tuples, as stored  in the record  manager, and
    how these  tuples are indexed  (by one or more  of the index
    managers).   It  translates   primitive  operations  against
    tuples,  such  as  "delete",   "get",  "put",  "find",  into
    operations of the index managers and the record manager.

4)   Relational database manager (MRDS).  This system is built on
    top  of   relation  manager  utility   and  understands  the
    existence   of  multiple   relations,  searches   and  other
    operations which involve  them, translating these operations
    into terms mdbm_util_  understands.  The relational database
    manager  may  directly invoke  index  or record  managers to
    manage some of its control or definitional data.

The  relational database  manager relies totally  on the relation
manager utility  and the index  and record managers  for all DS&R
activity; and the relation manager  utility relies totally on the


                           Page 12. 
Multics Technical Bulletin                                MTB-508
DM Architecture


index and record managers for all of its DS&R activity.  Both the
index  and  record managers  use  the Container  Access  Layer to
manage their internal objects.

    For instance,  a b-tree manager uses  "nodes" as its storage
objects.  If it  needs a new node or wants  to delete an old one,
it  uses  the allocate  or  free verbs,  respectively.   A record
manager would use the container manager in similar circumstances,
as well  as when it needed  to expand or contract  a record.  The
put  and get  interfaces of the  Container Access  Layer would be
used whenever "putting" and "getting" were to be done, since only
the  Container  Access  Layer  knows  how  storage  ids  map into
something the lower layers can understand.


7 CONTAINER ACCESS LAYER

    This layer provides  logical storage management (allocation,
freeing, and garbage-collection).


7.1 Services

1)   Reading  and  writing  of record-like  objects  addressed by
    unique "container element identifiers".

2)   Allocation and freeing of arbitrary sized pieces of storage,
    referred to as "container element".

3)   Creation and destruction of containers.


7.2 The Interface

    The following  are the operations  of the interface  for the
Container Access Layer:

1)   Create a container  in a specified page file.   This (or the
    following) operation is required  before any other Container
    Access  Layer  operation  can  be  used.   The  name  of the
    container to be created and the  page file in which it is to
    be placed must be supplied.

2)   Create a  container in a  specified container.  The  name of
    the container to be created and the container in which it is
    to be placed must be provided.

3)   Destroy a container.  The name  (or local identifier) of the
    container to be destroyed must be provided as input.


                           Page 13. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


4)   Allocate a container element.  The  name of the container in
    which  the element  is to be  allocated and the  size of the
    element needed must be provided  as input.  Returns the name
    of the element allocated.

5)   Free a container element.  The name of the container and the
    name  of the  container element  must be  provided as input.
    Or, only the local identifier  of the container element need
    be provided, as it implies the container.

6)   Get the "contents" of a  container element.  The name of the
    container  and  the name  of the  container element  must be
    provided as  input (with the same  exception about the local
    identifier as above).  This  is done whenever it's necessary
    to look at  a container element.  It is  always necessary to
    go through  the Container Access  Layer to look  at or alter
    data.

7)   Put the "contents" of a  container element.  The name of the
    container  and  the name  of the  container element  must be
    provided  as  input (with  the local  identifier exception).
    This is  done whenever it's  necessary to alter  a container
    element.


7.3 Administrative Services

    Some of these services are:

1)   Current state of a container, including space allocated.

2)   Set the type of container management.


7.4 Naming

    The exterior container is named  by a character string name,
unique  within  its  external   file.   Interior  containers  are
identified  as  container   elements  within  their  continaining
container.


7.5 Internal Structure

    Several  different   techniques  of  managing   storage  are
available.  Two  examples are:  a circular  free-list with roving
pointer; and, a multiple bucket first-fit.  Containers are either
implicitly or explicitly "typed" at creation time in terms of the
style of storage management which is to be used on them.


                           Page 14. 
Multics Technical Bulletin                                MTB-508
DM Architecture


8 PAGE ACCESS LAYER

     This layer is an interface  between the higher layer in the
architecture and the low-level Multics functions which manage the
physical  representation  of  the   data.   The  actual  physical
representation  of the  data is  hidden by  means of  this layer.
This layer presents the following  view of data storage to higher
layers:

1)   Data  is stored  in named  entities called  page files.  The
    name of a  page file within the system  is unique (analogous
    to path names in the Multics Storage System).

2)   A  page   file  is  a   contiguous  stream  of   bits,  each
    addressable, with no practical limitation in size.

3)   A  section of  a page file  is referenced  by specifying the
    page  file  name, the  bit  offset of  the beginning  of the
    section, and the length of the section in bits.


8.1 Services

1)   Maintain  page  files  as  lower-level  Multics  structures.
    Create, destroy, and administer these structures as needed.

2)   Translate  page file  names and  sections within  page files
    (bit  offsets and  lengths) into  the appropriate parameters
    identifying    the    corresponding    lower-level   Multics
    structures.   This can  be thought of  as converting logical
    addresses into physical addresses.


8.2 Naming

    Page files have pathnames which are unique per system.


9 MEDIA ACCESS LAYER


9.1 Services

    Moves   data   between   physical   storage   locations   in
peripherals,  such   as  a  disk,  and   main  memory  locations.
Basically,  the services  the disk Device  Interface Module (DIM)
provides.


                           Page 15. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


10 ANCILLARY SERVICES SYSTEM

    This system is logically outside of the DSR architecture but
provides  services that  are used by  several layers  of the DSR.
The reason the system is outside  of DSR is that its services are
logically distinct  from the storing and  retrieving of data done
by the DSR, and because these  services can be used without using
the DSR.  Also, these services  do not correspond directly to any
one  layer;  many  services  can be  used  by  any  layer.  These
services also  are not absolutely necessary  for data storage and
retrieval, but enhance the overall data management abilities.


10.1 Services

    The major services provided by the Ancillary Services system
are:

1)   Transaction definition management.

2)   Concurrent access synchronization.

3)   Recovery management.


10.1.1 TRANSACTION MANAGEMENT

    A  transaction is  a sequence of  data management operations
which  form  an  atomic  unit   using  the  concepts  of  commit,
checkpoint and rollback.  Transactions  are run by processes, but
are  not  dependent  upon  the  existence  of  any  one  process.
Transactions  control  the  state  of   data,  in  terms  of  the
volatility and availability of data.  Volatile data is data which
has no existence beyond that  of some particular transaction, the
transaction which  "owns" it.  Permanent data  does not depend on
the existence of any transaction  for its existence.  Public data
is  that which  can be seen  by any transaction,  private data is
only available to one transaction.   The generally usable data of
a  database  is permanent  and  public.  During  the course  of a
transaction, the  transaction will make  some of the  public data
private and will create some volatile data.

    Committing  a  transaction makes  all  of its  volatile data
permanent and all of its private data public.

    Checkpointing a  transaction transforms all  of its volatile
data  into  permanent data.   Should  all of  the processes  of a
transaction  die, the  transaction can  be resumed  from the last
checkpoint.


                           Page 16. 
Multics Technical Bulletin                                MTB-508
DM Architecture


    Rolling back a transaction destroys  all of the data it made
volatile since  a given checkpoint (perhaps  the beginning of the
transaction)  and   makes  public  all  of   the  data  that  the
transaction  made  private  since that  checkpoint.   A committed
transaction  cannot  be  rolled  back.   Aborting  a  transaction
requires rolling it back to the beginning of the transaction.


10.1.2 CONCURRENT ACCESS SYNCHRONIZATION

    If  transactions were  run one  at a  time, each transaction
would see  the consistent state  left behind by  its predecessor.
If  several  transactions   run  concurrently,  without  properly
synchronizing  their  actions,  their  interference  may  produce
incorrect  results  and  may  cause   the  data  base  to  become
inconsistent.   It is  the purpose  of concurrency  management to
regulate concurrent access to the  data base in order to preserve
its  consistency  and  prevent  all  transactions  from producing
incorrect results.

    A system  in which concurrent  transactions are synchronized
in  such a  way that  their result  is equivalent  to SOME serial
execution of  these transactions is commonly  regarded as correct
with  respect  to concurrency.   This  property is  known  as the
"serialization" property and can  be obtained if all transactions
behave according to the following protocol:

1)   Never modify  any data already  read or modified  by another
    transaction still in progress.

2)   Never read any data  already modified by another transaction
    still in progress.

    Read  locks   and  write  locks  are   available  to  assist
transactions  in  enforcing  these  rules.  A  read  lock  can be
granted to many readers while a write lock can be granted only to
one writer.

    During its execution, a transaction  requests a read lock on
an object before reading it and  a write lock on an object before
modifying it.  These locks are  held by the transaction until the
transaction commits or aborts, at  which point they are released.
It  is  therefore  impossible  for   a  transaction  to  see  the
modifications made by another  transaction still in progress.  As
a  result,  if  a  transaction has  to  be  undone,  partially or
completely,  by  the recovery  mechanism,  the rollback  will not
affect any other transaction.

    This  locking  protocol  does  not  prevent  deadlocks  from
occurring.   The  lock  manager   is  responsible  for  detecting


                           Page 17. 
MTB-508                                Multics Technical Bulletin
                                                 DM Architecture


deadlock  situations  when they  occur;  one of  the transactions
involved  in the  deadlock will  be interrupted  and rolled back.
The  rollback will  undo what  the transaction  has done  and, in
particular,  unlock  the  locks  set  by  the  transaction.   The
transaction  is  rolled  back  to  the  nearest  checkpoint  that
eliminates the conflict.


10.1.3 RECOVERY MANAGEMENT

    Recovery management guarantees that  data cannot be lost and
cannot be left inconsistent after program or system failure.

    The two important concepts in recovery are the "transaction"
and the "checkpoint".  The transaction  is the basic unit of data
base consistency; the  checkpoint is the basic unit  of data base
recovery.

    A transaction may either  run successfully to completion, in
which case it  is said to be committed, or  abort (or be aborted)
if it has no chance to reach completion.  A committed transaction
leaves the data base in a  consistent state, and its effects must
survive any system failure.  A transaction to be aborted may have
set the data base in an  inconsistent state, and its effects must
be completely undone.

    A  transaction  may be  broken  up into  several consecutive
phases, each of which being  ended by a checkpoint operation.  At
a  checkpoint,  the  data  base may  be  inconsistent,  since the
transaction  is  not  completed yet.   However,  the intermediate
state  of  the data  base  is well  understood by  the particular
transaction.   In the  event that the  transaction is interrupted
during the execution of a given phase, restoring the state of the
data  base  as it  was  at the  last  checkpoint would  allow the
transaction to restart from the last checkpoint instead of having
to  be  aborted, saving  all  the work  done in  the checkpointed
phases.  The recovery mechanism  must therefore guarantee that it
can undo the effect of an  uncommitted transaction up to its last
checkpoint.

    The method  chosen to implement  these recovery requirements
is  to log,  for each  modification made  to the  data base  by a
transaction, the  value of the  data before modification  and its
value after modification.

    The  values  before  modification  are logged  in  a journal
called  the  Before  Image Journal,  which  is used  to  undo, or
"rollback", the effects of an  interrupted transaction, up to the
last  checkpoint if  transaction does not  have to  be aborted or
completely if the transaction has to be aborted.


                           Page 18. 
Multics Technical Bulletin                                MTB-508
DM Architecture


    The values after modification are logged in a journal called
the  After Image  Journal, which  is used  to redo  all committed
transactions in the event of  a catastrophic system failure.  The
data base has to be reconstructed  from its last saved version to
which all  modifications recorded in the  After Image Journal are
applied.   After  all  committed  transactions  are  redone,  the
checkpointed  phases  of uncommitted  transactions are  redone to
allow  these  transactions to  resume  execution from  their last
checkpoint.


                           Page 19.