SECURITY = protection of data against unauthorized disclosure, alternation or destruction.

TYPES of DB SECURITY:
Discretionary Security
    (owner-creator of object grants and revokes rights at his/her discretion)

Mandatory Security
    - Enterprise requires security matches (clearances) for access to certain
        data by certain users.
    - used in military database systems, industry where trade secrets are involved


Discretionary security mechanisms
---------------------------------

 -user-views
 -authentication and authorization
 -dynamic modification of user requests


SQL Security

1 View mechanism (can hide sensitive data from unauthorized users)

2 authorization subsystem (user can grant & revoke rights to others)

 - initally DBA has SYSADM authority (can do everything and noone else can do anything).

 - DBA uses GRANT verb to authorize others.
       Then those others can further authorize still others using GRANT.

 - GRANT... TO PUBLIC means all users
 - GRANT ALL ... means grant all rights
 - GRANT WITH GRANT OPTION ...U1  gives user U1 authority and the right to
                                  grant authorization to others.

 - rights include SELECT, UPDATE, DELETE, INSERT, ALTER, INDEX.
 - all authorizations can be revoked by granter using REVOKE ...
 - Revocations cascade if applied at the top of a hierarchy.

Data encryption can provide security (e.g., password file, etc.)

INTEGRITY = accuracy or validity of data

* Usually implemented using predicates (WHERE clauses).
* Most DBMSs have minimal integrity system



"Mandatory" Security

In 1982 summer a project was initiated by National Research Council
and Air Force Studies Board to study different issues in Multilevel Data
Management Security, at Woods Hole Massachusets.  In their report they
have considered the following four generic approaches to retrofit existing 
DBMSs in order to make them secure.  

  1. Physical separation (e.g., different discs per data classification)
  2. Temporal separation (e.g., periods processing)
  3. Logical separation (e.g., security kernel mediates shared resources)
  4. Cryptographic separation (e.g., different encryption keys for each data classification)

 
     Based on these four approaches ten different architectures were
obtained and out of these ten, three were found to be of particular
interest.  These alternatives are given below.
 (Notations:-  Users: U, Data: D, Reference Monitor Filter: F, and the
  boxes bounded by "+" signify a trusted component.)

Case 1. Physical (Temporal) Separation

         ______      ______      ______      ______
        | D Hi |----| DBMS |----| F Hi |----| U Hi |  Hi
	|______|    |______|    |______|    |______|
	-------------------------------------------------
         ______      ______      ______      ______
        | D Lo |----| DBMS |----| F Lo |----| U Lo |  Low
	|______|    |______|    |______|    |______|
       

Case 2. Logical Separation

         ______      ______      ______      ______
        | D Hi |----| DBMS |----| F Hi |----| U Hi |  Hi
        |      |    |______|    |______|    |______|
        |++++++|  ,-------------------------------------
        |      |  :  ______      ______      ______
        | D Lo |----| DBMS |----| F Lo |----| U Lo |  Low
        |______|  : |______|    |______|    |______|
        ----------' 
     This case is same as case 1 except for the shared physical media
     (e.g., disc) to store the date and it requires logical separation
     by a trusted disc or file manager in a trusted OS kernel.  The
     kernel also encapsulates the other security objects (e.g., DBMS,
     filter, user interface).


Case 3. Logical Separation Plus

         ______      ______     ++++++++     ______
        | D Hi |----| DBMS |----+ F Hi +----| U Hi |  Hi
        |      |    |______|    ++++++++    |______|
        |++++++|  ,----------.    +  +   ,--------------
        |      |  :  ______  :  ++++++++ :   ______
        | D Lo |----| DBMS |----+ F Lo +----| U Lo |  Low
        |______|  : |______| :  ++++++++ :  |______|
        ----------'          `-----------'
        This case has the same configuration as Case 2, but with an
        interconnection between the Hi and Lo filters.  This allows the
	Hi user to access the Lo data, but it complicates the design of
	the filters (which now must be trusted).


Case 4. Logical Separation Plus

         ______      ______     ++++++++     ______
        | D Hi |----| DBMS |----+      +----| U Hi |  Hi
        |      |    |______|    + F Hi +    |______|
        |++++++|  ,----------.  +  &   + ,--------------
        |      |  :  ______  :  + F Lo + :   ______
        | D Lo |----| DBMS |----+      +----| U Lo |  Low
        |______|  : |______| :  ++++++++ :  |______|
        ----------'          `-----------'
        This configuration is the same as Case 3 except that we collapse
	separate filters into a single trusted object to simplify its design.
    

Case 5. Physical Separation Plus
         ______      ______     ++++++++     ______
        | D Hi |----| DBMS |----+      +----| U Hi |  Hi
        |______|    |______|    + F Hi +    |______|
        ---------------------.  +  &   + ,--------------
         ______      ______  :  + F Lo + :   ______
        | D Lo |----| DBMS |----+      +----| U Lo |  Low
        |______|    |______| :  ++++++++ :  |______|
                             `-----------'
        Case 5 combines the benifits of Cases 1 and 4.  It shows good
	security because of physical separation, and it allows Hi user 
	access to both Hi and Lo data (data access performance is more
	complex for the Hi user, who must integrate separate queries
	against Hi and Lo databases).


Case 6. Physical Separation Plus
         ______      ______     ++++++++     ______
        | D Hi |----| DBMS |----+ F Hi +----| U Hi |  Hi
        |______|    |______|    ++++++++    |______|
        ---------------------.    +  +   ,--------------
         ______      ______  :  ++++++++ :   ______
        | D Lo |----| DBMS |----+ F Lo +----| U Lo |  Low
        |______|    |______| :  ++++++++ :  |______|
                             `-----------'
This case is a combination of features from Case 3 and 1, with benefits similar to Case 5.


Case 7. Fully Integrated 

         ______                 ++++++++     ______
        |      |                +  F   +----| U Hi |  Hi
        |  D   |     ______     +  Hi  +    |______|
        |  Hi  |----| DBMS |----+  &   + ,--------------
        |  &   |    |______|    +  Lo  + :   ______
        |  Lo  |                +      +----| U Lo |  Low
        |______|                ++++++++ :  |______|
        ---------------------------------'

        Case 7 is a fully integrated database, with a fully integrated
	tursted filter and an untrusted DBMS.


Case 8. Partially Integrated, Physically Separate
         ______
        | D Hi |     ______      ______      ______
        |  Lo  |----| DBMS |----| F Hi |----| U Hi |  Hi
	|______|    |______|    |______|    |______|
	-------------------------------------------------
         ______      ______      ______      ______
        | D Lo |----| DBMS |----| F Lo |----| U Lo |  Low
	|______|    |______|    |______|    |______|
       
        In this case the Hi user has a direct access to an integrated,
	physically separate Hi and Lo database.  But Lo data
	synchronization in both the containers is a problem. 
        

Case 9. Partially Integrated Plus
         ______
        | D Hi |     ______     ++++++++     ______
        |  Lo  |----| DBMS |----+  F   +----| U Hi |  Hi
	|______|    |______|    +  Hi  +    |______|
	---------------------.  +  &   +  ,--------------
         ______      ______  :  +  L   +  :  ______
        | D Lo |----| DBMS |----+      +----| U Lo |  Low
	|______|    |______| :  ++++++++  : |______|
                             `------------' 
        This configuration offers the same benefits as Case 8 and resolves
	the data synchronization dilemma. 


Case 10. Partially Integrated, Logically Separate.

         ______      ______     ++++++++     ______
        | D Hi |----| DBMS |----+  F   +----| U Hi |  Hi
        | & Lo |    |______|    +  Hi  +    |______|
        |++++++|  ,----------.  +  &   + ,--------------
        |      |  :  ______  :  +  Lo  + :   ______
        | D Lo |----| DBMS |----+      +----| U Lo |  Low
        |______|  : |______| :  ++++++++ :  |______|
        ----------'          `-----------'

	Case 10 is functionally the same as Case 9, but a security kernel
	is used to supply logical separation of the databases. System
	performance and security depend on kernel performance and security.


These ten models generated can be gouped into the following four generic
security architectures.

  Generic Architecture      Model Cases    Key DBMS Technology
  ------------------------------------------------------------
  Physical                  1,5,6,8,9      Distributed Nets
  Temporal                  1              ----
  Logical                   2,3,4,10       OS Kernel
  Cryptographic             1,7            Cryptographic Integrity Lock

Out of these ten models cases 5,9,7,4,and 10 are of particular interest.
These 5 are identified by the key technology involved in providing the 
security.  Physical protection can be achieved via a "Distributed DBMS"
in a local net.  Logical protection is achieved by using trusted filters
in combination with trusted OS software based on secutity-kernel
techonology, i.e., a "Kernelized DBMS".  Although Case 7 is the simplest
design and offers the best performance, it is very onsecure because the
untrusted DBMS can modify data and classification labels needed by the
trusted filter.  The key problem is the integrity of the database, and
the key solution technology is cryptography i.e., "Cryptographic Integrity 
Lock" otherwise known as "Spray Paint Architecture".
  

The advantages and disadvantages of the three main archtectures:

 Database    Problem              Benefit                    Architecture
 ------------------------------------------------------------------------
 Separated   Complex Hi Query     Database Isolation         Kernelized
  Hi, Lo      parsing             Trojan Horse Resistance
             Performance          Direct Lo Query 
	      overhead

 Separated   Database             Database Isolation         Distributed
  Hi-Lo,Lo    Synchronization     Trojan Horse Resistance   
				  Direct Hi and Lo Query

 Integrated  Trojan Horse         High Performance          Spray Paint
  Hi-Lo       Vulnerability       Least Complex             (Integrity Lock)
 

Kernalized Architecture Examples
--------------------------------
  - Research Prototype SEAVIEW (SRI) : element-level labelling 
  - Commercial Systems ORACLE : relation-level labelling


========================================================================  

				SECTION B
				---------
In the previous section we have seen different retrofit architectures
for Multilevel Secure Databases.  This section  describes a Concurrency
Control technique and Query Processing in Kernelized architecture.  As
we know in this kind of architecture all writes are local and all low
level data item reads are local but remote (with respect to container).
While discussing a concurrency control mechanism our goal is to reduce
transaction response time while increasing system through-put.  Query
processing mechanism also focuses on reducing transaction response time.

CC in kernelized architecture has two basic problems: security and
data availability.  When a high level process wants to read low level
data item, it can not set a read lock on the data item since this process
would establish a covert channel.  This could be done by a trozan horse 
trying to send bits of 1's and 0's by locking and then unlocking low 
level data which is being watched by a low level subject.  Trying to
eliminate this problem causes the hign user to wait for the low level
data which creates the problem of data availability.  A better solution
is to use multiversioning and hence high users can read older versions
of low data although it would be stale.  Some people suggest to restrict
the number versions to two, one for the subjects at the same level of
data item and one for subjects of higher levels.

Let us consider 3 classification levels U (Unclassified) < C (Confidential)
< S (Secret).  We use the following notations:  Ti(S) means a transaction
Ti which has entered at level S and x(U) means the data item x at level U.
In our method we suggest to use a POSTROLL in the TCB and a ROLL at each
level.  The POSTROLL contains one bit for each container and a '1' at a 
particular position of a vector shows that that particular transaction
needs to access atleast one data item from that container.  When a
transaction enters the system it declares its data needs and POSTs a
Vector in POSTROLL which would eliminate indirect conflicts between
containers.  Then it posts the RV at the same level of the transaction. 
Since we are using restricted *-policy (i.e., users are not allowed to
wirte up) the transaction would not POST any RV on the ROLLs that are in 
the higher contaiers.  If a transaction, Tk(C) say, wants to read a lower
level data item, x(U) say, it keeps a pointer to the private workspace
of the transaction Ti(U) iff Ti(U) is the most recent transaction in the 
ROLL at U level that wants to write x(U) and the serialization order of
Ti(U) is less than that of Tk(C).  When Ti(U) finishes writing X(U) and
about to commit Tk(C) reads the value of X(U).  If Ti(U) aborts then
Tk(C) moves its pointer to the private workspace of the transaction which
satisfy the above condition.  If at a given time there is no such Ti(U)
which satisfies the condition, then Tk(C) reads the value of x(U) from
the database.

Next we discuss some issues in Query Processing in Kernelized 
Architecture.  Here we introduce a security policy (like simple
security policy and *-policy) which we call Primary Key-Foreign
Key policy.  

P-F Key Policy: A subject should never be allowed to reference up.

Let us consider two relations R and S where relation R has the
primary key and relation S has the foreign key.  If the two 
relations are at two different levels then relation R must be at
a lower level than relation S, otherwise it would violate the P-F Key
policy.  Now we consider inserts and deletes into both the relations.

  Insert into relation S: 
	  We must check relation R to make sure that the value at the
	  foreign key field of the tuple being inserted exists in R.
	  Since R is at a lower level there is no problem in doing so.
  
  Delete from relation S:
         No problem.

  Insert into relarion R:
	 No problem.

  Delte form relation R:
	 We can not check the relation S for referential integrity
	 constrains since S is at a higher level.  Because in doing so
	 if we find out that it would violate the referential integrity
	 constraint and hence reject the deletion, it would confirm
	 the user at lower level that the value exists at the higher
	 level.  So we would let the deletion occur.  Then we check for
	 the matching foreign key values in relation S and if any found
	 we would delete those tuples from S.
 

SECTION 2:
A system which enforces Bell-La Padula security policies (i.e., simple
securiy policy and star policy) is called BL-Secure.  This is the minimum
requirement for a secure system.  In addition to this some systems are
also free from a certain class of covert channels, (i.e., those caused
by contention for data).  These kind of systems are called DC-Secure.  A 
system which is free from all kind of covert channels, is called Covert
Channel Secure or CC-Secure.  The following figure illustrates these
three classifications.

		 ,-------------------------,
		 |        BL-Secure        | 
                 |  ,-------------------,  |
		 |  |     DC-Secure     |  | 
                 |  |  ,-------------,  |  |
		 |  |  |  CC-Secure  |  |  |
                 |  |  `-------------'  |  |
                 |  `-------------------'  |
		 `-------------------------'

There are two main kinds of covert channels: storage channels and timing
channels.  A storage channel signals information by indicating the presence 
of a stored object such as a file etc. where as a timing channel does so
by modulating an observable delay.  Besides these there are other kinds of
channels such as inference channels, aggregations channels etc.  In covert
channel the information flows from one process to another.  Where as in 
other kinds of channels only one subject gets the information to which
he does not have access to.  An example of inference channel is given
below.  
Consider an relation EMPLOYEE with an attribute NAME  and another attribute
SALARY  and both are leveled as unclassified.  But retrieved together it
becomes secret so that no unclassified user can know the salary of any
particular employee.  Now if an unclassified user knows an employee to
be the president of the company and the fact that the president gets the
highest salary he can find out the salary of the president by retrieving
both name and salary separately.

Woods hole report discusses the near term solutions and long term
requirements for multilevel secure databases.  There are ten different 
architectures out of which three architectures are most significant. 
They are 1) replicated or distributed, 2) Kernelized, and 3) Integrity Lock
architectures.  Further details of these archituctures have been discussed 
in previous notes.  

In kernelized architecture data at different classification levels are
logically separated from each other.  The classification of data could
be at relation level (example: trusted ORACLE), or at attribute level 
(example: ?) or at row level (example: ASD by TRW) or at element level
(example: SeaView by SRI). 
 
 One of the long term requirements for secure databases as given in
 Woods hole report is the dynamic classification of data.  As per a
 paper by some database researchers at Honeywell, the classification
 of data could be either simple (example: name attribute of EMPLOYEE
 relation is unclassified), or content based (example: name of employee
 is secret if the dept. he works in is D15), or context based (example:
 name and salary taken together is secret where as when individually taken 
 both name and salary are unclassified).

 In the above situation, the use of rules can control the access of
 users to the database more easily.  But as constraints start growing,
 the number of rules grow exponentially.  To make the point more clear
 let us consider the above example.  Now if an unclassified user has
 already retrieved the name and social_security_number of employee then
 attribute which might establish the link between the names and salaries)
 should be rejected.  To find this out the history of retrival and the
 rule which must be triggerd according to the history have to be stored.
 Again this exponential growth of rules makes the search time an
 exponential function which ultimately increases the response time.  

 Fragmentation of the relation solves this problem.  In the above
 example salary could be taken off the original relation and kept
 separately as a relation itself with the same classification level
 (unclassified in this case).  Then another relation with name 
 and salary together could be created which could be accessed by the
 higher class users only.

 Open Question:  Is there any classification constraint which could
       not be handled just by fragmentation (horizontal or vertical)
       of data?





These decomposition and recovery algorithms have the following advantages 
over both Lunt's and Jajodia's algorithms.
      
    1) The tuples that belong to entirely one level (i.e., all the
       attributes at the same level) are stored in a relation.
       Hence we can just retrieve them without requiring any join.
  
    2) Unlike Jajodia's algorithm the full primary key need not
       be stored.  
   
    3) The recovery algorithm results in lossless join which is the
       main problem in Lunt's algorithm as pointed out by Jajodia.

    4) The use of DVs help in retrieving the tuples much faster. 
       Sometimes we can do a selection before doing a join in order
       to delete tuples that should not be retrived for a given
       class.

    5) Also it serves all the purposes of decomposition as mentioned
       in Lunt's paper.

   
The Decomposition Algorithm:

  1. Group the Multilevel Relation (MLR) R by the primary key class and
     name each group R'(x) where x is the class of the PK attribute A1.

  2. For each R'(x), do the followings:

     2.1 Create a Domain Vector Table DVT(x).A1 for the PK attribute A1.
     
     2.2 Create the single level relation R(x) = {t | all the attributes
	 of t have the same classification level}.
  
     2.3 Construct a Domain Vector DV_R(x).A1 for the attribute A1.
   
     2.4 Create R''(x) = R'(x) - R(x).  
         R''(x) contains tuples that have PK attribute classification x 
	 and atleast one of the other attribute classification is greater
	 than x.

  3. For each R''(x), where R''(x) is not empty, create the following
     relations.  (Note that for system high R'' will be empty.)

     3.1 Construct the primary key relation R1(x) by projecting the PK
	 attribute A1.

     3.2 Construct a Domain Vector DV_R1(x).A1 for the attribute A1.

     3.3 For each attribute Ai (i not equal to 1) present in R''(x), and
	 for y >= x where y is the classification of attribute Ai, construct
	 attribute relation Ri(x,y)(A1,Ai).  
                 
     Destroy R'(x) and R''(x).


The Recovery Algorithm

    Let y be the clearance of the subject that has submitted the query.

    If y is not system high then do steps 1 to 4.

      1) AND the domain vectors DV_R(y).A1 and DV_R1(y).A1. 
         (This helps in finding the polyinstantiated elements.)

      2) XOR the result with DV_R1(y).A1.
         (This gives the tuples that are not polyinstantiated but have 
         at least one attribute at higher level and hence are not in the 
         relation R(y).  This drops polyinstantiated elements which the 
         low level user is not supposed to know.)

      3) For each 1 bit present in the above resulted vector, using the 
         DV table, find the domain values of the PK attribute. 

      4) Then select those values, gotten from step 3, in relation R1(y)
	 and perform a left outer join among R1(y) R2(y,y), ...,Rn(y,y).
	 Let us denote this result relation as R'_y.

    else denote R'_y = R(y).

      5) Construct relation R_y = R'_y union R(y) 

    For each x


[ Back || Class List || Perrizo's Home || NDSU Home || Next ]


perrizo@plains.nodak.edu