1. Relation vs Table
| INFORMAL TERMS | FORMAL TERMS |
| Table | Relation |
| Column | Attribute |
| Data type | Domain |
| Row | Tuple |
| Table definition | Relation schema |
2. Keys
- A subset of the attributes of a relation schema R is a superkey if it uniquely identifies any tuple in r(R).
- A superkey K is called a candidate key if no proper subset of K is a superkey. That is, if you take any of the attributes out of K , then it is not enough to uniquely identify tuples.
- The primary key is chosen from the candidate keys and the primary key is one of the candidate keys.
3. Constrains
- Domain constraints: every value in a tuple must be from the domain of its attribute.
- Key constraints: a bunch of keys (superkey, candidate key and primary key).
- Entity integrity constraints: no primary key value can be NULL.
- Referential integrity constraints: a foreign key in one table must have a matching value in the primary key of another table.
4. SQL
4. Database Design – Four Phases
- Requirements Collection and Analysis
- Conceptual Design
Entity-Relationship Model - Logical Design
From Entity-Relationship Model to Relation Schemas - Physical Design
5. Entity-Relationship (ER) Model





5. Functional Dependency
Let R be a relation schema.
- A FD on R is an expression X → Y with attribute sets X,Y ⊆ R.
- A relation r(R) satisfies X → Y on R if, for any two tuples
t1,t2 ∈ r(R), whenever the tuples t1 and t2 coincide on values of X, they also coincide on values of Y .
t1[X] = t2[X]
⇓
t1[Y] = t2[Y]
6. Implied Functional Dependencies


7. Minimal Cover




8. Finding Keys


9. Normalisation
9.1 BCNF
A relation schema R is in BCNF if whenever a non-trivial FD X → A holds in R, then X is a superkey. When a relation schema is in BCNF, all data redundancy based on functional dependency are removed.
Algorithm for a BCNF-decomposition
Input: a relation schema R′ and a set Σ of FDs on R′.
Output: a set S of relation schemas in BCNF, each having a set of FDs
- Start with S = {R′};
- Do the following for each R ∈ S iteratively until no changes on S:
- Find a (non-trivial) FD X → Y on R that violates BCNF, if any;
- Replace R in S by two relation schemas XY and (R − Y ) and project the FDs to these two relation schemas.
9.1.2 3NF
A relation schema R is in 3NF if whenever a non-trivial FD X → A holds in R, then X is a superkey or A is a prime attribute.
Algorithm for a dependency-preserving and lossless 3NF-decomposition
Input: a relation schema R and a set Σ of FDs on R.
Output: a set S of relation schemas in 3NF, each having a set of FDs
- Compute a minimal cover Σ′ for Σ and start with S = φ
- Group FDs in Σ′ by their left-hand-side attribue sets
- For each distinct left-hand-side Xi of FDs in Σ′ that includes Xi → A1,Xi → A2,…,Xi → Ak:
- Add Ri =Xi ∪{A1}∪{A2}···∪{Ak} to S
- Remove all redundant ones from S (i.e., remove Ri if Ri ⊆ Rj )
- If S does not contain a superkey of R, add a key of R as R0 into S.
- Project the FDs in Σ′ onto each relation schema in S
10. Relational Algebra

11. Database Security
Confidentiality
– E.g. enforced by access control mechanisms Integrity
– E.g. enforced by access control mechanisms and integrity constraints specified on schemas
Availability
– E.g. enforced by recovery and concurrency control mechanisms
Some further services
Encryption: to protect data when being transmitted across systems and when being stored on secondary storage
Query authentication: to ensure a query result is correct by using signature mechanisms and data structures
12. Access Control Mechanisms
- Discretionary access control (DAC)
- Grant privileges
- Revoke privileges
- Delegate privileges
- Mandatory access control (MAC)
- It is based the Bell-LaPadula model (originally developed for U.S. Department of Defense multilevel security policy).
- Subjects (e.g. users) are assigned security clearances;
- Objects (e.g. rows, tables, views) are assigned security classes.
- TS ≥ S ≥ C ≥ U (TS: Top secret, S: Secret, C: Confidential, U: Unclassified)
- Two MAC rules are enforced by the model:
- Subject X can read object Y only if clearance(X ) ≥ class(Y ). ,→ Read down
- Subject X can write object Y only if clearance(X ) ≤ class(Y ). ,→ Write up
- It is based the Bell-LaPadula model (originally developed for U.S. Department of Defense multilevel security policy).
- Role-based access control (RBAC)
13. Transactions
A transaction is a sequence of database operations grouped together for execution as a logic unit in a DBMS.

14. Locking
Two main types of locks:
- Read lock for reading an object by a transaction
- Write lock for writing an object by a transaction
Lock compatibility:
| Lock type | read-lock | write-lock |
| read-lock | Yes | No |
| write-lock | No | No |
14.1 Two-Phase Locking (2PL) Protocol
Locks are handled in two phases:
- Expanding: locks are acquired and no locks are released.
- Shrinking: locks are released and no locks are acquired.
Cons:
- 2PL may be subject to deadlocks, i.e., the mutual blocking of two or more transactions.
- 2PL can radically limit interleaving among transactions in some cases …
Pros:
2PL makes interleaving safe, i.e., guarantee the serializability property for transactions.
- Serializability means that a resulting database state is equal to a database state of running transactions serially.
- Serializability is the major correctness criterion for concurrent transactions.

15. NoSQL
Main Categories of NoSQL Solutions
- Key-value data stores
– Dynamo: Amazon’s Highly Available Key-value Store, DeCandia & et al, SOSP 2007. - Column-oriented data stores
– Bigtable: a distributed storage system for structured data, Chang & et al, OSDI 2006. - Document-oriented data stores
– MongoDB: http://mongodb.org, including source code and documentation. - Graph databases
– Neo4j: http://www.neo4j.org, including source code and documentation.
Other non-relational databases (not usually considered as NoSQL databases) include native XML databases, object databases, etc.
