Database Security: Security and Privacy Concerns

/* The following article is extracted from the "Information Security Newsletter" published by the JUCC IS Task Force. */ 
In universities, some data have to be shared with students, staff, guests whereas others have to remain protected. At the same time, the number of data leak and unauthorized exposure to sensitive data increases dramatically. Therefore, concern for database security surges. This paper will expose the database management system; the encountered risks and finally focus on the security. 
Database Management System
A database is a complex structure that stores data for one or more applications usage. It also contains the relationship between the different data items. Sensitive data requiring specific security can also be stored inside.
In order to construct and maintain the database, a database management system (DBMS) has been developed1. It is an interface between end users/application programs and the database.
Figure 1 provides a simplified block diagram of DBMS architecture. A Data Definition Language (DDL) is used by administrators to define the database logical structure and procedural properties which are stored in Database description tables. The latter are used by the DBMS to manage the physical database. In order to support end user and application developers’ queries, a Data Manipulation Language (DML) and query languages have been designed. A transaction manager and a file manager provide a key link between the DBMS and the database interface. The DBMS is supported by two other tables: Concurrent access tables which prevent simultaneous and conflicting command from being executed, and Authorization tables that ensure that the user query execution is authorized.
Unlike operating system security that controls access right only for an entire file, DBMS can specify the access right of a particular record. Moreover, access right are not limited to read or write but concern many commands as INSERT, DELETE, SELECT.
Database is comprised of tables. Each table consists of Columns also known as Attributes or Fields that hold the data type and Rows also known as Tuples or Records that contain the value.
Comparison of the main DBMS: MySQL, Microsoft SQL Server, IBM DB2, Oracle3
MySQL has no CPU number limit, and available on almost all OS. It is free except in case of Support subscription which costs between $600 and $6000. This one is highly scalable.
Microsoft SQL server (Express) has a free version and a commercial one from $600 to $10,000. It supports for Windows only. The express version is limited to one CPU and a single user thread.
IBM DB2 is free for noncommercial use and $6,000 per CPU with unlimited users. The user interface is easy but support is limited. A more efficient support can be provided for additional cost.
Oracle can be download free for non-profit, noncommercial use and $15,000 for commercial use. It is available on almost all OS. Support and Service are efficient but user interface is weak and the installation and maintenance process is complex.
100 top universities hacked by SQL injection6 
In October 2012, Team GhostShell, an hacktivist group, stole 120,000 records of personal data thanks to SQL injections. These records belonged to top level universities as Michigan University, Harvard, Princeton, Cambridge and Imperial College of London … Among the 120,000 records, tens of thousands student and staff names were revealed as well as more than 35,000 e-mail addresses.
These attacks are part of a project called “Project WestWind” that aims at exposing vulnerabilities in education networks. They also declared that many infiltrated systems were already infected by malware.
Sydney University Data Breach9
On February 2nd 2015, the Online Recruitment System for Economic Experiments (ORSEE) of Sydney security was illegally accessed. Only 8 days later, the Security team succeeded in disabling the system. Personal information of 5,000 students was stored in this application. The main issue was that ORSEE was an open-source therefore, anyone can read the code. The hacker noticed SQL injection vulnerability in the code. This issue remained unpatched from 2004 and the vulnerability was detected in 2013 but they didn’t fix it immediately. 
SQL injection detection tool: Snort 11
Snort is a tool which can analyze real-time traffic and packet logging. This open-source software is a network-based intrusion detection system (NIDS). It can also be used as a prevention system. There are three main modes: Sniffer mode in which network packets are read and displayed on the console, Packet logger mode which logs packets to the disk, and Intrusion detection mode. The latter monitors and analyses network traffic in comparison to a rule set previously designed by the user. Depending on the result a specific action will occur.
This system has been downloaded 4 million times and almost 500,000 users are registered.
In order to configure it properly, IT security experience is essential.  

Ideally, all values of one column are unique and form an identifier/key. In relational database structure, this unique identifier will be present in all tables and will gather together the different tables4. This unique identifier is the primary key of the table. To link two tables, the primary key attributes of the first table must appear in second one and are referred to as a foreign key. Unlike primary key values, a foreign key value can appear several times in a table.
A view is a virtual table consisting of selected rows and columns from one or more tables. It doesn’t actually exist in the database, but is created at the time of the request. From a security point of view, a view can limit access to a relational database by hiding parts of the database from certain users.
In order to query and manipulate data in relational database, a basic syntax language has been created: Structure Query Language (SQL).
Database Vulnerability: SQL Injection
Databases are increasingly exposed to security threats.
SQL injection
According to the Imperva Web Application Attack Report, SQL injection attacks (SQLi) were the prevalent attacks in 20145. This attack takes advantages of web applications coding weaknesses. In fact, if SQL statements are created dynamically using user input, hackers can use the same input form to inject SQL commands. This attack is really easy to realize and the consequences are severe. They can have access to data registered inside as well as modify or delete them. All websites including forms are likely to be attacked by SQL injection.
For example, if a website is protected by a login and a password, an evil person can enter in the username form a string like ‘OR 1=1 --, then he will gain the access to the protected website without having a valid password or username.
Before the injection, the SQL code was close to: 
1=1 is always true and the comment mark - - will terminate prematurely the text string and the following text will be ignored by the server at execution time. Thus the password will not be checked. 
The main goal of this attack is to extract a large amount of data. The solution against SQLi is input validation.
SQLi attacks avenues
SQLi can be categorized according to the type and avenue of attack7.
User input: attackers use the user input form to inject SQL commands.
Server variables: attackers will directly change the values placed in HTTP and network headers.
Second-order injection: the input that will be the cause of the attack will not come from the hacker but directly from the system itself. This can be done if the attacker can rely on data already present in the database to trigger his SQL attack.
Cookies: in case of SQL query based on cookie’s content, cookies alteration could modify the structure and function of the query.
Physical user input: attack is constructed outside the realm of web requests through RFID tags, barcodes…
​In Band
An in band attack uses the same communication channel for injecting SQL code and retrieving results. Different kinds of attacks belong to this category. Tautology attacks inject code in conditional statement to turn them into true statements. Another technique is to add an end of line comment ‘- -‘ at the end of an injection to inhibit the execution of the following code. Piggy-backing is a technique that allows query to pass through security by adding queries beyond the legitimate one.
Unlike in band attacks, out-of-band attacks use different channels to inject and retrieve data. This technique is useful in case of limitations on information retrieval but the outgoing connectivity from the database is weak.
In the case of inferential attack, there is no data transfer8. The attacker sends particular requests and observes the result in order to retrieve the database information.
He will send illegal or logically incorrect queries to gather information about the type and the structure of the database. This attack is based on the error message returned by the application server.
If the system is sufficiently protected not to reveal error message, the attacker can also send blind SQL injection which consists of sending true or false questions to the server. Considering no error message is returned, a delay has to be added to check the answer.​

If the answer is false the page will upload quickly whereas in the right case a delay of ten seconds will precede the new page loading. This technique is time consuming because each entry has to be checked symbol by symbol. For instance, assuming a binary scheme based on the Ascii code is used for each request, 7 requests will be needed to guess one character.

Illustration of the inference problem:
The University database presents personal information. Individually, each column is available to anyone but the relationship between Name and Grades is restricted to the Administrator. A solution is to build two views define in SQL as follows:

A user, who knows the structure of the University table and that the view tables maintain the same row order as the University table, is then able to merge the two views to construct the table shown in Figure 2.

Moreover, a student’s course is an easy discoverable attribute and a user could partially infer the students’ name from the V2. This violates the access control policy that the relationship of attributes Name and Grades must not be disclosed.


The main goals of database security are to prevent unauthorized access to data, to prevent unauthorized alteration or modification of data, and to insure data availability.
Database access control

Limiting data access is primary in database security. This requires three mechanisms: authentication, authorization and access control. Authentication usually results in a username and a password to get access to the system. Then some privileges are given to authorize students to access specific resources. When the system has authenticated each user, the access control system has the capability to control access to portions of the database. The DBMS propose different kinds of administrative policies depending on people authorized to concede and abrogate table access rights thanks to the two SQL commands GRANT and REVOKE10.
Centralized administration: restricted to few privileged users.
Ownership-based administration: restricted to the creator of the table. 
Decentralized administration: as owner-ship-based administration but furthermore, the owner can also allow people to grant and revoke access rights.
Moreover, Role Based Access Control (RBAC) associates privileges to a role. Each user assigned to a role is given the associated privileges. The SQL code to authorize an action to a role category is:

In addition to this access control, access rights that vary from one person to another. In fact, access right may concern the entire database, individual tables or selected rows and columns. It may also be determined on the content, i.e. information upon a certain value is not available. During the database development stage, a low-privilege account should be created first and then, if necessary, some permission could be added.
SQLi countermeasures
Many approaches to fix SQLi attacks issue have been proposed. They can be divided into three categories: Defensive coding, Detection and Run-time prevention12.
Defensive coding: Insecure coding practices favor SQLi attacks success. Insufficient input validation is a common vulnerability, checking its type, size or content and performing pattern matching to distinguish normal from abnormal input can reduce the number of forcing errors attacks in the DBMS. Using prepared statement to pass the input value parameter separately will prevent attackers from modifying query structure. Special characters as -- ; */ /* xp_ ‘ should be rejected. In fact, they can modify the database structure. Moreover, acceptable answers should be white-listed, instead of using black listing to filter unwanted answer. Eventually, SQL DoM (Domain Object model) provides a safe access to databases via encapsulation technique. Instead of using string concatenation, the query-building process will use a type-checked API. Input filtering and type checking of input will be applied within the API.​​
Detection: In case of an attack, it is important to be able to detect it. The signature based technique attempts to match specific attack patterns. In fact, a database of attack input string (signature) is created. Then the input is compared with the different signatures at runtime. Detection can also apply on abnormal behavior. This technique has to be preceded by a training phase to allow the system to define normal behavior. Therefore, a test suite generates a high range of SQLi attacks in order to detect vulnerabilities. Finally, database auditing can investigate suspicious activities by monitoring and gathering information on database users’ activities.
Runtime prevention techniques examine queries structure before and after user input. It accepts only legal queries stored in a master file. This file needs to be regularly updated. Others are rejected and reported. Developers can define a threshold to authorize queries that only partially matched. The choice of an incorrect threshold could compromise the database security. Moreover, a monitoring system can be deployed between the application server and the database server in order to intercept the queries.
Inference detection
Inference can be detected during the database design or at query time13.
The first one requires database structure modifications or an enhancement of access control. Even if this technique is easier to implement, it generally leads to a reduction of availability thanks to unnecessary access control.
The second one rejects or modifies suspicious inferential queries in order to avoid security violations.
Even if design-time is easier to implement, query-time insure more data availability.
As most valuable resources are generally stocked in database, this one is protected by multiple layers of security (firewalls, authentication mechanism, access control systems). For very sensitive data, encryption can be added; therefore, it is the last line of defense in database security. Encryption can be applied to the entire database or dedicated to a specific part from the record level until the individual field.
However, there are two disadvantages to database encryption: 
Firstly, to get access to encrypted data, users need​ to detain the key. As databases are designed to offer wide access, key providing is a complex task. 
Secondly, database encryption reduces availability and thus research of record is more difficult to perform.
To ensure high database security, encryption of the entire database and restricted key access could sound as a good idea. However, this solution is inflexible. In fact, to access to a single data, user will have to download the entire database, decrypt it and finally look for the desired data. 
Increasing amounts of sensitive data are being retained in database. Moreover, most of these data are available from the Internet. Consequently, the amount of database threats grows as well. Database security has become a major issue and has to be considered from design, by using defensive coding and followed up until query-time. Moreover, the security of the database is not enough, all the surrounding entity should be taken into account as database access applications and communication channel.​ 
  1. ​“Database Management System” January 2015 Web. 29 June 2015
  2. “Computer Security Principles and Practice” W. Stallings and L. Brown (2nd Edition) pdf 2011 pdf 30 June 2015
  3. “Comparison of relational database management systems” 13 July 2015 Web. 15 july 2015
  4. “Database keys” June 2015 Web. 29 June 2015
  5. “Web Application Attack Report #5” Imperva October 2014 pdf 9 July 2015
  6. “Ghost shell leaks 120000 records from top 100 universities” 2 October 2015 Web. 14 July 2015
  7. “A Classification of SQL injection Attacks and Countermeasures” by W. G.J. Halfond and al. 2006 pdf 29 June 2015
  8. “Inference Attacks and Control on Database Structures” by M. Turkanovic and al. 2000 pdf 29 June 2015
  9. “16 year old claims to be behind Sydney University data breach” 13 February 2015 Web. 14 July 2015
  10. “Database Access Control Policies” by A. Patil and Prof. B. B. Meshran 2012 pdf 30 June 2015
  11. “Snort” 14 July 2015 Web. 15 July 2015
  12. “A Survey of SQL Injection Countermeasures” by Dr R.P.Mahapatra and S. Khan 2012 pdf 29 June 2015
  13. “Inference attack” Web. 29 June 2015​