Post

Databases Overview

The foundational knowledge stems from the NYU’s “CSCI-UA 479 Data Management and Analysis” course by Joe Versoza.

Databases Overview

I also refered Oracle article to understand database basic.

What is Database?

A database is like a well-organized collection of information stored on a computer. It’s managed by a computer program called a database management system (DBMS). When you hear about a database system, it means the data, the DBMS, and the applications connected to them. Usually, data in today’s common databases is organized like a table with rows and columns, making it easy to work with. You can do things like find, change, or organize the data easily. Most databases use something called structured query language (SQL) for talking to and getting information from the database.

What is a database management system (DBMS)?

A database usually needs a special computer program called a database management system (DBMS). This program acts like a bridge between the database and the people or other programs that use it. It lets users get, change, and organize information in the database. The DBMS also helps in looking after and controlling the database, allowing things like watching how well it’s working, making it work better, and keeping copies of the data safe in case something goes wrong.

The computer asking for the "**service**” is known as the “client,” and the "service" is given by a server. The server can be:

  1. a single, centralized server,
  2. a group of servers in the same physical location, or
  3. several distributed servers in different physical locations.

The server must be “on” in order for data to be accessed.

Types of databases

Relational databases organize data in a collection of tables (relations). Within a relational database, data is structured into tables with columns and rows. This technology stands out for offering an efficient and flexible means of accessing organized information.

Each table row is assigned a primary key, serving to uniquely identify that specific row. The relationship between rows in one table and those in other tables is established by adding a column containing the primary key of the related table. This specific column is termed a foreign key.

NoSQL databases, also known as nonrelational databases, enable the storage and manipulation of unstructured and semistructured data. This is in contrast to relational databases, which dictate the composition of all inserted data. The rise in popularity of NoSQL databases coincided with the increasing prevalence and complexity of web applications.

Transaction

A transaction in a Database Management System (DBMS) represents a single logical operation or unit of work. This unit of work can involve multiple create, read, update, and delete operations (CRUD). For instance, within a single transaction, one might read a value from a specific record in one table and subsequently update another record in a different table using the value obtained from the initial read operation.

To ensure the database’s consistency and proper recovery from crashes or failures, maintaining fault tolerance, and preserving isolation between concurrent accesses, transactions must adhere to the ACID properties, which require them to be atomic, consistent, isolated, and durable.

  • Atomicity: Each transaction (series of operations in a transactins) is all or nothing
  • Consistency: Every transaction ensures that the resulting database state is valid (goes from one valid state to another)
  • Isolation: A failed transaction should have no effect on other transactions (even if the transactions are concurrent)
  • Durability: Once a transaction / operation is done, the result will remain persistent even through crash, power loss, etc.

What is Structured Query Language (SQL)?

SQL, utilized by almost all relational databases, serves as a programming language for querying, manipulating, and defining data, as well as facilitating access control.

It is a declarative programming language where your program specifies what you want, emphasizing the desired outcome instead of detailing the algorithm or how to achieve it.

1
2
3
4
5
# we're describing _how_ in this Python code
oldies = []
for user in users:
	if user.birthday > '1990-01-01':
		oldies.append(user)
1
2
3
4
--we're explaining _what_ in this SQL query
SELECT first, last 
	FROM user 
	WHERE birthday > '1990-01-01';