Backend systems revolve around data.

User accounts, payments, orders, logs, analytics, permissions — almost everything a backend service does eventually becomes data stored somewhere.

Because of that, interacting with databases becomes one of the most frequent operations backend engineers perform.

This article explains databases from the perspective of a modern backend developer, focusing on the concepts you actually use in real systems and interviews.

We will cover:

Why databases exist
Disk vs memory storage
What a DBMS does
Relational vs non-relational databases
PostgreSQL fundamentals
Schema design
Table relationships
Migrations
Indexing
Triggers
Writing backend queries

The goal is not to turn you into a database researcher — but to help you understand the 80% of concepts backend engineers actually use daily.

Why Do Databases Exist?

The core reason databases exist is persistence.

Persistence means:

Data continues to exist even after the program that created it stops running.

Imagine a simple todo app.

You add tasks like:

Buy groceries
Finish assignment
Call mom

You close the app and open it again later.

If the tasks are still there, the app uses persistent storage.

Without persistence:

every restart loses data
users lose progress
applications become unusable

Databases solve this problem by storing data durably and reliably.

What Is a Database?

At a high level:

A database is a structured system that stores data and allows operations to manipulate it.

These operations are known as CRUD operations.

CRUD stands for:

Create
Read
Update
Delete

Examples of simple databases include:

phone contact lists
browser local storage
spreadsheets
structured files

However backend systems require more than simple storage.

They require structured access, concurrency control, and consistency guarantees.

That is where Database Management Systems (DBMS) come in.

What Is a DBMS?

A Database Management System (DBMS) is software responsible for managing data.

Examples include:

PostgreSQL
MySQL
SQL Server
MongoDB

A DBMS is responsible for several key tasks.

Data Organization

Data must be stored in a format that allows efficient operations like searching, inserting, and updating.

Data Access

The DBMS provides mechanisms to perform CRUD operations efficiently.

For example:

SELECT * FROM users;

Data Integrity

Integrity means ensuring stored data remains correct and valid.

Example:

If a column represents price, it should only accept numbers.

price INTEGER

Trying to insert text should fail.

INSERT INTO orders(price)
VALUES ('hello');

The database rejects invalid data.

Security

Databases also enforce security through:

authentication
access control
roles
permissions

This protects sensitive data from unauthorized access.

Why Not Store Data in Text Files?

Before modern database systems, developers often stored data in files.

But this approach introduces several major problems.

Parsing Overhead

Suppose customer data is stored in a text file:

Alice,alice@email.com
Bob,bob@email.com

To find a specific user, the backend must:

read the file
split lines
parse fields
compare values

This becomes slow and error-prone as data grows.

No Structure

Text files allow anything.

Example:

price: hello

There is no mechanism enforcing valid structure.

Databases enforce schemas and constraints.

Concurrency Problems

Consider two users modifying the same data simultaneously.

Initial value:

balance = 40

User A adds 20
User B subtracts 20

Possible results:

Correct result should be 40.

Databases implement concurrency control mechanisms to solve this.

Disk Storage vs Memory Storage

Most databases store data on disk, not RAM.

This choice is based on trade-offs.

Storage	Speed	Capacity	Cost
RAM	Very fast	Limited	Expensive
Disk (SSD/HDD)	Slower	Very large	Cheap

Typical systems have:

RAM

8GB
16GB
32GB

Disk storage

512GB
1TB
2TB

Databases choose disk storage for durability and capacity.

Where Does Caching Fit?

Caching systems like Redis store data in RAM.

Typical backend architecture:

Client → Backend → Cache → Database

Cache provides:

faster reads
temporary storage

The database remains the source of truth.

Types of Databases

Two major database categories dominate backend systems.

Relational Databases

Relational databases organize data into tables.

Example:

Users table

id	name	email
1	Alice	alice@email.com
2	Bob	bob@email.com

Orders table

id	user_id	product
1	1	Laptop
2	2	Phone

The relationship:

users.id → orders.user_id

Relational databases enforce:

strict schemas
structured relationships
data integrity

Examples:

PostgreSQL
MySQL
SQL Server

Non-Relational Databases (NoSQL)

NoSQL databases allow flexible schemas.

Example MongoDB document:

{
  "name": "Alice",
  "email": "alice@email.com",
  "skills": ["backend", "devops"]
}

Another document in the same collection:

{
  "name": "Bob",
  "age": 22
}

Structure can vary.

Examples include:

MongoDB
Cassandra
DynamoDB

Choosing Between SQL and NoSQL

Relational databases are ideal when:

relationships between data are important
data integrity matters
complex queries are required

Example: CRM systems.

NoSQL databases are ideal when:

schema is flexible
data structure changes frequently
rapid prototyping is required

Example: content management systems.

Why PostgreSQL Is Often the Best Choice

PostgreSQL is one of the most popular databases in modern backend systems.

Reasons include:

Open Source

Completely free and community maintained.

SQL Standard Compliance

Queries follow standard SQL, making migrations easier.

Extensibility

Postgres supports many extensions and advanced features.

Reliability

Widely used in production environments.

Strong JSON Support

Postgres supports JSON and JSONB.

Example:

CREATE TABLE articles (
  id UUID,
  content JSONB
);

This allows flexible document storage inside relational tables.

Important PostgreSQL Data Types

Integer Types

SMALLINT
INTEGER
BIGINT

Choose based on required numeric range.

Decimal vs Floating Point

Use decimal for precise values such as money.

price DECIMAL(10,2)

Floating point numbers can introduce precision issues.

String Types

CHAR

Fixed length strings.

Rarely used today.

VARCHAR

Variable length strings.

VARCHAR(255)

TEXT

Unlimited string storage.

PostgreSQL documentation recommends using TEXT in most cases.

Boolean

BOOLEAN

Values:

TRUE
FALSE

Date and Time

DATE
TIME
TIMESTAMP
TIMESTAMP WITH TIME ZONE

UUID

Universally unique identifier.

Example:

550e8400-e29b-41d4-a716-446655440000

Often used as primary keys.

JSON vs JSONB

JSON
JSONB

JSONB stores data in optimized binary form and is usually preferred.

Database Schema Design Example

Consider a project management platform.

Main entities:

users
projects
tasks

Primary Keys

Each table needs a unique identifier.

Example:

id UUID PRIMARY KEY DEFAULT gen_random_uuid()

Primary keys guarantee:

uniqueness
non-null values

One-to-One Relationships

Example:

User → Profile

Tables:

users
user_profiles

Profiles store additional user information.

This avoids cluttering the main users table.

One-to-Many Relationships

Example:

Project → Tasks

One project can have multiple tasks.

Implementation:

project_id REFERENCES projects(id)

Many-to-Many Relationships

Example:

Users ↔ Projects

Solution: linking table

project_members

Columns:

project_id
user_id

Composite primary key:

PRIMARY KEY (project_id, user_id)

Database Migrations

Production databases should never be modified manually.

Instead we use migrations.

Example folder structure:

db/
  migrations/
    001_create_users.sql
    002_create_projects.sql

Migration tools include:

dbmate
Flyway
golang-migrate

Up Migrations

Apply schema changes.

Example:

CREATE TABLE users (...);

Down Migrations

Rollback schema changes.

Example:

DROP TABLE users;

Seeding Test Data

During development we insert sample data.

Example:

INSERT INTO users(email, name)
VALUES
('alice@email.com','Alice'),
('bob@email.com','Bob');

This is called database seeding.

Writing Backend Queries

Example API:

GET /users

SQL query:

SELECT u.*, to_jsonb(up.*) AS profile
FROM users u
LEFT JOIN user_profiles up
ON u.id = up.user_id
ORDER BY u.created_at DESC;

This returns users along with profile data.

Parameterized Queries

Never construct SQL queries using string concatenation.

Bad example:

query = "SELECT * FROM users WHERE id = " + userInput

This enables SQL injection attacks.

Correct approach:

SELECT * FROM users
WHERE id = $1;

The parameter value is passed separately.

Database Indexing

As tables grow larger, query performance becomes critical.

Without indexes, databases must scan every row.

This is called a sequential scan.

Example query:

SELECT * FROM users
WHERE email = 'alice@email.com';

Without an index, the database checks every row.

With millions of rows this becomes extremely slow.

What an Index Does

An index acts like the index section of a book.

Instead of scanning every row, the database maintains a lookup structure mapping values to their storage locations.

Creating an index:

CREATE INDEX idx_users_email
ON users(email);

Now when searching by email, the database can jump directly to the correct rows.

When to Create Indexes

Indexes are useful when columns appear frequently in:

WHERE clauses

Example:

SELECT * FROM users
WHERE email = 'alice@email.com';

JOIN conditions

Example:

SELECT *
FROM tasks t
JOIN projects p
ON t.project_id = p.id;

ORDER BY sorting

Example:

SELECT *
FROM users
ORDER BY created_at DESC;

Creating an index for this query:

CREATE INDEX idx_users_created_at
ON users(created_at DESC);

Index Trade-Offs

Indexes improve read performance, but they introduce overhead.

Every time data changes:

INSERT
UPDATE
DELETE

The index must also be updated.

Therefore indexes should only be created for frequently queried columns.

Database Triggers

Triggers allow automatic actions when database events occur.

Example: automatically updating updated_at.

Trigger function:

CREATE FUNCTION update_timestamp()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Trigger definition:

CREATE TRIGGER update_user_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_timestamp();

Now every update automatically refreshes the timestamp.

What Backend Engineers Actually Do With Databases

Most backend database work involves:

Designing schemas
Writing migrations
Creating indexes
Writing SQL queries
Maintaining data integrity

Typical backend flow:

API request
     ↓
Validate input
     ↓
Construct SQL query
     ↓
Execute query
     ↓
Return JSON response

Backend Database Interview Questions

What is persistence?

Persistence means storing data so it survives application restarts.

What is a primary key?

A unique identifier for each row in a table.

What is a foreign key?

A field referencing another table's primary key.

What is indexing?

Indexing creates a lookup structure that allows databases to locate rows quickly without scanning entire tables.

What are parameterized queries?

Queries where values are passed separately to prevent SQL injection attacks.

This is part of series Backend First Principles. Next: Caching.

Command Palette

Why Do Databases Exist?

What Is a Database?

What Is a DBMS?

Data Organization

Data Access

Data Integrity

Security

Why Not Store Data in Text Files?

Parsing Overhead

No Structure

Concurrency Problems

Disk Storage vs Memory Storage

Where Does Caching Fit?

Types of Databases

Relational Databases

Non-Relational Databases (NoSQL)

Choosing Between SQL and NoSQL

Why PostgreSQL Is Often the Best Choice

Open Source

SQL Standard Compliance

Extensibility

Reliability

Strong JSON Support

Important PostgreSQL Data Types

Integer Types

Decimal vs Floating Point

String Types

CHAR

VARCHAR

TEXT

Boolean

Date and Time

UUID

JSON vs JSONB

Database Schema Design Example

Primary Keys

One-to-One Relationships

One-to-Many Relationships

Many-to-Many Relationships

Database Migrations

Up Migrations

Down Migrations

Seeding Test Data

Writing Backend Queries

Parameterized Queries

Database Indexing

What an Index Does

When to Create Indexes

WHERE clauses

JOIN conditions

ORDER BY sorting

Index Trade-Offs

Database Triggers

What Backend Engineers Actually Do With Databases

Backend Database Interview Questions

What is persistence?

What is a primary key?

What is a foreign key?

What is indexing?

What are parameterized queries?

Comments

Backend First Principles

REST API Design: Stop Guessing, Start Following Patterns

More from this blog