Skip to main content

Command Palette

Search for a command to run...

Understanding Databases Like a Backend Engineer (PostgreSQL Edition)

Updated
11 min read

Backend systems revolve around data.

User accounts, payments, orders, logs, analytics, permissions — almost everything a backend service does eventually becomes data stored somewhere.

Because of that, interacting with databases becomes one of the most frequent operations backend engineers perform.

This article explains databases from the perspective of a modern backend developer, focusing on the concepts you actually use in real systems and interviews.

We will cover:

  • Why databases exist

  • Disk vs memory storage

  • What a DBMS does

  • Relational vs non-relational databases

  • PostgreSQL fundamentals

  • Schema design

  • Table relationships

  • Migrations

  • Indexing

  • Triggers

  • Writing backend queries

The goal is not to turn you into a database researcher — but to help you understand the 80% of concepts backend engineers actually use daily.


Why Do Databases Exist?

The core reason databases exist is persistence.

Persistence means:

Data continues to exist even after the program that created it stops running.

Imagine a simple todo app.

You add tasks like:

  • Buy groceries

  • Finish assignment

  • Call mom

You close the app and open it again later.

If the tasks are still there, the app uses persistent storage.

Without persistence:

  • every restart loses data

  • users lose progress

  • applications become unusable

Databases solve this problem by storing data durably and reliably.


What Is a Database?

At a high level:

A database is a structured system that stores data and allows operations to manipulate it.

These operations are known as CRUD operations.

CRUD stands for:

  • Create

  • Read

  • Update

  • Delete

Examples of simple databases include:

  • phone contact lists

  • browser local storage

  • spreadsheets

  • structured files

However backend systems require more than simple storage.

They require structured access, concurrency control, and consistency guarantees.

That is where Database Management Systems (DBMS) come in.


What Is a DBMS?

A Database Management System (DBMS) is software responsible for managing data.

Examples include:

  • PostgreSQL

  • MySQL

  • SQL Server

  • MongoDB

A DBMS is responsible for several key tasks.


Data Organization

Data must be stored in a format that allows efficient operations like searching, inserting, and updating.


Data Access

The DBMS provides mechanisms to perform CRUD operations efficiently.

For example:

SELECT * FROM users;

Data Integrity

Integrity means ensuring stored data remains correct and valid.

Example:

If a column represents price, it should only accept numbers.

price INTEGER

Trying to insert text should fail.

INSERT INTO orders(price)
VALUES ('hello');

The database rejects invalid data.


Security

Databases also enforce security through:

  • authentication

  • access control

  • roles

  • permissions

This protects sensitive data from unauthorized access.


Why Not Store Data in Text Files?

Before modern database systems, developers often stored data in files.

But this approach introduces several major problems.


Parsing Overhead

Suppose customer data is stored in a text file:

Alice,alice@email.com
Bob,bob@email.com

To find a specific user, the backend must:

  1. read the file

  2. split lines

  3. parse fields

  4. compare values

This becomes slow and error-prone as data grows.


No Structure

Text files allow anything.

Example:

price: hello

There is no mechanism enforcing valid structure.

Databases enforce schemas and constraints.


Concurrency Problems

Consider two users modifying the same data simultaneously.

Initial value:

balance = 40

User A adds 20
User B subtracts 20

Possible results:

  • 60

  • 20

Correct result should be 40.

Databases implement concurrency control mechanisms to solve this.


Disk Storage vs Memory Storage

Most databases store data on disk, not RAM.

This choice is based on trade-offs.

Storage Speed Capacity Cost
RAM Very fast Limited Expensive
Disk (SSD/HDD) Slower Very large Cheap

Typical systems have:

RAM

  • 8GB

  • 16GB

  • 32GB

Disk storage

  • 512GB

  • 1TB

  • 2TB

Databases choose disk storage for durability and capacity.


Where Does Caching Fit?

Caching systems like Redis store data in RAM.

Typical backend architecture:

Client → Backend → Cache → Database

Cache provides:

  • faster reads

  • temporary storage

The database remains the source of truth.


Types of Databases

Two major database categories dominate backend systems.


Relational Databases

Relational databases organize data into tables.

Example:

Users table

id name email
1 Alice alice@email.com
2 Bob bob@email.com

Orders table

id user_id product
1 1 Laptop
2 2 Phone

The relationship:

users.id → orders.user_id

Relational databases enforce:

  • strict schemas

  • structured relationships

  • data integrity

Examples:

  • PostgreSQL

  • MySQL

  • SQL Server


Non-Relational Databases (NoSQL)

NoSQL databases allow flexible schemas.

Example MongoDB document:

{
  "name": "Alice",
  "email": "alice@email.com",
  "skills": ["backend", "devops"]
}

Another document in the same collection:

{
  "name": "Bob",
  "age": 22
}

Structure can vary.

Examples include:

  • MongoDB

  • Cassandra

  • DynamoDB


Choosing Between SQL and NoSQL

Relational databases are ideal when:

  • relationships between data are important

  • data integrity matters

  • complex queries are required

Example: CRM systems.

NoSQL databases are ideal when:

  • schema is flexible

  • data structure changes frequently

  • rapid prototyping is required

Example: content management systems.


Why PostgreSQL Is Often the Best Choice

PostgreSQL is one of the most popular databases in modern backend systems.

Reasons include:

Open Source

Completely free and community maintained.


SQL Standard Compliance

Queries follow standard SQL, making migrations easier.


Extensibility

Postgres supports many extensions and advanced features.


Reliability

Widely used in production environments.


Strong JSON Support

Postgres supports JSON and JSONB.

Example:

CREATE TABLE articles (
  id UUID,
  content JSONB
);

This allows flexible document storage inside relational tables.


Important PostgreSQL Data Types


Integer Types

SMALLINT
INTEGER
BIGINT

Choose based on required numeric range.


Decimal vs Floating Point

Use decimal for precise values such as money.

price DECIMAL(10,2)

Floating point numbers can introduce precision issues.


String Types

CHAR

Fixed length strings.

Rarely used today.


VARCHAR

Variable length strings.

VARCHAR(255)

TEXT

Unlimited string storage.

PostgreSQL documentation recommends using TEXT in most cases.


Boolean

BOOLEAN

Values:

TRUE
FALSE

Date and Time

DATE
TIME
TIMESTAMP
TIMESTAMP WITH TIME ZONE

UUID

Universally unique identifier.

Example:

550e8400-e29b-41d4-a716-446655440000

Often used as primary keys.


JSON vs JSONB

JSON
JSONB

JSONB stores data in optimized binary form and is usually preferred.


Database Schema Design Example

Consider a project management platform.

Main entities:

  • users

  • projects

  • tasks


Primary Keys

Each table needs a unique identifier.

Example:

id UUID PRIMARY KEY DEFAULT gen_random_uuid()

Primary keys guarantee:

  • uniqueness

  • non-null values


One-to-One Relationships

Example:

User → Profile

Tables:

users
user_profiles

Profiles store additional user information.

This avoids cluttering the main users table.


One-to-Many Relationships

Example:

Project → Tasks

One project can have multiple tasks.

Implementation:

project_id REFERENCES projects(id)

Many-to-Many Relationships

Example:

Users ↔ Projects

Solution: linking table

project_members

Columns:

project_id
user_id

Composite primary key:

PRIMARY KEY (project_id, user_id)

Database Migrations

Production databases should never be modified manually.

Instead we use migrations.

Example folder structure:

db/
  migrations/
    001_create_users.sql
    002_create_projects.sql

Migration tools include:

  • dbmate

  • Flyway

  • golang-migrate


Up Migrations

Apply schema changes.

Example:

CREATE TABLE users (...);

Down Migrations

Rollback schema changes.

Example:

DROP TABLE users;

Seeding Test Data

During development we insert sample data.

Example:

INSERT INTO users(email, name)
VALUES
('alice@email.com','Alice'),
('bob@email.com','Bob');

This is called database seeding.


Writing Backend Queries

Example API:

GET /users

SQL query:

SELECT u.*, to_jsonb(up.*) AS profile
FROM users u
LEFT JOIN user_profiles up
ON u.id = up.user_id
ORDER BY u.created_at DESC;

This returns users along with profile data.


Parameterized Queries

Never construct SQL queries using string concatenation.

Bad example:

query = "SELECT * FROM users WHERE id = " + userInput

This enables SQL injection attacks.

Correct approach:

SELECT * FROM users
WHERE id = $1;

The parameter value is passed separately.


Database Indexing

As tables grow larger, query performance becomes critical.

Without indexes, databases must scan every row.

This is called a sequential scan.

Example query:

SELECT * FROM users
WHERE email = 'alice@email.com';

Without an index, the database checks every row.

With millions of rows this becomes extremely slow.


What an Index Does

An index acts like the index section of a book.

Instead of scanning every row, the database maintains a lookup structure mapping values to their storage locations.

Creating an index:

CREATE INDEX idx_users_email
ON users(email);

Now when searching by email, the database can jump directly to the correct rows.


When to Create Indexes

Indexes are useful when columns appear frequently in:

WHERE clauses

Example:

SELECT * FROM users
WHERE email = 'alice@email.com';

JOIN conditions

Example:

SELECT *
FROM tasks t
JOIN projects p
ON t.project_id = p.id;

ORDER BY sorting

Example:

SELECT *
FROM users
ORDER BY created_at DESC;

Creating an index for this query:

CREATE INDEX idx_users_created_at
ON users(created_at DESC);

Index Trade-Offs

Indexes improve read performance, but they introduce overhead.

Every time data changes:

  • INSERT

  • UPDATE

  • DELETE

The index must also be updated.

Therefore indexes should only be created for frequently queried columns.


Database Triggers

Triggers allow automatic actions when database events occur.

Example: automatically updating updated_at.

Trigger function:

CREATE FUNCTION update_timestamp()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Trigger definition:

CREATE TRIGGER update_user_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_timestamp();

Now every update automatically refreshes the timestamp.


What Backend Engineers Actually Do With Databases

Most backend database work involves:

  1. Designing schemas

  2. Writing migrations

  3. Creating indexes

  4. Writing SQL queries

  5. Maintaining data integrity

Typical backend flow:

API request
     ↓
Validate input
     ↓
Construct SQL query
     ↓
Execute query
     ↓
Return JSON response

Backend Database Interview Questions

What is persistence?

Persistence means storing data so it survives application restarts.


What is a primary key?

A unique identifier for each row in a table.


What is a foreign key?

A field referencing another table's primary key.


What is indexing?

Indexing creates a lookup structure that allows databases to locate rows quickly without scanning entire tables.


What are parameterized queries?

Queries where values are passed separately to prevent SQL injection attacks.


This is part of series Backend First Principles. Next: Caching.

Backend First Principles

Part 11 of 17

This series documents my learning journey through the "Backend from First Principles" playlist. Instead of jumping directly into frameworks, the focus is on understanding the core concepts that power backend systems. Throughout this series, I explore how backend systems actually work — from the request-response lifecycle, HTTP fundamentals, routing, serialization, authentication, and validation to more advanced topics like caching, task queues, observability, security, and scaling. The goal of this series is to build a strong conceptual foundation for backend engineering that applies across languages and frameworks. By learning backend development from first principles, we gain a deeper understanding of how modern web systems are designed, built, and scaled.

Up next

REST API Design: Stop Guessing, Start Following Patterns

Most backend engineers write APIs. Few design them. The difference shows up every time someone integrates your work and either breezes through it — or spends half a day decoding inconsistencies. This