Understanding Databases Like a Backend Engineer (PostgreSQL Edition)
Backend systems revolve around data.
User accounts, payments, orders, logs, analytics, permissions — almost everything a backend service does eventually becomes data stored somewhere.
Because of that, interacting with databases becomes one of the most frequent operations backend engineers perform.
This article explains databases from the perspective of a modern backend developer, focusing on the concepts you actually use in real systems and interviews.
We will cover:
Why databases exist
Disk vs memory storage
What a DBMS does
Relational vs non-relational databases
PostgreSQL fundamentals
Schema design
Table relationships
Migrations
Indexing
Triggers
Writing backend queries
The goal is not to turn you into a database researcher — but to help you understand the 80% of concepts backend engineers actually use daily.
Why Do Databases Exist?
The core reason databases exist is persistence.
Persistence means:
Data continues to exist even after the program that created it stops running.
Imagine a simple todo app.
You add tasks like:
Buy groceries
Finish assignment
Call mom
You close the app and open it again later.
If the tasks are still there, the app uses persistent storage.
Without persistence:
every restart loses data
users lose progress
applications become unusable
Databases solve this problem by storing data durably and reliably.
What Is a Database?
At a high level:
A database is a structured system that stores data and allows operations to manipulate it.
These operations are known as CRUD operations.
CRUD stands for:
Create
Read
Update
Delete
Examples of simple databases include:
phone contact lists
browser local storage
spreadsheets
structured files
However backend systems require more than simple storage.
They require structured access, concurrency control, and consistency guarantees.
That is where Database Management Systems (DBMS) come in.
What Is a DBMS?
A Database Management System (DBMS) is software responsible for managing data.
Examples include:
PostgreSQL
MySQL
SQL Server
MongoDB
A DBMS is responsible for several key tasks.
Data Organization
Data must be stored in a format that allows efficient operations like searching, inserting, and updating.
Data Access
The DBMS provides mechanisms to perform CRUD operations efficiently.
For example:
SELECT * FROM users;
Data Integrity
Integrity means ensuring stored data remains correct and valid.
Example:
If a column represents price, it should only accept numbers.
price INTEGER
Trying to insert text should fail.
INSERT INTO orders(price)
VALUES ('hello');
The database rejects invalid data.
Security
Databases also enforce security through:
authentication
access control
roles
permissions
This protects sensitive data from unauthorized access.
Why Not Store Data in Text Files?
Before modern database systems, developers often stored data in files.
But this approach introduces several major problems.
Parsing Overhead
Suppose customer data is stored in a text file:
Alice,alice@email.com
Bob,bob@email.com
To find a specific user, the backend must:
read the file
split lines
parse fields
compare values
This becomes slow and error-prone as data grows.
No Structure
Text files allow anything.
Example:
price: hello
There is no mechanism enforcing valid structure.
Databases enforce schemas and constraints.
Concurrency Problems
Consider two users modifying the same data simultaneously.
Initial value:
balance = 40
User A adds 20
User B subtracts 20
Possible results:
60
20
Correct result should be 40.
Databases implement concurrency control mechanisms to solve this.
Disk Storage vs Memory Storage
Most databases store data on disk, not RAM.
This choice is based on trade-offs.
| Storage | Speed | Capacity | Cost |
|---|---|---|---|
| RAM | Very fast | Limited | Expensive |
| Disk (SSD/HDD) | Slower | Very large | Cheap |
Typical systems have:
RAM
8GB
16GB
32GB
Disk storage
512GB
1TB
2TB
Databases choose disk storage for durability and capacity.
Where Does Caching Fit?
Caching systems like Redis store data in RAM.
Typical backend architecture:
Client → Backend → Cache → Database
Cache provides:
faster reads
temporary storage
The database remains the source of truth.
Types of Databases
Two major database categories dominate backend systems.
Relational Databases
Relational databases organize data into tables.
Example:
Users table
| id | name | |
|---|---|---|
| 1 | Alice | alice@email.com |
| 2 | Bob | bob@email.com |
Orders table
| id | user_id | product |
|---|---|---|
| 1 | 1 | Laptop |
| 2 | 2 | Phone |
The relationship:
users.id → orders.user_id
Relational databases enforce:
strict schemas
structured relationships
data integrity
Examples:
PostgreSQL
MySQL
SQL Server
Non-Relational Databases (NoSQL)
NoSQL databases allow flexible schemas.
Example MongoDB document:
{
"name": "Alice",
"email": "alice@email.com",
"skills": ["backend", "devops"]
}
Another document in the same collection:
{
"name": "Bob",
"age": 22
}
Structure can vary.
Examples include:
MongoDB
Cassandra
DynamoDB
Choosing Between SQL and NoSQL
Relational databases are ideal when:
relationships between data are important
data integrity matters
complex queries are required
Example: CRM systems.
NoSQL databases are ideal when:
schema is flexible
data structure changes frequently
rapid prototyping is required
Example: content management systems.
Why PostgreSQL Is Often the Best Choice
PostgreSQL is one of the most popular databases in modern backend systems.
Reasons include:
Open Source
Completely free and community maintained.
SQL Standard Compliance
Queries follow standard SQL, making migrations easier.
Extensibility
Postgres supports many extensions and advanced features.
Reliability
Widely used in production environments.
Strong JSON Support
Postgres supports JSON and JSONB.
Example:
CREATE TABLE articles (
id UUID,
content JSONB
);
This allows flexible document storage inside relational tables.
Important PostgreSQL Data Types
Integer Types
SMALLINT
INTEGER
BIGINT
Choose based on required numeric range.
Decimal vs Floating Point
Use decimal for precise values such as money.
price DECIMAL(10,2)
Floating point numbers can introduce precision issues.
String Types
CHAR
Fixed length strings.
Rarely used today.
VARCHAR
Variable length strings.
VARCHAR(255)
TEXT
Unlimited string storage.
PostgreSQL documentation recommends using TEXT in most cases.
Boolean
BOOLEAN
Values:
TRUE
FALSE
Date and Time
DATE
TIME
TIMESTAMP
TIMESTAMP WITH TIME ZONE
UUID
Universally unique identifier.
Example:
550e8400-e29b-41d4-a716-446655440000
Often used as primary keys.
JSON vs JSONB
JSON
JSONB
JSONB stores data in optimized binary form and is usually preferred.
Database Schema Design Example
Consider a project management platform.
Main entities:
users
projects
tasks
Primary Keys
Each table needs a unique identifier.
Example:
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
Primary keys guarantee:
uniqueness
non-null values
One-to-One Relationships
Example:
User → Profile
Tables:
users
user_profiles
Profiles store additional user information.
This avoids cluttering the main users table.
One-to-Many Relationships
Example:
Project → Tasks
One project can have multiple tasks.
Implementation:
project_id REFERENCES projects(id)
Many-to-Many Relationships
Example:
Users ↔ Projects
Solution: linking table
project_members
Columns:
project_id
user_id
Composite primary key:
PRIMARY KEY (project_id, user_id)
Database Migrations
Production databases should never be modified manually.
Instead we use migrations.
Example folder structure:
db/
migrations/
001_create_users.sql
002_create_projects.sql
Migration tools include:
dbmate
Flyway
golang-migrate
Up Migrations
Apply schema changes.
Example:
CREATE TABLE users (...);
Down Migrations
Rollback schema changes.
Example:
DROP TABLE users;
Seeding Test Data
During development we insert sample data.
Example:
INSERT INTO users(email, name)
VALUES
('alice@email.com','Alice'),
('bob@email.com','Bob');
This is called database seeding.
Writing Backend Queries
Example API:
GET /users
SQL query:
SELECT u.*, to_jsonb(up.*) AS profile
FROM users u
LEFT JOIN user_profiles up
ON u.id = up.user_id
ORDER BY u.created_at DESC;
This returns users along with profile data.
Parameterized Queries
Never construct SQL queries using string concatenation.
Bad example:
query = "SELECT * FROM users WHERE id = " + userInput
This enables SQL injection attacks.
Correct approach:
SELECT * FROM users
WHERE id = $1;
The parameter value is passed separately.
Database Indexing
As tables grow larger, query performance becomes critical.
Without indexes, databases must scan every row.
This is called a sequential scan.
Example query:
SELECT * FROM users
WHERE email = 'alice@email.com';
Without an index, the database checks every row.
With millions of rows this becomes extremely slow.
What an Index Does
An index acts like the index section of a book.
Instead of scanning every row, the database maintains a lookup structure mapping values to their storage locations.
Creating an index:
CREATE INDEX idx_users_email
ON users(email);
Now when searching by email, the database can jump directly to the correct rows.
When to Create Indexes
Indexes are useful when columns appear frequently in:
WHERE clauses
Example:
SELECT * FROM users
WHERE email = 'alice@email.com';
JOIN conditions
Example:
SELECT *
FROM tasks t
JOIN projects p
ON t.project_id = p.id;
ORDER BY sorting
Example:
SELECT *
FROM users
ORDER BY created_at DESC;
Creating an index for this query:
CREATE INDEX idx_users_created_at
ON users(created_at DESC);
Index Trade-Offs
Indexes improve read performance, but they introduce overhead.
Every time data changes:
INSERT
UPDATE
DELETE
The index must also be updated.
Therefore indexes should only be created for frequently queried columns.
Database Triggers
Triggers allow automatic actions when database events occur.
Example: automatically updating updated_at.
Trigger function:
CREATE FUNCTION update_timestamp()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Trigger definition:
CREATE TRIGGER update_user_timestamp
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_timestamp();
Now every update automatically refreshes the timestamp.
What Backend Engineers Actually Do With Databases
Most backend database work involves:
Designing schemas
Writing migrations
Creating indexes
Writing SQL queries
Maintaining data integrity
Typical backend flow:
API request
↓
Validate input
↓
Construct SQL query
↓
Execute query
↓
Return JSON response
Backend Database Interview Questions
What is persistence?
Persistence means storing data so it survives application restarts.
What is a primary key?
A unique identifier for each row in a table.
What is a foreign key?
A field referencing another table's primary key.
What is indexing?
Indexing creates a lookup structure that allows databases to locate rows quickly without scanning entire tables.
What are parameterized queries?
Queries where values are passed separately to prevent SQL injection attacks.
This is part of series Backend First Principles. Next: Caching.

