Skip to main content

Command Palette

Search for a command to run...

API Validations & Transformations: The Gatekeepers of Your Backend

Updated
13 min read

Every piece of data your server blindly trusts is a vulnerability waiting to happen. Here's the system that keeps your backend honest.


Why This Matters More Than You Think

Imagine a hospital that lets anyone walk in, grab a file, and write in a patient's medical record — no ID check, no signature, nothing.

That's what your API looks like without validations.

Every client — browsers, mobile apps, Postman, scripts, bots — can send your server anything. Your job as a backend engineer is to decide what actually gets through, and in what shape.

That's what validations and transformations are about.


First, Let's Understand the Architecture

Before we talk about where validations happen, you need to understand the typical layers of a backend application. Most production backends are organized into three layers:

Repository Layer — The bottom. Talks directly to the database. Executes queries, inserts, updates, deletes. Knows nothing about HTTP.

Service Layer — The middle. Contains your business logic. Calls repository methods, sends emails, triggers webhooks, processes payments. This is where the meaning of an API lives.

Controller Layer — The top. Handles HTTP. Decides what status code to return, what format the response takes, what data to pass down to the service layer. It's the gatekeeper between the outside world and your application internals.

Here's how a typical request flows through these layers:

HTTP Request
     │
Controller Layer   ←── validations & transformations happen HERE
     │
Service Layer      ←── business logic, emails, notifications
     │
Repository Layer   ←── database queries
     │
Database

Validations and transformations live at the entry point of the controller layer — before any business logic runs, before any database is touched.


The Core Idea: Don't Trust Incoming Data

Multiple clients hit your server. Different browsers, mobile apps, third-party integrations, CLI tools. Each sends data in slightly different ways. Some send garbage accidentally. Some send garbage on purpose.

Your validation pipeline is a contract: "This is the exact shape and type of data this API accepts. Anything else gets rejected immediately."

Why at the entry point? Because the further bad data travels into your system, the more damage it can do — and the harder it is to debug.


What Happens Without Validations (A Horror Story)

Say you have an API that creates a new book. It expects a name field as a string. Now a client sends this:

{
  "name": 0
}

No validation. The request flows through your controller, into your service, and down to your repository. Your repository executes an INSERT against your PostgreSQL table:

-- PostgreSQL table definition
CREATE TABLE books (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

PostgreSQL expects TEXT. You're inserting an integer. The database throws a type error. Your server returns a 500 Internal Server Error.

The client gets a cryptic message that something blew up on the server. They have no idea what they did wrong. Your logs show a database exception buried inside a chain of function calls.

A 400 Bad Request with the message "name must be a string" would have been infinitely better — and it would have been caught at the door, before anything else ran.

That's the entire argument for validations.


The Three Types of Validation

1. Type Validation

The most fundamental check: does the value match the expected data type?

// Schema definition (using Zod — a popular validation library for TypeScript/Node.js)
import { z } from "zod";

const schema = z.object({
  stringField: z.string(),
  numberField: z.number(),
  arrayField:  z.array(z.string()),
  boolField:   z.boolean(),
});

If a client sends numberField: "hello", this fails immediately with a clear error: "Expected number, received string".

Type validation also works on nested structures:

const bookSchema = z.object({
  title:  z.string(),
  price:  z.number(),
  tags:   z.array(z.string()),   // each element must also be a string
  author: z.object({
    name:  z.string(),
    email: z.string().email(),
  }),
});

2. Syntactic Validation

Beyond types, some fields must match a specific structure or pattern. Even if something is a string, it might not be a valid email or phone number.

const contactSchema = z.object({
  email: z.string().email(),          // must match email structure
  phone: z.string().regex(
    /^\+?[1-9]\d{7,14}$/,            // optional +, then digits
    "Invalid phone number format"
  ),
  date: z.string().datetime(),        // must be ISO 8601 date string
});

Common syntactic validations you'll build:

  • Email — must have local@domain.tld structure

  • Phone — country code + digit count varies by region

  • Dates — must follow a valid format like YYYY-MM-DD or ISO 8601

  • URLs — must start with http:// or https://, have valid host

  • Postal codes — format differs by country

These are called syntactic because you're checking the shape of the string, not its meaning.


3. Semantic Validation

This is where it gets interesting. Semantic validation checks whether the data makes logical sense — not just whether it's the right type or format.

const profileSchema = z.object({
  dateOfBirth: z.string().datetime().refine(
    (val) => new Date(val) < new Date(),
    "Date of birth cannot be in the future"
  ),
  age: z.number().min(1).max(120),
});

A date string "2027-01-01" is syntactically valid — it's a perfectly formatted date. But as a date of birth, it's semantically nonsense. Semantic validation catches this.

More examples:

// Check-out date must be after check-in date
z.object({
  checkIn:  z.string().datetime(),
  checkOut: z.string().datetime(),
}).refine(
  (data) => new Date(data.checkOut) > new Date(data.checkIn),
  "Check-out must be after check-in"
);

// If a discount is applied, a promo code must be present
z.object({
  applyDiscount: z.boolean(),
  promoCode:     z.string().optional(),
}).refine(
  (data) => !data.applyDiscount || !!data.promoCode,
  "Promo code required when discount is applied"
);

Semantic validation enforces business rules, not just data types.


Complex / Cross-Field Validation

Some validations span multiple fields. The classic example: password confirmation.

const resetPasswordSchema = z.object({
  password:        z.string().min(8),
  confirmPassword: z.string(),
  married:         z.boolean(),
  partner:         z.string().optional(),
})
.refine(
  (data) => data.password === data.confirmPassword,
  {
    message: "Passwords don't match",
    path: ["confirmPassword"],
  }
)
.refine(
  (data) => !data.married || !!data.partner,
  {
    message: "Partner name is required when married is true",
    path: ["partner"],
  }
);

This is called complex validation — where the validity of one field depends on the value of another.


Transformations: Reshaping Data Before It Enters Your System

Validation confirms the data is acceptable. Transformation changes it into exactly the format your service layer needs.

The most common real-world example: query parameters.

All query parameters arrive as strings — always. That's how HTTP works. Even if the user sends ?page=2&limit=20, your server receives:

req.query.page  // "2"   ← string
req.query.limit // "20"  ← string

But your service layer expects numbers. So you transform:

const paginationSchema = z.object({
  page:  z.string().transform(Number).pipe(z.number().min(1).max(500)),
  limit: z.string().transform(Number).pipe(z.number().min(1).max(10000)),
});

// Input:  { page: "2", limit: "20" }
// Output: { page: 2,   limit: 20   }

.transform() runs after the type check, converting the string to a number before passing it downstream.

More transformation examples:

const userSchema = z.object({
  // Always lowercase emails before storing — avoid case-sensitivity issues
  email: z.string().email().transform(val => val.toLowerCase()),

  // Strip whitespace from names
  name: z.string().trim().min(2).max(100),

  // Normalize phone: ensure it starts with +
  phone: z.string().transform(val =>
    val.startsWith("+") ? val : `+${val}`
  ),

  // Parse a date string into a Date object for the service layer
  birthDate: z.string().datetime().transform(val => new Date(val)),
});

The validation + transformation pipeline runs as a single unit. By the time data reaches your service layer, it's already in the exact shape and type your business logic expects. No surprises.


Where This Lives in Real Code

Here's a complete Express.js controller showing a validation + transformation middleware in practice:

import { Request, Response, NextFunction } from "express";
import { z, ZodSchema } from "zod";

// Generic middleware factory — reuse with any schema
function validate(schema: ZodSchema) {
  return (req: Request, res: Response, next: NextFunction) => {
    const result = schema.safeParse({
      body:  req.body,
      query: req.query,
      params: req.params,
    });

    if (!result.success) {
      return res.status(400).json({
        error:   "Validation failed",
        details: result.error.flatten(),
      });
    }

    // Attach the validated + transformed data
    req.validated = result.data;
    next();
  };
}

// Schema for creating a book
const createBookSchema = z.object({
  body: z.object({
    title:  z.string().trim().min(1).max(200),
    author: z.string().trim().min(2),
    price:  z.number().positive(),
  }),
});

// Route with validation middleware
app.post(
  "/api/books",
  validate(createBookSchema),   // ← runs first, rejects bad data
  async (req, res) => {
    const { title, author, price } = req.validated.body;
    const book = await bookService.create({ title, author, price });
    res.status(201).json(book);
  }
);

If validation fails, it returns 400 with a descriptive error. The controller logic never runs. The service layer never runs. The database is never touched.


The Frontend vs Backend Validation Debate

This one causes real confusion, so let's be completely clear:

Frontend Validation Backend Validation
Purpose User experience Security + Data integrity
When it runs Before the API call After the API call arrives
Can be bypassed? Yes — Postman, curl, scripts No — server always validates
Required? Recommended for UX Mandatory. Always.

Frontend validation exists to give users immediate feedback — highlight the email field before they even click Submit. It's a UX optimization.

Backend validation exists to protect your system from everything — malformed requests, attackers, other developers' bugs, API clients with no UI at all.

Never, ever rely on frontend validation for security. A user can open DevTools, intercept the request, and send whatever they want. Your backend validation is your last line of defense.

The correct mental model:

User fills form
      │
Frontend validates ──── fail? ──→ Show inline error (good UX)
      │
    pass
      │
API call is made
      │
Backend validates ──── fail? ──→ Return 400 with clear error (security)
      │
    pass
      │
Business logic runs

Both layers, always. They serve different purposes.


What Good Validation Error Responses Look Like

A great validation error tells the client exactly what's wrong and how to fix it. Here's a well-structured error response:

{
  "error": "Validation failed",
  "details": {
    "fieldErrors": {
      "email":  ["Invalid email format"],
      "age":    ["Number must be less than or equal to 120"],
      "phone":  ["Expected string, received number"]
    },
    "formErrors": [
      "Passwords don't match"
    ]
  }
}

And a minimal success response after validation + transformation pass:

{
  "id": 42,
  "title": "Clean Code",
  "author": "Robert C. Martin",
  "price": 29.99,
  "createdAt": "2024-03-11T10:30:00Z"
}

Clear errors help developers integrate with your API. Cryptic 500s do not.


Interview Questions (With Real Answers)


Q1: What is the difference between validation and transformation?

Validation checks whether incoming data meets defined constraints — correct type, correct format, logical consistency. If validation fails, the request is rejected with a 400. Transformation modifies the data after it's deemed valid — parsing strings into numbers, normalizing casing, prepending missing characters. Both typically run in the same pipeline at the controller entry point.


Q2: Why should validations happen at the controller layer and not the service layer?

The controller layer is the boundary between the outside world and your application. Validating there means your service layer can safely assume it's always receiving clean, well-typed data. If you validate inside the service layer, you end up repeating validation logic every time that service method is called from different places, and you let bad data travel deeper into your system before it's caught.


Q3: Why can't we rely on the database to enforce data constraints?

You can define type constraints at the database level (e.g., NOT NULL, TEXT, INTEGER), and the database will reject bad data. But the error that comes back from the database is a raw database error — not a user-friendly HTTP 400 with a readable message. Your user gets a 500, no guidance, and a terrible experience. Validate early, reject gracefully.


Q4: Why are query parameters always strings on the server side?

The HTTP specification defines query strings as text. When a browser or client builds a URL like ?page=2&limit=20, the server receives those as the string characters "2" and "20", not the numbers 2 and 20. It's your server's responsibility to parse and cast them using transformation before running validation logic against them.


Q5: Is frontend validation a replacement for backend validation?

Never. Frontend validation is a UX layer — it gives users immediate feedback and reduces unnecessary API calls. It can be bypassed entirely with tools like Postman, curl, or browser DevTools. Backend validation is mandatory for security and data integrity. Always implement both, but for different reasons.


Q6: What HTTP status code should a failed validation return?

400 Bad Request. The client sent invalid data — that's a client-side error. The response body should include which fields failed and why. 422 Unprocessable Entity is also semantically correct for semantic validation failures (valid format, but logically nonsensical data) and you'll see it used in some REST APIs.


Q7: How do you validate optional fields that become required based on another field's value?

This is cross-field or conditional validation. In Zod:

z.object({
  married: z.boolean(),
  partner: z.string().optional(),
}).refine(
  (data) => !data.married || !!data.partner,
  { message: "Partner name required when married is true", path: ["partner"] }
);

The .refine() method lets you write arbitrary cross-field logic.


Key Takeaways

  • Validations and transformations run at the entry point of the controller layer — before any business logic

  • Type validation checks data types. Syntactic validation checks structure (email, phone, date formats). Semantic validation checks logical meaning (dates in the future, impossible ages)

  • Transformation reshapes valid data into the format your service layer expects — casting strings to numbers, normalizing casing, parsing dates

  • Run both in a single pipeline so your data contract is defined in one place

  • Frontend validation is for UX. Backend validation is for security and data integrity. You always need both

  • Always return 400 Bad Request with field-level error details on validation failure


Further Reading


This is part of a Backend Engineering from First Principles series. Next: REST API Demystified.

Backend First Principles

Part 13 of 17

This series documents my learning journey through the "Backend from First Principles" playlist. Instead of jumping directly into frameworks, the focus is on understanding the core concepts that power backend systems. Throughout this series, I explore how backend systems actually work — from the request-response lifecycle, HTTP fundamentals, routing, serialization, authentication, and validation to more advanced topics like caching, task queues, observability, security, and scaling. The goal of this series is to build a strong conceptual foundation for backend engineering that applies across languages and frameworks. By learning backend development from first principles, we gain a deeper understanding of how modern web systems are designed, built, and scaled.

Up next

Authentication & Authorization: The Complete Backend Engineer's Guide

Whether you're building your first API or prepping for a senior backend interview, auth is the topic you can't afford to hand-wave. This guide covers everything — history, sessions, JWTs, cookies, MFA