I do not like designing tables from nouns first.
I prefer starting from the queries the application must answer.
That sounds backwards, but it prevents a common mistake: a clean-looking schema that becomes awkward the moment the product needs filtering, reporting, or permissions.
Start With Access Patterns
Before creating tables, write the important questions:
Find a user's published posts.
List orders by account and status.
Show the latest 20 events for a device.
Check whether a user can access a project.
Then design tables that can answer those questions clearly.
Example:
CREATE TABLE posts (
id BIGINT PRIMARY KEY,
author_id BIGINT NOT NULL,
title VARCHAR(200) NOT NULL,
status VARCHAR(30) NOT NULL,
published_at TIMESTAMP NULL,
created_at TIMESTAMP NOT NULL
);
For this query:
SELECT id, title, published_at
FROM posts
WHERE author_id = ?
AND status = 'published'
ORDER BY published_at DESC
LIMIT 20;
The index should match the access pattern:
CREATE INDEX idx_posts_author_status_published
ON posts (author_id, status, published_at DESC);
Use Constraints as Guardrails
If invalid data should never exist, do not rely only on app code.
ALTER TABLE posts
ADD CONSTRAINT chk_posts_status
CHECK (status IN ('draft', 'published', 'archived'));
Use:
NOT NULL- foreign keys where appropriate
- unique constraints
- check constraints
- sensible default values
The database is allowed to protect itself.
Normalize Until It Hurts, Then Denormalize Carefully
Normalization prevents duplicated inconsistent data.
But production systems sometimes denormalize for read performance.
The important word is carefully.
| Choice | Good for | Risk |
|---|---|---|
| normalized | correctness, updates | more joins |
| denormalized | faster reads | stale duplicated data |
If you denormalize, decide how the duplicated value gets updated.
No answer means future bug.
Avoid Mystery Columns
Columns like this age badly:
extra TEXT
data JSON
flag1 BOOLEAN
type VARCHAR(255)
JSON columns are useful, but if every important field ends up inside data, you lose constraints, discoverability, and indexing clarity.
My rule:
If the application filters, sorts, joins, or validates a field often, consider making it a real column.
Timestamps Are Not Optional
Most production tables need:
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL
Many also need:
deleted_at TIMESTAMP NULL
Soft delete is not free, though. Every query must remember to exclude deleted rows.
WHERE deleted_at IS NULL
If you add soft delete, make it a deliberate pattern.
Final Thought
Good table design is not about making a beautiful ER diagram. It is about making future queries, constraints, and debugging less painful.
What schema decision looked clean at first but became painful later?
Top comments (0)