Mark Yu

Posted on May 24, 2024 • Edited on Jun 16

Database Table Design Starts With the Queries You Need

#database #backend #sql #architecture

I do not like designing tables from nouns first.

I prefer starting from the queries the application must answer.

That sounds backwards, but it prevents a common mistake: a clean-looking schema that becomes awkward the moment the product needs filtering, reporting, or permissions.

Start With Access Patterns

Before creating tables, write the important questions:

Find a user's published posts.
List orders by account and status.
Show the latest 20 events for a device.
Check whether a user can access a project.

Then design tables that can answer those questions clearly.

Example:

CREATE TABLE posts (
  id BIGINT PRIMARY KEY,
  author_id BIGINT NOT NULL,
  title VARCHAR(200) NOT NULL,
  status VARCHAR(30) NOT NULL,
  published_at TIMESTAMP NULL,
  created_at TIMESTAMP NOT NULL
);

For this query:

SELECT id, title, published_at
FROM posts
WHERE author_id = ?
  AND status = 'published'
ORDER BY published_at DESC
LIMIT 20;

The index should match the access pattern:

CREATE INDEX idx_posts_author_status_published
ON posts (author_id, status, published_at DESC);

Use Constraints as Guardrails

If invalid data should never exist, do not rely only on app code.

ALTER TABLE posts
ADD CONSTRAINT chk_posts_status
CHECK (status IN ('draft', 'published', 'archived'));

Use:

NOT NULL
foreign keys where appropriate
unique constraints
check constraints
sensible default values

The database is allowed to protect itself.

Normalize Until It Hurts, Then Denormalize Carefully

Normalization prevents duplicated inconsistent data.

But production systems sometimes denormalize for read performance.

The important word is carefully.

Choice	Good for	Risk
normalized	correctness, updates	more joins
denormalized	faster reads	stale duplicated data

If you denormalize, decide how the duplicated value gets updated.

No answer means future bug.

Avoid Mystery Columns

Columns like this age badly:

extra TEXT
data JSON
flag1 BOOLEAN
type VARCHAR(255)

JSON columns are useful, but if every important field ends up inside data, you lose constraints, discoverability, and indexing clarity.

My rule:

If the application filters, sorts, joins, or validates a field often, consider making it a real column.

Timestamps Are Not Optional

Most production tables need:

created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL

Many also need:

deleted_at TIMESTAMP NULL

Soft delete is not free, though. Every query must remember to exclude deleted rows.

WHERE deleted_at IS NULL

If you add soft delete, make it a deliberate pattern.

Final Thought

Good table design is not about making a beautiful ER diagram. It is about making future queries, constraints, and debugging less painful.

What schema decision looked clean at first but became painful later?

DEV Community