Mastering Complex Database Queries- Best Practices

What "Complex" Queries Actually Means

When developers talk about complex queries, they usually mean one of three things:

Most "complex" queries are just poorly written simple queries. Before you blame the database engine, ask yourself if your query is doing more work than necessary. 90% of slow queries are slow because of bad design, not hardware limitations.

Read the Execution Plan First

You cannot optimize what you can't measure. Every major database system provides a way to inspect how your query actually executes:

The execution plan tells you where the database spends time. Look for these red flags:

What the Numbers Actually Mean

Cost is relative to your database. A cost of 1000 might be fine on a beefy server and terrible on a shared hosting environment. Focus on relative differences between query versions, not absolute cost numbers.

Row estimates vs actual rows matter more than cost. When estimates are way off, your statistics are stale or your data distribution changed significantly.

Indexing: The Obvious Fix Nobody Does Right

Add an index. Everyone knows that. Here's what they don't tell you:

Composite Indexes Have Order

A composite index on (status, created_at) works for queries filtering by status first, then sorting by date. It does not help if you filter by created_at first.

Put the most selective column first. This isn't a hard rule—query patterns matter more than theory. Test both orderings with your actual queries.

Index Bloat Kills Write Performance

Every index you add slows down INSERT, UPDATE, and DELETE operations. A table with 15 indexes isn't optimized—it's over-indexed. Audit your indexes quarterly. Remove anything that hasn't been used in query plans for 6+ months.

Partial Indexes Save Space and Time

If you only query active users, index only active users:

CREATE INDEX idx_active_users_email 
ON users(email) 
WHERE status = 'active';

This index is smaller, fits in memory easier, and updates faster than a full table index.

Query Writing Patterns That Actually Matter

Avoid SELECT *

Nobody needs every column. SELECT * forces the database to read metadata for every column, prevents index-only scans, and breaks your application when schema changes. List the columns you actually need. That's not premature optimization—it's basic competence.

OR Conditions Break Index Usage

This query:

WHERE status = 'active' OR status = 'pending'

Should be:

WHERE status IN ('active', 'pending')

The OR version often forces a full scan. The IN version can use an index. Same result, completely different performance characteristics.

Functions on Indexed Columns Disable Indexes

This kills performance:

WHERE LOWER(email) = 'user@example.com'

Either store a lowercase version and index that, or use a case-insensitive collation. Functions on columns mean the database can't use the B-tree structure—it has to evaluate every row.

UNION vs UNION ALL

UNION removes duplicates. That requires sorting and comparing all rows. If you know there are no duplicates (or don't care), use UNION ALL. It simply appends rows. No deduplication overhead.

Join Strategies and When to Use Them

Different join types have different performance profiles. Your database picks one based on statistics, but you can force a specific type if needed.

Join Type Best For Warning
Hash Join Large tables, no indexes, equality conditions Requires memory; spills to disk if table doesn't fit
Nested Loop Small outer table, indexed inner table Can be slow if inner table isn't indexed
Merge Join Pre-sorted data, range conditions Needs sort operation if data isn't pre-sorted

Most of the time, let the query planner decide. Only override when you've measured and confirmed a better strategy exists.

Subqueries vs Joins: Pick the Right One

Correlated subqueries run once per row. That's usually terrible:

SELECT name, (SELECT COUNT(*) FROM orders WHERE orders.user_id = users.id) 
FROM users

This runs the subquery for every user row. A JOIN or window function does the same work in a single pass:

SELECT u.name, COUNT(o.id) 
FROM users u 
LEFT JOIN orders o ON o.user_id = u.id 
GROUP BY u.id, u.name

Non-correlated subqueries are fine—the database evaluates them once and treats them like a temporary table.

Common Mistakes That Make Simple Queries Slow

Getting Started: How to Fix a Slow Query

Here's the actual process:

  1. Find it. Enable slow query logging. Set threshold to 1 second initially. Review logs daily.
  2. Reproduce it. Run the query with EXPLAIN ANALYZE. Save the output.
  3. Identify the bottleneck. Look for scans, sorts, and high row estimates.
  4. Add an index. Test on a copy of production data first. Measure the improvement.
  5. Deploy and monitor. The fix worked in staging. It might not in production if data distribution differs.
  6. Document it. Why was this slow? What did you fix? Someone will need this context later.

Quick Wins Checklist

Tools Worth Using

Tool Database What It Does
pg_stat_statements PostgreSQL Tracks query performance over time
pt-query-digest MySQL Analyzes slow query logs
Query Store SQL Server Captures execution plans and performance
SQL Tuning Advisor Oracle Automated recommendations

Manual optimization beats automated tools most of the time. Tools suggest indexes. You understand your access patterns. Combine both.

When to Rewrite vs When to Add Hardware

Hardware is cheap. Developer time isn't. But sometimes throwing hardware at the problem is the right call:

Optimization has diminishing returns. If you've applied the basics and still have problems, consider whether the query belongs in a queue job rather than synchronous execution.