Semi-Join with EXISTS
intermediatejoins
2 min read
What is a Semi-Join?
A semi-join returns rows from the left table where at least one match exists in the right table. Unlike a regular join, it does not duplicate left rows when multiple matches exist.
PostgreSQL does not have a SEMI JOIN keyword, but the EXISTS subquery achieves the same result.
Syntax
SELECT a.*
FROM table_a a
WHERE EXISTS (
SELECT 1 FROM table_b b WHERE b.a_id = a.id
);When to Use
- Finding students who are enrolled in at least one course
- Checking for the existence of related data without duplicating rows
- When you only need columns from the left table
Key Points
- No Duplicates — Unlike INNER JOIN, a semi-join returns each left row at most once.
- EXISTS — The subquery only needs to find one match; it stops early.
- Performance — Often faster than DISTINCT + INNER JOIN for large datasets.
- Alternative: IN —
WHERE id IN (SELECT a_id FROM b)is similar but EXISTS handles NULLs better. - Correlated — The EXISTS subquery references the outer query, making it a correlated subquery.
Guided Practice
Solve the challenge below. Use hints when stuck and check your answer for instant feedback.
Practice challengeGuided learning mode
Semi-Join with EXISTS Challenge
Write a query that find students who have enrollments.
Expected result
Each student who appears in enrollments, listed once (no duplicates).
Hidden checks
- Returned rows and values
- Output columns and result shape
- Final database state after the query runs
Lesson guidance
What is a Semi-Join?
Initializing database...Each run starts from fresh sample data.
More Examples
Products that have been ordered
Find products with at least one order.
Initializing database...Each run starts from fresh sample data.
Frequently Asked Questions
How is a semi-join different from an inner join?
An inner join can produce duplicate left rows if there are multiple matches. A semi-join returns each left row at most once.
Is EXISTS faster than IN?
They are usually optimized the same way in PostgreSQL. EXISTS is preferred when the subquery might return NULLs or when you want to short-circuit after the first match.