Sets
1.5 Sets
Uniqueness, fast membership, union, and intersection
Sets are one of the most important data structures in Python and one of the most useful for coding interviews.
A set stores unique values only.
This makes it ideal for problems involving:
- duplicate detection
- fast membership checks
- finding common elements
- removing duplicates
- visited-node tracking in graphs
- set algebra (union / intersection)
A large number of interview questions become dramatically simpler once you recognize that a set is the right tool.
Examples:
- check if duplicates exist
- find common values between two lists
- keep track of visited items
- compute differences between datasets
This chapter focuses on understanding why sets are powerful, not just memorizing methods.
1.5.1 Mental model: what a set really is
A set is an unordered collection of unique hashable values.
Example
nums = {1, 2, 3}
Important properties
- unordered → no guaranteed position
- unique → duplicates are removed automatically
- mutable → you can add / remove items
Uniqueness
This is the defining feature.
nums = {1, 2, 2, 3}
Result
{1, 2, 3}
The duplicate 2 is automatically removed.
This is extremely important for interviews.
1.5.2 Why sets are powerful
The two biggest reasons:
Uniqueness
Automatic deduplication.
Fast membership
Average lookup complexity:
O(1)
This is one of the most common interview questions.
Membership checks
3 in nums
This is usually much faster than:
3 in [1, 2, 3]
because list membership is O(n).
A strong verbal answer should explicitly mention hashing.
1.5.3 How set lookup works conceptually
Sets use hashing, just like dictionaries.
Mental model
value -> hash -> bucket -> exists?
This is why lookup is usually constant time.
A strong interview answer should mention hash table semantics.
1.5.4 Common set operations
add()
nums = {1, 2}
nums.add(3)
Result
{1, 2, 3}
remove()
nums.remove(2)
Removes the element.
If it does not exist, raises:
KeyError
discard()
Safer version.
nums.discard(5)
Does not raise an error if missing.
This is often useful in production code.
1.5.5 Remove duplicates from a list
This is one of the most common set interview patterns.
nums = [1, 2, 2, 3, 1]
unique = list(set(nums))
Result
[1, 2, 3]
Important note
Order is not guaranteed
This is a very common interview follow-up.
Preserve order while removing duplicates
Interviewers often ask this as a follow-up.
def remove_duplicates(nums):
seen = set()
result = []
for num in nums:
if num not in seen:
seen.add(num)
result.append(num)
return result
This preserves original order.
Strong interview pattern.
1.5.6 Union
One of the key topics for sets.
Union means:
all unique elements from both sets
Method syntax
a = {1, 2, 3}
b = {3, 4, 5}
a.union(b)
Result
{1, 2, 3, 4, 5}
Operator syntax
a | b
Same result.
This operator form is very common.
Interview meaning
A strong verbal explanation:
union combines both sets while keeping only unique values
1.5.7 Intersection
Intersection means:
only elements present in both sets
Method syntax
a.intersection(b)
Result
{3}
Operator syntax
a & b
Same result.
Very commonly used in interviews.
Real-world example
backend_skills = {"python", "sql", "docker"}
candidate_skills = {"python", "aws", "docker"}
backend_skills & candidate_skills
Result
{"python", "docker"}
This is a very intuitive interview example.
1.5.8 Difference and symmetric difference
These are essential set operations.
Difference
Difference means:
values in one set but not the other
Method syntax
a.difference(b)
Result
{1, 2}
Operator syntax
a - b
Very common in practical code.
Symmetric difference
a ^ b
Result
{1, 2, 4, 5}
Values present in exactly one set.
1.5.9 Important interview patterns
Sets are heavily used in:
Duplicate detection
len(nums) != len(set(nums))
Visited tracking
visited = set()
Used in DFS / BFS.
Common elements
common = set(a) & set(b)
Classic interview pattern.
1.5.10 Common interview problems
Sets appear in:
- duplicates
- first duplicate
- two sum variants
- graph traversal
- common elements
- unique substring problems
This chapter is foundational.
Verbal interview questions
Answer these out loud:
- Why is set membership usually
O(1)? - Why use a set over a list?
- Explain union
- Explain intersection
- Why does converting list → set lose ordering?
Coding drills
Drill 1: duplicates
def has_duplicates(nums: list[int]) -> bool:
...
Drill 2: common elements
def common(a: list[int], b: list[int]) -> set[int]:
...
Use intersection.
Drill 3: preserve order dedupe
def dedupe_preserve_order(nums: list[int]) -> list[int]:
...
Must-answer interview questions
- list vs tuple
- set vs list for membership checks
- dict lookup complexity
- when ordering matters
Additional coding drills
- word frequency counter
- deduplicate list while preserving order
- flatten nested lists
- group objects by field