🐍 Python Course

Sets

1.5 Sets

Uniqueness, fast membership, union, and intersection

Sets are one of the most important data structures in Python and one of the most useful for coding interviews.

A set stores unique values only.

This makes it ideal for problems involving:

  • duplicate detection
  • fast membership checks
  • finding common elements
  • removing duplicates
  • visited-node tracking in graphs
  • set algebra (union / intersection)

A large number of interview questions become dramatically simpler once you recognize that a set is the right tool.

Examples:

  • check if duplicates exist
  • find common values between two lists
  • keep track of visited items
  • compute differences between datasets

This chapter focuses on understanding why sets are powerful, not just memorizing methods.


1.5.1 Mental model: what a set really is

A set is an unordered collection of unique hashable values.

Example

nums = {1, 2, 3}

Important properties

  • unordered → no guaranteed position
  • unique → duplicates are removed automatically
  • mutable → you can add / remove items

Uniqueness

This is the defining feature.

nums = {1, 2, 2, 3}

Result

{1, 2, 3}

The duplicate 2 is automatically removed.

This is extremely important for interviews.


1.5.2 Why sets are powerful

The two biggest reasons:

Uniqueness

Automatic deduplication.

Fast membership

Average lookup complexity:

O(1)

This is one of the most common interview questions.

Membership checks

3 in nums

This is usually much faster than:

3 in [1, 2, 3]

because list membership is O(n).

A strong verbal answer should explicitly mention hashing.


1.5.3 How set lookup works conceptually

Sets use hashing, just like dictionaries.

Mental model

value -> hash -> bucket -> exists?

This is why lookup is usually constant time.

A strong interview answer should mention hash table semantics.


1.5.4 Common set operations

add()

nums = {1, 2}
nums.add(3)

Result

{1, 2, 3}

remove()

nums.remove(2)

Removes the element.

If it does not exist, raises:

KeyError

discard()

Safer version.

nums.discard(5)

Does not raise an error if missing.

This is often useful in production code.


1.5.5 Remove duplicates from a list

This is one of the most common set interview patterns.

nums = [1, 2, 2, 3, 1]
unique = list(set(nums))

Result

[1, 2, 3]

Important note

Order is not guaranteed

This is a very common interview follow-up.

Preserve order while removing duplicates

Interviewers often ask this as a follow-up.

def remove_duplicates(nums):
    seen = set()
    result = []

    for num in nums:
        if num not in seen:
            seen.add(num)
            result.append(num)

    return result

This preserves original order.

Strong interview pattern.


1.5.6 Union

One of the key topics for sets.

Union means:

all unique elements from both sets

Method syntax

a = {1, 2, 3}
b = {3, 4, 5}

a.union(b)

Result

{1, 2, 3, 4, 5}

Operator syntax

a | b

Same result.

This operator form is very common.

Interview meaning

A strong verbal explanation:

union combines both sets while keeping only unique values


1.5.7 Intersection

Intersection means:

only elements present in both sets

Method syntax

a.intersection(b)

Result

{3}

Operator syntax

a & b

Same result.

Very commonly used in interviews.

Real-world example

backend_skills = {"python", "sql", "docker"}
candidate_skills = {"python", "aws", "docker"}

backend_skills & candidate_skills

Result

{"python", "docker"}

This is a very intuitive interview example.


1.5.8 Difference and symmetric difference

These are essential set operations.

Difference

Difference means:

values in one set but not the other

Method syntax

a.difference(b)

Result

{1, 2}

Operator syntax

a - b

Very common in practical code.

Symmetric difference

a ^ b

Result

{1, 2, 4, 5}

Values present in exactly one set.


1.5.9 Important interview patterns

Sets are heavily used in:

Duplicate detection

len(nums) != len(set(nums))

Visited tracking

visited = set()

Used in DFS / BFS.

Common elements

common = set(a) & set(b)

Classic interview pattern.


1.5.10 Common interview problems

Sets appear in:

  • duplicates
  • first duplicate
  • two sum variants
  • graph traversal
  • common elements
  • unique substring problems

This chapter is foundational.


Verbal interview questions

Answer these out loud:

  • Why is set membership usually O(1)?
  • Why use a set over a list?
  • Explain union
  • Explain intersection
  • Why does converting list → set lose ordering?

Coding drills

Drill 1: duplicates

def has_duplicates(nums: list[int]) -> bool:
    ...

Drill 2: common elements

def common(a: list[int], b: list[int]) -> set[int]:
    ...

Use intersection.

Drill 3: preserve order dedupe

def dedupe_preserve_order(nums: list[int]) -> list[int]:
    ...

Must-answer interview questions

  • list vs tuple
  • set vs list for membership checks
  • dict lookup complexity
  • when ordering matters

Additional coding drills

  • word frequency counter
  • deduplicate list while preserving order
  • flatten nested lists
  • group objects by field