Introduction

Knowing which data structure to use in different scenarios is crucial for writing efficient code, therefore is important to understand which ones are available in Python and when to use them.

Lists & Tuples

Lists and tuples are both used to store collections of items in Python, but they have some key differences:

# List are mutable and ordered
my_list = [1, 2, 3]
 
# Tuples are immutable and ordered
my_tuple = (1, 2, 3)

Use tuples for fixed collections of items and lists for collections that may change.

Keep in mind that tuples are faster and more memory efficient than lists.

TIP

Use tuples for data that shouldn’t change, and lists for collections that need to be modified.

Sets

Sets are unordered collections of unique items in Python. They are useful when you want to store a collection of items without duplicates and perform operations like union, intersection, and difference.

# Creating sets
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
 
# Union
union_set = set_a | set_b  # {1, 2, 3, 4, 5, 6}
 
# Intersection
intersection_set = set_a & set_b  # {3, 4}
 
# Difference
difference_set = set_a - set_b  # {1, 2}
 
# Symmetric Difference
symmetric_difference_set = set_a ^ set_b  # {1, 2, 5, 6}

You can also check if a set is a subset or superset of another set using the issubset() and issuperset() methods:

# Subset
is_subset = set_a.issubset(union_set)  # True
 
# Or using the <= operator
is_subset = set_a <= union_set  # True
 
# Superset
is_superset = union_set.issuperset(set_b)  # True
 
# Or using the >= operator
is_superset = union_set >= set_b  # True

TIP

Use sets when you need to eliminate duplicates or perform mathematical set operations.

Comprehensions

Comprehensions are a concise way to create lists, dictionaries, or sets in Python. They allow you to generate new collections by applying an expression to each item in an iterable.

# Traditional way
squares = []
for x in range(10):
    squares.append(x**2)
 
# List comprehension
squares = [x**2 for x in range(10)]

You can also do the same thing for dictionaries:

# Traditional way
squares_dict = {}
for x in range(10):
    squares_dict[x] = x**2
 
# Dictionary comprehension
squares_dict = {x: x**2 for x in range(10)}

They can also be used with conditional statements to filter elements:

# List comprehension with condition
even_squares = [x**2 for x in range(10) if x % 2 == 0]

They’re not only more concise but also faster than traditional loops, as they leverage C-level optimizations under the hood, making them more efficient and part of the pythonic way of coding.

TIP

Use comprehensions when transforming or filtering data. They’re more readable and performant than traditional loops for simple transformations.

Other

defaultdict

Is a subclass of the built-in dict class in Python. In simple terms, when you try to access a key that doesn’t exist in a defaultdict, it automatically creates that key with a default value, instead of raising a KeyError.

from collections import defaultdict
 
count_dict = defaultdict(int)
items = ['apple', 'banana', 'orange', 'apple', 'banana', 'apple']
for item in items:
    count_dict[item] += 1
print(count_dict)

This is better than using a regular dictionary, where you would have to check if the key exists before incrementing its value.

TIP

Use defaultdict when building dictionaries where you’re accumulating values (counting, grouping, or collecting items). It eliminates the need for key existence checks.

dataclasses

Dataclasses are a feature introduced in Python 3.7 that provides a way to simplify the creation of classes that are primarily used to store data. They are not a data structure per say, but serve a similar purpose.

from dataclasses import dataclass
 
@dataclass
class Point:
    x: int
    y: int
 
some_point = Point(1, 2)
print(some_point.x)  # Output: 1

It automatically generates special methods like __init__() based on the class attributes, reducing boilerplate code and making it easier to create classes that are used to represent data.

TIP

Use them as a lightweight/built-in alternative to libraries like Pydantic when you only need simple data structures.