Like it!

Join us on Facebook!

Like it!

Writing a custom iterator in modern C++

An experimental Forward Iterator written from scratch to boost up hand-made containers.

An iterator is an object that points to an element inside a container. Like a pointer, an iterator can be used to access the element it points to and can be moved through the content of the container. Each container in the C++ Standard Library provides its own iterator, as well as some methods to retrieve it. Using iterators is quite easy: obtain an instance from a container, move it around where needed and then get the pointed element.

Concretely, an iterator is a simple class that provides a bunch of operators: increment ++, dereference * and few others which make it very similar to a pointer and the arithmetic operations you can perform on it. In fact, iterators are a generalization of pointers, which are often used as a foundation when writing the iterator itself.

Iterators are one of the building blocks of the Standard Library containers, but they are also useful when you want to provide the ability to iterate over elements of a custom container that you wrote yourself: this is what I want to investigate in the present article. Adding iterators to your containers will make them compatible with the range-based for loops and the C++ Algorithms library: a collection of functions for searching, sorting, counting and manipulating containers, based on iterators.

A dummy container for our experiments

Before digging deeper, let's define a silly custom container that we want to spice up with iterators:

class Integers
{
private:
    int m_data[200];
};

The Integers class is a wrapper around a raw array of ints: we want to be able to access elements of that private array through an iterator, as well as to loop over it or pass it to any of the Standard Library algorithms. Let's start by making some design decisions.

Choose the nature of our iterator

The first step is to choose the type of iterator we want to implement. Modern C++ defines six types:

# Name Description
1 Input Iterator Can scan the container forward only once, can't change the value it points to (read-only);
2 Output Iterator Can scan the container forward only once, can't read the value it points to (write-only);
3 Forward Iterator Can scan the container forward multiple times, can read and write the value it points to;
4 Bidirectional Iterator Same as previous one but can scan the container back and forth;
5 Random Access Iterator Same as previous one but can access the container also non-sequentially (i.e. by jumping around);
6 Contiguous Iterator Same as previous one, with the addition that logically adjacent elements are also physically adjacent in memory.

The six categories are hierarchical: a Bidirectional Iterator is also a Forward Iterator and a Random Access Iterator is both a Bidirectional and a Forward Iterator and so on. Normally, all iterators are Input Iterators (1) which makes them read-only, also known as constant iterators. Iterators that both support read and write operations are also Output Iterators (2) and are called mutable iterators.

Iterators hierarchy
1. Iterators hierarchy in C++. All iterators are Input Iterators, Output Iterators or both.

Input and Output iterators are often used for low-level components such as input and output streams (the so-called single-pass algorithms) and thus have limitations. We want to do more with our custom container, so we will skip those two and jump straight to the mutable Forward Iterator.

Prepare the custom iterator

An iterator is usually declared inside the class it belongs to, for example:

class Integers
{
public:
    struct Iterator { /* ... */ };

    // ...
};

The first thing to do is to assign the iterator some properties. Until C++17 this is done by tagging it with the tag dispatch mechanism, while C++20 uses concepts: in this article I will follow the traditional approach.

C++ expects some properties from an iterator:

  • iterator_category — one of the six iterator categories we have seen above. The full list is available here. The std::forward_iterator_tag tag is what we need;
  • difference_type — a signed integer type that can be used to identify distance between iterator steps. Our iterator is basically a wrapper around a pointer and leverages pointer arithmetic, so the default std::ptrdiff_t is a good choice;
  • value_type — the type the iterator iterates over. int in our case;
  • pointer — defines a pointer to the type iterated over. int* in our case;
  • reference — defines a reference to the type iterated over. int& in our case;

Translated into code:

#include <iterator> // For std::forward_iterator_tag
#include <cstddef>  // For std::ptrdiff_t

struct Iterator 
{
    using iterator_category = std::forward_iterator_tag;
    using difference_type   = std::ptrdiff_t;
    using value_type        = int;
    using pointer           = int*;  // or also value_type*
    using reference         = int&;  // or also value_type&
};

Why are tags useful?

Some of the tags above might seem useless at first. In fact, you will notice how they will never get mentioned during the definition of our iterator. Tags are used to select the most efficient algorithm if your container is passed to one of the Standard Library functions from the <algorithm> library. Wrong tags mean sub-optimal performance! The iterator category is also used to set algorithm requirements, for example: std::fill wants a Forward Iterator, while std::reverse wants a Bidirectional Iterator. Passing the wrong iterator will result in a compilation error.

Define the iterator constructors

All iterators must be constructible, copy-constructible, copy-assignable, destructible and swappable. Let's translate those requirements into code for our iterator:

struct Iterator 
{
    // Iterator tags here...

    Iterator(pointer ptr) : m_ptr(ptr) {}

private:

    pointer m_ptr;
};

Easy! We just need a custom constructor to initialize the private member variable m_ptr, which points to an element of the Integers container. The custom constructor satisfies the constructible requirement, while all others are covered by the implicitly-declared constructors and operators kindly provided by the compiler.

Implement operators

We are building a mutable Forward Iterator, which inherits properties from both Input and Output Iterators. The resulting iterator must support the following operations:

  • *iterator and iterator->x — dereferenceable, to get the value it points to;
  • ++iterator and iterator++ — incrementable, to move it one step forward, both prefix and postfix versions. The latter must return something dereferenceable;
  • iterator_a == iterator_b and iterator_a != iterator_b — comparable with another iterator;

This is done by implementing some custom operators in the Iterator class, like this:

struct Iterator 
{
    // Iterator tags here...

    // Iterator constructors here...

    reference operator*() const { return *m_ptr; }
    pointer operator->() { return m_ptr; }

    // Prefix increment
    Iterator& operator++() { m_ptr++; return *this; }  

    // Postfix increment
    Iterator operator++(int) { Iterator tmp = *this; ++(*this); return tmp; }

    friend bool operator== (const Iterator& a, const Iterator& b) { return a.m_ptr == b.m_ptr; };
    friend bool operator!= (const Iterator& a, const Iterator& b) { return a.m_ptr != b.m_ptr; };     

private:

    pointer m_ptr;
};

As you can see every operator involves the usage of the private pointer m_ptr. Also, notice the friend declaration for the two comparison operators: this is handy way to define the operators as non-member functions, yet being able to access private parts of the Iterator class (rationale here).

Prepare the container

Our iterator is good to go. The last step is to give our custom container the ability to create Iterator objects. This is done by adding two public methods begin() and end() that return instances of the Iterator class, representing the first and the last element respectively:

class Integers
{
public:

    // Iterator definition here ...

    Iterator begin() { return Iterator(&m_data[0]); }
    Iterator end()   { return Iterator(&m_data[200]); } // 200 is out of bounds
};

The end() method returns an iterator that refers to an invalid memory address, past the end of our raw array. Such iterator is just a placeholder used to determine when the boundary has been reached: it should never be accessed directly.

Time to test our iterator

Both the custom container and its iterator are now ready. Let's test them with the range-based for loop:

Integers integers;
for (auto i : integers)
    std::cout << i << "\n";

This code will magically print the value of each integer in the container. It works because the range-based for loop is just syntactic sugar created by the compiler for the following:

for (auto it = integers.begin(), end = integers.end(); it != end; ++it) { 
    const auto i = *it; 
    std::cout << i << "\n";
}

In words: two iterators it and end are created. The first one points to the beginning of the container, the other one points to the end. Then, on each loop, the it iterator is incremented until it's equal to end, that is until the end of the container has been reached. The actual value is obtained by dereferencing it in a local variable before being printed.

Notice how the compiler makes use of all the operators and functions we have previously implemented: the begin() and end() methods in the custom container, the ability to compare the two iterators with the != operator, the ability to increment it with the prefix syntax and finally the ability to dereference it to grab the actual value it points to.

Let's now try a function from the Algorithm library, std::fill for example:

Integers integers;

std::fill(integers.begin(), integers.end(), 3);

The function assigns all elements in the container the value 3. It works because std::fill is usually implemented like this:

template <typename ForwardIterator, typename T>
void fill(ForwardIterator first, ForwardIterator last, const T& value)
{
    for (; first != last; ++first)
        *first = value;
}

Note that our iterator doesn't work with all functions from the Algorithm library. For example, we can't pass it to std::reverse as it requires a Bidirectional Iterator. The hard part has been done so far, so extending the iterator is now just a matter of adding more operators to the class and choose the best tags to describe it.

Final notes

Iterators behave like pointers: can't I just use them?

Our custom container is a wrapper around an old-school array, which can be navigated with pointer arithmetic. Indeed we could get rid of the whole Iterator class and just return a pointer to the first and last array element from the Integers::begin() and Integers::end() methods respectively. Range-based for loops and functions from the Algorithm library would still work fine. However, real-world containers are often based on more complex data structures than plain arrays — think of linked lists or maps to name a few, where pointers and their operations aren't just enough. Iterators abstract away all that complexity behind a handy object that behaves like a pointer and let you access a complex data structure with familiar operations.

My container already wraps a Standard Library container

In our example, the Integers class could have been a wrapper around a std::array. In this case you don't need to implement any custom iterator at all and just return the iterator that belongs to the Standard Library container in use. For example:

class Integers
{
    using IntegersType = std::array<int, 32>;
    // ... 

    IntegersType::iterator begin() { return m_data.begin(); }
    IntegersType::iterator end()   { return m_data.end() }    

private:
    IntegersType m_data;
};

The code above works because all containers in the C++ Standard Library do what we have done with our Integer container: they all implement their iterators as class members. The IntegersType alias is used to simplify type names and is not mandatory. Also, returning auto as iterator type seems just fine in C++17.

Make our iterator immutable

By default, Iterator can alter the element it points to. If you want to make it immutable, the common trick is to add another iterator type to the custom container class — let's call it ConstantIterator. This new iterator type is almost identical to the original one, except for its dereference operator which now returns a constant reference:

const reference operator*() const { return *m_ptr; }
// ^---- notice the 'const' here

The same thing applies to the -> operator. Finally, the custom container must be able to return such new iterator type. This is done by adding two additional public methods cbegin() and cend() (where the leading c stands for constant) that return instances of the ConstantIterator class:

ConstantIterator cbegin() const { return ConstantIterator(&m_data[0]); }
ConstantIterator cend()   const { return ConstantIterator(&m_data[200]); }

Many Standard Library containers provide both the begin()/end() and the cbegin()/cend() pairs. The same pattern is applied for each iterator type. For example, std::array also has rbegin()/rend(), where r stands for reverse iterator (yes, you can loop a Standard Library arrays in reverse, too).

How are iterators defined in C++20?

C++20 introduces concepts, a smart way to put constraints on the types a template function or class can take in. While iterator categories and properties remain the same, what changes is how you enforce them: with tags until C++17, with concepts since C++20. For example, instead of the std::forward_iterator_tag tag you would mark your iterator with the std::forward_iterator concept. The same thing applies to all iterator properties. For example, a Forward Iterator must be std::incrementable. This new mechanism helps in getting better iterator definitions and makes errors from the compiler much more readable. I will upgrade the article as soon as the concept implementation will become more widespread.

Full source code

#include <iterator>
#include <cstddef>

class Integers
{
public:
    struct Iterator 
    {
        using iterator_category = std::forward_iterator_tag;
        using difference_type   = std::ptrdiff_t;
        using value_type        = int;
        using pointer           = int*;
        using reference         = int&;

        Iterator(pointer ptr) : m_ptr(ptr) {}

        reference operator*() const { return *m_ptr; }
        pointer operator->() { return m_ptr; }
        Iterator& operator++() { m_ptr++; return *this; }  
        Iterator operator++(int) { Iterator tmp = *this; ++(*this); return tmp; }
        friend bool operator== (const Iterator& a, const Iterator& b) { return a.m_ptr == b.m_ptr; };
        friend bool operator!= (const Iterator& a, const Iterator& b) { return a.m_ptr != b.m_ptr; };  

    private:
        pointer m_ptr;
    };

    Iterator begin() { return Iterator(&m_data[0]); }
    Iterator end()   { return Iterator(&m_data[200]); }

private:
    int m_data[200];
};

Sources

Learncpp.com — 6.17 — Introduction to iterators
Learncpp.com — 16.3 — STL iterators overview
Chris Riesbeck — Defining C++ Iterators
Cplusplus.com — Iterator
Cppreference — Range-based for loop
Cppreference — Iterator library
Cppreference — Iterator tags
StackOverflow — What is the reason behind cbegin/cend?

comments
Duane on December 21, 2020 at 19:23
https://www.boost.org/doc/libs/1_75_0/libs/iterator/doc/iterator_facade.html
Or
https://www.boost.org/doc/libs/1_75_0/libs/iterator/doc/html/iterator/generic/adaptor.html
darkelfe14728 on December 27, 2020 at 20:33
Great article.
May i suggest an exemple with disjoint elements (like std::vector>) ?
Gangaraju on January 03, 2021 at 19:30
Thanks for the detailed explanation of implementing an iterator. Minor correction required: In "Choose the nature of our iterator" output iterator is introduced as one which can be used to write but cannot be read. Whereas immediately following that output iterator is mentioned as one that can be used to read and write.
Indi on January 03, 2021 at 23:49
The first part of the section “Make our iterator immutable” is incorrect.

You don’t turn `Iterator` into `ConstIterator` by simply sticking `const` in front of `reference` in the return type of `operator*` (“notice the const here”). If `reference` is `int&`, then `const reference` is… `int&`. Because that’s how references work. If `pointer` is `int*`, then `const pointer` is `int* const`… which is not the same as `int const*`. (East const 4 lyfe.)

The correct way to turn `Iterator` into `ConstIterator` is to change the typedefs: `pointer` should be `int const*` and `reference` should be `int const&`. And don’t add `const` to the return types of `operator*` or `operator->`. (You’ll probably also want to add implicit conversions from `Iterator` to `ConstIterator`, but not vice versa.)