Published: 12.05.2021 | Edited: 12.05.2021 | Tags: 100daystooffload

Notes on circular queue data structure

While researching the method to store the data into the electrically erasable programmable read-only memory (EEPROM) I have found out the problem is not as straightforward as I have previously thought. Consider the following requirements:

  1. Store the last N measured variables in order
  2. Preserve the data after the prower is removed

The requirements seems clear on the surface, but they have some unintended consequences to them. Consider the first requirement - storing a given number of most recent values. Either the stack or the queue would serve, if their capacity was sufficient, meaning the data would be taken out sooner than their capacity would be reached. With the sufficiently large queue, we could took all the data out and discard all but last N values. With the stack it would be even easier - just pop out the number of required entries.

What about limited storage?

The problem happens when the storage capacity is expected to fill up before the data are taken out. Such limitation forces us to start discarding data once that happens. When the either queue or the stack is full, no new data can be pushed into the structure, failing our first requirement.

What is worse, once the number of stored entries matches the capacity of the storage, meaning that the queue or the stack is full, we would have no way of telling how recent the data are, or in other words, how many entries were discarded.

Enter circular queue

To overcome the problem of discarding the most recent data, a different kind of data structure has to be used, a one called circular queue (sometimes called also circular FIFO or a rotating buffer). A circular queue has the property of discarding not the most recent data, but the least recent data instead, providing us with the exact solution meeting our first requirement. Note that there appears to be no such data structure called a circular stack, or at least not a standardized one.

To meet the second requirement, we would have to use a non-volatile memory. A non-volatile memory stores its contents after the power is removed. The most readily accessible non-volatile memory is EEPROM, being available as a stand-alone chips, but almost all microcontrollers, even low-end ones also have some EEPROM integrated among it's peripherals. Let's ignore other types of non-volatile storage, notably FRAM for the sake of this article, as other types of memory have different price and marker availability and what is more important, they come with different set of limitations.

Limitations of EEPROM

The main limitation of an EEPROM is its rather low number of rewrite cycles, ranging from 100k to 1M rewrites per-bit usually. Yes, while EEPROMs usually allow some kind of page access to streamline reading and writing blocks of data, they allow to manipulate the data down to a single bit, which is considered an advantage, but due to practical reasons we ignore this property here either.

So how to overcome the limited number of write cycles on EEPROM? The technique is called wear leveling. Wear leveling means adjusting the write frequency for every part of the memory to be virtually the same. In other words, writing to he same place again only after all the other places were re-written in the meantime.

EEPROM and the circular buffer

Wear leveling provides some nice symbiosis for EEPROM and the circular buffer data structure, as circular buffer does not need any data shifting and it basically has the wear leveling built in. With the new data, just move to the next entry and overwrite it, looping back to the first position from the last one (thus circular). Easy, right?

Well not so simple, we still need to know which entry is the last written. This is also called a head. The head position can be stored in RAM of course. But with the power outage, the head would be lost. Even though our data are stored in non-volatile memory, without the way of telling which entry was the last, we would fail the ordering part of our first requirement. Note that the circular buffer also stores the information about it's last entry, called a tail, but we omit the tail for the simplicity as well.

So how we go around this limitation? There also is quite a lot of useful information about circular buffers written in the link below:

https://betterembsw.blogspot.com/2015/07/avoiding-eeprom-wearout.html

The natural way to preserve the head position is to write it in EEPROM instead of RAM, but where exactly? No matter where do we choose to store it in EEPROM, it will be worn out at much faster rate that the other entries of the circular buffer, as this one needs updating every time an entry is pushed. Searching around the Internet, the solutions to this do not seem to agree. The most notable solution is to use a second circular buffer for storing the head, meaning our storage capacity is now reduced, in an extreme case even halved, but the wear leveling is achieved and both our requirements optimally met. Could it be done better?

Discarding the second circular buffer

One particular comment on the AVRfreaks forum shows hints about the possibility of discarding the troublesome second circular buffer while still providing wear leveling for the entire storage.

The idea is to rewrite not just the oldest entry with the most recent value, but actually also provide a divider entry, with a known empty value below it, effectively placing the divider between the head and the tail. This way, to find the head, we have just to find that divider in the circular buffer.

The details are more complex than this because of the initialization and such, so go read the discussion if interested, but I wanted to point this neat little idea out. Sometimes the most brilliant solutions are not the most apparent. I must admit I really like this one, although I am not sure if there are no gotchas in this approach. Hopefully I will soon find out.

This is a 63th post of #100daystooffload.