Data can be stored in many different vessels. In the computing world, stored data is fetched to the CPU, where the CPU will do some work using that data, and maybe write some data back. Generally, the closer the data is to the CPU, the faster the data will be fetched. Let’s talk about a few places where that data might be coming from.
The difference between a cache and memory is that, if the data being retrieved is not found, caches will look for that data in another location, whereas memory will not. Memory will immediately return that the data was not found. This is because memory is used as the final source, in a chain of storage mediums.
Register
Registers are the smallest, and faster form of storage. They are part of the CPU and are necessary for intermediate data storage by the CPU. It’s important to remember that registers do not store data long term. The contents of a register is lost when the power to the system is cut.
Registers live within a core. This means that each core in a CPU has its own set of registers.
Registers operate at CPU clock speed. The amount of data that a single register can store can be from 8 bits to 64 bits, depending on the CPU architecture. Similarly, the number of registers within a CPU is also dependent on the CPU architecture.
Some of the ways a CPU uses a register are:
- Memory addresses
- Data for intermediate steps
- Arithmatic operations
- Program instructions
- Stack pointer
CPU Caches
L1 cache lives in the individual cores. It can store 16KB to 128KB of data, and this data can be fetched at nearly clock speed.
L2 cache does not live inside of the core, however, it is the nearest memory to the core. L2 cache is also specific to a core however, it also has a portion that is shared amoung a small cluster of cores. L2 cache can hold between 128KB to 2MB of data.
L3 cache is the slowest and largest of the 3 caches. It can store from 2MB to 128MB of data. Unless the previous two, L3 cache is shared across all cores.
L2 and L3 caches are part of the CPU, though not part of the individual cores.
Being cache, the CPU stores frequently accessed data or instructions in them. The CPU first tries to access L1 cache for the data. If it’s not there, it will look for the data in the L2 cache and so on, until it finds it.
RAM
Random Access Memory (RAM, sometimes called main memory) is the next fastest form of storage for a CPU. It lives outside of the CPU, and connects to the CPU via the motherboard. RAM is a volitile type of memory, used to store data currently or recently used by the CPU closer to the CPU.
RAM is about 80x slower than L1 cache. However, the it’s storage limit is only limited by the amount of RAM cards a motherboard can support. Each RAM card can hold from 4GB to 64GB of memory for consumer grade. However, RAM cards can grow up to 256GB. These behemoths are used by data centers. Larger chips are currently in development by the likes of Samsung.
Scaling a RAM chip is difficult for several reasons. There are two approaches that can be taken:
- Increase the physical size of the chip, and/or
- Decease the size of individual memory cells so you can fit more inside of the chip
The approachs above both have their limitations. One drawback for the first apprach is that, if we increase the size of the chip, it will take longer for a signal to travel across the chip. This will impact the latency. Another important factor is, a larger chip means more wafer (semiconductor material, usually made of silicon) required to build it, which is very expensive.
The second approach has it’s own issues. We are nearing the smallest possible limit that a memory cell can shrink to, afterwhich, we will not be able to reduce it any longer. Second, as the density of memory cells increases, the possibility of defects and signal noise increases.
Both approaches are also affected by an increase demand for power, which will also result in an increase in emitted heat. Also, more memory means more cells to search, leading to diminishing performance.
RAM acts as the main memory for a CPU. It holds instructions and data that are currently being used by the CPU. When the RAM is full, the CPU might use some of the storage on an SSD or HDD as though it were part of the RAM (called virtual memory). This will hinder the performance of the applications being used, however, it will allow them to continue to work. Another method is to compress the data in the RAM, which most OSes do. If memory is not freed and demand continues to increase, applications may start to free until the OS decides to kill a few, based on priority, with an OOM error.
SSD
Solid state drives (SSD) are consider external memory used for the purposes of long-term storage. The CPU interacts with these drives in order to load data and/or instructions to the main memory. Compared to the main memory, SSDs are slower but are capable of storing much more data.
HDD
Hard disk drives are similar to SSDs in their relationship with CPUs however, they differ in terms of speed and cost of storage. HDD are slower and as a result, cost less to store the same amount as SSD.
HDDs have a physical disk that spins, and a physical arm that reads and writes data to the disk. As a result, these drives are limited by how fast the physical disk can spin. These physical components also can be there downfall, as they can break or simply become less accurate with wear from time.
The main usage for HDDs is cheaper storage for less accessed data. They can serve as backups or for archiving.
Tape
The slowest, and by far the cheapest method for storing data is a physical tape. Magnetic tape is wound around a reel, and stored in a cartridge. A tape reader is required to unwind the tape, and read the data. As a result of this method of access, data is fetched sequentially as oppose to random access. Random access is supported, though very slow, as the reader has to rewind and fast forward constantly to find the target address. A single tape can store up to 18TB of data uncompressed, or 45TB compressed.

Magnetic layer on the tape is made up of tiny particals called magnetic domains. These particles are isolated from one another in order to prevent interference. Each magnetic domain is either oriented to have a north pole or south pole. A magnetic writer head will iterate through the magnetic domains, applying a magnetic charge in order to change the poles to align with the binary data being written.
Reading the data requires a magnetic reader head. The reader will read the orientation of the particle domains, which is read as an electric current. The direction of the current is decoded into a 0 or 1. Other steps are also involved, such as error checks, but we will skip that on purpose for now.
The usage for tapes is long term cold storage. Given the rise in data volume throughout the years, cloud providers have utilized tapes for storage. A benefit of this medium being offline, is that it’s not susceptible to cyber attacks. For this reason, it’s a great medium for disaster recovery. Another added benefit of this is that they do not require any energy usage because they are offline. Lastly, it’s a great medium for storing media that isn’t used frequently. Tapes are very popular in the entertainment industry.
Definitions
CPU clock speed is a measure of how many basic units (fetch, decode, or execute instruction) of operation a CPU can execute in a second. This frequency is generated by a crystal oscillator inside of the processor. Clock speed is not equivalent to instructions executed per cycle. Different CPUs can execute different number of instructions per clock cycle. Clock speed is specifc to a single core. A multicore system can execute instructions in parallel, each core running at the clock speed specified.