Pointers are usually one of the first topics any programmer learns in a practical Computer Science course. The problem is that with languages such as Java, Python, Ruby, etc. most developers tend to forget about them. Though these languages provide much safety and aid in both the speed of development and maintainability of large software projects, the amount of machine detail hidden from the developers can sometimes prove harmful (i.e. optimizing when your webserver can only handle 10,000 requests/sec, but you need 100,000 requests/sec). This is an obvious mistake. As a result, I am going to write a refresher on how pointers work and maybe even teach something new along the way.
What is a Pointer?
The most logical place to begin is to describe what a pointer is. Fundamentally, a pointer is nothing more than a memory address. That is, it is an abstraction describing a location on your RAM chip (you know, the real hardware) to the start of some data. I do admit that this is not 100% true (see the virtual memory section below), but it is a reasonably accurate way to understand pointers (even with modern CPUs).
Graphically, it looks something like this:
This diagram represents a full, physical RAM chip. Similarly, you will notice that each unit is uniquely identifiable by an integer. I have described these in base 10 (i.e. how we typically count) for sake of example, but most diagrams display these values in hexadecimal (i.e. the subtext I added) since memory addresses are typically large and it is common to think of numbers in base 16 due to the clear relationship to base 2 (i.e. binary).
Another important point from this diagram is that RAM is typically byte-addressable. What this means is that any single byte in RAM has its own unique address. This is actually a significant point; especially if you are performing bitwise operations. In particular, if you only need 5 bits to represent your data you must allocate a whole byte for that data even though 3 bits will be unused. If you had a word-addressable system (assume word is 4-bytes), then you would need to allocate 4 whole bytes for that same data. It seems wasteful, but this signifies the importance of addressability.
Let’s recap what we’ve (re-)learned so far. A pointer is:
- A memory address
- An abstraction (or level of indirection) to some data on hardware
- A unique integer identifying an addressable type (i.e. byte) on the chip
- Typically byte-addressable
A short note about dereferencing. Dereferencing a pointer is simply taking the pointer (i.e. the memory address or reference to some data) and returning the actual value stored in that address. More concretely, you can think of it as the CPU asking the RAM chip what data actually exists in the specified location.
An example using “int” (integer type). Now that we have some notion of what a pointer is, let’s go through an example on how integers are actually represented. Our first assumption will be that we are considering a 32-bit integer (the common representation of int in most systems). As a result, our data type has a size of 4 bytes. Additionally, we will assume we have a 32-bit address space even though 64-bit is no different except addresses are longer (i.e. they have 64-bits). We will assume the value stored in our integer is 322424845 (hex: 0x1337d00d). Finally, if you know what endianess is, we’re ignoring it for now. If you don’t, you should Google it if you’re curious or else wait for me to write about it in a future article.
Suppose the pointer (i.e. the address) to our int is 0xdeadbeef (decimal 3735928559. read: this is partly why we use hexadecimal). That means the start of our integer (4 bytes) is at this location in memory. When we dereference this location (denoted in C or C++ as *((int*)0xdeadbeef)), the value we receive is then 0x1337d00d.
The reason this works is because we have defined (byte convention) that int is 4 contiguous bytes. As a result, we can specify the starting address for an integer and our language knows to copy 4 bytes starting at the specified location while the CPU understands how to interpret these bytes as a 32-bit integer. Similarly, we can access the second byte directly (i.e. address 0xdeadbef0) if we wanted to, but if we interpreted this as an int we would get an extra byte at the end which we don’t necessarily control (so the value is likely garbage to us). However, other types (i.e. short == 2 bytes) may be able to interpret something meaningful for you depending on your application.
What this shows is that, from a hardware perspective, bytes are simply on the RAM chip and can always be accessed. However, to be useful, we define conventions on where certain types start and how they should be interpreted.
Why would I need a pointer?
Pointers are incredibly powerful provided you understand what they represent and how to use them. In general, pointers are very useful when sharing state across your program. If you can share a single copy of an object everywhere, you save both CPU time in copying those objects and reduce the memory footprint of your program since you won’t have many copies of the same object. Likewise, you won’t need to worry about keeping those copies in sync with each other.
That really just scratches the surface on why pointers are useful. In any case, I will not try to enumerate the many ways and reasons to use a pointer here. Instead, if you have a question or comment, please leave it in the comments below and I’ll answer it there.
Stack vs. Heap (and using pointers)
In the context of pointers, you will often hear the terms “stack” and “heap.” These terms refer to how memory is managed in a CPU and do not actually refer to the underlying data structures used to implement them. They can be thought of as “automatic” memory (i.e. the CPU allocates and deallocates this space for us) for the stack and “managed” memory (i.e. the programmer allocates and deallocates space) for the heap. The important thing to realize here is that you can have pointers to both. Since the stack and heap are both in RAM, there is no fundamental difference between the two except how the space is managed.
Now, I won’t go through the entire memory model of a CPU (perhaps we could save that for a future post), but I will show a quick diagram of what this looks like. It is common for people to say the stack “grows down” and the heap “grows up."
This diagrams shows how the stack addresses start high and go low (i.e. 0xFFFFFFFF to 0xFFFFFFFD in this example) and the opposite is true for the heap. If you recall, we have a finite number of RAM chips on disk (we can assume these combine together to make a single logical RAM chip). As a result, everything that is stored "in memory” is on that big RAM chip. Consequently, the stack and the heap are resident on the same RAM chip and must live together. So why do we start them at opposite ends?
As we said before, the distinction between our memory types is how they are allocated. The heap is allocated in an ad-hoc manner and its lifetime is only determined by the programmer. The stack, on the other hand, has a very well-defined order of allocation and deallocation directly correlated with its lifetime. Without getting into the details, suffice to say that because programmers directly control the heap, you are likely to end up with holes (i.e. fragmentation) in the lower memory while this will not be true in the higher memory. By starting the allocation of stack and heap on opposite ends, you can allow heap fragmentation to occur while not interfering with your stack.
So what happens when they meet each other in the middle? Well, then you get a stack overflow or a heap underrun. This is very bad and your program will likely crash or (worse) behave unexpectedly without terminating. Since your stack and heap are largely unaware of each other, what happens is that one happily writes over the memory of the other and everything becomes corrupted. This is not usually a problem for most programmers, however (most apps don’t request 8GB+ memory to run). If it is, you should usually allocate a large chunk of memory “up-front” and manage that accordingly. This is a more advanced technique, however, and you probably wouldn’t be needing this refresher if you fell into that category!
Memory allocators are used to allocate memory on the heap. You typically request a “chunk” of bytes from the allocator and it returns you a pointer to the starting address of that block of memory. Examples of allocators are “malloc” in C or “new” in C++. When you are done using this memory, you should clean it up with “free” (if using malloc) or “delete” (if using new).
How memory allocators work is a topic which I can discuss at a later time. Again, another fascinating topic which is simply too deep for this– already– long article.
Virtual Memory (high-level idea)
As I said in the previous section, I won’t go through the memory model of a CPU or virtual memory in great detail, but I will acknowledge its existence in all modern commodity CPUs. In short, virtual memory is how we can keep programs isolated from each other (i.e. you cannot just read someone else’s program’s memory) and, similarly, allow a single process access to the entire RAM chip (along with paging, that is).
Inside a program, each memory address is actually “virtual.” That means if I have processes p1 and p2, a pointer “pointing” (or “referencing”) to 0xdeadbeef will point to different physical locations (i.e. on the RAM chip) for each process. The most intuitive way to think about this is a simple map. For instance,
|Process||Virt Addr -> Phys. Addr|
|p1||0xdeadbeef -> 0xcafebabe|
|p2||0xdeadbeef -> 1337bea7|
How this memory translation happens is actually quite an interesting topic, but it is also much more involved than we have time for here. As a result, the details of this will be left for a future article.
Phew, so we made it through. First, let me reiterate that the goal of this article is to understand the high-level of pointers rather than every little implementation detail. As a result, we went through a lot of material. What’s more, we touched on many topics which have not been fully developed (i.e. Virtual Memory, Allocators, Memory Layout, etc.). These are all incredibly important to fully understanding pointers, but are not necessarily required to become proficient at using them. As I said, I plan on writing some information about these topics sometime in the near future.
A final note is that I find computer architecture and understanding how the software interacts with the hardware to be incredibly useful for real software engineering. While I love the theory in computer science, you are still bound to real physical machines in the end when you’re implementing your algorithms. Consequently, unlike equivalent pseudocode, not all implementations of the same algorithm are created equal. Just keep that in mind as you continue producing new software and journey to become a better developer.comments powered by Disqus