Stack and Heap Allocation in Cpp
In cpp, memory you need to worry about sizes of data types and variables for arrays, it is way easier then people think, but does require some basic knowledge of stack and heap allocation and the difference. In most (probably all) programming languages, there exists a stack and a heap.
Comparison of the stack and the heap
The Stack
The stack is a predefined block of memory where variables store there data.
Allocating memory on the stack is super fast, since a block of memory is already reserved for this purpose.
Stack allocated variables look like regular variables, e.g.
// primitive data types
int a = 3;
float b = 3.0;
double c = 3.0;
char c = 'a';
// stack allocated arrays
int[5] arr = { 1, 2, 3, 4, 5 };
// classes and objects on the stack
obj o;
// with constructor
obj o(a);
Copying stack allocated variables you pass all the contents of a variable to the other variable.
Copying looks like
int b = 10;
int a = b;
You can also pass the location of a variable to a variable that points to it,
to get the memory address you use &
, and to show a variable points to an
existing memory address you use *
e.g.
int b = 10;
int* a = &b;
When the variable falls out of scope, the variable is removed (popped) from the stack.
Stack allocated variables should be the norm, and use them whenever you can.
The Heap
The heap is a place where you can allocate memory at runtime when the sizes of data is not known.
Allocating memory on the heap is a more expensive process, since you need to allocate a block of memory for the variable to be stored.
Heap allocated variables are pointers that point to a location in memory, and
is distinguished by the new
keyword.
e.g.
// primitives
int* a = new 10;
float b = new 3.0;
double c = new 2.0;
// heap allocated arrays
int* arr = new {1, 2, 3, 4, 5};
int* arr = new int[5];
// classes and objects on the heap
obj* o = new obj();
// with constructor values
obj* o = new obj(a);
Copying heap allocated variable copies the location of the memory on the heap, but does not copy the values (similar to copying objects in Java).
e.g.
int* b = new 10;
int* a = b;
Note that when a heap allocated variable falls out of scope, it does not get removed from the heap. Instead we need to manually delete the memory from the heap.
// for normal pointers
// first make pointer null
b = null_ptr;
// second free memory
delete b;
// for arrays
delete [] arr;
Now when the variable falls out of scope, it no longer exists within the program.
Note that heap allocation is useful for when you do not know the size of something at runtime, say we need to create a dynamically sized array. Here dynamic means that the size is decided at runtime.
when you call the new keyword, the program needs to know the how much memory you
need to allocate on the heap. Therefore the new keyword is different for arrays.
When you call new
what it is doing is allocating memory that is the size of
your data type. The reason that it is good for arrays is that when you don't
know the size of your array at compile time it allows you to create a block in
the heap for that array.
When you create an array it allocates the size of your array * the size of your
data type, hence why array you need to include how big an array is when you call
new
for an array.
Dangling Pointers and Memory Leaks
Dangling pointers
Before we saw that we can pass a reference to another variable to a special variable that points to a location in memory. A dangling pointer occurs when a we create a variable that points to the location of another variable but that variable falls out of scope. It looks like this
int main() {
int* ptr;
dangling_ptr(ptr);
}
void dangling_ptr(int* ptr) {
int num = 0;
ptr = #
// num falls out of scope, not the pointer points to something that doesn't
// exist!
}
So overall lesson, be very careful when creating pointers.
Memory Leaks
In cpp memory leaks are a big thing that often cause things like the blue screen or death or crippling crashes.
A memory leak occurs when you forget to free your heap allocated memory and the memory still exists even after the variable falls out of scope.
This is easier seen with examples of common occurences in competitive programming.
// loop with memory leak
int main() {
for (;;) {
int* arr;
int size;
std::cin >> size;
arr = new int[size];
for (int i = 0; i < size; i++) {
arr[i] = i;
}
// memory leak occurs here since arr is about to fall out of scope,
// but the memory was not freed, therefore the allocated values still exist!
}
}
// loop memory safe
int main() {
for (;;) {
int* arr;
int size;
std::cin >> size;
arr = new int[size];
for (int i = 0; i < size; i++) {
arr[i] = i;
}
// we free memory so no more memory leaks
delete [] arr;
}
}
as a rule of thumb, there are ways to program without calling new
or delete
,
and unless you need fine grain control on your pointers, using them should
generally be avoided.
RAII Pattern
RAII stands for Resource acquisition is initialisation, and is a common pattern cpp programmers use to avoid memory leaks. This involves wrapping our pointer around an object and deleting it while it is no longer used. In java when we define classes we have constructors, but in cpp we also have destructors, that define what happens when the object falls out of scope. Also in cpp people might not have seen that we can overide operators such as +, -, /, * and also [] which makes it easier to define data structures.
In RAII the new keyword is called during in the constructor, and the delete keyword is called in the destructor.
What RAII let's us do, is automagically manage memory. It is a pattern that enables memory safe programming.
#include <iostream>
// convention for classes note, I like to prefix the members of classes with
// m_ so for example an array member is m_arr
class dynamic_int_array {
public:
// constructor
dynamic_int_array(int size): m_size(size), m_arr(new int[size]) {}
// make class be referenced like a normal array
int& operator[](int index) {
return m_arr[index];
}
// destructor
dynamic_int_array() {
delete [] m_arr;
}
private:
int* m_arr;
int m_size;
};
int main() {
for (;;) {
int size;
std::cin >> size;
dynamic_int_array arr(size);
for (int i = 0; i < size; i++) {
arr[i] = i;
}
for (int i = 0; i < size; i++) {
std::cout << arr[i] << std::endl;
}
}
return 0;
}
Here we see that when the array falls out of scope, it the memory is managed automagically by the class.
The STL
The standard template library is a collection of memory safe objects that do what we did in the previous chapter for us. The STL stands for standard template library and it is called that because template code (generic programming) is a little funky in cpp, and it exists in a special library that is compiled with your program, rather then being compiled to a binary first and then linked like most other cpp libraries are.
std::string
std::string, is imported using #include <string>
.
In C a string is defined as a char*
, or as we learned a dynamic char array.
std::string is just a wrapper around this allowing us to create memory safe
cstring (synonym for char*
) like things.
std::vector
std::vector is our dynamically allocated array, imported using
#include <vector>
.
It works exactly like a mix between the arraylist class in java, and the class we defined previously.
For example:
#include <iostream>
#include <vector>
int main() {
for (;;) {
int size;
std::cin >> size;
vector<int> arr(size);
for (int i = 0; i < size; i++) {
arr[i] = i;
}
for (int i = 0; i < size; i++) {
std::cout << arr[i] << std::endl;
}
}
return 0;
}
Smart Pointers
Smart pointers are classes for objects that follow raii, in order to automagically manage memory like we did previously.
They are imported using #include <memory>
.
They consist of two main objects, the unique_ptr
, and the shared_ptr
.
The unique_ptr
is a object that handles memory management using the raii
pattern and is pretty much just that. Really doesn't have much special. If we
remember the heap allocation of objects in the first chapter, a unique ptr does
the exact same thing without having to worry about new
or delete
.
They look like this
// constructor
unique_ptr<obj> = make_unique<obj>();
// constructor with values
unique_ptr<obj> = make_unique<obj>(a);
The unique ptr is called a unique pointer because it is for use when you only have one reference to the heap (one thing pointing to the heap).
The shared_ptr
is almost exactly the same as a unique_ptr
, but also tracks
the amount of copies and references to other pointers there are.
It is initialised the exact same way but using shared_ptr
instead.
Please look at Microsoft Pointer Reference for more info.