Memory management in Linux

Junaid Mujawar
9 min readJun 26, 2022

Linux memory management is a pretty complicated area. For the last few days I have been debugging Linux kernel soft lockup issues, and had a chance to revisit memory management in Linux. I found it is actually very interesting. Memory management is a vast topic and covering it in one blog post will not do it justice.

Overview of Linux memory Management

Linux has a physical address and virtual address. For either kernel or user space programs, it always uses virtual addresses for memory access. For each process, there is 1GB(kernel)/3GB(user space) address space split as shown in the above diagram. The kernel address space is 3GB — 4GB, the user space is 0–3GB.

Linux manages the memory by page, which is typically 4K bytes size.

The modern processor has hardware to assist to translate virtual address to physical address.

Memory Zones

The Linux kernel doesn’t consider all of your physical RAM to be one great big undifferentiated pool of memory. Instead, it divides it up into a number of different memory regions, which it calls ‘zones’

  • ZONEDMA: it is 16 M-Bytes of memory. At this point it exists for historical reasons; once upon what is now a long time ago, there was hardware that could only do DMA into this area of physical memory.
  • ZONEDMA32: it exists only in 64-bit Linux; it is the low 4 G-Bytes of memory, more or less. It exists because the transition to large memory 64-bit machines has created a class of hardware that can only do DMA to the low 4 G-Bytes of memory.
  • ZONENORMAL: it is different on 32-bit and 64-bit machines. On 64-bit machines, it is all RAM from 4GB or so on upwards. On 32-bit machines it is all RAM from 16 MB to 896 MB for complex and somewhat historical reasons.
  • ZONEHIGHMEM: it exists only on 32-bit Linux; it is all RAM above 896 MB, including RAM above 4 GB on sufficiently large machines.

The zone information can be found like below:

Kernel Space and User Space

An application doesn’t need kernel privileges to access stack memory, nor memory blocks that the application dynamically allocated (although the malloc call does need to go through the kernel).

Upon startup of an application, the OS allocates two memory zones, one for the stack, and another for the code. When allocating blocks, the kernel specifies that only this specific application may read and write these zones; it does the same for space allocated through malloc.

As a rule of thumb, the kernel only arbitrates what applications may do in parallel, so that they don’t clash, either by queuing it (i/o, print, …), or spanning it (memory, …). But for memory, once it’s reserved for your application, you may access it, but if you try to access an address that the kernel didn’t map to your application, you’ll get memory error signals/crashes.

Kernel memory management

As previously stated, the kernel manages memory per page, which is typically 4K bytes size. The kernel provides one low-level API to allocate pages . The two core functions are:

  • struct page *allocpage(gfpt gfpmask, unsigned int order): Allocate 2 order pages and return a pointer to the first page’s page structure.
  • void _freepages(struct page *page, unsigned int order): free up pages allocated by taking page struct.
  • unsigned long _getfreepages(gfpt gfpmask, unsigned int order): it works same as above, except that it returns virtual address
  • void freepages(unsigned long addr, unsigned int order): free number of page 2order starting from the provided virtual address.
  • void *kmalloc(sizet size, gfp_t flags): this takes the number of bytes, instead of the number of pages. It is a higher level than the above two page allocation functions.
  • void kfree(void *addr): free up the memory allocated by kmalloc().
  • void *vmalloc(unsigned long size): same as kmalloc(), except that it only allocates the memory with a contiguous virtual address, not physically.
  • void vfree(void *addr): free the memory allocated by vmalloc().

User space memory management

The user space memory allocation is very different from kernel space.

malloc provides access to a process’s heap. The heap is a construct in the C core library (commonly libc) that allows objects to obtain exclusive access to some space on the process’s heap.

Each allocation on the heap is called a heap cell. This typically consists of a header that holds information on the size of the cell as well as a pointer to the next heap cell. This makes a heap effectively a linked list.

When one starts a process, the heap contains a single cell that contains all the heap space assigned on startup. This cell exists on the heap’s free list.

When one calls malloc, memory is taken from the large heap cell, which is returned by malloc. The rest is formed into a new heap cell that consists of all the rest of the memory.

When one frees memory, the heap cell is added to the end of the heap’s free list. Subsequent mallocs walk the free list looking for a cell of suitable size.

As can be expected the heap can get fragmented and the heap manager may from time to time, try to merge adjacent heap cells.

When there is no memory left on the free list for a desired allocation, malloc calls brk or sbrk which are the system calls requesting more memory pages from the kernel.

Now there are a few modifications to optimize heap operations.

  • For large memory allocations (typically > 512 bytes, the heap manager may go straight to the OS and allocate a full memory page.
  • The heap may specify a minimum size of allocation to prevent large amounts of fragmentation.
  • The heap may also divide itself into bins one for small allocations and one for larger allocations to make larger allocations quicker.
  • There are also clever mechanisms for optimizing multi-threaded heap allocation.

Table of Contents for Physical and Virtual Memory

  • Physical vs Virtual Memory in Linux
  • Commands for Memory Management in Linux

1. /proc/meminfo

2. The top command

3. free command

4. vmstat command

  • Conclusion

Physical vs Virtual Memory in Linux

Before we get into the nitty-gritty, it’s important to know that there are two types of memories in Linux.

  • Physical Memory
  • Virtual Memory

Physical memory is the actual memory present in the machine. Virtual memory is a layer of memory addresses that map to physical addresses.

Virtual memory is usually bigger than physical memory.

Linux kernel uses Virtual memory to allow programs to make a memory reservation.

While executing a program, the processor reads the instructions from the virtual memory. However, before executing the instructions, it converts the virtual addresses into physical addresses. Mapping information present in page tables is used for this job.

Commands for Memory Management in Linux

Let’s go over some of the commands for managing memory in Linux.

1. /proc/meminfo

The /proc/meminfo file contains all the information related to memory. To view this file use the cat command:

1  $ cat /proc/meminfo

This command outputs a lot of parameters related to memory. To get the physical memory from proc/meminfo file use:

1  $ grep MemTotal /proc/meminfo

To get the virtual memory from /proc/meminfo file use:

1  $ grep VmallocTotal /proc/meminfo

2. The top command

The top command lets you monitor processes and system resource usage on Linux. It gives a dynamic real-time view of the system. When you run the command, you’ll notice that the values in the output keep changing. This happens as it displays the values in real time.

1   $ top

The upper portion shows the current usage statistics of your system resources. The lower portion contains the information about the running processes. You can move up and down the list using the up/down arrow keys and use q to quit.

3. free command

The free command displays the amount of free and used memory in the system. It’s a simple and compact command to use. It tells you information such as how much free RAM you have on your system. It also tells you about the total amount of physical and swap memory on your system.

1   $ free

Values for each field are in Kibibyte (KiB). Kibibyte is not the same as Kilobyte. To get the output in a more human readable format use:

1   $ free -h

4. vmstat command

vmstat is a performance monitoring tool in Linux. It gives useful information about processes, memory, block IO, paging, disk, and CPU scheduling. It reports virtual memory statistics of your system.

1   $ vmstat

HOW LINUX OPERATING SYSTEM MEMORY MANAGEMENT WORKS

The memory management subsystem is one of the most important parts of the operating system. Since the early days of computing, there has been a need for more memory than exists physically in a system. Strategies have been developed to overcome this limitation and the most successful of these is virtual memory. Virtual memory makes the system appear to have more memory than it actually has by sharing it between competing processes as they need it.

Virtual memory does more than just make your computer’s memory go further. The memory management subsystem provides:

  • Large Address Spaces: The operating system makes the system appear as if it has a larger amount of memory than it actually has. The virtual memory can be many times larger than the physical memory in the system,
  • Protection: Each process in the system has its own virtual address space. These virtual address spaces are completely separate from each other and so a process running one application cannot affect another. Also, the hardware virtual memory mechanisms allow areas of memory to be protected against writing. This protects code and data from being overwritten by rogue applications.
  • Memory Mapping: Memory mapping is used to map image and data files into a processes address space. In memory mapping, the contents of a file are linked directly into the virtual address space of a process.
  • Fair Physical Memory Allocation: The memory management subsystem allows each running process in the system a fair share of the physical memory of the system,
  • Shared Virtual Memory: Although virtual memory allows processes to have separate (virtual) address spaces, there are times when you need processes to share memory. For example there could be several processes in the system running the bash command shell. Rather than have several copies of bash, one in each processes virtual address space, it is better to have only one copy in physical memory and all of the processes running bash share it. Dynamic libraries are another common example of executing code shared between several processes. Shared memory can also be used as an Inter Process Communication (IPC) mechanism, with two or more processes exchanging information via memory common to all of them. Linux supports the Unix System V shared memory IPC.

Conclusion

A few commands that you can use in Linux to manage your memory. These commands give key insights about your memory. You can play around with all the commands by following the individual tutorials to get an general idea of how you can play around with the processes in the task manager.

References

--

--

Junaid Mujawar

A descriptive information about PLC and SCADA for an amazing way of learning applied electronics