Low Orbit Flux Logo 2 F

Linux Kernel Parameters and Tuning

Types of options/parameters:

Sysctl Parameters



sysctl -a                               # show all kernel parameters
sysctl net.ipv4.tcp_fastopen            # check specific parameter
sudo sysctl -w net.ipv4.tcp_fastopen=3  # temporarily change param, lost after reboot

Persistently set kernel parameters:

/etc/sysctl.conf
net.ipv4.tcp_fastopen = 3 vm.nr_hugepages = 1024

Apply changes in config file:



sudo sysctl -p

Separate file for custom settings:



echo "net.core.somaxconn = 1024" | sudo tee /etc/sysctl.d/99-custom.conf

Apply changes:



sudo sysctl --system

Network



fs.file-max = 2097152  # File Descriptor Limits - files, sockets, and pipes
                       # system wide, not per process like ulimit settings

net.core.rmem_max = 268435456      # Increase Maximum Memory Buffers for Networking
net.core.wmem_max = 268435456      # Increase Maximum Memory Buffers for Networking
net.core.rmem_default = 67108864   # Increase Maximum Memory Buffers for Networking
net.core.wmem_default = 67108864   # Increase Maximum Memory Buffers for Networking

                                         # increase for bandwidth, decrease to save memory
net.ipv4.tcp_rmem = 4096 87380 6291456   # Defines the minimum, default, and maximum buffer sizes for TCP receive buffers.
net.ipv4.tcp_wmem = 4096 65536 16777216  # Defines the minimum, default, and maximum buffer sizes for TCP send buffers.


net.core.somaxconn = 1024              # max connections queued or acceptance by a listening socket
                                       # ( good for web or app server ex. 65535 )
net.ipv4.tcp_max_syn_backlog = 8192    # max syn backlog ( half open connections waiting )
                                       # ( good for web or app server ex. 65535 )

net.ipv4.tcp_fin_timeout=30         #  time to wait for closed conn clean up, reduce to free resources faster
net.ipv4.tcp_keepalive_time=600     # timer for keepalive probes on connection

net.ipv4.tcp_fastopen = 3       # Enable Fast Open for TCP, reduce delays
net.ipv4.tcp_low_latency = 1    # Enable low-latency mode for TCP connections, reduce delays
net.ipv4.tcp_timestamps = 0     # Disable TCP Timestamps (Slightly Improves Latency)

net.ipv4.ip_local_port_range = 1024 65000 #  Allows a larger range of ephemeral ports for faster network connections.

net.ipv4.ip_forward=1           # enable IP forwarding ( when setting up a router )

Security



net.ipv4.tcp_syncookies = 1             # Enable SYN Flood Protection, Helps mitigate DDoS attacks.
net.ipv4.conf.all.rp_filter = 1         # Enable Reverse Path Filtering (Prevents IP Spoofing)
net.ipv4.conf.all.accept_redirects = 0  # Disable ICMP Redirects
net.ipv4.conf.all.send_redirects = 0    # Disable ICMP Redirects

kernel.randomize_va_space=2       # ASLR  - randomize address space to protect memory location attacks ( 0: disable, 1: conservative, 2: full )

CPU / Process / other



kernel.sched_min_granularity_ns=15000000     # min time slice a process can run for before switching
kernel.pid_max=65535                         # max value a PID can have
kernel.sched_wakeup_granularity_ns=25000000  # min time slice for process to be woken up

inotify.max_user_watches=524288   # max files that inotify can watch

Memory



kernel.shmmax=2147483648        # max size for single shared memory segment

kernel.sem="250 32000 32 128"   # parameters for semaphore arrays - SEMMSL SEMMNS SEMOPM SEMMNI
                                # SEMMSL: Maximum number of semaphores per semaphore set.
                                # SEMMNS: Total number of semaphores system-wide.
                                # SEMOPM: Maximum number of operations per semop system call.
                                # SEMMNI: Maximum number of semaphore sets.

vm.dirty_background_ratio=20    # percentage of system memory that can be filled with dirty pages before writing in bg
vm.dirty_ratio=30               # percentage of system memory that can be filled with dirty pages before writing
vm.max_map_count=262144         # max number of memory map areas that a process can have
vm.overcommit_memory=1          # memory overcommit behavior - 0: heuristic, 1: always, 2: never

Swappiness

Disabling swap:

Disable swap:



sudo sysctl -w vm.swappiness=0

Check swappiness:



cat /proc/sys/vm/swappiness

Swappiness values:

0 Avoid swapping as much as possible.
100 Aggressively swap data to disk.
60 on my Ubuntu desktop, default on many systems, balanced approach for general-purpose servers and desktops, mix of work loads
10 closer to this for large mem intensive apps ( ex. Redis )
80-100 lot of swap but little RAM

Increase Swappiness (e.g., 80-100):

Kernel Boot Parameters

Example config:

/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash console=tty1"

Apply Config:



sudo update-grub
sudo reboot

Some boot parameters that could be used:

quiet Suppresses most boot messages to show a cleaner boot process.
splash Displays a graphical boot splash screen.
console=tty1 Specify console for kernel messages
debug No debugging
noht Disable hyper threading ( intel only ) - OLDER
nosmt Disable Simultaneous Multithreading ( multiple arch like Intel, ARM, RISC-V ) - NEWER
nomodeset Disables kernel mode setting, useful for avoiding display issues during boot (particularly useful for graphics-related issues).
noapic Disables the Advanced Programmable Interrupt Controller (APIC), used for interrupt management. This may help resolve boot issues related to multi-processor systems.
acpi=off Disables the ACPI (Advanced Configuration and Power Interface). This may be helpful if you’re facing issues related to power management or hardware compatibility.
nolapic Disables Local APIC support, which can be useful for troubleshooting on some systems with multi-core processors.
selinux=0 Disable selinux
apparmor=0 Disable app armor
init=/bin/bash Emergency recovery
fastboot  
transparent_hugepage=always Transparent Huge Pages (THP)

Disable Hyper-Threading (HT)

Avoids context switching overhead and interference from processes using the same core but different threads.

Temporarily at run time:



echo off > /sys/devices/system/cpu/smt/control

Check:



cat /sys/devices/system/cpu/smt/active   # 0 disabled, 1 enabled

Huge Pages ( Static and Transparent )

Transparent Huge Pages (THP)

Configure at boot with kernel parameter:



transparent_hugepage=always

Configure in runtime:



echo always > /sys/kernel/mm/transparent_hugepage/enabled

always THP is enabled and used when possible.
madvise THP is used only when explicitly requested by applications using madvise().
never THP is disabled.
Deframmentation with THP

Sysctl parameter:



vm.transparent_hugepage_defrag=defer

Kernel parameter ( at boot ):



GRUB_CMDLINE_LINUX="transparent_hugepage=defer"

Huge Pages ( Static Huge Pages )

This belongs above in the sysctl section:



sudo sysctl -w vm.nr_hugepages = 1024   # reserve this many huge pages
sudo sysctl -w vm.nr_hugepages = 1      # allocate imediately
sudo sysctl -w vm.nr_hugepages = 0      # deallocate immediately
grep -i huge /proc/meminfo              # check



cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages     # 2 MB
cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages  # 1 GB
echo 1024 | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages  # 2 MB
echo 4 | sudo tee /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages  # 1 GB

Static huge pages requested from pre-allocated pool with something like this:



default_hugepagesz=2M    # default size when app doesn't specify
hugepagesz=2M            # specific size to allocate
hugepages=1024           # number of pages to allocate

Allocate different sizes:



default_hugepagesz=2M hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024