Linux processes

What are processes and how to manage them

view on github

Linux processes overview

Table of contents

  1. What is a process
  2. Process types and lifecycle
  3. Process monitoring
  4. File access monitoring

What is a process

  • Programs and processes are a core concept for linux and computers at large.
  • A program is a blob of binary data consisting of a series of instructions for the CPU as well as other resources (images, audio files).
  • Creating a process means creating a running instance of a program by doing the following :
    • Copy the program instructions from the hard disk into the RAM.
    • Allocate some more RAM space for variables storage.
    • Set up some in-memory flags for the system to monitor and manage the process execution.
    • Have the CPU execute the program's instructions.
  • Once the program execution has completed, the system destroys the process and frees the allocated RAM.
  • As a result :
    • The system can support several processes running concurrently, whether they are instances of the same program or not.
    • Processes can run on behalf of users or on behalf of the system itself, in which case they are called daemons.

Process types and lifecycle

  • It is common policy on Linux that multiple users run multiple processes, at the same time and on the same system.
  • Multiple types of processes exist :

Interactive processes

  • Initiated by a user
  • Initialized and controlled through a terminal session.
  • Runs in the foreground (the terminal only accepts commands relative to this process).
  • Runs in the background (other processes can be started from the terminal from which the process was started).
  • Switching processes between foreground / backround is done using job control.

Automatic processes

  • Initiated by a user.
  • Not attached to a terminal but pushed into a spooler (FIFO queue).
  • Pulled from the spooler and executed either :
  • At scheduled time (at)
  • When system resources become available (batch)

Daemons

  • Scheduled automatically by a service manager like systemd.
  • Remain idle in the background until its service is needed.

List of process attributes

attribute variable description
PID $$ Process ID
PPID $PPID Parent process ID
RUID $UID Real user ID (user who initiated the process)
EUID $EUID Effective user ID (user whose permissions the process inherits, usually the same as RUID)
RGID N/A Real group owner ID (RUID's primary group)
EGID N/A Effective group owner ID (EUID's primary group)

Note : the above variables are shell variables as opposed to environment variables.

Process lifecycle

  1. Creation :

    • fork : the parent process makes a copy of itself at a different address space in memory :
      • I/O devices, environment and priority remain the same.
      • The child process PID changes after the fork procedure.
    • exec : the memory space of the child process is overwritten with the data of the new program to execute :
      • Multiple consecutive exec calls can happen without a fork.
      • example : init fork, getty exec, login exec, bash exec, bash fork.
    • fork and exec instructions have to be written as part of the processes themselves (TBC)
    • A daemonized process keeps running once its parent process terminates : its PPID changes from its parent processe's PID to init's PID.
  2. Termination :

    • Exit : the process exits normally once all the instructions are processed and returns an exit status (success / failure).
    • Killed : the process receives a SIGKILL signal that causes it to be immediately terminated by the kernel.
    • Other : the process can also unexpectedly stop for other reasons (power outage, etc ...).

Note : init is a symbolic link to the system's service manager (usually systemd) which always run with PID 1.

Common POSIX signals

signal value description
SIGTERM 15 Terminate the process in an orderly way
SIGINT 2 Interrupt the process (process can elect to ignore)
SIGKILL 9 Interrupt the process (process is immediately terminated)
SIGHUP 1 Daemon process rereads its configuration file

Notes :

  • The kill command is used to send signals to processes, for instance kill -s SIGTERM <process-id>.
  • Ctrl+C is the equivalent of sending SIGINT to the process running in the foreground.

Process monitoring

  • ps is the base tool used for process visualization.
    • With no options, it selects all processes that meet both those conditions :
      1. The process effective user ID (euid=EUID) is the same as the current user ID.
      2. The process is associated with the same terminal from which ps was invoked.
    • With no options, it displays the following informations :
      1. Process ID (pid=PID)
      2. Terminal associated with the process (tname=TTY)
      3. Cumulated CPU time in [DD-]hh:mm:ss format (time=TIME)
      4. Executable file name (ucmd=CMD)
  • ps can be passed options as :
    • UNIX options (dash, grouped, example ps -aux)
    • BSD options (no dash, grouped, example ps aux)
    • GNU long options (two dashes, not grouped, example ps --tty "$(tty)")
  • top can also be used to display a dynamic real-time view of a running system, however it is less flexible than ps
  • Options for those 2 commands should be investigated thoroughly so as to manage processes in the most effective way possible.

Notes :

  • htop is a modern, practical and recommended alternative to top.
  • Login shells (where the user provide credentials at startup) are preceded by - in ps -f output.

File access monitoring

  • lsof is an advanced and comprehensive tool used to monitor files access by processes in a linux system.

    • With no options, lsof lists all open files accessed by all active processes.
    • By default, the device cache file is disabled.
    • Some of the most commonly opened files are :
      • Directories.
      • Regular files.
      • Block device files.
      • Character device files.
      • Network files (network sockets, NFS files, UNIX domain sockets).
  • When passing options to lsof :

    • Each option can return a different list of files as a result.

    • lsof returns the union of all the results for the different options.

    • Passing the -a option returns the intersection of all the results instead.

    • Arguments can be negated using ^ when passed to options.

    • The -c option accepts POSIX extended regex patterns as arguments.

    • The most common options are :

      option displays
      -u Files opened by processes belonging to a specific user
      -p Files opened by processes with a specific PID
      -c Files opened by processes running a specific command
      -i Network files
  • Details on output :

    • FD: file descriptor.

      • See the man page for the full list.

      • Most frequent values are :

        value description
        (any number) File descriptor number
        cwd Current working directory
        rtd Root directory
        pd Parent directory
        txt Program text (code and data)
        mem Memory-mapped file
      • The file access mode flag is appended to FD in the output :

        • Space / empty if access mode is unknown.
        • r for read access.
        • w for write access.
        • u for read and write access.
    • TYPE: inode type associated with the file.

      • See the man page for the full list.

      • Most frequent values are :

        value description
        DIR Directory
        REG Regular file
        CHR Character device file
        BLK Block device file
        IPv4 IPv4 socket
        unix UNIX domain socket
        FIFO FIFO special file
        P... /proc files (process information pseudo-filesystem)
    • DEVICE: relevant device numbers for the file.

      • Block device files.
      • Character device files.
      • Regular files.
      • Directories.
      • NFS files.
    • NODE : inode for the file.

      • Inode number for local and NFS files.
      • Protocol for network sockets.
      • STR for streams.