Linux processes overview

What are processes and how to manage them

view on github

What is a process

✔️ Programs and processes are a core concept for linux and computers at large

✔️ A program is a blob of binary data consisting of a series of instructions for the CPU as well as other resources (images, audio files etc ...)

✔️ Creating a process is actually creating a running instance of a program by doing the following :

  • Copy the program instructions from the hard disk into the RAM
  • Allocate some more RAM space for variables storage
  • Set up some in-memory flags for the system to monitor and manage the process execution
  • Have the CPU execute the program's instructions

✔️ Once the program execution is completed, the system destroys the process and frees the allocated RAM.

✔️ As a result :

  • The system can support several processes running concurrently, whether they are instances of the same program or not
  • Processes can run on the behalf of users or on the behalf of the system itself (daemons)

Processes overview

✔️ It is common policy on Linux that multiple users run multiple commands, at the same time and on the same system.

✔️ Multiple types of processes exist :

  • Interactive processes :

    • Initiated by a user
    • Initialized and controlled through a terminal session
    • Runs in the foreground (the terminal only accepts commands which are relative to this process)
    • Runs in the background (other commands can be run on the terminal from which the process was started)
    • Switching processes between foreground/backround is done using job control
  • Automatic processes :

    • Initiated by a user
    • Not attached to a terminal but pushed into a spooler (FIFO queue)
    • Pulled from the spooler and executed either :
      • at scheduled time (at)
      • when system resources become available (batch)
  • Daemons :

    • Initiated by the system at startup
    • Remain idle in the background until its service is needed
  • List of process attributes :

    PID      # Process ID
    PPID     # Parent process ID
    RUID     # Real user ID (the user who initiated the process)
    EUID     # Effective user ID (the user whose permissions the process inherits, usually the same as RUID)
    RGID     # Real group owner ID (RUID's primary group)
    EGID     # Effective group owner ID (EUID's primary group)

✔️ Process lifecycle :

  1. Creation :

    • fork : the parent process makes a copy of itself at a different address space in memory
      • I/O devices, environment and priority remain the same
      • child PID changes after the fork procedure
    • exec : the memory space of the child process is overwritten with the data of the program to execute
      • multiple consecutive exec calls can happen without a fork
      • example : init fork, getty exec, login exec, bash exec, bash fork
    • fork and exec instructions have to be written as part of the processes themselves (TBC)
    • processes are daemonized so they keep running once their parent process is terminated : their PPID changes from their parent processe's PID to init's PID (1)
  2. Exit :

    • exit : the process exits normally after all the instructions were processed and returns an exit status (success/failure)
    • killed : the process receives a SIGKILL signal that will cause it to be immediately terminated by the kernel
    • the process can also unexpectedly stopped for other reasons (power outage, etc ...)

✔️ Common POSIX signals and their numeric value :

SIGTERM (15)    # terminate the process in an orderly way
SIGINT  (2)     # interrupt the process (process can elect to ignore)
SIGKILL (9)     # interrupt the process (process is immediately interrupted whatsoever)
SIGHUP  (1)     # daemon process rereads the configuration file

Process monitoring :

✔️ ps is the base tool used for visualizing processes.

  • With no options, it selects all processes for which both of the following conditions are met :
    • The process effective user ID (euid=EUID) is the same as the current user ID
    • The process is associated with the same terminal from which ps was invoked
  • With no options, it displays the following informations :
    • process ID (pid=PID)
    • terminal associated with the process (tname=TTY)
    • cumulated CPU time in [DD-]hh:mm:ss format (time=TIME)
    • executable file name (ucmd=CMD) - ps can be passed options as :
    • UNIX options (dash, grouped)
    • BSD options (no dash, grouped)
    • GNU long options (two dashes, not grouped)
  • top can also be used to display a dynamic real-time view of a running system, however it is less flexible than ps
  • Options for those 2 commands should be investigated thoroughly so as to manage processes in the most effective way possible.

File access monitoring :

✔️ lsof is an advanced and comprehensive tool used for monitoring access to files by processes in a linux system.

  • by default, lsof lists all open files that are accessed by all active processes.
  • by default, the device cache file is disabled.
  • some of the most commonly opened files are :
    • directories
    • regular files
    • block special files
    • character special files
    • network files (internet sockets, NFS files, UNIX domain sockets)

✔️ Passing options to lsof :

  • when passing options, the union of the files lists that match each specified option is returned.

  • pass the -a option to return the intersection of the matching files lists instead.

  • arguments can be negated using ^ when passed to options.

  • the -c option accepts extended regex patterns as arguments.

  • the most commonly used options are :

    option displays
    -u files opened by processes belonging to a specific user
    -p files opened by processes with a specific PID
    -c files opened by processes running a specific command
    -i network files

✔️ Details on output :

  • FD (file descriptor) most frequent values are (see the man page for the full list) :

    value description
    (any number) file descriptor number
    cwd current working directory
    rtd root directory
    pd parent directory
    txt program text (code and data)
    mem memory-mapped file
  • the file access mode flag is appended to FD in the output :

    • r for read access.
    • w for write access.
    • u for read and write access.
    • space if access mode is unknown.
  • TYPE (type of node associated with the file) most frequent values are (see the man page for the full list) :

    value description
    DIR directory
    REG regular file
    CHR character special file
    BLK block special file
    IPv4 IPv4 socket
    unix UNIX domain socket
    FIFO FIFO special file
    P... /proc files (process information pseudo-filesystem)
  • DEVICE contains the device numbers for block special files, character special files, regular files, directories and NFS files .

  • NODE contains the file inode number (for local and NFS files), the protocol (for network sockets) or STR for a stream.

Notes :

  • the first process to be started by the system is init with PID 1 (what about systemd ?)
  • login shells (where the user had to provide his id and password at startup) are preceded by - in ps -f output
  • Ctrl+C is the equivalent of sending SIGINT to the process running in the foreground.
  • xkill closes the connection between a X client (GUI window) and the system's X server (caution: applications losing their connection to the X server won't automatically abort)