Shell beginners guide
What is a shell and how to use it

- Overview
- Shell commands
- Words and shell expansion
- Standard streams, piping and redirections
- Common shell utilities
- Running commands in the background
-
Shell core principles
- THE "STEERING WHEEL" OF THE SYSTEM : it exposes all the operating system's features to the user.
- THE AIM IS TO BE LAZY : automate tasks and have the computer perform them on your behalf.
- NO ABSTRACTIONS OF ANY KIND : for instance, the shell does not know what an "object" is.
- NO DATA TYPES : shell variables are not typed at all (no "string", "number" etc ...).
- REGEX IS SUPPORTED : the POSIX regex implementation unleashes the full power of the shell.
-
Directories relevant to the shell
directory contents /etc
Config files for the system /var/log
Log files for various system programs /bin
Commonly used programs /usr/bin
Another location for programs on the system
-
Typing a command in the shell :
- Linux is case sensitive.
- Prefixing a character with a backslash
\(
escapes it: it nullifies its interpretation by the shell as a special character. - Long hand command line options begin with two dashes
--option
and short hand options begin with a single dash-o
. - When using single dash, several options can be invoked by placing all the letters together after the dash
-lAh
. - Option requiring arguments generally have to be placed separately along with their corresponding argument.
- "The Linux command line does not have an undo feature. Perform destructive actions carefully."
- You can stop/exit from a running shell command with
Ctrl+c
- "Whenever you get in trouble you can generally press Ctrl+c to get yourself out of trouble."
-
Shell wildcards
- Wildcards are a set of building blocks used in patterns that the shell will translate to a set of files or directories.
- Referring to a file or directory on the command line means referring to a path, and that path can use wildcards so it is turned into a set of files or directories.
- The wildcards are translated by the shell, not by the command itself (
ls
in that example). - The shell replace wildcards patterns with every matching path and passes the resulting paths set to the command.
# list all files and directories starting with a b
ls -la b*
-
Executing a command in the shell
- When a command is executed, the shell looks for the invoked program (or executable) through a preset series of directories.
- That series of directories is stored in the environment variable
$PATH
and no other directories will be considered during the search. -
$PATH
directories are searched sequentially, and the shell will execute the first executable it finds that matches the command. -
$PATH
is an environment variable and as such can be modified by individual users to fit their needs (usually through.bashrc
). - For instance, users can manage different installations of the same program, create programs wrappers, access custom scripts from anywhere, etc ...
- When the shell receives a command (either from the command line or from a script) it breaks it up into words.
- A word is a non-zero-length sequence of characters delimited by white spaces.
- After this happens, the shell performs seven operations on the words.
- These seven operations can change how the words are interpreted and are collectively known as shell expansion.
- Enclosing single or multiple words in double quotes results in :
- The delimited sequence of words will be considered a single word.
- The only operation performed by the shell will be variable substitution.
# word expansion happens, command creates two directories
mkdir some directory
# word expansion prevented, command creates one directory
mkdir "some directory"
Note : variable substitution displays line feeds correctly.
-
Every process started from the command line has three data streams automatically attached to it :
- standard input (data fed into the program).
- standard output (data printed by the program, defaults to the terminal).
- standard error (for error messages, also defaults to the terminal).
-
Inside the process it is the opposite : the process reads data from its standard and writes data to its standard output and error.
-
Using those different streams to read and write data is made possible by devices that are mapped to file descriptors :
stream name device file descriptor 0 STDIN /dev/stdin
/proc/<processID>/fd/0
1 STDOUT /dev/stdout
/proc/<processID>/fd/1
2 STDERR /dev/stderr
/proc/<processID>/fd/2
-
For convenience, "self" can be used instead of the process ID :
/proc/self/fd/1
. -
This mechanisme allows communications between process and files through the use of piping and redirection operators.
-
Any process that is on the left side of a piping or redirection operator has to provide an output.
-
That output will then be written to whatever is on the right side of the operator :
- A file in the case of redirection operators
<
or>
(for<
, the file has to be on the left). - A process input in the case of piping operator
|
(the output will be written to its standard input). - The
<<
and>>
operators will write to a file in append mode instead of replace mode.
- A file in the case of redirection operators
-
Prefixing a redirection operator with a number redirects the corresponding stream output :
1>
-
Prefixing a stream number with an ampersand allows redirecting to a stream :
&1
-
Examples :
# pipe first process to second process, write first process pid to stdout and have second process read it
echo -e "first command pid : $$\n" | echo -e "second command pid : $$\nstdin is [$(cat /dev/stdin)]"
# redirect file contents to process, then pipe process output to another process
wc -c < "$some_file" | echo "total number of chars in $some_file is $(cat /dev/stdin)"
# redirect stderr to stdout (have to be placed at the end) and redirect stdout to a file in append mode
cat "$some_file" "$other_file" 1>> "$both_files_contents" 2>&1
Note : command arguments should be favored over reading stdin from inside the process or script whenever possible.
- type
man <utility>
to display help and available options. -
head
prints the first x lines from a file :
# extract first 10 lines from 2 files, sort results, number lines and add separator
head -qn 10 "$some_file" "$other_file" | sort -fdi | nl -s ' --> ' -w 10
# extract first 10 lines from 2 files, display column 2 only on lines containing separator
head -qn 10 "$some_file" "$other_file" | cut -s -d '.' -f 2
-
sed
is a powerful stream editor for filtering and transforming text :
# execute a regexp match and replace on streamed file ($some_file), print newline
# sed -r is POSIX extended, see https://remram44.github.io/regex-cheatsheet/regex.html
sed -r 's/\s([a-z]+)\s([0-9]{1})$/ eats : \2 \1 /g' "$some_file" && echo
# does the same by piping cat stdout to sed stdin (always use the dash)
cat "$some_file" | sed -r 's/\s([a-z]+)\s([0-9]{1})$/ eats : \2 \1 /g' - && echo
-
grep
searches a given set of data and print every line matching a given pattern :
# grep -E activates POSIX extended regexp for search patterns
grep -E -n '\s[oap]{1}[a-z]+\s[0-9]{1,2}$' "$some_file"
# use absolute path, number lines, print file name, ignore binaries
# print output context for each match (leading and trailing line)
grep -E --color=always -nbHIC1 '\s[oap]{1}[a-z]+\s[0-9]{1,2}$' "$some_file"
# same on 2 files, opening the regexp to get matches from both
grep -E --color=always -nbHIC1 '\s[oapw]{1}[a-z]+\s[0-9]{1,2}$' "$some_file" "$other_file"
-
xargs
builds and execute command lines from standard input :
# xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines
# it then executes the command (default is /bin/echo) one or more times with any initial-arguments followed by items read from standard input
# blank lines on the standard input are ignored. xargs can be a nice alternative to for loops in bash scripts if seeking performance
ls "$some_directory" | xargs -tn1 -I '%1' cat "$some_directory/%1" >> "$some_directory/$all_file_contents"
- Adding an ampersand
&
at the end of a command makes the shell run the command in the background and create a job. - Jobs can be moved between foreground and background :
Ctrl + z
pauses the running foreground process and moves it into the background. - Conversely,
fg
can be used to bring background processes to the foreground.
# start background command
sleep 60 && echo \ && echo "dear $USER, job has completed" &
# output: job id
[1] 2864
# view jobs
jobs
# output: running jobs
[1]+ Running sleep 60 && echo \ && echo "dear $USER, job has completed" &
# bring job 1 to foreground
fg 1
# process is foreground again
sleep 60 && echo \ && echo "dear $USER, job has completed"
# command returns ...