Shell beginners guide
What is a shell and how to use it
✔️ Shell core principles
- THE "STEERING WHEEL" OF THE SYSTEM : it exposes all the operating system's features to the user.
- THE AIM IS TO BE LAZY : automate tasks and have the computer perform them on your behalf.
- NO ABSTRACTIONS OF ANY KIND : for instance, the shell does not know what an "object" is.
- NO DATA TYPES : shell variables are not typed at all (no "string", "number" etc ...).
- REGEX IS SUPPORTED : the POSIX regex implementation unleashes the full power of the shell.
✔️ Directories relevant to the shell
-
/etc
stores config files for the system. -
/var/log
stores log files for various system programs (permissions may be restricted). -
/bin
stores several commonly used programs (some of which we will learn about in the rest of this tutorial). -
/usr/bin
another location for programs on the system.
✔️ Typing a command in the shell
- Linux is case sensitive.
- Prefixing a character with a backslash
\(
escapes it, which means that it nullifies its interpretation by the shell as a special character. - Long hand command line options begin with two dashes
--option
and short hand options begin with a single dash-o
. - When using a single dash, several options can be invoked by placing all the letters together after the dash
-lAh
. - Option requiring arguments generally have to be placed separately along with their corresponding argument.
- The Linux command line does not have an undo feature. Perform destructive actions carefully.
- You can stop/exit from a running shell command with
<Ctrl> + c
- "Whenever you get in trouble you can generally press + c to get yourself out of trouble."
✔️ Shell wildcards
- Wildcards are a set of building blocks that are used in patterns that the shell will translate to a set of files or directories.
- Referring to a file or directory on the command line means referring to a path, and that path can use wildcards so it is turned into a set of files or directories.
# list all files and directories starting with a b
ls -la b*
- The wildcards are translated by the shell, not by the command itself (
ls
in that case) - The shell replace the wildcards pattern with every matching path and passes the resulting paths set to the command
✔️ Executing a command in the shell
- When a command is executed, the shell looks for the invoked program (or executable) through a preset series of directories
- That series of directories is stored in the environment variable
$PATH
and no other directories will be considered during the search -
$PATH
directories are searched sequentially, and the shell will execute the first executable it finds that matches the command -
$PATH
is an environment variable and as such can be modified by individual users to fit their needs (usually through.bashrc
) - For example, users can manage different installations of the same program, create programs wrappers, access custom scripts from anywhere, etc ...
✔️ Words and shell expansion
- When the shell receives a command (either from the command line or from a script) it breaks it up into words.
- A word is a non-zero-length sequence of characters delimited by white spaces.
- After this happens, the shell performs seven operations on the words.
- These seven operations can change how the words are interpreted and are collectively known as shell expansion.
- Enclosing single or multiple words in double quotes results in :
- The delimited sequence of words will be considered a single word
- The only operation performed by the shell will be variable substitution
Note : variable substitution displays line feeds correctly.
✔️ Standard streams, piping and redirections
-
Every process started from the command line has three data streams automatically attached to it :
- standard input (data fed into the program)
- standard output (data printed by the program, defaults to the terminal)
- standard error (for error messages, also defaults to the terminal)
-
Inside the process it is the opposite : the process reads data from its standard and writes data to its standard output and error
-
Using those different streams to read and write data is made possible by devices that are mapped to file descriptors :
stream name device file descriptor 0 STDIN /dev/stdin
/proc/<processID>/fd/0
1 STDOUT /dev/stdout
/proc/<processID>/fd/1
2 STDERR /dev/stderr
/proc/<processID>/fd/2
-
For convenience, "self" can be used instead of the process ID :
/proc/self/fd/1
-
This mechanisme allows communications between process and files through the use of piping and redirection operators
-
Any process that is on the left side of a piping or redirection operator has to provide an output
-
That output will then be written to whatever is on the right side of the operator :
- A file in the case of redirection operators
<
or>
(for<
, the file has to be on the left) - A process input in the case of piping operator
|
(the output will be written to its standard input) - The
<<
and>>
operators will write to a file in append mode instead of replace mode
- A file in the case of redirection operators
-
Prefixing a redirection operator with a number redirects the corresponding stream output :
1>
-
Prefixing a stream number with an ampersand allows redirecting to a stream :
&1
-
Example :
# pipe first process to second process, write first process pid to stdout and have second process read it
$ echo -e "first command pid : $$\n" > /dev/stdout | echo -e "second command pid : $$\nstdin is [$(cat /dev/stdin)]"
# redirect file contents to process, then pipe process output to another process
$ wc -c < somefile | echo "total number of chars is $(cat /dev/stdin)"
# redirect stderr to stdout (have to be placed at the end) and redirect stdout to a file in append mode
$ cat datafile doesntexist 1>> blabla 2>&1
Note : command arguments should be favored over reading stdin from inside the process or script whenever possible
✔️ Common shell utilities
-
head
prints the first x lines from a file
# extract first 10 lines of datafile and fishyeah, sort, number lines and add separator
$ head -qn 10 datafile fishyeah | sort -fdi | nl -s ' --> ' -w 10
# extract first 10 lines of datafile and fishyeah, display column 2 only on lines containing separator
$ head -qn 10 datafile fishyeah | cut -s -d '.' -f 2
-
sed
is a stream editor for filtering and transforming text
# execute a regexp match and replace on streamed file datafile, print newline
# sed -r is POSIX extended, see https://remram44.github.io/regex-cheatsheet/regex.html
$ sed -r 's/\s([a-z]+)\s([0-9]{1})$/ eats : \2 \1 /g' datafile && echo
# does the same by piping cat stdout to sed stdin (always use the dot)
$ cat datafile | sed -r 's/\s([a-z]+)\s([0-9]{1})$/ eats : \2 \1 /g' - && echo
-
grep
searches a given set of data and print every line matching a given pattern
# grep -E activates POSIX extended regexp for search patterns
$ grep -E -n '\s[oap]{1}[a-z]+\s[0-9]{1,2}$' datafile
# use absolute path, number lines, print file name, ignore binaries, print output context for each match (leading and trailing line)
$ grep -E --color=always -nbHIC1 '\s[oap]{1}[a-z]+\s[0-9]{1,2}$' ~/.test_the_shell/datafile
# same on 2 files, opening the regexp to get matches from both
$ grep -E --color=always -nbHIC1 '\s[oapw]{1}[a-z]+\s[0-9]{1,2}$' datafile fishyeah
-
xargs
builds and execute command lines from standard input
# xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines
# it then executes the command (default is /bin/echo) one or more times with any initial-arguments followed by items read from standard input
# Blank lines on the standard input are ignored. xargs can be a nice alternative to for loops in bash scripts if seeking performance
$ ls ./dirtextfiles | xargs -tn1 -I '%1' cat ./dirtextfiles/%1 >> ./dirtextfiles/.hiddentest3.txt
✔️ Running commands in the background
- Adding an ampersand
&
at the end of a command will make the shell run the command in the background and create a job. - Jobs can be moved between the foreground and background :
Ctrl + z
pauses the running foreground process and moves it into the background. - Opposite,
fg
can be used to bring background processes to the foreground.
# start background command
$ sleep 60 && echo \ && echo "dear $USER, job has completed" &
[1] 2864
# view jobs
$ jobs
[1]+ Running sleep 60 && echo \ && echo "dear $USER, job has completed" &
# bring job 1 to foreground
$ fg 1
sleep 60 && echo \ && echo "dear $USER, job has completed"
# command returns ...