FFMPEG overview - Mulepedia

The FFMPEG multimedia framework

FFMPEG containers
FFMPEG codecs
FFMPEG bitstream filters
FFMPEG commands
Notes

FFMPEG containers

Containers are specific file formats designed to pack muxed audio/video data streams together and be read by media players.

file format container type

.mp3 audio only

.adts audio only

.opus audio only

.mp4 audio/video

.avi audio/video

.webm audio/video

.hls streaming

.dash streaming

file format	container type
`.mp3`	audio only
`.adts`	audio only
`.opus`	audio only
`.mp4`	audio/video
`.avi`	audio/video
`.webm`	audio/video
`.hls`	streaming
`.dash`	streaming

Note : constraints exist on data streams that can be muxed together in a given container format

FFMPEG codecs

Codecs are encoders that encode/decode individual media frames into data streams that are then muxed together in a container.

codec name codec type

copy audio

aac audio

libopus audio

libfdk_aac audio

copy video

gif video

libx264 video

libvpx-vp9 video

codec name	codec type
`copy`	audio
`aac`	audio
`libopus`	audio
`libfdk_aac`	audio
`copy`	video
`gif`	video
`libx264`	video
`libvpx-vp9`	video

Note : media frames also are characterized by their raw format (e.g. rgb, yuv 420, pcm) on which filtering can be performed.

FFMPEG bitstream filters

Bitstream filters perform bit-level operations on data packets without decoding them into frames (example: aac_adtstoasc).

FFMPEG commands

The typical FFMPEG command has a general syntax that looks like ffmpeg $1 $2 -i $3 $4 $5, where :

# an input / output can be either a file path or an url
ffmpeg \
  $1 \                                            # global command options
  $2 \                                            # input options
  -i $3 \                                         # input
  $4 \                                            # output options
  $5 \                                            # output

Basic usage (inputs can be files or urls)

ffmpeg \
  -i "$input1" -i "$input2" \                     # inputs                            (files/urls)
  -map "$input1:$stream" \                        # select input stream 1             (-map $input1:v to select all video streams)
  -map "$input2:$stream" \                        # select input stream 2             (-map $input2:a to select all video streams)
  -c:a:0 "$encoder_audio" -strict experimental \  # output first audio stream encoder (-c:a selects all output audio streams)
  -c:v:0 "$encoder_video" -strict experimental \  # output first video stream encoder (-c:v selects all output video streams)
  -f "$container_format" \                        # output container format
  "$output_file"                                  # output file path

Note : audio and video streams are 0-indexed in a container

Extract an audio file snippet to a mp3 file

ffmpeg \
  -ss "$timestamp" \                              # start time timestamp (00:00:00.000 format)
  -t "$duration" \                                # snippet duration (in seconds)
  -i "$input" \                                   # input file
  -c:a:0 copy \                                   # copy first audio stream bits
  -f mp3 \                                        # output container format
  "$output_file"                                  # output file path

Fine tune HTTP headers for url inputs

ffmpeg \
  -user_agent "$agent" \                           # user agent for input 1
  -headers "$header" \                             # additional HTTP header for input 1
  -i "$input1" \                                   # input 1 (url)
  -user_agent "$agent" \                           # user agent for input 2
  -headers "$header" \                             # additional HTTP header for input 2
  -headers "$header" \                             # additional HTTP header for input 2
  -i "$input2" \                                   # input 2 (url)
  -f "$container_format" \                         # output container format
  "$output_file"                                   # output file path

Make pepe dance

tracklist=$(find -P /home/user/music/mp3 -maxdepth 1 -regextype 'posix-extended' -regex '^.+\.mp3$' | tr \\n \| | sed 's/.$//') && \
ffmpeg \
  -i "concat:$tracklist" \                        # input 1 (uses the concat protocol)
  -i /path/to/pepeds.gif \                        # input 2
  -map 0:0 \                                      # select input 1 audio stream
  -map 1:0 \                                      # select input 2 video stream
  -shortest \                                     # stop transcoding at the end of the shortest stream
  -c:a:0 aac \                                    # output audio stream encoder
  -b:a 320k \                                     # set output audio stream bitrate
  -ar:a:0 44100 \                                 # set output audio stream sample rate (relevant as to normalization filter)
  -filter:a:0 "loudnorm" \                        # use filter : normalize output audio stream loudness
  -c:v:0 libx264 \                                # output video stream encoder
  -b:v:0 750k \                                   # set output video stream bitrate
  -s 498x357 \                                    # set video frame size (width x height)
  -r 20 \                                         # set video frame rate (Hz)
  -filter:v:0 "loop=loop=-1:size=32767:start=0" \ # use filter : repeat the video frames from 0 to 32767, to no end
  -f "$container_format" \                        # output container format
  "$output_file"                                  # output file path

Use a complex filtergraph with `acopy` and `scale` to resize a video (6)

ffmpeg \
  -i "$input" \                                   # input
  -filter_complex \                               # filtergraph definition
  '[0:a:0] acopy [audio];
  [0:v:0] scale=320:180 [scaled]' \
  -map '[scaled]' \                               # map filtergraph video output
  -map '[audio]' \                                # map filtergraph audio output
  -r 20 \                                         # set video frame rate (Hz)
  -f "$container_format" \                        # output container format
  "$output_file"                                  # output file path

Use a complex filtergraph with `concat` and `scale` to resize and concatenate two videos (6)

ffmpeg \
  -i "$input1" \                                  # input 1
  -i "$input2" \                                  # input 2
  -filter_complex \                               # filtergraph definition
  '[0:a:0] [1:a:0] concat=v=0:a=1 [audio];
    [0:v:0] scale=320:180 [scaled0];
    [1:v:0] scale=320:180 [scaled1];
    [scaled0] [scaled1] concat=v=1:a=0 [scaled]' \ # use the scale and concat filters
  -map '[scaled]' \                               # map filtergraph video output
  -map '[audio]' \                                # map filtergraph audio output
  -r 20 \                                         # set video frame rate (Hz)
  -f "$container_format" \                        # output container format
  "$output_file"                                  # output file path

Output HTTP live streaming manifest and segments (compatible with adaptative streaming)

ffmpeg \
  -i "$input" \                                   # input
  -c:a aac \                                      # output audio streams encoder
  -b:a "$bitrate" \                               # output audio streams bitrate
  -c:v libx264 \                                  # output video streams encoder
  -s "$size" \                                    # set video frame size (width x height)
  -r "$rate" \                                    # set video frame rate (Hz)
  -bufsize "$size" \                              # bitrate control buffer size in bits/s, 2x maxrate         (global)
  -maxrate 1M \                                   # max bitrate in bits/s, approx 1-10 Mbps                   (global)
  -keyint_min "$interval" \                       # min interval between IDR frames (2)                       (global)
  -g "$size" \                                    # group of pictures size - max distance between key frames  (global)
  -sc_threshold "$threshold" \                    # threshold for scene change detection                      (global)
  -crf "$factor" \                                # set the quality for constant quality mode (1)             (libx264)
  -preset "$preset" \                             # encoding speed preset                                     (libx264)
  -f hls \                                        # use apple http live streaming muxer/container
  -hls_time "$length" \                           # segments length (usually 2 to 12 seconds)
  -hls_playlist_type vod \                        # vod (static playlist) / event (append segments on the fly)
  "$output_file"                                  # output path for m3u8 playlist and segments

Use HTTP live streaming to create a live playlist using the "sliding window" method

ffmpeg \
  -re \                                           # read and process inputs in real time (VITAL FOR LIVE STREAMING)
  -i "$input" \                                   # input
  -f hls \                                        # use apple http live streaming muxer/container
  -hls_time "$length" \                           # segments length (usually 2 to 12 seconds)
  -hls_segment_type mpegts \                      # segments file format (use MPEG-2 Transport Stream by default)
  -hls_segment_filename "$segment_file" \         # segments output path (3) (4)
  -hls_list_size "$size" \                        # max number of segments present in the playlist at a given time (defaults to 5)
  -hls_flags "$flag1+$flag2" \                    # flags as options for playlist creation (5)
  -hls_delete_threshold "$threshold" \            # number of unreferenced segments to keep on disk before deletion by hls_flags delete_segments (set to 5)
  -hls_start_number_source generic \              # sets the #EXT-X-MEDIA-SEQUENCE tag in the playlist (defaults to start_number)
  -start_number 0 \                               # initial segment index in the playlist when using hls_start_number_source generic
  -master_pl_publish_rate "$period" \             # updates the master playlist after "$period" new segments are added
  -hls_allow_cache 1 \                            # allows the client to cache media segments (more fluid)
  -hls_enc 0 \                                    # disables segments encryption
  "$output_file"                                  # output path for the m3u8 playlist

Other relevant options

ffmpeg \
  -itsoffset "$offset" \                          # read input 1 from offset "$offset" in seconds
  -i "$input1" \                                  # input 1
  -i "$input2" \                                  # input 2
  -map 0:a \                                      # select all audio streams from input 1
  -map 1:v \                                      # select all video streams from input 2
  -filter:a "atempo=$rate" \                      # use filter : adjusts audio streams speed by "$rate"
  -c:v copy \                                     # copy all video streams
  -y \                                            # no-confirm overwrite output file
  -f "$container_format" \                        # output container format
  "$output_file"                                  # output file path

Notes

Some details on -crf
*For hls playlists, GOP size must be equal to keyint_min and match the segment duration : -g 100 -keyint_min 100 -hls_time 4 will stream at 100/4 = 25 fps
The segment file path specified here will not be included in the manifest files, only the segment file name will
Also, segments names must be unique so the file name specified can be expanded to include its index in the playlist by using a pattern such as segment%03d.ts as a segment file name
Useful hls flags are :
- delete_segments: auto delete segments that are no longer in the playlist after (segment duration + playlist duration) seconds - use by default
- temp_file : write segment initially to segment.tmp and rename to segment.ts once processing is complete - use by default
- omit_endlist : do not append the EXT-X-ENDLIST tag once inputs are exhausted and playlist finishes
- split_by_time : allow segments to start on frames other than keyframes (jeopardizes videojs as of now)
Remember that filters operate on streams and not on containers
When using the concat demuxer, keep in mind that all the input files streams (audio or video) MUST HAVE THE SAME TIMEBASE which usually is 1 / frame rate for video streams and 1 / sample rate for audio streams.

Thus, THE AUDIO SAMPLE RATE AND VIDEO FRAME RATE HAVE TO BE THE SAME FOR ALL INPUT FILES. Also, -r "$framerate" has to be passed as an input option to the concat demuxer so it ignores the timestamps stored in the source files and recalculates new timestamps for the source assuming a constant framerate.