Arrays, input, Here Documents
Arrays
BASH supports both indexed and associative one-dimensional arrays. Indexed array can be declared
with declare -a array_name
, or first assignment does it automatically (note: indexed arrays only):
arr=(my very first array)
arr=('Otakaari 1' Espoo 02150 [6]='PL 11000')
arr[5]=AALTO
To access array elements (the curly braces are required, unlike normal variable expansion):
# elements one by one
echo ${arr[0]} ${arr[1]}
# array values at once
${arr[@]}
# indexes at once
${!arr[@]}
# number of elements in the array
${#arr[@]}
# length of the element number 2
${#arr[2]}
# to append elements to the end of the array
arr+=(value)
# assign a command output to array
arr=($(command))
# emptying array
arr=()
# to destroy, delete an array
unset arr
# to unset a single array element
unset arr[6]
# sorting array
IFS=$'\n' sorted=($(sort <<<"${arr[*]}"))
# array element inside arithmetic expanssion requires no ${}
((arr[$i]++))
# split a string like 'one two three etc' or 'one,two,three,etc' to an array
# note that IFS=', ' means that separator is either space or comma, not a sequence of them
IFS=', ' read -r -a arr <<< "$string"
# spliting a word to an array letter by letter
word=qwerty; arr=($(echo $word | grep -o .))
Loops through the indexed array:
for i in ${!arr[@]}; do
echo arr[$i] is ${arr[$i]}
done
Negative index counts back from the end of the array, [-1] referencing to the last element.
Quick ways to print array with no loop:
# with keys, as is
declare -p arr
# indexes -- values
echo ${!arr[@]} -- ${arr[@]}
# array elements values one per line
printf "%s\n" "${arr[@]}"
Passing an array to a function as an argument could be the use case when you want to make it local:
f() {
local arr=(${!1}) # pass $1 argument as a refence
# do something to array elements
echo ${arr[@]}
}
# invoke the function, huom that no changes have been done to the original arr[@]
arr=(....)
f arr[@]
BASH associative arrays (this type of array supported in BASH since version 4.2) needs to be
declared first (!) declare -A asarr
.
Both indexed arrays and associative can be declared as an array of integers, if all elements
values are integers declare -ia array
or declare -iA
. This way element values are
treated as integers always.
asarr=([university]='Aalto University' [city]=Espoo ['street address']='Otakaari 1')
asarr[post_index]=02150
Addressing is similar to indexed arrays:
for i in "${!asarr[@]}"; do
echo asarr[$i] is ${asarr[$i]}
done
Even though key can have spaces in it, quoting can be omitted.
# use case: your command returns list of lines like: 'string1 string2'
# adding them to an assoative array like: [string1]=string2
declare -A arr
for i in $(command); do
arr+=(["${i/ */}"]="${i/* /}")
done
Variable expanssions come out in the new light:
# this will return two elements of the array starting from number 1
${arr[@]:1:2}
# all elements without last one
${arr[@]:0:${#arr[@]}-1}
# parts replacement will be applied to all array elements
declare -A emails=([Vesa]=vesa@aalto.fi [Kimmo]=kimmo@helsinki.fi [Anna]=anna@math.tut.fi)
echo ${emails[@]/@*/@gmail.com}
# returns: vesa@gmail.com anna@gmail.com kimmo@gmail.com
For a sake of demo: let us count unique users and their occurances (yes, one can do it with ‘uniq -c’ :)
# declare assoative array of integers
declare -iA arr
for i in $(w -h | cut -c1-8); do # get list of currenly logged users into loop
for u in ${!arr[@]}; do # check that they are unique
if [[ $i == $u ]]; then
((arr[$i]++))
continue 2
fi
done
arr[$i]=1 # if new, add a new array element
done
for j in ${!arr[@]}; do # printing out
echo ${arr[$j]} $j
done
Another working demo: script that automates backups or just makes a sync of data to a remote server. Same can be adapted to copy locally, to a usb drive or alike.
# array of directories to be backuped, to skip one, just comment with #
declare -A dirs
dirs[wlocal]=/l/$USER
dirs[xpproject]=/m/phys/extra/project/xp
dirs[homebin]=$HOME/bin
cmd='/usr/bin/rsync' # rsync
args="-auvW --delete --progress $@" # accept extra args, like '-n' for the dryrun first
serv='user@server:backups' # copying to ~/backups that must exist
# array key is used for the remote dir name
for d in ${!dirs[@]}; do
echo "Syncing ${dirs[$d]}..."
$cmd $args ${dirs[$d]}/ $serv/$d
done
Exercise 2.5
Exercise
make a script/function that produces an array of random numbers, make sure that numbers are unique. Print the array nicely using
printf
for formating.one version should use BASH functionality only (Tip:
$RANDOM
)the other one can use
shuf
(*) Pick up the
ipvalid
function that we have developed earlier, implement IP matching regular expression as^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$
and work with the ${BASH_REMATCH[*]} array to make sure that all numbers are in the range 0-255
Working with the input
User input can be given to a script in three ways:
as command arguments, like
./script.sh arg1 arg2 ...
interactively from keyboard with
read
commandas standard input, like
command | ./script
Nothing stops from using a combination of them or all of the approaches in one script. Let us go through the last two first and then get back to command line arguments.
read
can do both: read from keyboard or from STDIN
# the command prints the prompt, waits for the response, and then assigns it
# to variable(s)
read -p 'Your names: ' firstn lastn
# read into array, each word as a new array element ('arr' declared automatically)
read -a arr -p 'Your names: '
Given input must be checked (!) with a pattern, especially if script creates directories, removes files, sends emails based on the input.
# request a new directory name till correct one is given (interrupt with Ctrl-C)
regexp='^[a-zA-Z0-9/_-]+$'
until [[ "$newdir" =~ $regexp ]]; do
read -p 'New directory: ' newdir
done
read
selected options
-a <ARRAY>
read the data word-wise into the specified array <ARRAY> instead of normal variables
-N <NCHARS>
reads <NCHARS> characters of input, ignoring any delimiter, then quits
-p <PROMPT>
the prompt string <PROMPT> is output (without a trailing automatic newline) before the read is performed
-r
raw input - disables interpretion of backslash escapes and line-continuation in the read data
-s
secure input - don’t echo input if on a terminal (passwords!)
-t <TIMEOUT>
wait for data <TIMEOUT> seconds, then quit (exit code 1)
read
is capable of reading STDIN, case like command | ./script
, with while read var
it goes
through the input line by line:
# IFS= is empty and echo argument in quotes to make sure we keep the format
# otherwise all spaces and new lines shrinked to one and leading/trailing whitespace trimmed
while IFS= read -r line; do
echo "line is $line" # do something useful with $line
done
# To check current $IFS
cat -A <<<"$IFS"
Though in general, whatever comes from STDIN can be proceeded as:
# to check that STDIN is not empty
if [[ -p /dev/stdin ]]; then
# passing STDIN to a pipeline (/dev/stdin can be omitted)
cat /dev/stdin | cut -d' ' -f 2,3 | sort
fi
Other STDIN tricks that one can use in the scripts:
# to read STDIN to a variable, both commands do the same
var=$(</dev/stdin)
var=$(cat)
In the simplest cases like ./script arg1 arg2 ...
, you check $# and then assign
$1, $2, … the way your script requires.
# here we require exactly two arguments
if (($#==2)); then
var1=$1 var2=$2
# ... do something useful
else
echo 'Wrong amount of arguments'
echo "Usage: ${0##*/} arg1 arg2"
exit 1
fi
To work with all input arguments at once we have $@:
# $# is a number of arguments on the command line, must be non-zero
if (($#)); then
for i; do
echo "$i"
# ... do something useful with each element of $@
# note that 'for ...' uses $@ by default if no other list given with 'in ...'
done
else
echo 'No command line arguments given'
fi
As a use case, our tarit.sh script. The script can accept STDIN and arguments, so we check both:
# Usage: tarit.sh [dirname1 [dirname2 [dirname3 ...]]]
# or command | tarit.sh
# by default no directories to archive. i.e. current
args=''
# checking for STDIN, if any, assigning STDIN to $args
[[ -p /dev/stdin ]] && args=$(</dev/stdin)
# if arguments are given, appending the $args with $@
(($#)) && args+=" $@"
# no arguments, no stdin, then it is a current dir
[[ -z "$args" ]] && args="$(pwd)"
# by now we should have a directory list in $args to archive
for d in $args; do
# checking that directory exists, if so, archive it
if [[ -d "$d" ]]; then
echo Archiving $d ...
tar caf ${d##*/}.$(date +%Y-%m-%d).tar.gz "$d"
else
echo " $d does not exist, skipping."
fi
done
Often, the above mentioned ways are more than enough for simple scripts.
But what if options and arguments are like
./script [-f filename] [-z] [-b] [arg1 [arg2 [...]]]
or more complex?
(common notaion: options in the square brackets are optional). What if you write
a production ready script that will be used by many other as well?
It is were getopt
offers a more efficient way of handling script’s input options.
In the simplest case getopt
command (do not get confused with getopts
built-in BASH
function of similar kind) requires two parameters to work:
first is a list of valid input options – sequence of letters and colons. If letter
followed by a colon, the option requires an argument, if folowed by two colons, argument
is optional. For example, the string getopt "sdf:"
says that the options -s, -d and -f
are valid and -f requires an argument, like -f filename.
The second argument required by getopt is a list of input parameters (options + arguments)
to check, i.e. just $@
.
Let us use cx script as a demo:
# common usage function with the exit at the end
usage() {
echo "Usage: $sname [options] file [file [file...]]"
echo ' -a, gives access to all, like a+x, by default +x'
echo ' -d <directory/path/bin>, path to the bin directory'
echo " can be used in 'cx' to copy a new script there"
echo ' -a, gives access to all, like a+x, by default +x'
echo ' -d <directory/path/bin>, path to the bin directory'
echo " can be used in 'cx' to copy a new script there"
echo ' -v, verbose mode for chmod'
echo ' -h, this help message'
exit 1
}
# whole trick is in this part: getopt validates the input parameters,
# structures them by dividing options and arguments with --,
# and returns them to a variable
# then they are reassigned back to $@ with 'set --'
opts=$(getopt "avhd:" "$@") || usage
set -- $opts
# defining variables' default values
ALL=''
CMD='/usr/bin/chmod'
sname=${0##*/} # the name this script was called by
# by now we have a well structured $@ which we can trust.
# to go through options one by one we start an endless 'while' loop
# with the nested 'case'. 'shift' makes another trick, every time
# it is invoked it is equal to 'unset $1', thus $@ arguments are
# "shifted down", $2 becomes $1, $3 becomes $2, etc
# 'getopt' adds -- to $@ which separates valid options and the rest
# that did not qualify, when it comes to '--' we 'break' the loop
while true; do
case ${1} in
-h) usage ;; # output help message and exit
-a) ALL=a ;; # if -a is given we set ALL
-v) CMD+=' -v' ;; # if verbose mode required
-d) shift # shift to take next item as a directory path for -d
BINDIR="$1"
if [[ -z "$BINDIR" || ! -d "$BINDIR" ]]; then
echo "ERROR: the directory does not exist"
usage
fi
;;
--) shift; break ;; # remove --
esac
shift
done
# script body
case "$sname" in
cx*) $CMD ${ALL}+rx "$@" && \
[[ -n "$BINDIR" ]] && cp -p $@ $BINDIR ;;
cw*) $CMD ${ALL}+w "$@" ;;
cr*) $CMD ${ALL}+r "$@" ;;
c-w*) $CMD ${ALL}-w "$@" ;;
*) echo "ERROR: no idea what $sname is supposed to do"; exit 1 ;;
esac
getopt
can do way more, go for man getopt
for details, as an example:
# here is getopt sets name with '-n' used while reporting errors: our script name
# accepts long options like '--filename myfile' along with '-f myfile'
getopt -n $(basename $0) -o "hac::f:" --long "help,filename:,compress::" -- "$@"
Exercise 2.6
Exercise
Using the latest tarit.sh (see lecture notes) version as an example, expand above cx script to accept STDIN, like
command | cx [options]
, wherecommand
produces a list of files. Examplefind . -t file -name '*.sh' | cx -a -d /path/to/bin
.Using cx demo as an example, expand the latest version of our tarit.sh (see lecture notes) to make it accepting the following options and arguments:
tarit.sh -h -y -d <directory/with/backups> [dirname1 [dirname2 [dirname3 ...]]]
. By default, with no args, it still should make an archive of the current directory.-h
returns usage info,-d <directory/path/with/backups>
is a directory the tar archives will go to, your script has to check that directory exists, the script must also check whether a newly created archive already exist and if so, skip creating the archive with the corresponding warning message.(*)
-y
should force overwriting already existing archive.(*)
-s
should make script silent, so that no errors or other messages would come from any inline command.
Here Document, placeholders
A ‘here document’ and ‘here string’ take the line(s) following and send them to standard input. It’s a way to send larger blocks to stdin.
# instead of 'echo $STRING | command ...'
command <<<$STRING
# instead of 'cat file | command ...'
command <<SomeMagicStopWord
The benefit is that one can use $var, $() etc in the text
The text ends with the Stop Word on a new line, the word can be any
SomeMagicStopWord
Often used for messaging, be it an email or dumping bunch of text to file.:
# NAME, SURNAME, EMAIL, DAYS are set earlier
mail -s 'Account expiration' $EMAIL<<END-OF-EMAIL
Dear $NAME $SURNAME,
your account is about to expire in $DAYS days.
$(date)
Best Regards,
Aalto ITS
END-OF-EMAIL
Or just outputting to a file (same can be done with echo commands):
cat <<EOF >filename
... text
EOF
One trick that is particularly useful is using this to make a long comment:
: <<\COMMENTS
here come text that is seen nowhere
there is no need to comment every single line with #
COMMENTS
Hint <<\LimtiString
to turn off substitutions and place text as is with $ marks etc
In case you have a template file which contains variables as placeholders, replacing them:
# 'template' file like:
The name is $NAME, the email is $EMAIL
# command to substitute the placeholders and redirect to 'output' file
# the original 'template' file remains as is
NAME=Jussi EMAIL=jussi@gmail.com
cat template | while IFS= read -r line; do eval echo $line; done > output
# resulting file: The name is Jussi, the email is jussi@gmail.com