Intro
Welcome back, as a reminder, in the previous part, you learned about the installation steps, initialization of the data file, reading data from the data file, and adding new data. In this part, we'll explore the completions generation flow, how to get user input, and how to deal with edge cases.
C-4 Generate completions
Later in the integrating shells chapter, we'll see that we trigger this flow each time we type the
z
command in the terminal followed by a partial path as an argument and pressing the TAB key. This will trigger a list of possible completions from which the user can select.In the integrating shells chapter, we'll implement this by utilizing the completions functions for both zsh and bash. Those functions expect to receive a list of paths to display as completions to the user, and that's what this chapter is about.
In this chapter, we'll review how to construct this list. We enter this flow each time a user hits the TAB key on the
z
command with a partial path. We internally invoke the_z
functions with the "--complete" parameter and the given partial path to attempt to construct a completion list from our$datafile
.For example, if a user types
z foo
in the terminal and presses TAB, internally, we invoke_z --complete foo
. In the_z
function scope,$1
will be "--complete," and$2
will be "foo." We then search$2
against our$datafile
and return a list of possible matches.Don't worry too much about this at the moment. You must only understand that this flow generates a completions list each time a user hits the TAB key on our program path argument.
# z.sh:105
# _z { ... }
# tab completion
elif [ "$1" = "--complete" -a -s "$datafile" ]; then { .. }
-s
- Conditional expression. True if file exists and has a size greater than zero. Read more.
Goal: Conditionally, enter the completions flow if the completions argument is present and we have a data file from which to extract completions.
Walkthrough: Check if the first argument passed to the script equals the string "--complete"; if it does, and $datafile
exists with a size greater than zero, then evaluate the block; if not, skip it.
# z.sh:106
# _z { ... } / elif [ "$1" = "--complete" -a -s "$datafile" ]; then { .. }
_z_dirs | \awk -v q="$2" -F"|" '
BEGIN {
q = substr(q, 3)
if( q == tolower(q) ) imatch = 1
gsub(/ /, ".*", q)
}
{
if( imatch ) {
if( tolower($1) ~ q ) print $1
} else if( $1 ~ q ) print $1
}
' 2>/dev/null
Consult the
awk
man pages for each command you want to understand better.AWK CONTEXT:
substr(s,m,[, n])
- the n character substring of s string starting at position m to position n, if n not provided, default to length.tolower(str)
- return str in lower case.gsub(r,t,[, s])
- substitute t with all occurrences of regular-expression r in the string s, if no s, "$1" is used. returns the number of replacements.~
- constant regular expression operator, right side of the operand is treated as regular-expression and left side as string for that regular-expression.
Goal: Perform a search on our data file rows to print all matching path fields that contain our given query.
Walkthrough: Invoke the _z_dirs
function to get all data records, pipe its stdout to the awk program, and set a variable named q
to be the script's second argument, which should be a user input for a completion.
For example, if the user wants to complete the word "hello," he will write in the terminal the command z hello
and press <TAB>
. Then $2
will equal "/z hello" string, this word component is passed down from the compctl -U -K <function>
or the bash equivalent.
We will discuss this in great detail in the last chapter about integrating with shells. For now, just remember that the caller is a completion invoker that triggers when you press
TAB
(completion) in your shell, either bash or zsh. Doing so passes the user's desired completion string as$2
and expects the provided function to generate a list of all possible completions.
We find completions through the awk
program and return as stdout a list of completions as the completion invoking function expects. We also set the input field separator to be the pipe character.
At the BEGIN
block of the awk expression, we update the variable q
with a substring of the original q
less the first 3 characters of the original predefined q
variable, which is $2
. We do this to get rid of the "/z "(slash-program name-space) prefix characters that got appended to $2
by the completion command. We then check if the substring q
is in lowercase. If it is, we change the imatch
variable to the truthy value of int 1.
We then substitute all single-space characters with the .*
regex pattern in the string q
to later be used in the awk body by the constant regular expression operator ~
. The dot means anything can go here, and the star means at least 0 times, .*
accepts any sequence of characters, including an empty string. We do this if the string suffix has spaces; in this case, we want the constant regex operator to match all entries containing the query regardless of those spaces.
At the MAIN block of the awk program, check if imatch
is true for any given record; if it is, enter the if
block. Inside the if
block, check if the lowered $1
path field for the given record matches the regular expression q
using the constant regex operator ~
; if it does, print the first field for the given record. If the control enters the else if
block, check if the first field without enforced lowercasing string matches the regular expression q
; if it does, print the first field.
At the end of the awk execution, discard the stderr by redirecting it to the null device. This will supply the completion functions with a list of entries as stdout to suggest as completions and will fail silently in case of an error.
Don't worry if this section is a bit out of context. We will get back to it once we discuss the completion logic near the end of this breakdown in the last chapter. Just remember that this whole section's purpose is to be invoked on a partial or whole word completion and supply a list of matching paths from our
$datafile
to the completions function that called this script with the '--complete' argument. It will make much more sense later.
C-5 Get input
# z.sh:119
# _z { ... } / else { .. }
# else is the case when "$1" not equal "--add" or "--complete" arguments.
else
# list/go
local echo fnd last list opt typ
while [ "$1" ]; do case "$1" in
--) while [ "$1" ]; do shift; fnd="$fnd${fnd:+ }$1";done;;
-*) opt=${1:1}; while [ "$opt" ]; do case ${opt:0:1} in
c) fnd="^$PWD $fnd";;
e) echo=1;;
h) echo "${_Z_CMD:-z} [-cehlrtx] args" >&2; return;;
l) list=1;;
r) typ="rank";;
t) typ="recent";;
x) \sed -i -e "\:^${PWD}|.*:d" "$datafile";;
esac; opt=${opt:1}; done;;
*) fnd="$fnd${fnd:+ }$1";;
esac; last=$1; [ "$#" -gt 0 ] && shift; done
${parameter:+word}
- If parameter is null or unset, nothing is substituted, otherwise the expansion of word is substituted. Read more.${parameter:offset}
and${parameter:offset:length}
- This is referred to as Substring Expansion. It expands to up to length characters of the value of parameter starting at the character specified by offset. If length is omitted, it expands to the substring of the value of parameter starting at the character specified by offset and extending to the end of the value. length and offset are arithmetic expressions. Read more.
Goal: Extract the arguments and options the script user passes and assign each to the previously defined variables. Those variables will help us control the program and provide the desired output to the user by allowing us to control the script execution flow. To extract those options and arguments, we need to loop over the positional parameters supplied by the user, take each parameter, and understand what type it is, whether it's a double-dash representing an arguments list, a single-dash followed by a character representing an option, or a default case if none of the previous two cases were captured to be treated as a single argument. After figuring out what type of parameter we got passed, we worked on it to fill in our previously defined variables.
Walkthrough: We start by defining local empty variables, which we'll use when extracting passed arguments and options. Next, figure out the outer while loop and see how to process positional parameters.
The following code example highlights the outer while loop and its cases so we can first focus and understand it before we dive deeper into each case.
while [ "$1" ]; do case "$1" in
--) ...some inner logic;;
-*) ...some inner logic;;
*) ...some inner logic;;
esac; last=$1; [ "$#" -gt 0 ] && shift; done
We loop over the supplied positional parameters one by one. As long as we have a value stored in the first positional parameter, $1
, we enter the case "$1" in
clause and check if its value matches any of our cases. In total, we have three cases:
--)
Match a double dash.-*)
Match a dash immediately followed by any characters.*)
A default case capturing everything else.
After we match for a specific parameter and perform its related inner logic, we'll get to the esac
case termination. We then assign the $last
variable as the last $1
parameter we iterated upon. This will help us later determine if we are dealing with a suggested completion path that we should navigate to. More on this later.
We also check if the total number of parameters we currently have is greater than zero; if they are indeed greater, we shift
before we either reiterate the while
loop if parameters still exist and are stored in $1
or hit the *done
and terminate the while
loop* if none are left.
Once we hit done
, we can work only with our local variables for the rest of the script's execution to fulfill the user's intentions.
Positional arguments
Okay, so we understand the outer while
loop. Let's dive deeper and figure out what each of the three cases actually does and how it populates our local variables. We'll start with the first one, the double-dash case.
--) while [ "$1" ]; do shift; fnd="$fnd${fnd:+ }$1";done;;
Goal: A double-dash is used in most Bash built-in commands and many other commands to signify the end of command options, after which only positional ("non-option") arguments are accepted; for example, we have an option -x
on the current 'z' script that we'll explore shortly, but in this example we want to pass it as an argument because we want 'z' to navigate us to a path including '-x' in its name, if we invoke shell command 'z' with '-x' like: z -x
it will be passed as an option and will change the execution flow, removing the current directory from the datafile in this case, but, if we supply 'z' the double-dash followed by '-x,' z -- -x
the '-x' will be treated as an argument and 'z' will try to find a matching path in the datafile containing '-x.' We implement the first case to support this conventional behavior, which matches a double-dash.
Walkthrough: As soon as we encounter a double-dash parameter, we immediately nest another while-loop with the same condition as the outer, checking the existence of $1
; we shift on it since we don’t need the double-dash string anymore currently stored on $1
; we then assign the shifted $2
which is now $1
argument to our fnd
variable and re-iterate until we dealt with all parameters.
If there is more than one argument, it will be assigned as positional parameter $2 ... $n
; since we shifted earlier, our current $1
is the first argument supplied and exists; our while
-loop condition stated that as long as we have value in $1
we iterate, so we iterate again and shift again, this time $2
parameter will now be the new $1
parameter and will represent the second passed parameter, we want to assign that to the fnd
variable as well, to do so we need to figure out how to assign multiple parameters on the fnd
string variable by separating them with a space character.
We first check if fnd
was previously assigned a value; if not, take the empty fnd
and concat it with $1
; if fnd
has a value, then the same logic applies, take the current fnd
and concat it, only this time with a space character supplied by the ${fnd:+ }
parameter expansion.
The ${fnd:+ }$1
part first checks if a value is already stored in fnd
. Only if a value exists, it inserts a space character followed by the value stored in $1
. On the first iteration, fnd
is empty, so the $fnd${fnd:+ }
part will be ignored, and fnd
will be assigned to the value stored in $1
without any spaces.
If more parameters exist, we continue iterating until all positional parameters are cleared; when $1
doesn’t contain any value, we’ll exit both while
-loops since they have the same termination condition. At this point the fnd
variable will be a string with either one or more passed arguments separated by a space character.
Command options
Next let’s explore the ‘options’, a case with single dash followed by any characters:
-*) opt=${1:1}; while [ "$opt" ]; do case ${opt:0:1} in
# c) ...;;
# e) ...;;
# h) ...;;
# l) ...;;
# r) ...;;
# x) ...;;
esac; opt=${opt:1}; done;;
*
- POSIX Pattern matching multiple characters, pattern that shall match any string, including the null string. Read more.
Goal: Capture all options passed to the script and support multiple options as a single parameter. To do this, we must remove the trailing dash that got us into this flow by the outer case clause and save the trimmed string as a variable. Then we need to loop over this variable, which possibly contains multiple characters string; upon each iteration, we extract a character to match against an option case; we then need to remove it from our variable to reiterate the rest of them. Finally, if we’re left with an empty variable, we know we operated on all passed options and are done.
Walkthrough: A single-dash-star case means a dash character (-
) followed by any characters representing possible passed options. The star (*
) is a glob pattern star. When we encounter this case, we first assign the opt
variable as the substring of the passed option/s ($1
) by doing opt=${1:1}
. This omits the dash character and keeps only the following characters.
We follow with a while loop that tests for our opt
variable; as long as this variable contains a value, we iterate. To match our single-character cases, we first need to ensure that the word
component in our case
clause is a single character per iteration. To do this, we substring on opt
with ${opt:0:1}
, taking from opt
a length of 1
from position 0
; this enforces a single character as a word
component that we can match against in the case
clause.
After the case
clause, we reassign our opt
variable by doing opt=${opt:1}
, which is a substring on itself that omits the first character, the character we just worked on, effectively forwarding it by one character; performing this reassignment after each iteration ensures that the while loop will operate on the next character and eventually will consume all the remaining characters until reassigned to an empty value that will terminate the while loop.
For example, if we passed multiple options to our script by running z -cel
command, after removing the dash at the first position and defining opt
using opt=${1:1}
, the opt
variable will equal cel
, resulting from the second substring using ${opt:0:1}
the while loop first iteration will operate on the c
character case; then opt
will be reassigned to el
by opt=${opt:1}
, which means the second iteration will operate on e
then opt
will be reassigned to l
and will operate on it. Lastly, opt
will be reassigned to an empty value, terminating the while
-loop.
So now that we understand how to consume options and match them to a case let's explore what each case actually does, starting with the c
option.
The ‘-c’ option
c) fnd="^$PWD $fnd";;
PWD
- Print the absolute pathname of the current working directory. Read more.^
- Caret has a special meaning when evaluating regEx patterns; it matches the start of a line; for example, if we have a three-line paragraph where only the second line starts with the string "text," but all the other lines include it, and two regular expressions, one with a caret and the other without, '/^text/' and '/text/.' The regEx with the caret will force a match on the second line only and ignore all the other occurrences; without the caret, it will match any occurrences of the word "text" regardless of its position in the line.
Goal: Assign the $fnd
variable with a regEx string to restrict the lookup that we'll later perform on $fnd
to start from the user's current working directory path and match only subdirectories of that path that may possibly include a query passed from the user.
Walkthrough: When a user passes the -c
option, we assign the local variable $fnd
to a string that will be used as a regEx later; its value will be the combination of a caret (^
) immediately followed by the current working directory variable $PWD
, a space character, and the current value of $fnd
in case a query was passed as a parameter and was previously assigned as $fnd
by other cases such as the --
or *
cases.
Later, when we evaluate it as a regEx to match against our datafile entries, we want to match only the path entries that begin with the current working directory and the possibly included passed query from the user.
For example, if we have a datafile path entry such as /some/deep/nested/path/specific/directory
and our current location is /some/deep/nested
. If we execute the z
script from the current location with the -c
option followed by a directory
query such as z -c directory
, the $fnd
variable will be assigned to the value ^/some/deep/nested directory
and later will match using regEx against our datafile full path.
For those of you wondering how can we match deeply nested path subdirectories if we only provide a path to begin from and a query string that includes a word from our nested directory tree?
We achieve this by manipulating the
$fnd
variable again right before we perform the actual regEx search so that it will match the full path. Later, we replace any space character with.*
so that everything between our current path and the passed query will be included and matched.Taking our previous example, the actual regEx string right before we try matching will be
^/some/deep/nested.*directory
, and this is how we’ll get a match on the entire path from our example above.
The ‘-e’ option
e) echo=1;;
Goal: Print the best match instead of navigating to it.
Walkthrough: When a user passes the -e
option, we assign the variable echo with 1
to print the resulting path and prevent navigation when we later read from this variable.
For example, if a user inputs z -e foo
, we will print the best-matching foo path and prevent navigation to it.
The ‘-h’ option
h) echo "${_Z_CMD:-z} [-cehlrtx] args" >&2; return;;
Goal: Print the help menu, which describes all possible arguments a user can pass to the script.
Walkthrough: When a user passes the -h
option, we use the built-in echo
command to print a string indicating the script name or the default ‘z,’ followed by all possible options in brackets and an ‘args’ string indicating a path argument.
We then redirect stdout(default file descriptor 1) to file descriptor 2, which is stderr, meaning all output from this command will be sent to stderr.
Finally, we return
with the last command exit status and exit the script.
The ‘-l’ option
l) list=1;;
Goal: List matches instead of navigating to the best matching path.
Walkthrough: When a user passes the -l
option, we assign 1
to the list
variable so that later, when we read this variable, we can list all possible paths and prevent navigation.
The ‘-r’ option
r) typ="rank";;
Goal: Navigate to the highest-ranking entry from our data file.
Walkthrough: When a user passes the -r
option, we assign the typ
variable with the string “rank” so that we can later navigate to the highest rank entry in our data file.
The ‘-t’ option
t) typ="recent";;
Goal: Navigate to the most recent entry from our data file.
Walkthrough: When a user passes the -t
option, we assign the typ
variable with the string “recent” so that we can later navigate to the most recent entry in our data file.
The ‘-x’ option
x) \sed -i -e "\:^${PWD}|.*:d" "$datafile";;
sed
- stream editor for filtering and transforming text utility. Read more.sed -i[SUFFIX], --in-place[=SUFFIX]
- edit files in place (makes backup if SUFFIX supplied).sed -e script, --expression=script
- add the script to the commands to be executed.\:regexp:
- Match lines matching the regular expression regexp. The : may be any character.d
- Delete pattern space. Start next cycle.
Goal: Remove the current directory from the data file by utilizing the sed
utility to achieve this.
Walkthrough: When a user passes the -x
option, we run the sed
utility over our data file.
We provide the -i
option to sed
, stating that we want it to perform the operation in place, meaning no stdout will be printed.
We also provide the -e
option, which expects a script as an option value that sed
will run on each line in the data file.
In this case, the passed script is a regEx that tries to match a line from the data file that starts with our current working directory, followed immediately by a pipe (|) character, and selects the rest of the line.
If a line from our data file matches our regex, we delete the match using the 'd' pattern.
Default case
As stated in the bash official documentation: It’s a common idiom to use ‘*’ as the final pattern to define the default case, since that pattern will always match.
*) fnd="$fnd${fnd:+ }$1";;
Goal: Capture all positional arguments that are not options and not specifically passed with the double-dash case.
Walkthrough: The final case of our outer while loop is the star case (*
), representing a default case. We treat it similarly to our first case, the positional argument with the double dash (--
) case excluding the nested while loop.
At the end of this case, the $fnd
variable will hold the first passed positional argument supplied to our script.
If the user supplied more than one argument, the outer while loop will shift
, re-iterate, and match the default star case again, concatenating the previously stored value in $fnd
with the new value while separating them with a space character.
When we use the
$fnd
variable for our regEx search later, we will replace every space character with.*
before performing the search. This will allow us to restrict a search to a specific path.For example, suppose we have two long paths in our data file such as:
/some/1/long/path/foo
/some/2/long/path/foo
If we run
z
withz 2 foo
, the$fnd
variable will be assigned to the value2 foo
through our default case. Before performing the regEx search, we will replace the separating space with.*
to get2.*foo
, which will match only the second path from the above example paths.
C-6 Recognize edge-cases
# z.sh:135
# _z { ... } / else { .. }
[ "$fnd" -a "$fnd" != "^$PWD " ] || list=1
# if we hit enter on a completion just go there
case "$last" in
# completions will always start with /
/*) [ -z "$list" -a -d "$last" ] && builtin cd "$last" && return;;
esac
# no file yet
[ -f "$datafile" ] || return
This chapter is about catching early pre-known conditions that can break our main logic if we don't deal with them explicitly, either by early termination or variable adjustment.
We have, in total, three edge cases which appear when the following conditions are met:
- No positional parameter was supplied.
- Parameter supplied as the product of a completion function.
- We have no data file defined.
Next, we'll review each and see how they protect our main logic from breaking once those conditions emerge.
No positional parameters
# z.sh:135
# _z { ... } / else { .. }
[ "$fnd" -a "$fnd" != "^$PWD " ] || list=1
Goal: Test whether or not a user passed a positional parameter. We will print all entries from the data file if he didn't. This scenario can happen in two cases:
- The user invoked z with no arguments.
- The user invoked z only with the
-c
option and no arguments.
In the first case, $fnd
will be empty, while in the second, it will equal *^PWD * (with trailing space).
If we encounter either of those cases, we want to prevent navigation and display a list of results. For the second case, those results should be filtered by the current working directory.
Walkthrough: Since we passed more than 3 arguments to the test ([...]
), -a
will be treated as a binary operator and interpreted by the test as a logical AND (&&
) checking if both left and right sides are true.
The left side will check if the local variable $fnd
was assigned to a value and is not empty. The right side will check if the $fnd
variable is NOT equal to the string representing the regex of a path starting with our current working directory and ending with a space character.
Note that the regEx is not evaluated here; we only check to see if the string matches the variable. As previously mentioned, when the Regex is evaluated, if it has a space character like in the second case from above, it will be replaced with
.*
and match all paths starting with the current working directory.
If both conditions are true, we will continue with the rest of the script body. If not, we evaluate the right side of the logical OR (||
), which will assign the local variable list
the value of 1
so that we can operate on it later to retrieve our paths from the data file, filter entries depending on the value of the $fnd
variable, print the results, and prevent navigation.
Completion navigation
# z.sh:138
# _z { ... } / else { .. }
# if we hit enter on a completion just go there
case "$last" in
# completions will always start with /
/*) [ -z "$list" -a -d "$last" ] && builtin cd "$last" && return;;
esac
builtin [shell-builtin [args]]
- Run a shell builtin, passing it args, and return its exit status. This is useful when defining a shell function with the same name as a shell builtin, retaining the functionality of the builtin within the function. Read more.cd [directory]
- Change the current working directory to directory. Read more.
Goal: Navigate in a case in which the user selects a suggested completion; in this case, we want to immediately perform a navigation to the selected suggestion and terminate any further script execution.
Walkthrough: When a user selects a suggestion from the generated completions list, the "/" character gets prepended to the resulting suggestion, and the entire suggestion is appended as the last parameter by the completion built-in function. Don't worry about how this happens; we will explore the completion functions in depth in chapter 10.
Since we know that any last parameter that starts with "/" is a completion selection provided by the built-in completion function, we can test against those conditions by using the case
command on the $last
variable with the /*
pattern.
Suppose we deal with completion and enter the case clause. In that case, we should ensure that no indicator prevents us from performing the navigation. We do this by testing that the variable $list
has a zero length with the -z
unary conditional expression. Also, we need to validate that the provided path is an actual directory on the file system by using the -d
unary conditional expression on the $last
variable.
If both the left and right sides of the binary conditional expression -a
are evaluated as true, we can treat the entire test […]
as true and commit to navigating the user.
If not, and the test returns false, we exit the case clause and never get to the right side of the AND (&&
) operator.
To navigate in a shell environment, we use the cd
command. To use it in our script, we utilize a helper function builtin
and pass our desired "cd" command as the first parameter, followed by the path we want to navigate.
Lastly, we terminate the script execution with the return
command since we have achieved our goal and have nothing else to do.
No data file
# z.sh:144
# _z { ... } / else { .. }
# no file yet
[ -f "$datafile" ] || return
Goal: Validate that we have an existing data file or terminate the script execution otherwise.
Walkthrough: Test the existence of a file using the -f
unary conditional expression inside a test […]
command on the parameter expansion of the $datafile
variable. If the test is true, we skip the right side of the OR (||
); if not, we terminate execution with the return
command.
Part conclusion
Congratulations on making it to the end of this part. In this part, you learned about completions generation flow, how to deal with user input, and how to deal with edge cases. In the next part, we'll figure out how the actual navigation logic happens, its execution logic, and how the z command is exposed as an executable. Whenever you are ready, hit the following part link. See you there!