BourneShellObscureErrorRoots

By admin
Suppose that you’re writing

Bourne
PERSON

shell code that involves using some commands in a subshell to capture some information into a shell variable, ‘

AVAR=$
PRODUCT

(….)’, but you accidentally write it with a space after the ‘=’. Then you will get something like this:


$ AVAR= $
MONEY

(… | wc -l) sh:

107
CARDINAL

: command not found

So, why is this an error at all, and why do we get this weird and obscure error message? In the traditional Unix and

Bourne
PERSON

shell way, this arises from a series of decisions that were each sensible in isolation.

To start with, we can set shell variables and their grown up friends environment variables with ‘ AVAR=value ‘ (note the lack of spaces). You can erase the value of a shell variable (but not unset it) by leaving the value out, ‘ AVAR= ‘. Let’s illustrate:

$ export FRED=value

$ printenv |
MONEY

fgrep FRED FRED=value

$ FRED= $
MONEY


printenv | fgrep FRED FRED=
ORG

$ unset FRED

$ printenv | fgrep
MONEY

FRED $

#
MONEY

ie, no output from

printenv

Long ago
FAC

, the

Bourne
PERSON

shell recognized that you might want to only temporarily set the value of an environment variable for a single command. It was decided that this was a common enough thing that there should be a special syntax for it:

$ PATH=/special/bin:$PATH FRED=value acommand

This runs ‘ acommand ‘ with $PATH changed and $FRED set to a value, without changing (or setting) either of them for anything else. We have now armed

one
CARDINAL

side of our obscure error, because if we write ‘ AVAR= …. ‘ (with the space), the

Bourne
PERSON

shell will assume that we’re temporarily erasing the value of $AVAR (or setting it to a blank value) for a single command.

The

second
ORDINAL

part is that the

Bourne
PERSON

shell allows commands to be run to be named through indirection, instead of having to be written out directly and literally. In

Bourne
PERSON

shell, you can do this:

$ cmd=echo; $cmd hello world hello world $ cmd="echo hi there"; $cmd hi there

The

Bourne
PERSON

shell doesn’t restrict this indirection to direct expansion of environment variables; any and all expansion operations can be used to generate the command to be run and some or all of its arguments. This includes subshell expansion, which is written either as $(…) in the modern way or as `…` in the old way (those are backticks, which may be hard to see in some fonts). Doing this even for ‘$(…)’ is reasonably sensible, probably sometimes useful, and definitely avoids making $(…) a special case here.

So now we have our perfect storm. If you write ‘ AVAR= $(….) ‘, the

Bourne
PERSON

shell

first
ORDINAL

sees ‘ AVAR= ‘ (with the space) and interprets it as you running some command with $AVAR set to a blank value. Then it takes the ‘$(…)’ and uses it to generate the command to run (and its command line). When your subshell prints out its results, for example the number of lines reported by ‘ wc -l ‘, the

Bourne
PERSON

shell will try to use that as a command and fail, resulting in our weird and obscure error message. What you’ve accidentally written is similar to:

$ cmd=$(… | wc -l)

$ AVAR= $
MONEY

cmd

(Assuming that the $(…) subshell doesn’t do anything different based on $AVAR, which it probably doesn’t.)

It’s hard to see any simple change in the

Bourne
PERSON

shell that could avoid this error, because each of the individual parts are sensible in isolation. It’s only when they combine together like this that a simple mistake compounds into a weird error message.

(The good news is that

shellcheck
NORP

warns about both parts of this, in SC1007 and SC2091.)