Thursday, November 26, 2009

More Fun With Bash Quoting

I've written about bash quoting before, and yes, it's about as exciting as watching paint dry or listening to the corn grow.

It can also be extremely frustrating when it doesn't do what you want, case in point: trying to script the updating of a field in a mysql table when the field to be changed contains quote characters.

Let's imagine we have a simple table with the following data and we would like to change the name field:
+----+----------------------+---------------------+
| id | name                 | create_date         |
+----+----------------------+---------------------+
|  1 | name 'with' "quotes" | 2009-11-19 08:48:59 |
+----+----------------------+---------------------+

Your first script might look something like this:
#!/bin/bash

USERNAME=${USERNAME:-user}
PASSWORD=${PASSWORD:-pwd}

mysql_cmd="mysql -u $USERNAME -p$PASSWORD test"

# Remove mysql header line.
function remove_header()
{
    shift
    echo $*
}

id=1
name=$(remove_header $($mysql_cmd -e "SELECT name FROM atable WHERE id='$id'"))

new_name="$name and more"
echo mysql -e "UPDATE atable SET name='$new_name' WHERE id='$id'"
$mysql_cmd -e "UPDATE atable SET name='$new_name' WHERE id='$id'"

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

And when you run it, it will puke something like this:
$ bash badfix.sh
mysql -e UPDATE atable SET name='name 'with' quotes and more' WHERE id='1'
ERROR 1064 (42000) at line 1:
    You have an error in your SQL syntax;
    check the manual that corresponds to your MySQL server version
    for the right syntax to use near 'with' quotes and more' WHERE id='1''
    at line 1

Note, the function at the top (remove_header) removes the header line from the mysql output so that we don't get the name of the field included in the data.

We all know the solution here: we need to escape the quotes in the value so that both bash and mysql are happy.

However, this turns out to be easier said than done, and perhaps I missed the obvious, but after numerous attempts (on more than one occasion) the following finally did the trick:
#!/bin/bash

USERNAME=${USERNAME:-user}
PASSWORD=${PASSWORD:-pwd}

mysql_cmd="mysql -u $USERNAME -p$PASSWORD test"

# Remove mysql header line.
function remove_header()
{
    shift
    echo $*
}

# Quote any quotes in a mysql value.
function fix_quotes()
{
    local  val="$*"
    if [[ "$val" =~ .*\'.* ]]; then #'
        echo "String contains single quotes: $val" >&2
        #val="${val//'/\\\\'}"
        val=$(sed -e "s/'/' \"'\" '/g" <<<"$val")
        echo New Value: "$val" >&2
    fi
    echo "$val"
}



id=1
name=$(remove_header $($mysql_cmd -e "SELECT name FROM atable WHERE id='$id'"))

fixed_name="$(fix_quotes "$name") and more"
echo mysql -e "UPDATE atable SET name='$fixed_name' WHERE id='$id'"
$mysql_cmd -e "UPDATE atable SET name='$fixed_name' WHERE id='$id'"

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

The fix_quotes function only checks for single quotes since our mysql value is contained in single quotes:
$mysql_cmd -e "UPDATE atable SET name='$fixed_name' WHERE id='$id'"
#                                     ^           ^

As you would expect, we don't need to escape double quotes inside single quotes for mysql. However, if we wanted to use a literal value in our SQL command we would need to escape double quotes since our SQL command is contained inside double quotes:
$mysql_cmd -e "UPDATE atable SET name='quoted \"value\"' WHERE id='$id'"
#             ^                                                        ^

We need to escape them in this case for bash's benefit and not for mysql: bash will "remove" the backslashes before passing the command to mysql.

One of my initial attempts (which you can see commented out in the code) was to try to change the value directly using a bash assignment statement.

I tried to change each single quote to an escaped single quote:
val="${val//'/\\\\'}"

Interestingly, this does not modify the string at all, a result that I don't quite understand. I tried a similar thing using sed and that also did not work.

The solution that finally worked is based on the fact that mysql, like C++, concatenates adjacent strings into a single string.

So, I change (using sed) all single quotes inside the string into the sequence: single-quote, space, double-quote, single-quote, double-quote, space, single-quote.

You may notice that the double quotes are escaped, but that's for escaping within the sed command, those don't make it into the value that's passed to mysql.

val=$(sed -e "s/'/' \"'\" '/g" <<<"$val")

Running this final version does the trick:
$ bash fix.sh
String contains single quotes: name 'with' \"quotes\"
New Value: name ' "'" 'with' "'" ' \"quotes\"
mysql -e UPDATE atable SET name='name ' "'" 'with' "'" ' \"quotes\" and more' WHERE id='1'

and you can see the result in the table.
+----+-------------------------------+---------------------+
| id | name                          | create_date         |
+----+-------------------------------+---------------------+
|  1 | name 'with' "quotes" and more | 2009-11-19 08:48:59 |
+----+-------------------------------+---------------------+
Have we had enough quoting yet????

No comments:

Post a Comment