Python, tuples, sequences, and parameterized SQL queries

I recently developed a teaching tool using the Python Flask framework to demonstrate SQL injection and XSS (cross-site scripting) vulnerabilities and how to remediate them.

The remediation step for SQL injection tripped me up though when I received the following error message:

sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 4 supplied.

The code that was being executed was:

cursor.execute("INSERT INTO query VALUES (NULL, ?)", (query_text))

(The value of query_text submitted was ‘test’.)

This looks very similar to the example in the documentation for sqlite3.Cursor.execute(), but the key difference here is that my query had one parameterized value instead of 2 or more.

Why does that make a difference? Because a tuple containing a single value must be followed by a trailing comma. (See the documentation for the tuple data type in Python.)

Without the trailing comma, ('test') becomes the string 'test'. Strings are a particular type of Python sequence, a sequence of characters. This is why the error message indicated that 4 bindings were supplied: the bindings were 't', 'e', 's', and 't'.

The solution, therefore, was to add a comma to the existing code:

cursor.execute("INSERT INTO query VALUES (NULL, ?)", (query_text,))

At first I thought the documentation could be improved. If parameters in the qmark format can take any sequence type, shouldn’t it state that explicitly? For example, using a Python list instead would solve this problem, since a list can contain a single item without the need for a trailing comma:

[query_text]

However, it doesn’t make sense to use a Python list to send parameters to a SQL query, because Python lists are mutable. A list can be sorted, which could change the order of your parameters. A list can have items appended and removed, which would change the number of parameters. A tuple, on the other hand, is immutable: it will have the same number of items in the same order as when it was defined. Therefore, it is the preferred sequence data type to use to send parameter values to a SQL statement.

The trailing comma needed by a tuple containing a single item is somewhat unfortunate, as the Python tutorial on the tuple data type admits:

A special problem is the construction of tuples containing 0 or 1 items: the syntax has some extra quirks to accommodate these. Empty tuples are constructed by an empty pair of parentheses; a tuple with one item is constructed by following a value with a comma (it is not sufficient to enclose a single value in parentheses). Ugly, but effective.
[Emphasis added]

This is something that regular Python programmers get used to, but for a beginner or casual user will likely cause some confusion.

Leave a Reply

Your email address will not be published. Required fields are marked *