How do I use taint mode?

How do I use taint mode?

What is taint mode?

Taint mode is a way of making your code more secure. It means that your program will be fussier about data it receives from an external source. External sources include users, the file system, the environment, locale information, other programs and some system calls (e.g. readdir).

Why should I use taint mode?

Some functions can unexpectedly cause problems with bad data. For example, the “magical” properties of the Perl open function mean that it may be used to open a pipe to any arbitrary shell commands. It’s in your interests to ensure that your program is opening the file that you think it should be opening, and not a command line supplied by a mischievous user.

When should I use taint mode?

There are differing opinions on when to use taint mode. Certainly as a minimum, any CGI application should use taint mode. When accepting external data you should always program defensively and use taint mode to ensure that the external data matches your expectations.

Some people argue that taint mode should always be used as it forces you to consider the implications of your use of external data. This is also useful because if you write your program without taint mode and then decide you need taint mode later on, you may have a lot of work adding the checks required.

How do I turn taint mode on?

You turn taint mode on by using the -T flag in your hashbang line. For example:

Or if you turn warnings on using -w:

Taint mode cannot be turned off in a script once it has been turned on. Note that the -T argument is read by Perl even on those platforms where the hashbang line itself is not used by the operating system.

What does taint mode do?

When your program receives any data in taint mode, that data is marked as tainted. Tainted data may not be used to affect anything outside your program (for example, to open a file, or used in a system call), until you have specifically un-tainted it.

If you assign a variable a tainted value, that variable is also tainted. For example:

This program would fail with the following error:

How do I untaint data?

To untaint data you need to apply a regular expression and any data that you capture is then untainted. Remember when writing your regular expressions that it is better to include patterns that are allowed rather than those that aren’t.

For the example above, assume we want the argument passed in on the command line to be a filename (not a path to a file), so we will only allow a filename argument limited to containing any alphanumberic characters, dots and underscores.

It is important to anchor your regular expression (use ^ and $ to ensure you match the entire string) and it is equally important to check that your regular expression has succeeded.

Re-tainting

When writing modules and subroutines, you need to be careful that your code doesn’t untaint data where it shouldn’t. For example, the CGI module used to untaint all parameters because it used a regular expression with capturing parenthesis to capture the data.

Using the above example, if we want the data to remain tainted despite our check, we could use:

This would fail like the first example:

If you then want to use capturing parenthesis to untaint data, you can use:

See also

Scroll to Top