Tag Archives: filter_input_array()

Never Use $_GET Again

Please don’t need to use $_GET or $_POST anymore. In fact, probably we shouldn’t use $_GET and $_POST anymore. Since PHP 5.2, there is a new and better way to safely retrieve user-submitted data.

The clever developers have constructed a library that analyzes data and escapes it appropriately. But the problem of validating and sanitizing input is still a substantial issue. Many seasoned PHP developers still spend precious development cycles building custom code to filter input.

PHP (from 5.2 onward) has a built-in filtering system that makes the tasks of validating and sanitizing data trivially easy. Rather than accessing the $_GET and $_POST superglobals directly, you can make use of PHP functions like filter_input() and filter_input_array(). Let’s take a quick look at an example:

$my_string = filter_input(INPUT_GET, ‘my_string’, FILTER_SANITIZE_STRING);

The code above is roughly the equivalent of retrieving $_GET[‘my_string’] and then running it through some sort of filter that strips HTML and other undesirable characters. This represents data sanitization, one of the two things that the filtering system can do. These are the two tasks of the filtering system:

* Validation: Making sure the supplied data complies with specific expectations. In this mode, the filtering system will indicate (as a boolean) whether or not the data matches some criterion.
* Sanitizing: Removing unwanted data from the input and performing any necessary type coercion. In this mode the filtering system returns the sanitized data.

By default, the filter system provides a menagerie of filters ranging from validation and sanitization of basic types (booleans, integers, floats, etc.) to more advanced filters which allow regular expressions or even custom callbacks.

The utility of this library should be obvious. Gone are the days of rolling our own input checking tools. We can use a standard (and better performing) built-in system.

If we take things one step further than merely presenting this as an option. We can say that we should no longer directly access superglobals containing user input. There is simply no reason why we should. And the plethora of security issues related to failure to filter input provides more than sufficient justification for my claim. Always use the filtering system. Make it mandatory.

“But,” one might object, “what if I don’t want my data filtered?” The filtering system provides a null filter (FILTER_UNSAFE_RAW). In cases where the data needn’t be filtered (and these cases are rare), one ought to use something like this:

$unfiltered_data = filter_input(FILTER_GET, ‘unfiltered_data’, FILTER_UNSAFE_RAW);

Following this pattern provides a boon: We can very quickly discover all of the unfiltered variables in the code by running a simple find operation looking for the FILTER_UNSAFE_RAW constant. This is much easier than hunting through calls to $_GET to find those that are not correctly validated or sanitized. Risky treatment of input can be managed more efficiently by following this pattern.

Filters won’t solve every security-related problem, but they are a tremendous step in the right direction when it comes to writing safe (and performant) code. It’s also simpler. Sure, the function call is longer, but it relieves developers of the need to write their own filtering systems. These are darn good reasons to never use $_GET (or $_POST and the others) again.