Paul M. Jones | Sanitation with PHP filter

Sanitation with PHP filter_var()

I'm adding a combined validate-and-sanitize class to Solar, Solar_DataFilter. It uses some of the new filter extension functions internally.

However, I found a problem with the "float" sanitizing function in the 5.2.0 release, and thought others might want to be aware of it. In short, if you allow decimal places, the sanitizer allows any number of decimal points, not just one, and it returns an un-sanitary float.

I entered a bug on it, the text of which follows:

Description:
------------
When using FILTER_SANITIZE_NUMBER_FLOAT with FILTER_FLAG_ALLOW_FRACTION,
it seems to allow any number of decimal points, not just a single
decimal point.  This results in an invalid value being reported as
sanitized.

Reproduce code:
---------------
<?php
$val = 'abc ... 123.45 ,.../';
$san = filter_var($val, FILTER_SANITIZE_NUMBER_FLOAT,
    FILTER_FLAG_ALLOW_FRACTION);
var_dump($san);
?>

Expected result:
----------------
float 123.45

Actual result:
--------------
string(12) "...123.45..."

The bug has been marked as bogus, with various reasons and explanations that all make sense to the developers. "You misunderstand its use" and "it behaves the way we intended it to" seem to be the summary responses.

However, I would argue that intended behavior is at best naive and of only minimal value. If I'm sanitizing a value to be a float, I expect to get back a float, or a notification that the value cannot be sanitized to become a float ... but maybe that's just me.

Regardless, I'm not going to belabor the point any further; I'll just avoid that particular sanitizing filter.

Update: Pierre responds with, essentially, "RTFM." I agree that the manual describes exactly what FILTER_SANITIZE_NUMBER_FLOAT does. My point is that what it does is not very useful. I think it's reasonable to expect that a value, once sanitized, should pass its related validation, and the situation described in the above bug report indicates that it will not. My opinion is that the filter should either (1) attempt to extract a float value, or (2) indicate in some way that the value cannot be reasonably sanitized (in the sense that the returned value is not "sane"). Since it does not, and since the developers seem unwilling to accept that approach, I'll just avoid using that filter and write my own.

Update 2: Something just occurred to me. Pierre says in the comments that accepting "abc "¦ 123.45 ,"¦/" to create a float is a bad idea. Yet the PHP float sanitizer will happily accept "123.abc,/45"³ and return a float that will validate. Is *that* a good idea? If so, why?

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving you codebase, all while keeping it running the whole time.