![]() |
PHP Best Practicesby Craig BuchekSt. Louis GNU/Linux Users GroupDecember 20, 2007Intro
Overview
PHP
PHP 5
When to Use PHP
Starting a New ProjectHere's a list of things to do when starting a new project using PHP:
File LayoutIf you can, place the following files and directories ABOVE the HTDOCS directory.
If possible, set up a hierarchy of multiple sub-sites. (Can set up a .htaccess file to implement these from a single site.)
If that's not possible, sub-directories for dev and testing would be a good idea. As a rule, if you have more than about 5-10 PHP files, start using subdirectories. Subdirectories should gather related functionality. For example, if you've got a set of PHP scripts dealing with calendar events, put them in a subdirectory named 'events', 'event', or 'calendar'. A good directory hierarchy might look like this: mysite.com production (or www) htdoc (or public) .htaccess index.php xyz.php js (or script(s) or javascript) xyz.js css (or style(s)) xyz.css images (or graphics or pictures) admin .htaccess (to ensure only admin users can access, or use PHP authentication) thing1 include config.php (well-commented location for any global settings) common.php (frequently used routines) database.php (initialize the database connection, routines for working w/ the DB) classname.php (prefer this format; use lowercase and (maybe) underscores, or CamelCase) (CamelCase makes it clear it's a class containing that specific name.) cgi-bin (if necessary -- hopefully not) bin (if PHP code needs to call any custom CLI commands) doc LICENSE.txt (if Open Sourced) TODO.txt IDEAS.txt README.txt staging (or beta) test development backups If your site is simple, you can leave some of those out. PHP CLIInvoke with -a for interactive mode. Still need <? open tag. Can use readline to edit input (if enabled). Outputs immediately. (But leaves you on the output line – hit enter.) CTRL+D (at beginning of line) to end input. Invoke with code specified on command-line (without <? tags) by giving -r: php -r '$name = "Craig"; echo "Hello, $name.\n";' As the man page says, this is similar to running eval() on the passed-in string. You can specify a command to run on each line of input from stdin with -R: php -R '$name = $argn; echo "Hello, $name ($argi).\n";' The special variable $argn provides the text of each successive line; $argi gives the line number. The -F flag is similar to -R, but the PHP code comes from a file instead. You can also specify code to run before and after all the input lines, using -B and -E, respectively. Running php -i is about the same as calling phpinfo(). You can use php -w to output code stripped of comments and whitespace. I would have thought that this would be useful for counting lines of code, but it also removes line-feed characters. So I have no idea what use it is. Since PHP code doesn't get sent across the wire, there's no sense in compacting it. I came up with this to count lines of code: grep -E -h -v '^[[:blank:]]*([[:punct:]]{0,2}|#.*|//.*)$' *.php | wc It ignores blanks lines, comment lines, and lines with only white space and 1 or 2 punctuation characters. The only thing it doesn't handle is /* */ style comments, which would likely require a PHP parser. You can get help on a function, class, or extension with –rf, –rc, or –re. Unfortunately, the help is not very helpful – it only gives signatures. Running php -s will output a syntax-highlighted version of the code. The output is HTML, with spans using hard-coded colors. As far as I can tell, there's no way to change the colors or anything. TODO: Write a function to determine if we were called from CLI, mod_php, CGI, or FastCGI. Refactoring
Coding ConventionsThese are my personal preferences. switch ( $case ) { case 1: case 2: some_code(); break; case 3: do_something(); do_something_else(); break; default: blah(); break; } General Tips
General CodingIn order to prevent errors where you assign a variable in an if statement, instead of comparing the variable to a constant, put the constant BEFORE the variable: if ( 1 == $my_variable ) if ( $my_variable == 1 ) Return from functions as soon as possible, whenever you can. This decreases the amount of nesting, so when reading the code, you have less logic to keep in your head. Instead of: if ( $days_left < $DAYS_TO_SHOW_COUNTDOWN ) { ... show countdown ... } use: if ( $days_left >= $DAYS_TO_SHOW_COUNTDOWN ) return; ... show countdown ... To decrease the amount of nesting and repeated code, set variables to some defaults, and continue on. Instead of: if ( condition1 ): // do thing1 to x else: if ( condition2 ): // do thing2 to x // do thing1 to x else: // do thing3 to x // do thing2 to x // do thing1 to x endif; endif; or: if ( condition1 ): // do thing1 to x elseif ( condition2 ): // do thing2 to x // do thing1 to x else: // do thing3 to x // do thing2 to x // do thing1 to x endif; use: if ( !condition2 ): // do thing3 to x endif; if ( !condition1 ): // do thing2 to x endif; // do thing1 to x Quoting
Single Quotes
Double Quotes
Double-quotes and here documents will interpolate variables. There are 2 syntaxes for complex variable interpolation: "${obj->field}" "{$obj->field}" The 2nd form seems to be the preferred. As of PHP 5.0, string interpolation will work with function and method calls. Does not work as you might expect. For function calls, must be a variable function: function xyz() {return 5;} $x = 'xyz'; echo "{$x()}\n"; For function calls and method calls, the ${} syntax does not work. For method calls: class X { function y() {return 5;} } $x = new X; echo "{$x→y()}\n"; NOTE: PHP takes more time and memory to interpolate than to concatenate. In Ruby, it is definitely preferred (and faster) to interpolate. HTML Mode
function print_some_html($author) { ?> <h2>July 3, 2007</h2> <p>It was a nice day; the sun was shining, and the birds were singing.</p> <?php echo "<p> -- $author</p>\n"; # Even this is a bit questionable. }
The second-best choice is to use a here document. (See below.) If you've got a PHP variable you want to print in HTML mode, do it like this: <p>So I went to the store with <?php echo $friend; ?> today.</p> or with short tags, like this: <p>So I went to the store with <?= $friend ?> today.</p> Here Documents
$mailer->body = @<<<FORM Firstname = {$form['firstname']} Lastname = {$form['lastname']} FORM; HTML Within PHP Loops
<tr> <td colspan="4" align="left" valign="top" ><span class="fieldLabel">Race:</span> <br> <? foreach($RACE_ARRAY as $key=>$value): ?> <input name="RACE" class="formText" type="radio" value="<? echo $key; ?>" <? echo $RACE == $key ? 'checked="checked"' : ''; ?>> <? echo $value; ?> <br> <? endforeach; ?> </td> </tr> PHP TagsIt's preferable not to use short tags, as short tags are not XHTML or XML compliant. However, it's not expected that PHP code should be valid, only the output of a PHP document after it has been run through the PHP processor. So this requirement is not too important, unless you plan to run your PHP code through any XML tools, or if you're writing a library of code that others might use. If short tags are enabled, you cannot directly include the XML definition in the XML prolog. Instead, you'll have to have PHP echo it: <?= '<' . '?xml version="1.0" ?' . '>' ?> Note that I separated the question marks from the angle brackets, because some text editors would have problems interpretting the PHP otherwise. However, short tags are great for inserting the values of variables into the HTML portion: <p>So I went to the store with <?= $friend ?> today.</p> If you do use short tags, ALWAYS put a space after the <? (and before the ?>), as otherwise the processing instruction is ambiguous, and if you use any other tools on the file that recognize PIs, they will fail. In addition to long tags (<?php echo "hello";?>) there is the <SCRIPT> syntax: <script language="php"> #<![CDATA[ echo 'Some HTML editors don't like processing instructions.'; #]]> </script> I would not be surprised however, if some PHP editors don't know to look in there for PHP code. The following code MIGHT work for replacing all short open tags with long open tags: find -name '*.php' | xargs perl -pi -e 's/<\?= ?(.*?) ?\?>/<?php echo($1); ?>/g' find -name '*.php' | xargs perl -pi -e 's/<\?/<?php/g' find -name '*.php' | xargs perl -pi -e 's/<\?phpphp/<?php/g' (from http://us2.php.net/manual/en/language.basic-syntax.php#60798) For files that contain only PHP code, the closing tag ("?>") is not required by PHP. Not including it prevents trailing whitespace from being accidentally injected into the output. (From Doctrine style guidelines.) Conclusions:
Include FilesTry to place include files outside of DocumentRoot. Preferably in the directory above DocumentRoot. Don't name include files *.inc, as web servers will generally serve them uninterpreted, allowing attackers to see the source code, and potentially see database passwords. Just name them *.php. Add an .htaccess file to the includes directory to prevent direct access. Add this to top-level Apache config or .htaccess file, just in case someone uses *.inc files. <Files ~ "\.inc$"> Order allow,deny Deny from all </Files> Check to see if someone is trying to run an include file directly, because include files running out of context might do something unpredictable and give up sensitive info. if ( realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME']) ) exit ('Cannot run this file directly!'); This can also be used to run unit tests if an include file is run directly. if ( realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME']) ) run_unit_tests(); Don't override the include_path – extend it: set_include_path('/new/path/to/include' . PATH_SEPARATOR . get_include_path()); require_once/require/include_once/include - almost always want require_once. Note that these are not functions, so we don't need parentheses. Initialization
Use php_admin_flag open_basedir /path/to/basedir Use a custom error handler. For production sites, don't display errors on-screen to the users. Send yourself an email. If you're on a shared host, you should change the location where your sessions are stored, so that you're not sharing the same place in /tmp where all the other virtual hosts are using and can easily access. You can either set the session.save_path INI setting, or call session_save_path('/new/path') before calling session_start(). This is a bit of security by obscurity; the stronger way is to keep session info in the database. (See the Storing Sessions in Database section below.) Check for magic_quotes. Hope that it's off, but if not fix things w/ stripslashes() and unset the INI variable. magic_quotes_gpc, magic_quotes_runtime Here's code from Harry Fuecks at SitePoint: if (get_magic_quotes_gpc()) {$_GET = array_map('stripslashes', $_GET); $_POST = array_map('stripslashes', $_POST); $_COOKIE = array_map('stripslashes', $_COOKIE);} // TODO: I think we need to strip slashes from $_REQUEST and $_FILES too. Here's some code from http://www.phpguru.org/article.php?ne_id=58 called dispelMagicQuotes(): function remove_magic_quotes_gpc () { if ( !ini_get('magic_quotes_gpc') ) return; // TODO: Apparently, $_FILES[]['name'] also has magic quotes applied. foreach ( array('_GET', '_POST', '_COOKIE') as $super ): foreach ( $GLOBALS[$super] as $k => $v ): $GLOBALS[$super][$k] = stripslashes_r($v); // TODO: I think we need to strip slashes from $_REQUEST too. endforeach; endforeach; ini_set('magic_quotes_gpc', false); // TODO: Make sure this works as expected. } /** * Recursive stripslashes. array_walk_recursive seems to have great trouble with stripslashes(). * @param mixed $str String or array * @return mixed String or array with slashes removed */ // NOTE: Might be able to replace this with return is_array($str) ? array_map('stripslashes', $str) : stripslashes($str); function stripslashes_r ( $str ) { if ( !is_array($str) ) return stripslashes($str); foreach ( $str as $k => $v ): $str[$k] = stripslashes_r($v); endforeach; return $str; } Check for register_globals. Perhaps unregister any globals from G/P/C/E/S. (See my code below.) Turn off allow_url_fopen. When using a database, include/database.php should probably open the connection.
Example code: $DB_CONNECTION = mysql_pconnect($DB_CONFIG['HOST'], $DB_CONFIG['USER_NAME'], $DB_CONFIG['PASSWORD']) or exit(mysql_error()); mysql_select_db($DB_CONFIG['DATABASE'], $DB_CONNECTION); Input ValidationClean all data that comes from users. Provide a whitelist of allowed characters. I like to start with a minimal set, and add any other characters that might be required. I change any disallowed characters to an underscore: $name = preg_replace('/[^a-zA-z0-9.-]/', '_', $_POST['name']); You should know if you're expecting a GET or POST (usually POST); don't use $_REQUEST, except in rare instances. This can help eliminate CSRF vulnerabilities. When stripping HTML, use strip_tags() and htmlentities() – not addslashes(). When stripping data for use in SQL, use mysql_real_escape_string($_POST['username']); Use tokens to verify intent and that user came from a form. (See http://shiflett.org/articles/cross-site-request-forgeries for more info.) if ( $_SERVER['REQUEST_METHOD'] == 'POST' ) { if ( $_POST['token'] != $_SESSION['token'] ) { // can also check token_timestamp for recentness exit('Invalid token!'); // or add an error message and redirect_to_self(); } } else { $token = md5(uniqid(rand(), TRUE)); $_SESSION['token'] = $token; $_SESSION['token_timestamp'] = time(); <input type="hidden" name="token" value="<?php echo $token; ?>" /> } FilteringUse PHP 5.2's Filter extension. $html = filter_input(INPUT_GET, 'html', FILTER_SANITIZE_SPECIAL_CHARS); $url = filter_input(INPUT_POST, 'url', FILTER_SANITIZE_ENCODED); $myinputs = filter_input_array(INPUT_POST, array('product_id' => FILTER_SANITIZE_ENCODED, 'component' => array('filter' => FILTER_VALIDATE_INT, 'flags' => FILTER_FLAG_ARRAY, 'options' => array('min_range' => 1, 'max_range' => 10)), )); $email = filter_var('bob@example.com', FILTER_VALIDATE_EMAIL)); Error MessagesFirst of all, you should be using a custom error handler, to prevent the user from seeing any PHP errors. (See above.) For error messages you want the user to see, there are 3 methods:
$MESSAGES = array(); function add_error ( $text ) { global $MESSAGES; array_push($MESSAGES, $text); } if ( !valid($XYZ) ) { add_error("ERROR: Cannot do that because XYZ is invalid."); } For errors related to form entry, there are 2 places to display them:
Fixing register_globalsSee http://us2.php.net/manual/en/faq.misc.php#faq.misc.registerglobals for resolving issues if it's turned off, and you need it on, or if it's on and you want it off. But if you want it OFF, just set it in your .htaccess file. php_flag register_globals off Note that $_SESSION is not populated until session_start() is called. If you want to UNSET everything in $GLOBALS that got set due to register_globals, then run this function (from http://www.phpguru.org/article.php?ne_id=60, where they called it dispelGlobals): function unregister_globals () { if (!ini_get('register_globals')) return; if (isset($_REQUEST['GLOBALS'])) exit('GLOBALS overwrite attempt detected'); # Variables that shouldn't be unset. TODO: Should we add $_SESSION; does it get auto-globaled? Is $_FILES auto-globaled? $noUnset = array('GLOBALS', '_GET', '_POST', '_COOKIE', '_REQUEST', '_SERVER', '_ENV', '_FILES'); # TODO: Should we sort these into ini_get('variables_order')? $input = array_merge($_GET, $_POST, $_COOKIE, $_SERVER, $_ENV, $_FILES, isset($_SESSION) ? (array)$_SESSION : array()); # TODO: Consider looping through $GLOBALS instead of $input, and checking for $input instead of $GLOBALS. foreach ( $input as $k => $v ): if ( !in_array($k, $noUnset) && isset($GLOBALS[$k]) ): unset($GLOBALS[$k]); endif; endforeach; # NOTE: This doesn't actually do anything except flag for later users. Don't do this unless the above code worked. ini_set('register_globals', false); # TODO: Make sure this works as expected. } File Downloads
header('Content-Disposition: attachment; filename="' . $filename_to_save_as . '";')
Database ConfigurationOn shared hosting, it's nearly impossible to hide your files from other virtual hosts, since all virtual hosts run as the same user. The best advice is to store all information in the database. However, you need some way to store the database name, database user, and database password. One way to accomplish this is to set some environment variables in the Apache config files, and make the config files readably only by root (and the owner of the files). Apache is started by root, but then changes to a less-privileged user AFTER it has read the config files and started listening on the port(s).
Unfortunately, this has to be done at the VirtualHost level, and won't
work from It might be a good idea to use an Apache-config include file to store the protected info, so it's easy to ensure that it is protected. Here's an example:
In <VirtualHost *> ServerName mysite.com ServerAlias www.mysite.com UseCanonicalName On DocumentRoot /home/web/mysite.com/public Include /home/web/mysite.com/environment <Directory /home/web/mysite.com/public> AllowOverride All Options Indexes FollowSymLinks MultiViews IncludesNoExec Order allow,deny Allow from all </Directory> </VirtualHost>
Then in SetEnv DB_USER "myuser" SetEnv DB_PASS "mypass If you only want to set the environment variables for PHP files within a certain directory, you can do this: SetEnvIf Request_URI "/path/to/my/directory" DB_PASS="mypass"
In the PHP code, you can then refer to these as The environment file can be owned by the site owner, and made readable only by the site owner and root. chown mysite:mysite /home/web/www.mysite.com/environment chmod 600 /home/web/www.mysite.com/environment Note that if this file is changed, Apache will need to be restarted, or at least reload its configuration. But since the system admin has to create the database info anyway, this doesn't seem to be a major problem – the system admin should just add this to his procedures for creating a new virtual host. (Idea from PHP Cookbook via Chris Shiflett.) Another way to do it is to use UNIX permissions, and use PHP via CGI instead of mod_php. This will run PHP as your user ID, instead of Apache. However, there's a serious performance impact from having to start a new process for each request. If you're on a host that doesn't have PHP running through CGI (or suExec or suPhp), you can implement it yourself via the technique described at http://www.sonic.net/support/faq/advanced/phpwrap/. Database Layers
Database AbstractionSee the section above about where to store the database configuration parameters (database name, database username, database password). See the section above about initializing (opening the connection and selecting the database instance) the database in include/database.php file. (Although for my framework, I open a (possibly pooled) connection for each table, and include the connection in all calls to the database instead of setting the default database.) Should have a thin abstraction library to wrap SQL calls. It should properly escape user-supplied data. Example code: $where_clause = array('id' => $user_id); # $user_id gets properly escaped. Multiple items are ANDed. $where_clause = 'status = 2 OR status = 4'; # NOT RECOMMENDED! But useful for ORing conditions. $options = array('order' => 'id ASC', 'limit' => 10, 'group_by' => 'name'); $column_names = array('id', 'name', 'middle' => 'middle_name'); # Not sure which order is best on the last one. $column_names = 'id, first_name, last_name, middle_name AS middle'; $new_data = array('id' => 2, 'name' => 'Bob'); # Must be an associative array. sql_select_all('*', table_name, where_clause, options) sql_select_one(column_names, table_name, where_clause, options) sql_insert(table_name, new_data, options) sql_update(table_name, where_clause, new_data, options) sql_delete(table_name, where_clause, options) PDO example: set_exception_handler(custom_exception_handler); PDO::setAttribute("PDO::MYSQL_ATTR_USE_BUFFERED_QUERY", true); $db = new PDO('mysql:host=localhost;dbname=testdb', 'username', 'password', array(PDO::ATTR_PERSISTENT => true)); $db->exec('INSERT blah...'); $result = $db->query('SELECT * from table'); foreach ($result as $row) {print $row['first_name'] . $row['last_name'];} $statement = $db->prepare('SELECT * from table WHERE first_name = :name AND last_name = ?'); $statement->bindParam('name', 'Craig'); $statement->bindParam(1, 'Buchek'); $db->beginTransaction(); $statement->execute(); $statement->rowCount(); $result = $statement->fetch(); # Does this fetch all rows, or just 1? $db->commit(); $db->rollBack(); Object-Relational Mappers
Storing Sessions in DatabasePer Chris Schiflett: CREATE TABLE sessions ( id VARCHAR(32) NOT NULL, access INT(10) UNSIGNED, DATA text, PRIMARY KEY (id) ); session_set_save_handler('_open', '_close', '_read', '_write', '_destroy', '_clean'); function _open() { global $_sess_db; if ($_sess_db = mysql_connect('127.0.0.1', 'myuser', 'mypass')) { return mysql_select_db('sessions', $_sess_db); } return FALSE; } function _close() { global $_sess_db; return mysql_close($_sess_db); } function _read($id) { global $_sess_db; $id = mysql_real_escape_string($id); $sql = "SELECT data FROM sessions WHERE id = '$id'"; if ($result = mysql_query($sql, $_sess_db)) { if (mysql_num_rows($result)) { $record = mysql_fetch_assoc($result); return $record['data']; } } return ''; } function _write($id, $data) { global $_sess_db; $access = time(); $id = mysql_real_escape_string($id); $access = mysql_real_escape_string($access); $data = mysql_real_escape_string($data); $sql = "REPLACE INTO sessions VALUES ('$id', '$access', '$data')"; return mysql_query($sql, $_sess_db); } function _destroy($id) { global $_sess_db; $id = mysql_real_escape_string($id); $sql = "DELETE FROM sessions WHERE id = '$id'"; return mysql_query($sql, $_sess_db); } function _clean($max) { global $_sess_db; $old = time() - $max; $old = mysql_real_escape_string($old); $sql = "DELETE FROM sessions WHERE access < '$old'"; return mysql_query($sql, $_sess_db); } session_start(); // Use $_SESSION['varname'] however you want; Helpers
function redirect_to ( $url ) { # Make sure URL is absolute. Required to be absolute by section 14.30 of RFC 2616. $url = absolute_url($url); # Output headers telling the browser to redirect to the desired URL. # NOTE: Should actually use a 303 return code, but most browsers don't handle that. # See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for details. header('HTTP/1.1 302 Moved Temporarily'); header('Status: 302 Moved Temporarily'); header("Location: $url"); # We don't need to continue running any more PHP code, so exit. # Just in case the browser doesn't understand the headers we passed, send the new URL in the body. exit("Go to <a href='$url'>next page</a>."); } function absolute_url ( $url ) {} function validate_email_address ( $address ) {} function validate_phone_number ( $phone_num_str ) {} # From http://www.zend.com/lists/php-dev/200506/msg00232.html function firstNotEmpty() { $vars = func_get_args(); foreach($vars as $var) if (!empty($var)) return $var; return NULL; } Define array_get() if it's not already defined. Use it whenever you need to use a default if an array key/value isn't defined. # From http://marc.info/?l=php-dev&m=118208519027980&w=2 if ( !function_exists('array_get') ) { function array_get ( /*array*/ $arr, /*mixed*/ $key, /*mixed*/ $default = false ) { if ( array_key_exists($key, $arr) ) return $arr[$key]; else return $default; } } DatesWhenever possible, output dates in ISO 8601 format (YYYY-mm-dd). This format is unambiguous to any human reader, and it also sorts properly in any alphanumeric sort. See this GNU manual for an interesting and extensive treatise on the subject of date/time input formats. Look into PHP 5.2 date_create(), date_modify(), etc for date manipulation. Should probably use them in deference to date(), mktime(), and gmdate(). OOPclass MyClass { public $anyone_can_access; protected $name = 'Bob'; private $secret; function __construct($secret = 'password') { $this->secret = $secret; } function __destruct() { # NOTE: Don't call anything in here that might throw an exception. } public function xyz() { echo "Hello, $name!"; } } class OtherClass extends MyClass { function __construct($name, $secret) { # Cannot access $this->secret; $this->name = $name; parent::__construct($secret); } } $x = new MyClass('better_password'); $y = new OtherClass('craig', 'rt35erIM4'); $x->xyz(); $y->xyz(); Normal Form
<?php if ( 'POST' == $_SERVER['REQUEST_METHOD'] ): // ... process ... $name = ...; $show_name = ...; // ... process ... if (!errors): redirect_to($VIEW_URL); // Or possibly $SELF, if the only view we have is the same as the edit form. else: // Redirect to the same page, but as a GET so that refreshing the page works. // NOTE: Have to use sessions to retain error messages from this POST to that redirect_to_self(); // Or just fall through to HTML below. (Delete else clause.) endif; endif; function abc () { .... } function xyz () { .... } ?> <html> .... <?php if ( $show_name ): ?> <p>Name: <?php echo $name; ?></p> <?php endif; ?> .... </html>
MVC
Controller
View
Model
Frameworks
Popular MVC Frameworks
PEAR
pear list-all
pear list-channels
pear install xyz
pear search phpunit
Testing
Problems with PHP
Variable Argumentsfunc_get_args() # returns array of arguments func_get_arg(n) # returns the nth argument func_num_args() # returns number of arguments passed to the function Summary
Resources
Thanks
Presentation Info
|
![]() |