PHP Best Practices

by Craig Buchek

St. Louis GNU/Linux Users Group

December 20, 2007

Intro

  • Just started a year ago as a full-time independent web developer
  • Been working with real live PHP code
  • Seen a lot of problems with legacy code
  • Run into some problems on my own
  • Read a few good books and blogs about PHP
  • Read a few good books and blogs on coding in general
  • Learned how to apply good concepts from Ruby on Rails to PHP

Overview

  • PHP work environment
  • Coding conventions
  • Quoting
  • Include files
  • Form handling and input validation
  • Database abstraction
  • Model-View-Controller architecture
  • Separating code from HTML
  • Frameworks
  • Testing

PHP

  • PHP is a very popular web development language, especially on GNU/Linux servers
    • Available on almost any web server
  • Commonly the "P" part of the LAMP stack (Linux, Apache, MySQL, PHP/Perl/Python)
  • PHP makes it pretty easy to write simple web applications
    • Also makes it very easy to get yourself into trouble
  • Looked down upon by "real" languages
    • Perl with training wheels
    • No namespaces, so several hundred functions in the top-level namespace
    • Missing some other advanced features

PHP 5

  • Use it!
  • PHP 5 has been out for 3.5 year now (as of December 2007)
    • PHP 5.2.5 is current
  • PHP 4 is end-of-life
    • No more releases after 2007-12-31
    • No more security fixes after 2008-08-08
  • PHP 6 is on the way
  • If your web server provider does not support PHP 5 yet, find another provider!

When to Use PHP

  • That's all that is available
  • There's already existing PHP code
  • Project is small
  • You've got a PHP framework that you're comfortable with
  • You've got a package written in PHP that you need to integrate with
  • Only a few pages need to have programmatic control
  • App/site is page-based
  • You're not familiar/comfortable with another language/framework

Starting a New Project

Here's a list of things to do when starting a new project using PHP:

  1. Can you use a different language?
    • If you can use a different programming language, look into those choices as well.
    • Ruby and Python are usually "cleaner" and quicker to program once you learn the frameworks.
  2. Consider using a framework.
    • I like CakePHP, but there are plenty of others to choose from.
    • Your own "micro-framework" might work as well.
    • It will take some time to learn the framework, but it will pay off later.
    • Also consider CMSes.
  3. Consider any libraries you might be able to make use of.
    • Recommended database abstraction libraries: PDO (comes with PHP 5.1+), ADOdb.
    • Recommended ORMs: Doctrine (PHP 5, large).
  4. Set up equivalent environments on your development and production systems.
    • Database with same schema, as well as some test data.
    • Try to match the version of PHP, if possible.
  5. Set up bug tracking and source control systems.
    • I like FlySpray for bug tracking.
    • I like Subversion for source revision control, but will be investigation distributed source control.
  6. Set up the directory hierarchy. (See below.)
    • Include a documentation directory.
  7. Create some scripts (or Makefile recipes) to automate common tasks.
    • Uploading and downloading current code.
  8. Turn on all WARNING messages, to keep yourself from making common mistakes.
    • Set error_reporting to E_ALL in headers or INI file.
    • Turn off allow_call_time_pass_reference.
      • Turns off a deprecated feature.
      • Requires pass-by-reference to be declared in function definitions, not function calls.
  9. Make sure register_globals is off.
    • Set it in server config or .htaccess if possible.
    • Otherwise, use my unregister_globals script.
  10. Use a consistent style (indentations, etc.) in your code.
    • Many recommend always using braces in if/while/for statements.

File Layout

If you can, place the following files and directories ABOVE the HTDOCS directory.

  • include (see next section)
  • cgi-bin
  • backups
  • NOTES.txt
  • Makefile

If possible, set up a hierarchy of multiple sub-sites. (Can set up a .htaccess file to implement these from a single site.)

  • test
  • development
  • staging
  • production

If that's not possible, sub-directories for dev and testing would be a good idea.

As a rule, if you have more than about 5-10 PHP files, start using subdirectories. Subdirectories should gather related functionality. For example, if you've got a set of PHP scripts dealing with calendar events, put them in a subdirectory named 'events', 'event', or 'calendar'.

A good directory hierarchy might look like this:

mysite.com
	production (or www)
		htdoc (or public)
			.htaccess
			index.php
			xyz.php
			js (or script(s) or javascript)
				xyz.js
			css (or style(s))
				xyz.css
			images (or graphics or pictures)
			admin
				.htaccess (to ensure only admin users can access, or use PHP authentication)
			thing1
		include
			config.php (well-commented location for any global settings)
			common.php (frequently used routines)
			database.php (initialize the database connection, routines for working w/ the DB)
			classname.php (prefer this format; use lowercase and (maybe) underscores, or CamelCase)
							(CamelCase makes it clear it's a class containing that specific name.)
		cgi-bin (if necessary -- hopefully not)
		bin (if PHP code needs to call any custom CLI commands)
		doc
			LICENSE.txt (if Open Sourced)
			TODO.txt
			IDEAS.txt
		README.txt
	staging (or beta)
	test
	development
	backups

If your site is simple, you can leave some of those out.

PHP CLI

Invoke with -a for interactive mode. Still need <? open tag. Can use readline to edit input (if enabled). Outputs immediately. (But leaves you on the output line -- hit enter.) CTRL+D (at beginning of line) to end input.

Invoke with code specified on command-line (without <? tags) by giving -r: php -r '$name = "Craig"; echo "Hello, $name.\n";' As the man page says, this is similar to running eval() on the passed-in string.

You can specify a command to run on each line of input from stdin with -R: php -R '$name = $argn; echo "Hello, $name ($argi).\n";' The special variable $argn provides the text of each successive line; $argi gives the line number.

The -F flag is similar to -R, but the PHP code comes from a file instead. You can also specify code to run before and after all the input lines, using -B and -E, respectively.

Running php -i is about the same as calling phpinfo().

You can use php -w to output code stripped of comments and whitespace. I would have thought that this would be useful for counting lines of code, but it also removes line-feed characters. So I have no idea what use it is. Since PHP code doesn't get sent across the wire, there's no sense in compacting it. I came up with this to count lines of code:

grep -E -h -v '^[[:blank:]]*([[:punct:]]{0,2}|#.*|//.*)$' *.php | wc

It ignores blanks lines, comment lines, and lines with only white space and 1 or 2 punctuation characters. The only thing it doesn't handle is /* */ style comments, which would likely require a PHP parser.

You can get help on a function, class, or extension with --rf, --rc, or --re. Unfortunately, the help is not very helpful -- it only gives signatures.

Running php -s will output a syntax-highlighted version of the code. The output is HTML, with spans using hard-coded colors. As far as I can tell, there's no way to change the colors or anything.

TODO: Write a function to determine if we were called from CLI, mod_php, CGI, or FastCGI.

Refactoring

  • Process of replacing bad code with good code
  • List of problem code types, and how to replace them
    • Each refactoring has a name
  • Many of the best practices here are based on (the results of) refactorings
  • Lots of books and web sites available

Coding Conventions

These are my personal preferences.

switch ( $case )
{
    case 1:
    case 2:
        some_code();
    break;
    case 3:
        do_something();
        do_something_else();
    break;
    default:
        blah();
    break;
}

General Tips

  • Names are important
    • Try to make your code read like English
      • header('Location: error.php'); exit();
      • redirect_to('error.php');
  • Use what PHP provides you
    • Functions
      • For anything you do more than 1 or 2 times
      • To isolate functionality, reducing amount of code in a single place
      • To increase readability
    • Classes
      • Most useful for "things" in your domain model

General Coding

In order to prevent errors where you assign a variable in an if statement, instead of comparing the variable to a constant, put the constant BEFORE the variable:

if ( 1 == $my_variable )
if ( $my_variable == 1 )

Return from functions as soon as possible, whenever you can. This decreases the amount of nesting, so when reading the code, you have less logic to keep in your head.

Instead of:

if ( $days_left < $DAYS_TO_SHOW_COUNTDOWN )
{
	... show countdown ...
}

use:

if ( $days_left >= $DAYS_TO_SHOW_COUNTDOWN )
	return;
... show countdown ...

To decrease the amount of nesting and repeated code, set variables to some defaults, and continue on.

Instead of:

if ( condition1 ):
	// do thing1 to x
else:
	if ( condition2 ):
		// do thing2 to x
		// do thing1 to x
	else:
		// do thing3 to x
		// do thing2 to x
		// do thing1 to x
	endif;
endif;

or:

if ( condition1 ):
	// do thing1 to x
elseif ( condition2 ):
	// do thing2 to x
	// do thing1 to x
else:
	// do thing3 to x
	// do thing2 to x
	// do thing1 to x
endif;

use:

if ( !condition2 ):
	// do thing3 to x
endif;
if ( !condition1 ):
	// do thing2 to x
endif;
// do thing1 to x

Quoting

  • Single quotes
  • Double quotes
  • HTML mode
  • Here documents

Single Quotes

  • Use single quotes (') for most items
  • Single quotes denote "exact" literal strings
    • What you type is what you get
  • There are very few escape sequences
    • Only exceptions are \' and \\
  • Single-quoted strings can contain (literal) newlines
  • (The D programming language actually calls these types of strings WYSIWYG, and prefixes them with a 'w'.)

Double Quotes

  • Use double quotes (") any time you have a variable you want to insert into a string
    • It's preferable to use double quotes than to concatenate single-quoted strings
  • Use double-quoted strings if you want to use special escaped characters
    • Such as newline ("\n") or tab ("\t")

Double-quotes and here documents will interpolate variables. There are 2 syntaxes for complex variable interpolation:

"${obj->field}"
"{$obj->field}"

The 2nd form seems to be the preferred.

As of PHP 5.0, string interpolation will work with function and method calls. Does not work as you might expect. For function calls, must be a variable function: function xyz() {return 5;} $x = 'xyz'; echo "{$x()}\n"; For function calls and method calls, the ${} syntax does not work. For method calls: class X { function y() {return 5;} } $x = new X; echo "{$x->y()}\n";

NOTE: PHP takes more time and memory to interpolate than to concatenate. In Ruby, it is definitely preferred (and faster) to interpolate.

HTML Mode

  • Do NOT echo/print out lines of HTML text
  • NEVER print more than 1 line from PHP mode
    • Instead, go back to HTML mode by closing your PHP tag
  • This is the preferred way:
function print_some_html($author)
{
	?>
    <h2>July 3, 2007</h2>
    <p>It was a nice day; the sun was shining, and the birds were singing.</p>
	<?php
    echo "<p> -- $author</p>\n"; # Even this is a bit questionable.
}
  • One reason is that you have to quote a lot more in PHP mode
    • Nested quotes can get pretty nasty
  • HTML mode is better at HTML
    • Your text editor can help you deal with the HTML as HTML, instead of considering it to be just another string in PHP.

The second-best choice is to use a here document. (See below.)

If you've got a PHP variable you want to print in HTML mode, do it like this: <p>So I went to the store with <?php echo $friend; ?> today.</p> or with short tags, like this: <p>So I went to the store with <?= $friend ?> today.</p>

Here Documents

  • The final type of literal string in PHP is a "here document"
  • Starts with <<< followed (without spaces) by a legal PHP identifier
  • Ends with the identifier as the ONLY thing on a new line
    • With no whitespace preceding or following
    • ONLY a semi-colon may follow it
      • Nothing else may follow the semi-colon, not even comments
  • Other than that, and not needing to escape double quotes, the here document syntax works exactly like double-quoted strings
  • Note that there's no -EOF or "EOF" here-doc syntax, like the (Bash) shell has
  • PHP here documents do allow turning off PHP warning messages though:
$mailer->body = @<<<FORM
Firstname = {$form['firstname']}
Lastname = {$form['lastname']}
FORM;

HTML Within PHP Loops

  • If there's HTML code within a loop, use the end-style looping constructs
  • Lets you leave the HTML as HTML
    • Instead of wanting to print it with echo statements
      • Which gets messy due to escaping, quoting, etc.
  • Makes it much easier to see the end of the loop
  • Don't indent the PHP code that implements the loop
    • So the looping stands out
    <tr>
      <td colspan="4" align="left" valign="top" ><span class="fieldLabel">Race:</span> <br>
<? foreach($RACE_ARRAY as $key=>$value): ?>
        <input name="RACE" class="formText" type="radio" value="<? echo $key; ?>" <? echo $RACE == $key ? 'checked="checked"' : ''; ?>>
        <? echo $value; ?>
        <br>
<? endforeach; ?>
      </td>
    </tr>

PHP Tags

It's preferable not to use short tags, as short tags are not XHTML or XML compliant. However, it's not expected that PHP code should be valid, only the output of a PHP document after it has been run through the PHP processor. So this requirement is not too important, unless you plan to run your PHP code through any XML tools, or if you're writing a library of code that others might use.

If short tags are enabled, you cannot directly include the XML definition in the XML prolog. Instead, you'll have to have PHP echo it: <?= '<' . '?xml version="1.0" ?' . '>' ?> Note that I separated the question marks from the angle brackets, because some text editors would have problems interpretting the PHP otherwise.

However, short tags are great for inserting the values of variables into the HTML portion: <p>So I went to the store with <?= $friend ?> today.</p>

If you do use short tags, ALWAYS put a space after the <? (and before the ?>), as otherwise the processing instruction is ambiguous, and if you use any other tools on the file that recognize PIs, they will fail.

In addition to long tags (<?php echo "hello";?>) there is the <SCRIPT> syntax:

    <script language="php">
    #<![CDATA[
        echo 'Some HTML editors don't like processing instructions.';
    #]]>
    </script>

I would not be surprised however, if some PHP editors don't know to look in there for PHP code.

The following code MIGHT work for replacing all short open tags with long open tags:

find -name '*.php' | xargs perl -pi -e 's/<\?= ?(.*?) ?\?>/<?php echo($1); ?>/g'
find -name '*.php' | xargs perl -pi -e 's/<\?/<?php/g'
find -name '*.php' | xargs perl -pi -e 's/<\?phpphp/<?php/g'

(from http://us2.php.net/manual/en/language.basic-syntax.php#60798)

For files that contain only PHP code, the closing tag ("?>") is not required by PHP. Not including it prevents trailing whitespace from being accidentally injected into the output. (From Doctrine style guidelines.)

Conclusions:

  1. ALWAYS use long tags in library code.
  2. ALWAYS use long tags at the beginning of all PHP files.
  3. Use long tags if you plan to run your PHP code through any XML tools.
  4. Otherwise, use short tags.
  5. If you use short tags, take advantage of the <?= value ?> syntax.

Include Files

Try to place include files outside of DocumentRoot. Preferably in the directory above DocumentRoot.

Don't name include files *.inc, as web servers will generally serve them uninterpreted, allowing attackers to see the source code, and potentially see database passwords. Just name them *.php.

Add an .htaccess file to the includes directory to prevent direct access.

Add this to top-level Apache config or .htaccess file, just in case someone uses *.inc files.

<Files ~ "\.inc$">
	Order allow,deny
	Deny from all
</Files>

Check to see if someone is trying to run an include file directly, because include files running out of context might do something unpredictable and give up sensitive info.

if ( realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME']) )
	exit ('Cannot run this file directly!');

This can also be used to run unit tests if an include file is run directly.

if ( realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME']) )
	run_unit_tests();

Don't override the include_path -- extend it:

set_include_path('/new/path/to/include' . PATH_SEPARATOR . get_include_path());

require_once/require/include_once/include - almost always want require_once. Note that these are not functions, so we don't need parentheses.

Initialization

Use open_basedir to restrict all PHP scripts to only be able to open files within the specified directories. (Works regardless of whether safe_mode is enabled.) For best results, this should be set for all virtual hosts in the Apache configs:

php_admin_flag open_basedir /path/to/basedir

Use a custom error handler. For production sites, don't display errors on-screen to the users. Send yourself an email.

If you're on a shared host, you should change the location where your sessions are stored, so that you're not sharing the same place in /tmp where all the other virtual hosts are using and can easily access. You can either set the session.save_path INI setting, or call session_save_path('/new/path') before calling session_start(). This is a bit of security by obscurity; the stronger way is to keep session info in the database. (See the Storing Sessions in Database section below.)

Check for magic_quotes. Hope that it's off, but if not fix things w/ stripslashes() and unset the INI variable.

magic_quotes_gpc, magic_quotes_runtime

Here's code from Harry Fuecks at SitePoint:

	if (get_magic_quotes_gpc()) {$_GET = array_map('stripslashes', $_GET); $_POST = array_map('stripslashes', $_POST); $_COOKIE = array_map('stripslashes', $_COOKIE);}
	// TODO: I think we need to strip slashes from $_REQUEST and $_FILES too.

Here's some code from http://www.phpguru.org/article.php?ne_id=58 called dispelMagicQuotes():

function remove_magic_quotes_gpc ()
{
	if ( !ini_get('magic_quotes_gpc') )
		return;
	// TODO: Apparently, $_FILES[]['name'] also has magic quotes applied.
	foreach ( array('_GET', '_POST', '_COOKIE') as $super ):
		foreach ( $GLOBALS[$super] as $k => $v ):
			$GLOBALS[$super][$k] = stripslashes_r($v);
			// TODO: I think we need to strip slashes from $_REQUEST too.
		endforeach;
	endforeach;
	ini_set('magic_quotes_gpc', false); // TODO: Make sure this works as expected.
}
/**
* Recursive stripslashes. array_walk_recursive seems to have great trouble with stripslashes().
* @param  mixed $str String or array
* @return mixed      String or array with slashes removed
*/
// NOTE: Might be able to replace this with return is_array($str) ? array_map('stripslashes', $str) : stripslashes($str);
function stripslashes_r ( $str )
{
	if ( !is_array($str) )
		return stripslashes($str);
	foreach ( $str as $k => $v ):
		$str[$k] = stripslashes_r($v);
	endforeach;
	return $str;
}

Check for register_globals. Perhaps unregister any globals from G/P/C/E/S. (See my code below.)

Turn off allow_url_fopen.

When using a database, include/database.php should probably open the connection.

  • Because why would we include it if the page didn't need it?
  • If there's only ever 1 database used by the application at once, select it at this time as well.

Example code:

$DB_CONNECTION = mysql_pconnect($DB_CONFIG['HOST'], $DB_CONFIG['USER_NAME'], $DB_CONFIG['PASSWORD']) or exit(mysql_error());
mysql_select_db($DB_CONFIG['DATABASE'], $DB_CONNECTION);

Input Validation

Clean all data that comes from users. Provide a whitelist of allowed characters. I like to start with a minimal set, and add any other characters that might be required. I change any disallowed characters to an underscore:

$name = preg_replace('/[^a-zA-z0-9.-]/', '_', $_POST['name']);

You should know if you're expecting a GET or POST (usually POST); don't use $_REQUEST, except in rare instances. This can help eliminate CSRF vulnerabilities.

When stripping HTML, use strip_tags() and htmlentities() -- not addslashes().

When stripping data for use in SQL, use mysql_real_escape_string($_POST['username']);

Use tokens to verify intent and that user came from a form. (See http://shiflett.org/articles/cross-site-request-forgeries for more info.)

if ( $_SERVER['REQUEST_METHOD'] == 'POST' ) {
	if ( $_POST['token'] != $_SESSION['token'] ) { // can also check token_timestamp for recentness
		exit('Invalid token!'); // or add an error message and redirect_to_self();
	}
} else {
	$token = md5(uniqid(rand(), TRUE));
	$_SESSION['token'] = $token;
	$_SESSION['token_timestamp'] = time();
	<input type="hidden" name="token" value="<?php echo $token; ?>" />
}

Filtering

Use PHP 5.2's Filter extension.

$html = filter_input(INPUT_GET, 'html', FILTER_SANITIZE_SPECIAL_CHARS);
$url = filter_input(INPUT_POST, 'url', FILTER_SANITIZE_ENCODED);
$myinputs = filter_input_array(INPUT_POST, array('product_id' => FILTER_SANITIZE_ENCODED,
		'component' => array('filter'    => FILTER_VALIDATE_INT,
		'flags'     => FILTER_FLAG_ARRAY,
		'options'   => array('min_range' => 1, 'max_range' => 10)),
        ));
$email = filter_var('bob@example.com', FILTER_VALIDATE_EMAIL));

Error Messages

First of all, you should be using a custom error handler, to prevent the user from seeing any PHP errors. (See above.)

For error messages you want the user to see, there are 3 methods:

  1. Display a JavaScript alert(). (Generally reserved for VERY SERIOUS problems.)
  2. If you're redirecting to another page, you'll need to use a session.
  3. If you're not redirecting to another page, you can simply collect the errors and show them.
$MESSAGES = array();
function add_error ( $text ) {
	global $MESSAGES;
	array_push($MESSAGES, $text);
}
if ( !valid($XYZ) ) {
	add_error("ERROR: Cannot do that because XYZ is invalid.");
}

For errors related to form entry, there are 2 places to display them:

  1. At the top, above the first entry field of the form.
  2. Next to (or above or below) the entry field that the error pertains to. In this case, our $MESSAGES array should be an associative array; each field will (potentially) have an associated item (or maybe even array of items) in the array.

Fixing register_globals

See http://us2.php.net/manual/en/faq.misc.php#faq.misc.registerglobals for resolving issues if it's turned off, and you need it on, or if it's on and you want it off.

But if you want it OFF, just set it in your .htaccess file.

php_flag register_globals off

Note that $_SESSION is not populated until session_start() is called.

If you want to UNSET everything in $GLOBALS that got set due to register_globals, then run this function (from http://www.phpguru.org/article.php?ne_id=60, where they called it dispelGlobals):

function unregister_globals ()
{
	if (!ini_get('register_globals'))
		return;
	if (isset($_REQUEST['GLOBALS'])) 
		exit('GLOBALS overwrite attempt detected');
	# Variables that shouldn't be unset. TODO: Should we add $_SESSION; does it get auto-globaled? Is $_FILES auto-globaled?
	$noUnset = array('GLOBALS', '_GET', '_POST', '_COOKIE', '_REQUEST', '_SERVER', '_ENV', '_FILES');
	# TODO: Should we sort these into ini_get('variables_order')?
	$input = array_merge($_GET, $_POST, $_COOKIE, $_SERVER, $_ENV, $_FILES, isset($_SESSION) ? (array)$_SESSION : array());
	# TODO: Consider looping through $GLOBALS instead of $input, and checking for $input instead of $GLOBALS.
	foreach ( $input as $k => $v ):
		if ( !in_array($k, $noUnset) && isset($GLOBALS[$k]) ):
			unset($GLOBALS[$k]);
		endif;
	endforeach;
	# NOTE: This doesn't actually do anything except flag for later users. Don't do this unless the above code worked.
	ini_set('register_globals', false); # TODO: Make sure this works as expected.
} 

File Downloads

  • Allow user to click on a link to save the linked document
  • The resulting document should use the following code:
header('Content-Disposition: attachment; filename="' . $filename_to_save_as . '";')
  • NOTE: IE cannot have spaces or a colon (:) in the filename
    • Recommended to replace those with underscores (_)

Database Configuration

On shared hosting, it's nearly impossible to hide your files from other virtual hosts, since all virtual hosts run as the same user. The best advice is to store all information in the database. However, you need some way to store the database name, database user, and database password.

One way to accomplish this is to set some environment variables in the Apache config files, and make the config files readably only by root (and the owner of the files). Apache is started by root, but then changes to a less-privileged user AFTER it has read the config files and started listening on the port(s).

Unfortunately, this has to be done at the VirtualHost level, and won't work from .htaccess files, since Apache has already dropped root privileges once it gets that far.

It might be a good idea to use an Apache-config include file to store the protected info, so it's easy to ensure that it is protected. Here's an example:

In /etc/apache2/sites-available/mysite:

<VirtualHost *>
	ServerName mysite.com
	ServerAlias www.mysite.com
	UseCanonicalName On
	DocumentRoot /home/web/mysite.com/public
	Include /home/web/mysite.com/environment
	<Directory /home/web/mysite.com/public>
		AllowOverride All
		Options Indexes FollowSymLinks MultiViews IncludesNoExec
		Order allow,deny
		Allow from all
	</Directory>
</VirtualHost>

Then in /home/web/mysite.com/environment:

  SetEnv DB_USER "myuser"
  SetEnv DB_PASS "mypass

If you only want to set the environment variables for PHP files within a certain directory, you can do this:

  SetEnvIf Request_URI "/path/to/my/directory" DB_PASS="mypass"

In the PHP code, you can then refer to these as $_ENV["DB_USER"] and $_ENV["DB_PASS"]. (They're also available in $_SERVER.)

The environment file can be owned by the site owner, and made readable only by the site owner and root.

chown mysite:mysite /home/web/www.mysite.com/environment
chmod 600 /home/web/www.mysite.com/environment

Note that if this file is changed, Apache will need to be restarted, or at least reload its configuration. But since the system admin has to create the database info anyway, this doesn't seem to be a major problem -- the system admin should just add this to his procedures for creating a new virtual host.

(Idea from PHP Cookbook via Chris Shiflett.)

Another way to do it is to use UNIX permissions, and use PHP via CGI instead of mod_php. This will run PHP as your user ID, instead of Apache. However, there's a serious performance impact from having to start a new process for each request. If you're on a host that doesn't have PHP running through CGI (or suExec or suPhp), you can implement it yourself via the technique described at http://www.sonic.net/support/faq/advanced/phpwrap/.

Database Layers

  • Lots to choose from
    • database-specific
    • PDO
    • ADOdb
    • PEAR::DB
  • For PHP 5.1 or newer, use the built-in PDO
    • Abstracts various databases
  • Direct access to MySQL
    • mysql
    • mysqli

Database Abstraction

See the section above about where to store the database configuration parameters (database name, database username, database password).

See the section above about initializing (opening the connection and selecting the database instance) the database in include/database.php file. (Although for my framework, I open a (possibly pooled) connection for each table, and include the connection in all calls to the database instead of setting the default database.)

Should have a thin abstraction library to wrap SQL calls. It should properly escape user-supplied data.

Example code:

$where_clause = array('id' => $user_id); # $user_id gets properly escaped. Multiple items are ANDed.
$where_clause = 'status = 2 OR status = 4'; # NOT RECOMMENDED! But useful for ORing conditions.
$options = array('order' => 'id ASC', 'limit' => 10, 'group_by' => 'name');
$column_names = array('id', 'name', 'middle' => 'middle_name'); # Not sure which order is best on the last one.
$column_names = 'id, first_name, last_name, middle_name AS middle';
$new_data = array('id' => 2, 'name' => 'Bob'); # Must be an associative array.
sql_select_all('*', table_name, where_clause, options)
sql_select_one(column_names, table_name, where_clause, options)
sql_insert(table_name, new_data, options)
sql_update(table_name, where_clause, new_data, options)
sql_delete(table_name, where_clause, options)

PDO example:

set_exception_handler(custom_exception_handler);
PDO::setAttribute("PDO::MYSQL_ATTR_USE_BUFFERED_QUERY", true);
$db = new PDO('mysql:host=localhost;dbname=testdb', 'username', 'password', array(PDO::ATTR_PERSISTENT => true));
$db->exec('INSERT blah...');
$result = $db->query('SELECT * from table');
foreach ($result as $row) {print $row['first_name'] . $row['last_name'];}
$statement = $db->prepare('SELECT * from table WHERE first_name = :name AND last_name = ?');
$statement->bindParam('name', 'Craig');
$statement->bindParam(1, 'Buchek');
$db->beginTransaction();
$statement->execute();
$statement->rowCount();
$result = $statement->fetch(); # Does this fetch all rows, or just 1?
$db->commit();
$db->rollBack();

Object-Relational Mappers

  • Object-Relational Mapper (ORM)
  • Lets you easily use database rows as objects

Storing Sessions in Database

Per Chris Schiflett:

   CREATE TABLE sessions (
     id VARCHAR(32) NOT NULL,
     access INT(10) UNSIGNED,
     DATA text,
     PRIMARY KEY (id)
   );
session_set_save_handler('_open', '_close', '_read', '_write', '_destroy', '_clean');
function _open()
{
    global $_sess_db;
    if ($_sess_db = mysql_connect('127.0.0.1', 'myuser', 'mypass')) {
        return mysql_select_db('sessions', $_sess_db);
    }
    return FALSE;
}
function _close()
{
    global $_sess_db;
    return mysql_close($_sess_db);
}
function _read($id)
{
    global $_sess_db;
    $id = mysql_real_escape_string($id);
    $sql = "SELECT data FROM sessions WHERE id = '$id'";
    if ($result = mysql_query($sql, $_sess_db)) {
        if (mysql_num_rows($result)) {
            $record = mysql_fetch_assoc($result);
            return $record['data'];
        }
    }
    return '';
}
function _write($id, $data)
{
    global $_sess_db;
    $access = time();
    $id = mysql_real_escape_string($id);
    $access = mysql_real_escape_string($access);
    $data = mysql_real_escape_string($data);
    $sql = "REPLACE INTO sessions VALUES ('$id', '$access', '$data')";
    return mysql_query($sql, $_sess_db);
}
function _destroy($id)
{
    global $_sess_db;
    $id = mysql_real_escape_string($id);
    $sql = "DELETE FROM sessions WHERE id = '$id'";
    return mysql_query($sql, $_sess_db);
}
function _clean($max)
{
    global $_sess_db;
    $old = time() - $max;
    $old = mysql_real_escape_string($old);
    $sql = "DELETE FROM sessions WHERE access < '$old'";
    return mysql_query($sql, $_sess_db);
}
session_start();
// Use $_SESSION['varname'] however you want;

Helpers

  • Small piece of code that is useful in many locations, that does one simple thing.
  • Often used to help format things, like dates, money, etc.
  • Often used in validation of data.
function redirect_to ( $url )
{
    # Make sure URL is absolute. Required to be absolute by section 14.30 of RFC 2616.
    $url = absolute_url($url);
 
    # Output headers telling the browser to redirect to the desired URL.
    # NOTE: Should actually use a 303 return code, but most browsers don't handle that.
    #       See http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html for details.
    header('HTTP/1.1 302 Moved Temporarily');
    header('Status: 302 Moved Temporarily');
    header("Location: $url");
 
    # We don't need to continue running any more PHP code, so exit.
    # Just in case the browser doesn't understand the headers we passed, send the new URL in the body.
    exit("Go to <a href='$url'>next page</a>.");
}
function absolute_url ( $url ) {}
function validate_email_address ( $address ) {}
function validate_phone_number ( $phone_num_str ) {}
# From http://www.zend.com/lists/php-dev/200506/msg00232.html
function firstNotEmpty() {
    $vars = func_get_args();
    foreach($vars as $var) if (!empty($var)) return $var;
    return NULL;
} 

Define array_get() if it's not already defined. Use it whenever you need to use a default if an array key/value isn't defined.

# From http://marc.info/?l=php-dev&m=118208519027980&w=2
if ( !function_exists('array_get') )
{
	function array_get ( /*array*/ $arr, /*mixed*/ $key, /*mixed*/ $default = false )
	{
		if ( array_key_exists($key, $arr) )
			return $arr[$key];
		else
			return $default;
	}
}

Dates

Whenever possible, output dates in ISO 8601 format (YYYY-mm-dd). This format is unambiguous to any human reader, and it also sorts properly in any alphanumeric sort.

See this GNU manual for an interesting and extensive treatise on the subject of date/time input formats.

Look into PHP 5.2 date_create(), date_modify(), etc for date manipulation. Should probably use them in deference to date(), mktime(), and gmdate().

OOP

class MyClass
{
    public $anyone_can_access;
    protected $name = 'Bob';
    private $secret;
 
    function __construct($secret = 'password') {
        $this->secret = $secret;
    }
    function __destruct() {
        # NOTE: Don't call anything in here that might throw an exception.
    }
    public function xyz() { echo "Hello, $name!"; }
}
 
class OtherClass extends MyClass
{
    function __construct($name, $secret) {
        # Cannot access $this->secret;
        $this->name = $name;
        parent::__construct($secret);
    }
}
 
$x = new MyClass('better_password');
$y = new OtherClass('craig', 'rt35erIM4');
 
$x->xyz();
$y->xyz();

Normal Form

  • If you're processing a form, your PHP code should typically have a layout similar to this:
<?php
if ( 'POST' == $_SERVER['REQUEST_METHOD'] ):
	// ... process ...
	$name = ...;
	$show_name = ...;
	// ... process ...
	if (!errors):
		redirect_to($VIEW_URL); // Or possibly $SELF, if the only view we have is the same as the edit form.
	else:
		// Redirect to the same page, but as a GET so that refreshing the page works.
		// NOTE: Have to use sessions to retain error messages from this POST to that
		redirect_to_self(); // Or just fall through to HTML below. (Delete else clause.)
	endif;
endif;
function abc () { .... }
function xyz () { .... }
?>
<html>
....
<?php if ( $show_name ): ?>
      <p>Name: <?php echo $name; ?></p>
<?php endif; ?>
....
</html>
  • The HTML portion should not contain any multi-line PHP.
  • The PHP portion should not contain any multi-line HTML inside of echo/print statements.
  • If you accept HTTP DELETE or PUT verbs, check for them the same way as POST, but check for them before POST, and also allow setting them via $_POST['_method'] or something.

MVC

  • Model - domain objects
  • View - primarily HTML
  • Controller - application logic
  • Web MVC is very different than GUI MVC
    • If you know about GUI MVC, forget about it when leaning Web MVC
    • Model does not need to worry about updating views
  • Primarily used with data-based applications
  • Very helpful for larger applications

Controller

  • Top part of the "normal form"
  • Checks input, decides what to output

View

  • Bottom/HTML part of the "normal form"
  • Should contain minimal PHP code

Model

  • Mostly the "include" files
  • Domain objects
    • Model your business needs in code

Frameworks

  • Provides the basic "framework" of an application to get you started
  • Sort of like a "blank" application that doesn't do anything (yet)
  • You "fill in" the parts specific to your application
  • Provides lots of routines common to (a certain subset of) applications

PEAR

  • Library of code for PHP
  • Much like Perl's CPAN
  • Command-line program to access it
  • Code is of varying quality
  • Libraries often have dependencies
pear list-all
pear list-channels
pear install xyz
pear search phpunit

Testing

  • Recommended for automated testing of code
  • Can test small pieces (unit testing)
  • Can test the user's view (acceptance testing)
  • SimpleTest (unit or acceptance tests)
  • PHPUnit (unit tests)
    • NOTE: Several versions exist
  • PHPT (unit tests, used to test PHP itself)
  • Selenium (acceptance tests, written in JavaScript)
  • Fitnesse (acceptance tests)

Problems with PHP

  • No namespaces.
  • Have to use $this all the time.
    • Should be implicit in many places.
  • Cannot directly get an item out of an array returned by a function.
    • $first_item = $this->calculate_items()[0]
    • $items = $this->calculate_items(); $first_item = $items[0];
  • Array literal syntax is bad.
    • $x = array(1, 2, 3);
  • Function names are inconsistent.
  • OOP feels bolted on.
  • Too much need for ob_start()/ob_end() to capture output.
    • Too many functions echo output, instead of returning result as a string.
  • Poor object/collection hierarchy.
  • Confusing way to declare and use instance variables.
    • var $name; $this->name
    • $this->$name means something very different than $this->name

Variable Arguments

func_get_args() # returns array of arguments
func_get_arg(n) # returns the nth argument
func_num_args() # returns number of arguments passed to the function

Summary

  • Set up a good PHP work environment
    • Make it as close to the server setup as possible
  • Use good (consistent) coding conventions
  • Know when to use which type of quoting
    • Don't print out HTML -- use HTML mode
  • Make effective use of include files
  • Form handling and input validation
  • Database abstraction
  • Use a Model-View-Controller architecture
    • Separate code from HTML
    • Separate page controller logic from database object logic
  • Use testing to ensure you don't break things as you go
  • Use libraries, frameworks, and packages when possible

Resources

  • PHP In Action -- very good book for advanced topics
    • Best description of web MVC I've seen
  • The Pragmatic Programmer
  • Practices of an Agile Developer

Thanks

  • My Clients
  • STLLUG / SLUUG
  • PHP developers
  • Rails developers

Presentation Info