DomDocument::loadXml not throwing exceptions in PHP

For some reason, Zend have decided not to make PHP throw an exception when you try and load invalid XML into a DomDocument object. This includes XML with invalid characters e.g. &. This means wrapping a try/catch around anything does absolutely no good whatsoever, which is annoying.

Is there a good way of trapping runtime errors using loadXml with the DomDocument library?

PHP aaaaarrrrrggggghhhhhh

Things I hate about PHP No. 512 (notwithstanding the fact that in comparison to many other things I like PHP):

Inconsistency in parameter ordering

Say I want to find a small thing in a big thing, like a piece of a string in a bigger string. I can use strpos and you pass in the string you’re looking in first, and the string you want to find second. So strpos(“PHP aaaarrrgggh”,”PHP”) returns 0 (or false, unless you’re using ===, but I’m sure that’s been ranted about enough).

But if I want to find, for example, a key in an array I pass in the key I’m looking for first and the thing to look in second. So I have to remember the exact syntax of every single miniscule stupid little command and which order to pass parameters in because there’s absolutely no consistency.

PHP 4 foreach as references

PHP 4 doesn’t seem to create references for objects in foreach loops. e.g. the following will not change the original objects:

foreach($placeholder as $contentBlock)
{
 $contentBlock->setPosition(1);
}

The value of the position property in the original object will remain unaffected. This is quite rubbish. To get round it you need to get all the array keys and then create a separate variable to store a reference to the object you want to change:

foreach($placeholder as $key => $value)
{
 $contentBlock =& $placeholder[$key];
 $contentBlock->setPosition(1);
}

The variable $value isn’t used at all but if you’re stuck using PHP 4 then this is the best you can do. Luckily PHP 5 works much more sensibly, but we don’t always get a chance to use that so I’m stuck with this slightly messy alternative.

Foreach in the PHP docs

PHP 4 annoyances ( = rant)

After working with PHP 5 for quite some time now, I’ve had to go back to PHP 4 to develop a CMS for a client’s site where we don’t have much over the hosting environment. Going back to the old version has annoyed me quite a few times already and it’s probably made worse by the fact that I never really did much PHP 4 anyway (I’m quite new to the language, even though I have been programming in it on and off for 2 years) and lately I’ve been working in a combination of PHP 5 and C# (as well as various others) and generally being a bit more object oriented about everything.

First off there’s the whole not really object orientedness of PHP 4. PHP 5 isn’t that object oriented, but PHP 4 is miles off. Passing variables by value instead of reference unless you put stupid & symbols everywhere… whose idea was that? Also it’s obviously a bit hacked together behind the scenes, as exemplified by the fact you can’t do things like $this->getThing()->getOtherThing() without it barfing. Instead you have to put the returned value from the first thing into a variable and then call the next method… not a good sign in an interpreter.

You can just about force yourself not to access member variables of objects directly, even if you can’t set things to be protected or private, and make get() and set() methods everywhere, but sometimes it’s really useful to put things into static classes. Database routines would be really useful, for example. Instead I’ve taken to creating a great big global variable to hold an object and refer to it everywhere and use the member variables of that object to store other useful objects. At least methods can be called statically, but without static member variables they’re not as much use. (I use a similar technique when programming in Lingo to save having to reference all my globals in each script: store them as properties of another object that keeps state for the whole application.)

And very little XML support either… Even ASP could handle XML well, and even XSL, four or five years ago, and ASP as we all know is the lowest form of programming language ever invented. (Okay, there are worse languages than VBScript, but not many that site behind so many important web sites. I once worked on a half million pound project that was built in ASP… Scary!)

Rant ends

PHP 5 Static class variable inheritence

PHP 5 doesn’t seem to attempt to implement any kind of inheritance for properties within static classes. This can be mean having to duplicate code within static subclasses. As an example, we would like to have a parent class (this is using the Singleton pattern) such as this:

abstract class DataTable
{
 protected static $instance;

 protected function __construct()
 {
  parent::__construct();
 }

 public static function getInstance()
 {
  if(!isset(self::$instance))
  {
   self::$instance = new self();
  }
  return self::$instance;
 }
}

And then to create a subclass to inherit from it:

class PageTable extends DataTable
{
}

And then creating an instance of PageTable could be done by running a line of php such as this:

$pageTable = PageTable::getInstance();

Which would then return an instance of PageTable.

Then we might want another subclass such as:

class ContentTable extends DataTable
{
}

And so be able to create an instance of ContentTable with:

$contentTable = ContentTable::getInstance();

However, what really happens is that first PageTable::getInstance() is called and since PageTable itself doesn’t have a method defined as getInstance() the parent method is called. This sets $instance as a reference to the newly created object. However, the $instance property belongs to DataTable and not to PageTable. Even declaring a $instance property local to PageTable has the same result as in this case self::$instance refers to the parent class (which is where the method is run from). Thus when ContentTable::getInstance() is called, the value of self::$instance is detected as being set and the original instance of PageTable is returned. This is obviously not the intention.

The only way around this is to duplicate code by adding an $instance property to each subclass, as well as a duplicate of the getInstance() method.

As an aside, the same is also true of class constants which are also referred to with the self:: caller. From this we can infer that self:: always refers to the context of the current code block not the class from which a method is called.

Using importNode and appendChild with PHP 5 DOM

importNode is one of the DOM functions in PHP 5 that I struggled with for a while. What I wanted to do was to take an XML node from one document and insert it into another, and somehow that wasn’t particularly easy. I tried using ‘appendChild’ but kept getting ‘wrong document’ error messages. Once it was working it seemed obvious, but the example in the documentation wasn’t entirely clear to me.

As an outline, the steps that need to be gone through are:

  1. Import the node you want into the destination document. This is done by calling the importNode method and storing the result in a variable (the crucial step). At this stage, the node is in the document, but won’t appear anywhere if you print it out which may seem odd.
  2. Append the stored node to the destination document in the place you want it

The thing that threw me is that the importNode method doesn’t so much import the node as make a copy of it in the destination document – so the original is actually left untouched. This seems to be standard across XML DOM methods in other languages such as C# and JavaScript.

The code then is as follows. This is for taking a complete document and moving it into a new DOMDocument object. The existing xml is assumed to be loaded into $oldXML

$xml = new DOMDocument();
$xmlContent = $xml->importNode($oldXML->documentElement,true);
$xml->appendChild($xmlContent);

It is important to use the second parameter ‘true’ in importNode as this tells the method to import all children as well as the selected node. The node that $xmlContent is appended to can be any DOMElement. Note that importNode is a method of the DOMDocument (and must always be) as it is the document as a whole that the new node is being imported into, not a specific node.

importNode in the PHP 5 DOM documentation
appendChild in the PHP 5 DOM documentation

Using removeChild with the PHP 5 DOM

The documentation for PHP 5’s DOM functions isn’t at its most helpful yet, so I thought an example of how to use ‘removeChild’ wouldn’t go amiss.

Assuming first that you have some DOMDocument XML in a variable called $xml, that may look something like this:


My title

The first thing to do is to get a handle on the node you want to remove. We’re going to remove the node named ‘title’:

$node=$xml->getElementsByTagName("title")->item(0);

Then, simply remove $node from its parent:

$xml->getElementsByTagName("labels")->item(0)->removeChild($node);

Not too much to it at all.

PHP DOM removeChild documentation

PHP loop benchmarking

This benchmark of different ways of looping over a hash array is very interesting. Some things aren’t too surprising, such as counting the elements in an array before looping over them is faster than not counting them, could be expected, but it’s good to see how fast the built-in functions run.

PHP loop benchmarking

PHP 5 and the magic __toString() method

Working with PHP 5 I thought the ‘magic’ method __toString would be a really great way of substituting objects for simple data types. That seemed the whole point of good object oriented design, so I could change the way a piece of data worked without having to track down every call to it and change that. Unfortunately, it seems that __toString() is only called if used directly from an echo or print statement. Now: what is the point of that, really?

I can see that we don’t want it called by default everywhere, otherwise there would never be a way to grab a reference to an object. But surely there are a large number of ‘string only’ functions that could invoke it? If I set up an object called $parameter with a method of __toString() and call it in code with echo “This is the value: ” . $parameter then there really can’t be much else I’m doing with it than using it as a string. If the toString() method isn’t called then, why do we have it at all?

It seems that this may be another case where PHP’s lack of strong typing is severely limiting it’s future development as a robust object oriented language and the __toString() implementation smacks of a half-implemented hack to me.

PHP 5 magic __toString() method on the Zend site