HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

HipHop PHP -- some guidance for programmers

HipHop PHP -- some guidance for programmers We've recently switched our site over to Facebook's re-implementation of PHP, HipHop PHP ("HPHP"). This is a PHP compiler that gives over 2x speed improvements. As we're one of the first sites other than Facebook to implement this, it's been a bit of a case of navigating the stormy early-adopter waters.

After sorting out our source code a few months ago to be able to run on HPHP, we've now actually gone through the process of finding (and working around) all the little bugs HPHP has. I've found and reported about 30 now. Right now there's certainly a need for caution and care in adoption if you're moving over an existing code base. This blog post will talk about some of the things a programmer should do (lessons I've learnt), beyond the obvious need for them to make sure that their codebase meets HPHP's restraints (no 'eval', etc).

Do not compile at first

HPHP contains an interpretor as well as a compiler. It makes things a lot easier to test things on the interpretor before you think about compiling. This will help you find compatibility problems as the interpretors still runs mostly the same PHP API that the compiler uses. Plan for problems – it won't be completely smooth sailing, and the last thing you need is an inability to step through code to try and debug it. Each compile on HPHP is a complete recompile, so takes a few hours, a nightmare if you're trying to test code changes, or run test scripts.

After you've been running without errors, on a live site, for a few days, you can then think about compiling.

Don't rush into things as a performance quick-fix, you need to plan to go through this as a process, gradually working work your way up to full compilation.

Faster compilation

When you do compile, you may find it can take many hours to complete. This is mostly unavoidable for those of us who don't have our own server farms to use (distcc). However there are two things you can do to speed compilation…

  1. Install 'ccache'. This will speed things a bit for identical recompiles, but to be honest not hugely in most scenarios because if you do more than trivial code changes you'll get mostly just cache misses.
  2. Make sure you tell make to do parallel compilation (so multiple core's/CPU's are properly utilised). export MAKEOPTS=-j4.

Use nginx

Nginx setup is described on the HPHP wiki. It is great for a number of reasons:
  1. You can support SSL with it.
  2. You can proxy specific paths or domains that have compatibility issues with HPHP to an Apache instance running on a different port or a different server. Remember that there are many bugs in HPHP at the moment, and also some restraints, so if you rely on third party systems for parts of your site (e.g. a bug tracker, or wiki software), you will need to plan to off-load them.
  3. You can set up fallbacks. If you have to take HPHP offline, or if it crashes, you don't want to have to worry about continuation of service. You can set up automatic fallbacks to Apache.

Fallbacks can be defined like this…

Code

      location /whatever {
        ...
        error_page 502 = @fallback;
      }

      location @fallback {
        ...
      }

Plan for crashes

HPHP may well crash, so you need to plan for it. Currently I am having a number of different intermittent cases that cause segmentation errors. I'm sure in due-time Facebook will fix all the issues (I plan on reporting them properly when I have a proper debug-build/core-dump environment running).

Code

Right now I would suggest just running HPHP in a loop...
while true; do
   sudo ./program (...) >> data_custom/errorlog.php
        date >> data_custom/errorlog.php
done

I load up this script using 'screen', and all together it solves four things:
  • Initialisation
  • Automatic re-run if it crashes
  • 'screen' keeps it running in the background even if I close my console, but at the same time I can pull it back up and watch it's output
  • I also explicitly redirect output to my own log file because PHP error logging isn't implemented directly

Put aside time to consider build scripts and configuration

I'd advise reading the HPHP wiki very carefully. There are quite a few options, and you should consider exactly what you want and make sure you understand them.

You'll also likely need to be quite careful in what files you pass to HPHP for compilation. I have a script that invokes 'find' to search the filesystem for a list, with a number of exclusions, which is saved to a file and then passed to the compiler via '–input-list'.

You'll probably want to write a few scripts. I have three:
  • One for a full compilation
  • One to invoke a compiled build as a web server
  • One to invoke hphpi as a web server
It's better to write reusable scripts than do it ad-hoc, as there are quite a few options to pass through.

Don't rely on PHP files for configuration

It's very common for web systems (such as our own ocPortal) to use PHP files for configuration. This is more secure on shared hosting environments. However, you don't want to have to do a full recompile in the event you need to change options.
I suggest you use the PHP file to chain-load an ini file of settings, using parse_ini_file.

HTTPS

You can get HTTPS via nginx. It actually is surprisingly easy to set up. However, one small word of caution – because you're proxying HTTP requests and then sending them out as HTTPS, your code will not be able to dynamically detect HTTPS is running (a common technique, to know whether to serve pure-HTTPS resources and hence avoid the mixed-security browser warnings). Plan for this by either:
  1. writing your code to be aware of when it should be running HTTPS
  2. pass through some kind of extra parameter from nginx
  3. write your code to not need to know if it's running HTTPS, by using relative URLs for CSS sheets, javascript files, etc. This isn't always viable, especially if you're using scripts hosted elsewhere.

Read the outstanding bugs

Take a look at the open bugs. Right now the documentation isn't great, and there are some compatibility issues described that you should be aware of.

How HPHP compares to other PHP compilers

HPHP certainly isn't the first PHP compiler. In fact there are quite a few. However, it's the first PHP compiler that really has the backing to be used in real-world scenarios. The other ones, whilst ingenious, are even buggier than HPHP and have no real ongoing development – it's harsh but fair to say they're toys, written by very smart people, but without the resources to get things far enough.
Here's a summary:
  • Quercus. This is a Java implementation, but unfortunately there is at least one show-stopper bug in it right now to do with it corrupting data during execution. I posted it some time back and there's been no response – I think the project is basically dead as I did not get a response when I e-mailed their developers either. Running on Java inherently will also limit performance and increase complexity (as you need a JVM), so I don't think it's an ideal solution. Also they tie their most efficient implementation into their commercial licensing, and it's too expensive because it's a per-node licensing arrangement (not good when the problem being overcome is likely one of scaling). It's a shame, because Quercus can run on Google AppEngine – if it was more stable and supported, that could open up some great opportunities.
  • Project Zero. This is IBM's implementation, again in Java. It is commercially backed, but I don't think really it is about performance – it seems slower than Zend PHP to me. I think it's more about allowing PHP developers to work inside IBM's web platform. It has it's share of bugs, but I think it's the best after HPHP; ocPortal basically can run on it (after I worked around some of those bugs). I wish IBM would have teamed up with the Quercus guys for this – it's a case of NIH ("not invented here") leading to duplication of effort, and we really need focused efforts on compatibility and performance more than anything.
  • Phalanger. This is a .net implementation. The performance doesn't seem great, so it's debatable whether to use it. Moreover though, it is seemingly a very dead project, only got to where it is now due to university theses. You can't even run MySQL on the latest version, which was out months ago, because nobody has updated the MySQL plugin (you can compile it yourself – but who wants to create a full Visual Studio build environment? In fact, who wants to even deploy for Windows - or try and make mono work and suffer pain in the process?). I had to down-grade to the 32-bit version as the 64-bit one kept crashing all the time. ocPortal basically can run on it (after I worked around some of those bugs).
  • Roadsend (the original). This was once commercial (in fact, we had a license), but because Open Source a while ago. It's a very well engineered project, but there are too many bugs and now development has stopped. I also think that the performance improvements are limited by it going through Scheme as an intermediate language.
  • Roadsend (the rewrite, "Raven"). The Roadsend creators are (in their spare time) making a new implementation, but to be honest I don't know if it will ever be finished. The design seems very similar to HPHP, which of course is here today. I feel sorry for the Roadsend team, as Facebook were very secretive about their development until recently and now HPHP is out they're really sidelined, despite their genius – Facebook should have probably hired some of these compiler developers secretly early on, to pool talent and focus efforts (although Facebook have done extremely well on their own).
  • PHPC. I don't know much about this one. As far as I understand, it is built on the idea of making it easier to turn PHP code into PHP extensions, to give a performance boost. Personally I think that's quite a complex and limiting way to go about things – you get all the complexity of compilers – but with the added complexity of trying to integrate extensions into Zend PHP and make some kind of build system to do that which isn't insanely complex and disruptive to the structure of a code-base.
  • Numiton. This is a commercial one and not really (as far as I can tell) available as a product. I'd like to try it, but I can't and to be honest I suspect it's insulation from the PHP community will mean it is the most buggy of all implementations.
That seemed negative. I think the developers of these things deserve huge respect. They've approached very a very challenging problem and come up with solutions that mostly work. But the final mile and achieving a 'complete solution' and an ideal model is ongoing and extremely difficult and none of them have really made it.

The fact there are so many PHP compilers illustrates a point – the mainstream PHP implementation (Zend) is inherently a sub-optimal design. It isn't multi-threaded, which means it relies on multiple processes to dispatch requests – which means a lot more memory and process juggling than you'd ideally have. It also is an interpretor; even with opcode cacheing, it doesn't do particularly efficient execution (Java's a lot faster for example, even though that is also in some sense an interpretor). Lastly, it's a real memory hog – because of the design of how variables are stored, it takes many times more memory than it really needs to. PHP is great and Zend is great (which explains it's overwhelming popularity), but the advantages gained from compilation are very real (but only) for those who have high performance needs.

ocPortal certainly doesn't need a PHP compiler to run well, but even with the large amount of time invested in making this work, we're saving money overall compared to getting the high-performance server we'd need to take the high number of hits we get/are-planning-for as fast as we ideally want.
Besides, I enjoy this stuff and keeping our users on the cutting edge.

Whilst HPHP is certainly rough around the edges right now, it has an excellent design, and I think it makes exactly the correct trade-offs. Facebook have done an amazing job. I also expect they will continue to improve performance – a few people including myself passed over some further performance improvement ideas a few months ago which seemed to be well received, so hopefully right now Facebook are working on further speed gains (if any Facebook dev's want to reveal future plans it would please me ;)).

View all

Trackbacks

Cheap Soccer Jerseys – 16 July 2013, 10:03 PM (Cheap Soccer Jerseys)

Hi If you want to buy soccer jerseys or football shirts you can to soccerjerseybox.com as a football fans



Cheap Soccer Jerseys – 16 July 2013, 10:03 PM (Cheap Soccer Jerseys)

Hello If you want to buy soccer jerseys or football shirts you can to vipsoccerjersey.com as a football fans



Cheap Soccer Jerseys – 16 July 2013, 9:52 PM (Cheap Soccer Jerseys)

Hi If you want to buy soccer jerseys or football shirts you can to soccerjerseyeuro.com as a football fans



Edited