HTML Logo by World Wide Web Consortium ( Click to learn more about our commitment to accessibility and standards.

Moving forward with Composr

ocPortal has been relaunched as Composr CMS. ocPortal 9 is superseded by Composr 10.

Head over to for our new site, and to our migration roadmap. Existing ocPortal member accounts have been mirrored.

PHP performance: array appending - Comments

Login / Search

 [ Join | More ]

PHP performance: array appending

Posted 23 September 2012, 9:28 PM
Warning: this is a pretty technical blog post!

Something I've always wondered about with PHP, is how to efficiently append one array to another. There is no built in way to do it directly, so you need to choose something a bit inefficient:
  1. Do a loop, doing a single append in each loop (seems slow because PHP has to run the loop as PHP opcodes, when a native implementation could be in pure C).
  2. As with

Read more

I disagree with the map version being 'functional', since these
append functions aren't maps, they're folds/reductions; they
reduce a pair of arrays into a single array. Using the 'map'
function as alternative syntax to a 'for' loop doesn't make
things functional (especially when there are globals inside the
function ;) ), just like using singleton classes as alternative
syntax for globals doesn't make things encapsulated.

I'd say the functional way of doing this is 'array_merge'. A
theoretically faster functional approach would be to abandon arrays
and use difference lists since they can be appended in constant time.

I've thrown together a few functional alternatives, including explicit
folds and difference lists:
Code here

It uses a library of useful functional components which lives

Here are my results:

TIME FOR 1: 0.00004 seconds
TIME FOR 2: 0.00011 seconds
TIME FOR 3: 0.00004 seconds
TIME FOR 4: 0.00005 seconds
TIME FOR 5: 0.00002 seconds
TIME FOR FUNCTIONAL 0: 0.00006 seconds
TIME FOR FUNCTIONAL 1: 0.00016 seconds
TIME FOR FUNCTIONAL 2: 0.00020 seconds
TIME FOR FUNCTIONAL 3: 0.00046 seconds (difference lists)

The functional alternatives all go slower at your benchmark problem
because they involve multiple layers of wrapper functions, plus they
all return copies rather than modifying in-place (since they're all
pure functions). Difference lists go slowest, since they spend a lot
of time converting to and from the difference list format (in this
case, partially applied functions).

I've also added another benchmark which appends 25 arrays together,
each with 1000 elements. They are appended one at a time to simulate
the behaviour of a result being accumulated by a program (rather than
doing something more efficient by considering them all at
once). Here are the results:

1000x25 FOR 1: 0.01030 seconds
1000x25 FOR 2: 0.02666 seconds
1000x25 FOR 3: 0.16402 seconds
1000x25 FOR 4: 0.22038 seconds
1000x25 FOR 5: 0.00189 seconds
1000x25 FOR FUNCTIONAL 0: 0.23336 seconds
1000x25 FOR FUNCTIONAL 1: 0.66304 seconds
1000x25 FOR FUNCTIONAL 2: 5.23630 seconds
CREATING 1000x25 DIFFLISTS: 0.00454 seconds
APPENDING 1000x25 DIFFLISTS: 0.00023 seconds
DESTROYING 1000x25 DIFFLISTS: 16.70157 seconds

This problem is interesting since, in the normal implementations, it
takes longer and longer to append another array, as the first argument
is bigger each time. For example:

Code (php)

$result = array();
technique_1($result, $first);
technique_1($result, $second);
technique_1($result, $third);
technique_1($result, $twenty_fifth);

This doesn't matter for difference lists, since their append is
constant time, regardless of argument size. Looking at the append
operations, difference lists are an order of magnitude faster than
anything else; however the cost of converting to and from difference
lists is quite high. Converting back to an array also slows down as we
add more iterations to the benchmark, due to garbage collection (these
use a lot of memory!)

This is because we're sort of cheating, since we delay the 'real'
appending until the point where we pull an array out of the difference
list (hence the 'destroy' step takes the longest). However, this is
still a nice solution since it always ends up constructing the list
from the end backwards, like this:

Code (php)

$result = array_merge($twenty_fifth, array());
$result = array_merge($twenty_fourth, $result);
$result = array_merge($twenty_third, $result);
$result = array_merge($first, $result);
$result = array_merge(array(), $result);

This means that every append is done with the smallest-possible first
argument, and since the cost of appending is linear in the length of
the first argument this is a good solution. Note that we *use*
difference lists in the same way are ordinary arrays; this reversal
is happening internally due to function composition.

Unfortunately this particular implementation is very heavyweight
(especially with memory, due to partially-applied functions building
up), so I haven't found the breakeven point above which it's better to
use this difference list implementation rather than arrays. Also,
difference lists are ideally suited to optimisations like function
inlining, but this usually doesn't happen with PHP since it's mostly
interpreted rather than compiled.

So what's the moral? I suppose it's that I'm too honest. I should have
just shown this ;)

1000x25 FOR 1: 0.01030 seconds
1000x25 FOR 2: 0.02666 seconds
1000x25 FOR 3: 0.16402 seconds
1000x25 FOR 4: 0.22038 seconds
1000x25 FOR 5: 0.00189 seconds
APPENDING 1000x25 DIFFLISTS: 0.00023 seconds
I think you get the award for "most intense post ever" Chris :lol:.

Wow, great job.

My brain doesn't really do that level of functional coding so well, but what it basically comes down to – delay the merge, do some extra [relatively simpler] work on retrieval. It's a good strategy, and so often with optimisation, actually stepping back and thinking of it in different terms can be quite rewarding :).

You're right, I did abuse the term 'functional' a bit. To me anything with a callback (i.e. functions as variables), I'd call it functional, but no technically even PHP will internally be using a procedural loop to do array_map.

It would be quite interesting to see if changing Tempcode to lists of lists of output parts ("seq_parts" in the engine), instead of just one long list, might improve performance. So you'd just be extending a set of references rather than O(n) extending one big list. I think it probably would work, but further work would be needed to make cache serialisation optimise that structure back down again. That way you kind of get the best of both worlds:
  • Faster composition for run-time Tempcode
  • Fairly fast output, as it just becomes a 2D loop rather than all the functional overhead for the more theoretical difference-lists implementation
  • Optimised output for cached Tempcode, to reduce complexity of what needs serialising/deserialisation in caches as well as holding in run-time memory.

hmmm pretty interesting find. I certainly wouldn't have guessed those results.

Been reading the PHP source code. No wonder array_splice isn't efficient – it actually completely rebuilds the input array unnecessarily :S.
That explains the "performance-bound based on the length of the first array" thing I noticed.

Here's another technique slightly slower than 1, that works on the basis of array_push being able to take infinite parameters:


function technique_6(&$a,&$b)

It's frustrating, as if call_user_func_array accepted multiple parameters that would probably be much much faster. The performance is broken as I have to create a new array and put $a on the front, to encode the first parameter of array_push (the destination array).

I'm perfectly capable of programming in C, so I could optimise PHP and make it faster, but alas, no time :(.

While I guess every micro second counts I am wondering if beyond time differences which in those micro seconds aren't enccessarilly huge what impact would each method have on memory.

As you say some options leave you with rebuilding an array uneccessarily which I can only presume means wasted memory? And if so would any of the methods reduce that?  Perhaps if there's a large difference on memory consumption over one method to another even if that method takes a few micro seconds longer it would be the better option to work with?

I don't think memory is such an issue. It will fluctuate up and down as the code runs, but the real issue for ocPortal is the compound effect of doing a whole lot of composition in the Tempcode engine.

Plucking a figure out of thin air, if PHP had an efficient array compounder, maybe it would take 4% off the ocPortal time.

4% still isn't a lot, but each optimisation win like that we make, adds up.

There are too many online users to list.
Control functions:

Quick reply   Contract

Your name:
Your message: