Mutating a hierarchical array in PHP

A common setup for Express apps is to configure a template engine to handle rendering views. Alternatively, with some help from Babel you can embed JSX directly into your route definitions, n

Reading JSON-formatted configuration files into PHP can be a headache, especially when you need to populate default values.

JSON

[
  { "title": "About" },
  {
    "title": "Resources",
    "children": [
      { "title": "FAQs", "url": "/resources#faqs" },
      { "title": "Features" }
    ]
  },
  {
    "title": "Newsroom",
    "children": [
      { "title": "Press Releases", "url": "/newsroom#press-releases" },
      { "title": "Press Kit", "url": "/newsroom#press-kit" }
    ]
  },
  { "title": "Contact" },
  { "title": "En Espanol", "url": "/es" }
]

Let's look, for example, at a JSON file storing a website's navigation structure (above). See how some of the elements have a "url" key, while others don't?

Here's a challenge: write a function, in PHP, that sets a URL key (when it's missing) on each element of this tree.

  • The URL should be computed from the "title" key of the same element.
  • Assume it's expensive (in time or space) to copy elements, so you'll need to operate on the tree in-place.

In PHP, you can use array_walk_recursive() to modify each element of a tree. With a little bit of effort, you can even use that function to modify a tree in-place. Its shortcoming, though, is that it only operates on a single key at a time. In other words, array_walk_recursive() won't let us compute one key based on another - a "url" based on a "title".


Here's the solution I came up with instead:

PHP

class ref_stack {
  private $stack = array();
  private $i = -1;

  public function push(&$item){
    $this->i++;
    $this->stack[$this->i] =& $item;
  }

  public function &pop(){
    if($this->i < 0)
      return null;

    $item =& $this->stack[$this->i];
    $this->i--;
    return $item;
  }

  public function is_empty(){
    return $this->i < 0;
  }
}
class config {
  // Associative array of list items
  public $items = array();

  public function __construct($json)
  {
    $this->items = json_decode($json, true);

    // preorder traversal to populate all 'url' keys if missing (function of 'title')
    for($i = 0; $i < count($this->items); $i++) // can't use foreach, since we want to modify $this->items in-place (by value vs. by reference)
    {
      $item =& $this->items[$i];
      $stack = new ref_stack();
      $stack->push($item);

      while(!$stack->is_empty()){
        $cur_item =& $stack->pop();

        // visit $cur_item - this is a lame title to url converter, you can write better!
        $cur_item['url'] = '/' . strtolower($cur_item['title']);

        $has_children = array_key_exists('children', $cur_item);
        if(!$has_children)
          continue;

        for($child_i = 0; $child_i < count($cur_item['children']); $child_i++)
        {
          $child =& $cur_item['children'][$child_i];
          $stack->push($child);
        }
      }
    }
  }
  // ... other methods here
}

Normally you could use SplStack, or even a plain array (via array_push and array_pop), to manage the stack for an iterative tree traversal. The problem is the tree, $this->items above, is effectively one big value type. Since the traversal must modify the tree in-place, the stack needs to operate on references, not values. Otherwise, tree elements would be copied onto the stack, with the original tree unchanged after traversal.

Neither of SplStack's push or pop methods receive or return (respectively) stack elements by reference. array_push and array_pop likewise operate directly on values. Hence, the need for a ref_stack class above. It's a barebones implementation of a stack backed by a PHP array, but with care taken to preserve references as part of any push or pop operation.

The last trick in this code was avoiding foreach. When looping through an array of value types, the foreach construct makes copies of each element it encounters, and those copies are what's used in the body of the loop. In order to iterate over an array without generating copies, I had to use standard for loops, coupled with assign-by-reference.

In reality, this problem is somewhat contrived, if you accept the current rule of thumb that space is cheap. Fair enough. Even so, it's a tricky problem even if you make copies of the entire tree. You'll have to traverse both simultaneously - certainly do-able, but unpleasant.

© 2021 David Mann. All content provided on this site, code or otherwise, is licensed and/or copyright as specified below: