PHP Generators – an example

I’ve ignored generators for sometime in PHP, but recently realised why they can be quite handy šŸ™‚

As an example, imagine you are querying a web service, which returnsĀ  data in chunks of up to 100 results….

To get results 100-200, it’s necessary to pass an ‘offset’ value to the query ($after).

The JSON response from the service looks a little like :

{
"paging": { "after": "foo" },
"data": [
"stuff.we.care.about",
"stuff.we.care.about",
"stuff.we.care.about",
]
}

Naive (without error handling etc) PHP code for this could look a bit like the below :

function get_things_in_chunks($after = null) {
$x = new \GuzzleHttp\Client(/* config */);
$response = $x->get(
'https://graph.facebook.com/vX.Y/something/blah',
[
'query' => array_filter(
['limit' => 100, 'q' => 'foo', 'after' => $after]
]);

return json_decode($response->getBody()->getContents());
}

So :Ā 

  • My consumer (whatever is calling ‘get_things_in_chunks’) needs to know how paging is specified in the result returned back.
  • My consumer needs to be able to figure out if it’s possible for there to be more results from the web service, in order to know when to make a new request …

So, if I want to keep iterating through the data returned until I meet some condition, my calling code could look a bit like the below :

<?php

$dt = new DateTime("1 year ago");

$after = null;

while (true) {
$data = get_things_in_chunks($after);

// find the paging key/data for a next request.
$after = $data['paging']['after'] ?? null;

// was there data returned? if not, break out of while(..)
if(empty($data['data'])) {
break;
}


foreach($data['data'] as $post) {
if (!check_created_at_before_dt($post->created_at, $dt)) {
break(2);
}
// do something
}
}

We can simplify things for the consumer by using generators – like the example below.

The nice thing about this is that the caller no longer needs to know/care about the underlying response format from the web serviceĀ  – so no longer needs to read $data[‘paging’][‘after’]Ā  or whether there is data in $data[‘data’].

<?php
function get_things_in_chunks()
{
$after = null;

while (true) {
$x = new \GuzzleHttp\Client(/* config */);
$response = $x->get(
'https://graph.facebook.com/vX.Y/something/blah', [
'query' => array_filter(
['limit' => 50, 'q' => 'foo', 'after' => $after]]));

$data = json_decode($response->getBody()->getContents());

if (!isset($data['data']) || empty($data['data'])) {
return;
}

$after = $data['paging']['after'] ?? null;

foreach ($data['data'] as $post) {
yield $post; // magic here!
}
}
}

Now the caller can look more like :

<?php
$dt = new DateTime("1 year ago");

foreach (get_things_in_chunks() as $post) {
if (!check_created_at_before_dt($post->created_at, $dt)) {
break;
}
// do something.
}

So – now the caller only has to care about things it should be caring about ($post) and does not need to know or care about the paging mechanism that’s in place.Ā 

If multiple consumers are using the same ‘get_things_in_chunks()’ function, then there’s less repetition / usage becomes far simpler (no $after, no need to check if there are more posts etc).

Ā 

Leave a Reply

Your email address will not be published. Required fields are marked *