by Kevin Schroeder | 12:00 am

With the web being what it is today there can be a lot of times when you want to aggregate data from many different sources and bring them together in a single page.  I have not done much of that on my site simply because that means that I then need to learn a bunch of different API's.  However, since Youtube is the #2 search engine I figured that it might not be a bad idea to aggregate some of my YouTube content on my page automatically.  I don't necessarily want to do a blog post about each individual video I post, but I wanted there to be some place where I could just list them out.

I have two places where I post content.  Youtube and Facebook.  However, polling each site individually for each request is not conducive to having a page that renders quickly.  The thing you do NOT want to do is poll YouTube each time someone comes to an individual page.  The way around this is to cache the contents of the YouTube or Facebook query so you don't have to do that.  Then people are able to re-use the previously defined data when they view that page.  What this does is make most of the new requests to that page much faster since they don't have to re-load that data from YouTube or Facebook.  However, there's a bit of a problem there as well.  Every X number of minutes, the cache will expire and someone will take the hit of connecting to Youtube.  With a moderately low traffic site such as mine, that hit is something I didn't want to make my users endure when they came to the site since there is a decent probability that the cache will expire in between individual page requests.  And, working for Zend, I can't have a page that renders slowly, can I.

So what I did was create a new Zend Server Job Queue task, which I have detailed several times (and there should be a link to several on the side) that would connect to both YouTube and Facebook.  This task would insert the results into a cache (you could use a database if you liked) so that when someone came to a page that they would be seeing the cached data rather than polling YouTube.  From a settings perpective, the cache is set to never expire the content there.  But because I set the task to run once an hour the content is going to be refreshed.  Using this pre-population method I am able to keep requests snappy which at the same time providing mostly up to date content.

The task to do this is relatively simple.  First I edit my application.ini file to set up the cache manager.

resources.cachemanager.video.frontend.name = Core
resources.cachemanager.video.frontend.options.automatic_serialization = true
resources.cachemanager.video.frontend.options.lifetime = null
resources.cachemanager.video.backend.name = File

By defining these ini settings, Zend_Application will automatically instantiate an instance of Zend_Cache_Manager and set up a cache that is named "video" with the individual options as specified.  What this means is that I could create another cache interface by taking these configuration lines and giving it its own configuration settings.  It could be different settings or even a completely different backend, or a different front end.

Then I create my task class.

class Admin_Task_VideoPreCache extends Esc_Queue_TaskAbstract

    protected function _execute(Zend_Application $app)
    {
        $yt = new Zend_Gdata_YouTube();
        $options = $app->getOption('video');
        $uploads = $yt->getUserUploads($options['youtube']['id']);
        $manager = $app->getBootstrap()->getResource('cachemanager');
        /* @var $manager Zend_Cache_Manager */
        $manager->getCache('video')->save($uploads, 'youtube');
       
        $query = 'SELECT title, description, embed_html FROM video WHERE owner=' . $options['facebook']['id'];
        $url = 'https://api.facebook.com/method/fql.query?query='.urlencode($query);
        $data = simplexml_load_string(file_get_contents($url));
        $videos = array();
        foreach ($data->video as $video) {
            $videos[] = array(
                'title'    => (string)$video->title,    
                'description'    => (string)$video->description,
                'embed_html'    => (string)$video->embed_html
            );
        }
        $manager->getCache('video')->save($videos, 'facebook');
    }
}

Because the Zend_Application instance is always passed in I can easily get access to the predefined cache manager object in here for when I need to store the data at the end of the task.  Then in the task I use Zend_GData_Youtube to query YouTube and I do a simple FQL query to Facebook to get the Facebook videos (which stopped working between test, staging and production.  Go figure).

The next thing I have to do is make that data available to a view.  To do that I need to create a new controller action that queries the cache manager.

    public function myvideosAction()
    {
        $app = $this->getInvokeArg('bootstrap')->getApplication();
        /* @var $app Zend_Application */
        $cm = $app->getBootstrap()->getResource('cachemanager');
        /* @var $cm Zend_Cache_Manager */
        $this->view->youtube = $cm->getCache('video')->load('youtube');
        $this->view->facebook = $cm->getCache('video')->load('facebook');
    }

Then all I need to do in my view is iterate over the data and I'm pretty much good to go.  Because the cache data has been prepopulated my visitors should never have to take the hit of populating the cache and by using the Zend Server Job Queue the task of populating the cache is extremely easy to do.

Comments

SiNGH

NiCE

^_^

May 29.2010 | 05:43 pm

raystrach

i am missing something?

although i don’t use zend server at the moment(i have an interest in it), your post seemed quite good – thanks for sharing. however, this is what i don’t get…

when i first came to this page, it took an age to load – so long that i stopped loading it and reloaded. on reloading it took over 15 secs to completely load.

now this may well have something to do with my crap asdl2 connection or my browser idiosyncracies, but it still seemed a long time.

i tested your page with the google page speed tool and the results were not that great – it seems you could make a big improvement with relatively little effort. there is little or no compression of files and lack of browser caching of images and other static files. you don’t need me to tell you that this is going to slow downloading a page considerably.

this is not intended as a criticism, but, being a self taught web developer who is interested in learning more, i am interested to know the reasons for spending time on the server side

May 31.2010 | 09:23 pm

Kevin Schroeder

The slow response was most likely due to networking issues with my hosting provider. This page is 34k in size and takes roughly 100ms to receive the response back (without network issues). At best I might be able to compress the page down to 4kb but the actual packet savings on a site this small would be relatively negligible. The time to first byte is 61ms and the total transfer time is 71ms (+- 15ms on my connection). So I might be able to save some time by compressing, but not much.

Because of the Zend Server Monitoring that I use on there I can see what my page execution time is for long running requests. I have defined that as being a request that takes longer than a second to run, which, if this were a high traffic site, would be too long. But for here it’s fine. The last request that took longer than a second to execute was on April 26th at 5:36pm. So the delay you encountered was most likely a networking issue.

May 31.2010 | 10:34 pm

Marc

Seems to be the only info I could really find regarding any pre-cache methods on the net. And you seem to have a pretty good understanding of this process.

What I will be doing is streaming video from my (a) server to the end-user. Would these pre-caching methods provided by Zend eliminate the need for on-demand streaming, ultimately minimizing bandwidth?

I’m just trying to get a clear picture of what you have, very nicely might I add, demonstrated in your article. Or, did I miss the point?

Jun 21.2010 | 08:41 am

Kevin

Pre-caching is a method for pre-rendering content on a web page so that people who visit the page don’t have to wait for it to be generated. It is useful when the content of a page will take a long time to generate. An example of this is my videos page which relies on data coming from Youtube and Facebook. So the method that I am describing here does not do anything for streaming or anything like that. It is only for content generation.

Jun 21.2010 | 08:56 am

Marc

When you say “videos page which relies on ‘data’ coming from Youtube and Facebook”, what is the data? Simple text from the page(s), or are you pre-caching the actual video(s) from those pages?

I do not want to stream anything, so I was wondering if I could pre-cache the actual video using the Zend cache manager, and then serve up the cached version of the video to the end-user?

Jun 21.2010 | 09:00 am

Kevin

It only has to do with data about the videos being rendered on the page. The videos themselves are streamed directly from their sources.

Jun 21.2010 | 09:02 am

Marc

Man, that’s a shame.

Do you know of any methods to pre-cache video to serve to the end-user?

Jun 21.2010 | 09:04 am

Kevin

I suppose you could add it as a downloadable element, such as an image, so it would start downloading immediately. But I have my doubts as to whether that would be a good practice.

Jun 21.2010 | 09:42 am

Marc

Thanks for the assistance, Kevin.

Jun 21.2010 | 09:55 am

Chris Henry

Just out of curiosity, and having never used Zend Server, what’s the advantage of using this over say, cron and MySQL?

May 28.2011 | 05:57 pm

    Kevin Schroeder

    In this scenario, perhaps not a ton, since this is a simple recurring job. But in other scenarios where you need asynchronous execution the ZSJQ provides much more flexibility than cron and MySQL.

    May 31.2011 | 09:14 am

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.