ZendFramework (performance) II
(Apologies, I still don't have another blog. :-))
Just a brief follow-up to my earlier article.
A disclaimer which I should have added to my last article would include that most of my pseudo benchmarks are very subjective and also way too basic. For example, our server setup is pretty comprehensive but we have to take everything into account in order to provide real benchmark. And when I write everything I mean CPU (cores), RAM, motherboard, HDD and so on. Maybe even the throughput of the network card -- if it's different.
Having said this, here are a couple follow-ups.
1) require/include(_once) and __autoload, or "Why is __autoload() 'better'?"
A lot of people asked me if __autoload() wasn't slower than a straight include, and of course they are correct.
They are correct because when __autoload() is invoked, that is the extra overhead before anything else happens. Because inside __autoload() there's just include too. When coding a simple application the developer should be on top of the code and all its dependency, but sometimes they (or we) are not. ;-)
To illustrate my point from the above, look at the following example:
Faster:
<?php
include 'Foobar.php';
$foobar = new Foobar();
echo $foobar->hello('World');
?>
Slower:
Even faster (though not really maintainable):<?php
function __autoload($class) {
include '"{$class}.php";
}$foobar = new Foobar();
echo $foobar->hello('World');
?>
Alas, it it's not always that easy. Applications are often bigger than a single page and we sacrifice all the manual keeping track of our dependencies with calls to require/include_once or better __autoload(). :-) And because the Zend Framework is full require_once calls, which I explained that I stripped, I choose the "in this case" less expensive route with __autoload().<?php
class Foobar {
public function __construct() {}
public function hello($var) { return $var; }
}$foobar = new Foobar();
echo $foobar->hello('World');
?>
The bottom line is, that it's all about convenience.
Reasons for not being on top include that maintaining dependencies inside a framework such as Zend Framework are a nightmare. This is not about its 6 or so MB (mini release). It's also not about 20 MB -- we can all agree on the fact that disk space is cheap!
It's just that even though not all the files are loaded when we use only certain components of the framework, it does not make it easier for the developer to figure out which classes (and essentially files) are used. Not in a very straight forward way at least.
We could use Xdebug to figure out what is loaded and where - but that requires us to walk through everything our application does, revisiting this process when we add features and it overall adds to whatever we are working on already, thus taking away all the RAD benefits and reasons why we use a framework to begin with.
An alternative would be to break up the framework into components (as Zend advertises it) and to offer a PEAR-style installation. While this may not be convenient for unzip and go type of people, I'd offer it as an alternative installation method for the more knowledgable user.
2) Zend_Loader ERRATA
My loader in my last blog entry, looked like:
function __autoload($className) {
include_once str_replace('_', '/', $className) . '.php';
}
But since __autoload() is only invoked when the class is not yet present, it makes no sense to include_once, so let's use this instead (thanks for the comments and pointing that out):
function __autoload($className) {
include str_replace('_', '/', $className) . '.php';
}
3) Caching database results
In my first article, I talked about how you have to avoid queries and cache whatever is possible. Despite people raving about caches, all the awesomeness aside, caching also has major drawbacks which are almost always excluded and overlooked.
I'm sure many people know Cache_Lite, and have seen one or two examples - it's super nice and simple.
In a nutshell:
- set a TTL
- pull from the cache if it exists
- populate the cache if it doesn't
- Cache_Lite also takes care of deleting the cache when the TTL ran out
What none of those examples talk about, is the issue we run into when you a user added something and the cache is not invalidated by your application. For instance, if I decided to cache our user's data today, it would double or tripple our support volume in an instant. Just because our customers don't know or care about the resources consumed by a database query. What they care about is their data, and they want to see it now.
In this case the web2.0 is not too helpful either. Applications are online, available at all times, instant, very responsive and the data is live. ;-)
So far so good -- no! No?
Well, we need to keep in mind that a file based cache with a high traffic application might not be exactly suitable anyway ("How slow is the disk?"). So we may need to setup a RAM disk, if you got plenty of it and put your cache on there or look for a maybe more robust solution such as memcached.
To back up my claim and in an effort to normalize my blog post, go head over to Tillate (a pretty awesome name, if I may add). They just posted a blog entry going into detail on the obstacles you run into when you cache content and why you should consider outsourcing some of your caching to the clientside.
Last but not least -- two things.
- To re-iterate on my disclaimer, make sure to actually put a meter on things so you can measure an improvement and not guess ("I think it's faster."), because that is the worst. For an example of what I mean, please check out Sander van de Graaf's blog.
- We need to keep in mind that the added complexity needs to be taken into account -- during development (for example replicating the setup for development and staging) and also for maintainance (more services, more sorrow).
4) Zend_Db
I'm not trying to pick a fight with Zend_Db, but I can't really avoid talking about it. Aside from general shortcomings in the implementation, my number one issue currently is that the Mysqli driver prepares all queries that are send to the MySQL server.
Let aside all the pseudo security (I believe each value is quoted anyway before prepare is used), they are slow and also suck because MySQL's query cache currently cannot deal with them. Even though Bill Karwin pointed out that MySQL 5.1.x will eventually solve this problem, we are still stuck in the here and now.
The quickfix would be to shortcut all queries and inject directly into (I think) exec(), which would avoid the internal prepared statement but this leads to getting rid off it all together, which means we throw aboard all the convenience offered through insert(), update(), delete(), fetch*() and so on.
I have a patch to basically add in a fake Zend_Db_Adapter_Mysqli_Statement class (Did I get the name right?), which would basically mock mysqli_stmt but this is not a solution that would go into the framework. I'm attempting to find a more general solution, but don't hold you breath.
5) Zend Framework
Zend Framework is both amazing and frightening at the same time. With all the feedback I should add that it's semi-open-source (open source backed by a company), so you can always report bugs, (sign a CLA to) supply patches and maybe even get access to commit code to the official repository.
With (of course) no offense meant, there are a few things that lack severely currently:
- Good coding guidelines vs. overengineering - I wonder who's gonna optimize all the weirdness for 2.0 that is apparently required by CS/design. While engineering in general is a good thing (TM), various people have already pointed out that some of the components (or helpers, validators, etc.) tend to be overengineered - for example, when you load a new class (and essentially file) to apply strtolower() on a variable.
- Code review - I feel like
oftensometimes components are promoted to trunk and not enough people looked at it before that has been done. For a great example of peer code review, let me pimp PEAR again because while we are sort of agressive on our coding style (and also tend to drive people crazy or at least provoke flamewars), all the feedback you get during the proposal phase is worth so much.
You don't learn all that in school, open source gives it away for free. - Even though the Zend Framework is advertised as a glue framework (you just use what you need), many components have hardcoded dependencies - an example is my favorite Zend_Loader. Especially in case of the Zend_Loader there are two things I'd like to see -- one is allowing the user to override the loader used in any component, two is replacing all the require_once calls with the loader since by default it tends to be the less expensive operation anyway.
- Maintainers - some components seem to be rather unmaintained and it seems there is little/no communication that is visible for everyone on the outside. Not very open-sourcy. :-P
Also, various people who contributed to the framework initially, have left Zend's payroll and their work is sort of orphaned (so it looks).
There's just not enough people who have time to actively contribute to core components currently. And in a way, despite a company backing the code, the Zend Framework suffers from the same problems any opensource project has.
Last but not least -- I don't hate the Zend Framework, or frameworks in general. ;-)
(Despite some other people, who's blogs I like to read and find very entertaining.)
Comments
May I translate your articles into Chinese, and put the translations on my blog?
http://www.mikespook.com
I think you mentioned some really good points, I'm only going to add this:
Community, community and community!
Most open source projects have found their success almost completely due to it’s focus on community. Overengineered components is a clear symptom of a project that doesn't focus on community. Having a Wiki and Bug Tracking system helps, of course, but it's not enough. I think this particular project lacks of community management and leadership processes, governance, responsibilities, goals, etc. How do they organize, allocate responsibility for different activities, create teams, select team members, team leaders?
All this is a mystery to me.
http://ideas.zendframework.com/
An idea can be anything, not just a component. It can be an application that uses ZF, a script, a desktop application, a business proposal, anything. Here's a good example:
http://epic.codeutopia.net/pack/
I'd encourage you to approach the community and the internal ZF team via one of the many, active mailing lists. The questions you raise would be particularly well suited for the zf-contributors list.
thanks for reading and thanks for commenting.
I can't speak for Frederico, but I often feel like the "review" process that is taking part is more in terms of "what feature set does a component offer", but it rarely ever touches implementation.
Also, I'm not sure when exactly that happens, but there seems to be no real public codereview, until someone is promoted to trunk and emails the lists. And some people don't email the lists. ;-) And the code review that seems to take place is maybe too short, because some of the code I've seen promoted to trunk didn't look like it's been through a deep review. ;-)
I don't want to blame the community, or Zend, but some parts of the framework seem indeed overengineered. For example, there are Filter classes that do strtolower(), strtoupper(), trim() etc. and while I agree that sometimes it's nice for the "end-developer" to have a unified interface, I also feel like someone needs to talk some sense into this code because aside from the pretty interfrace, the code violates whatever is called best practice. Let alone the performance - you're loading an extra file to wrap code in userland?
I also know that Filter is not a standalone component, and it's used in other framework code, but when I ran into that, I thought there has to be a better way to achieve the same. ;-)
Anyway, no offense meant to whoever worked on Zend_Filter, it's just an example. There are others -- you're probably well aware of the front controller discussions on the wiki, etc.. That's just another example.
Anyway, I know you guys put a lot of work in the framework, and a lot of people (including myself) like to build apps on it. All of this is feedback -- no offense meant, I hope none taken.
Till
Regarding the review process, code review typically happens by the Zend liaison prior to the code being pushed to trunk. In the past, we often have not been stringent enough about this -- but in the past six months, we've been making a better effort to push back. Additionally, we're encouraging contributors to get feedback from users prior to asking for our review -- so that they can have a sanity check from real-world use cases.I want to build this into the component life cycle.
Again -- I'd love for this kind of conversation to happen in the zf-contributors mailing list, as it gives a public forum for all interested parties, and also a central location for looking up these threads. I definitely appreciate the feedback, no matter what form it may take. :)
Last week, I found an interesting article on performance at http://adventure-php-framework.org/Seite/103-Yii-vs-APF, that deals with a performance comparison of different frameworks based on a hello world sample. Basically, the article on http://www.yiiframework.com/performance was taken to compare the candidates.
I was glad about the fact, that good frameworks do care about their performance and design! Hence, I think the APF is worth a try...
I think you are missing the big picture. In most applications features come first, and then you optimize. Premature optimization is the root to all evil. ;-) My article shows off ways to optimize Zend Framework to be able to serve more traffic.
On the flip side, the Zend Framework offers very clear coding standards, stable APIs, unit-tested and liberal licensed code (I try to avoid anything remotely related to the GPL!), and also lots of features. Those things are very important for companies who on PHP-based applications.
Further more, the development is backed by a commercial entity which both has advantages and also disadvantages -- the biggest advantage is that it won't be discontinued just because the lead developer went on holidays, etc..
You may have a different opinion if e.g. ZF needs a figlet implementation (just an example=, or not, and why other features are not implemented already -- but that's the case with many frameworks and not specific to Zend Framework.
ZF offers a lot of piece of mind. And most of the people who use it will never hit the limitations and the rest of us will know how to deal with it.
I personally believe, that performance must be part of the design of any _web_ application. Sure, not every funtionality can be implemented as fast as customers wanted it to be, but if you don't keep that in mind, the result can only be fast using tricks like APC, extensive application caching and so on.
Commodity hardware is cheap these days, but providing good scaling services - from the hoster's point of view - is directly dependent on the software you use. This means, that one criterion in choosing your tools, should be performance. I agree, that there are multiple ways accelerate applications, but beeing forced to do so, you comparably increase the system's complexity.
Even if many people tend to be "feature-obsessed", the expirience of the last 6 years tells, that they are wrong! :)
Cheers,
Markus
Nice article(s).
About Zend_Db auto-preparing statements thru mysqli adapter...
Another option is to use pdo_mysql adapter with Zend_Db. You then have the option to emulate prepared statements client-side and thus u can still use mysql query cache
$pdo-
See
http://netevil.org/blog/2006/apr/using-pdo-mysql
And
http://framework.zend.com/manual/en/zend.db.html#zend.db.adapter.connecting.parameters
I like to set the adapter globally
Zend_Db_Table_Abstract::setDefaultAdapter(Zend_Db::factory($oRegistry->config->database));
So until mysql 5.1 is palatable you can emulate prepares. And one day just switch it off in the config.ini
:-)