Not all browsers are equal and some are less equal than others

The title is a misquote of George Orwell’s quote from Animal Farm

All animals are equal – but some animals are more equal than others

We tend to think of browsers as a utility and don’t really care what version it is when viewing a web page

Unfortunately, the most common browser (and probably the one you are using right now) Internet Explorer 6 is also the least secure as well has a number of bugs whilst rendering web pages that it degrades the user experience. IE6 was launched in 2001 and due to its default install on Windows based computer pretty much dominant status till recently. Microsoft for various reasons had disbanded the IE development till recently and this led to the browser being very susceptible to security flaws which were actively exploited to install malware.

Over the last few years, there has been growing interest in alternatives for IE6 as individuals and corporations have realised that that its not worth putting all your eggs in one basket. This has led to innovation in terms of web standards and an improved user experience.

Microsoft has also responded to market pressure and customer demand and restarted IE development. They have launched Internet Explorer 7 which fixes a number of bugs in the IE6 engine as well as provides improved security. Security improvments come in the form of a built in phishing filter, opt-in ActiveX, an always available address bar so you know where the window is being launched form.

User experience improvements are provided in the form of tabbed window support, easy auto discovery of RSS feeds (we @ mumineen.org use RSS feeds extensively), integrated web search in the default toolbar , bug fixes to many rendering issues which make life for a web developer much simpler because they no longer have to spend time working around IE6 defeciencies but instead can focus on developing a better product

IE7 has been out for nearly 2 years and there have also been alternative browsers such as Firefox, Opera and Safari available which have similar features. Unfortunately our statistics show that nearly 40% of visits to our website come from IE6 so users haven’t upgraded either to IE7 or any alternative browsers

The advantage of Firefox and Opera is that they are cross platform so if you are using Mac OSX or Linux then you have that browser available to you.

We want to take this opportunity to encourage our userbase to switch away from IE6 to any of the better alternatives namely IE7, Firefox, Opera, Safari for a more secure and better browsing experience and as such we have coded a subtle message on our website specific to IE6 users encouraging them to take the plunge

If you have switched away from IE6, we applaud your judgement and hope that you will assist other mumineen in making that switch. Let’s use the mantra

Friends don’t let friends use IE6

webserver upgrade

It seems that APC is causing some issue with our WordPress installation so we’ve disabled it for now.

We took this opportunity to upgrade our webserver to run Apache 2.2 for which readers may find the list of new features interesting. We’ve also got PHP 5.1.1 installed and volunteers experienced with our content tool-chain are busy porting our existing site from PHP 4 to PHP 5.1+

For this blog, we’ve also added the Akismet anti-spam plugin.

software upgrades

Sorry, it’s been a while since the last post. Real life interfered for some volunteers but still software toolchains were upgraded and we have migrated our authorative DNS servers to the new server
Our DNS server is tinydns . This has been our choice for a long time primarily for its ease of use, security (it breaks up the functions of content DNS and proxy resolving DNS).

For those who haven’t been following the news, DNS servers are generally configured to provide both proxy and content services in the same instance. This can lead to cache poisoning.

In terms of software upgrades, lighttpd is now at version 1.4.8 and the author has incorporated the sending of Cache-Control headers in mod_expire so we don’t have to use a workaround. We have also updated to PHP 5.1 and incorporated APC (Alternative PHP Cache) which is an opcode cache providing performance speedups

Impact of client connectivity on server performance

Many developers assume that the performance of a website depends on the server’s CPU performance and the connectivity of the server to its ISP.
Whilst this is true, another important factor on server performance is how the connectivity characteristic of the sites end-users impacts it

Keep in mind that a dynamic website consists of 2 components, creation of the dynamic page and then sending that data to the end-user.

Let’s look at a scenario from a plumbing perspective. Assume you have a water pump which takes finite time to pull water from a well and then send it to various end-users. There is only a fixed number of pipes available to deliver to end-users. The rate at which water can be pulled of is controlled by the rate at which the end-user can take the water (for now, assume that the size of the pipe to the end-user differs per end-user. If a situation arises that a lot of end-users with narrow pipes connect to the pump, the pump will have all its slots taken up by those narrow pipes and others (either with narrow pipes or fatter pipes) will have to wait. The pump becomes busy just serving water to the narrow pipes and isn’t doing what it does best, that is pull water from the ground.

Similarly, in a dynamic website one can only run a limited number of an application server because of memory and CPU constraints. If it gets busy serving customers who can’t pull data of it fast enough, this will delay others. So the solution is to have an intermediary talk to the end-user and the application server talks to the intermediary and then moves on to handle other requests. Such a situation is termed as having a ‘reverse-proxy’ sitting in front of your webserver.

At mumineen.org we are using Squid as our reverse proxy. Squid is primarily known as a forward proxy cache used by many ISP’s and we have tested our cache-friendly headers with it. It also has a mode wherby it can act as a reverse proxy or what Squid terms as “httpd accelerator” mode. Since Squid is an event driven single process program, it is very efficient CPU wise and by offloading our end-user connectivity to it, it frees up our Apache/PHP webserver to do what it does best, namely creation of the dynamic pages

Let’s see this with some numbers.

Let’s assume an average generated HTML page to be of 42KB and an average php script that generates this response in 0.5 second. How many responses this script could produce during the time it took for the output to be delivered to a user connecting at 56Kbps (7KB/s) ? A simple calculation reveals pretty scary numbers:

42KB / (0.5s * 7KB/s) = 12

12 other dynamic requests could be served at the same time, if we could put php to do only what it’s best at: generating responses.

This very simple example shows us that we need only one twelfth the number of children running, which means that we will need only one twelfth of the memory (not quite true because some parts of the code are shared).

With a reverse proxy, one can conceptually address security issues by filtering unwanted requests such as Nimda worms or connections to an admin URL from an unauthorised network segment. Reverse proxy’s are also used to provide SSL offload capabilities. Whilst Squid is not the only software based reverse proxy and we have plans to evaluate and possibly move to others, it was fairly easy to setup and with a brief tweak to our Apache config we can continue to capture client IP’s in our log.

In the commercial space, load balancer appliances such as Netscaler, F5 networks, Array Networks and many others have now incorporated reverse proxy functionality.

We hope that via the use of cache-friendly headers, compressed content and the use of a reverse proxy, we have setup an efficient infrastructure to help us serve the Dawoodi Bohra community.

Size Matters

A website developer should always strive to provide a better user experience, with a combination of a design and layout where the information the end user seeks is easily available and a performant site so the user is not experiencing the World Wide Wait.

An engineer deals in terms of tradeoffs. With Moore’s Law showing no sign of relenting, increasing CPU performance both at the server side and client side is a given. In our case, we can easily switch to a dual-core Opteron by just replacing the processor chip and flashing to a new bios and we have an SMP system with the same physical dimension.

However, the other aspect impacting performance is that of bandwith. Inspite of growing broadband usage, the majority of our user base particularly that from the subcontinent and Africa connect to the Internet via dial-up connections typically at speeds ranging from 28.8 kbps to 56 kpbs (kilo bits per second). This translates to 3.6 KB/s to 7 KB/s (Kilo Bytes per second)

With typical web-pages being of around 40KB (kilo-bytes) and at times even reaching upto 100 KB, this translates to a wait time of 10 to 30 secs for a 28.8 kbps modem user which is very long.

Is there any thing we can do to improve the situation ?

Yes, designers of the HTTP protocol (HTTP is the language spoken between web browsers and web servers) created a mechanism whereby browsers could request for content to be compressed before being sent to it and it would decompress it in real-time and render it for the user. This is negotiated via the Accept-Encoding: gzip,deflate header sent by the browser and the Content-Encoding: gzip response header sent by the server.

Apache 2.0 provides the deflate module and Apache 1.3 provides the gzip module.

We are very excited to offer this functionality in our new infrastructure. Our initial tests show that content is compressed on average by around 65% percent so for a page which was 40KB without compression would be around 14 KB with compression. Thus for a 28.8 kbps user the time to view the page drops from around 12 seconds to 4 seconds and the viewer feels like visiting the site over and over again.

We expect HTTP compression to be very useful as our userbase moves to an always connected environment via cellular networks such as GPRS