Crowdsourcing mobile browser technical details

April 30, 2008 on 11:57 pm | In ajax | No Comments

I’m fascinated by cases where seemingly banal technical details become precious commodities because very few have expended the time and energy necessary to document them. One good example is mobile browser connection profiles — there are thousands of combinations of mobile device and browser software, and each has its own particular connection limits and concurrency profile. No central body provides gratis access to this information, so those looking to study or test mobile browsers have few and costly options to choose from.

That’s why I was excited to see a post by Jason Grigsby of Cloud Four (via Ajaxian) about a research project to collect this information with some clever server-side magic. Just hit this link in your mobile device and help contribute to a worthy cause. The results will be published under a creative commons license for all to use.

More about native selector functionality

April 30, 2008 on 11:47 pm | In ajax | No Comments

I’ve talked before about the recent move by browser vendors to implement the Selectors API. There is potential for significant performance benefits from moving this code into the browser, but there is risk as well. If the provided functionality is buggy (as history tells us it must be), libraries will need to patch around these bugs on a case-by-case basis. If the spec is ambiguous or differs from de facto standards used in common practice, that’s yet more work for the library maintainers.

John Resig provided some insight with a post today into how browser vendors, the W3C, and library maintainers are coming together to smooth over the rough parts of the spec. It’s a fascinating read, providing a peek into the sausage-making process of spec wrangling for those who don’t frequent the public-webapi mailing list.

Cuzillion: simplifying page prototyping

April 29, 2008 on 11:22 pm | In ajax | 2 Comments

Testing new arrangements of DOM elements to improve the object load order or parallelism can be a bit of a cumbersome task. Fire up a text editor, create a test page with a meaningful name, hit with different browsers, and repeat a few hundred times. As an exemplar of the old aphorism that good programmers are lazy, Steve Souders (formerly of Yahoo!, now of Google) created Cuzillion to remove some of the friction from these testing cycles.

Cuzillion is a simple web app that allows for easy arrangement of different page elements (external scripts, images, stylesheets) within a DOM. These sample pages are each defined by a simple restian URL, so they can be shared with other developers as examples of what to do (or what not to do). Loading a page in Cuzillion also reports a high level number for page load time and some micro-metrics from within the page (the time to load an inline script, for example). You can use Page Detailer or HttpWatch to get a more detailed analysis of object load order.

When YSlow was released last year, one of the aspects of the project that excited me the most was the documentation it provided: just by ranking specific performance decisions made by the application, it served to educate developers on what they can do better. I could see a community developing around Cuzillion to serve a similar purpose, especially as the tool expands to handle more DOM elements or object load techniques (such as external scripts referenced via document.createElement).

On spam and comments

April 24, 2008 on 9:36 pm | In personal | 2 Comments

Since switching to Google hosting for my personal e-mail, all of my Wordpress ‘comments pending approval’ e-mails have been silently going to my spam folder. I just finished digging through the 4000 messages that queued up. Damn comment spam.

Apologies to those whose comments were delayed. I’ve corrected the e-mail issue, and I’ll do a better job of staying on top of comment moderation from now on.

WebKit Inspector getting some attention from Google’s Summer of Code

April 21, 2008 on 5:32 pm | In ajax | No Comments

Just a quick piece of news today: assuming I’m reading this correctly, it looks like Webkit Inspector will be the beneficiary of some attention from a student by the name of Keishi Hattori as part of Google’s Summer of Code. Keishi will be “implementing Firebug API and a JavaScript profiler into WebKit,” moving WebKit Inspector ever closer to feature parity with Firebug.

It’s great to see pseudo-standards such as Firebug’s console and profiling APIs gain traction. That makes it much easier for users to get meaningful comparative data between browsers while testing their applications.

The WebKit team makes the case for preloading

March 24, 2008 on 7:13 am | In ajax, http | No Comments

Over at Surfin’ Safari, Antti Koivisto explains the preloading features in the latest WebKit nightlies. Antti begins by documenting the dominance of latency in determining total page load time, focusing on the slowdown caused by the blocking behavior of modern browsers while handling external scripts. As we’ve discussed here in the past, this has the effect of serializing object loads resulting in a total page load time that increases linearly with increases in network latency.

The new preloading feature available in WebKit nightlies attempts to maintain network parallelization even while the parser is blocked waiting for an external script to load. To achieve this, a separate parser is created to move through the remainder of the page, queuing up any additional objects to load. Scripts and stylesheets are also moved to the head of the queue of pending objects.

The net result for end users is a faster page load:

It should be noted that IE8 promises a similar improvement to script load parallelization, as discussed by Steve Souders a few weeks back. I would guess that the underlying implementation is similar to that used by the WebKit team.

Testing IE8’s Connection Parallelism

March 16, 2008 on 7:51 pm | In ajax | 3 Comments

A few weeks ago, I discussed IE8’s improved connection parallelism, specifically the increase from 2 concurrent connections per host to 6. One open question was the total number of connections allowed — my speculation was that the IE team would stick with a max of 6 rather than triple that value as well.

I was wrong. The new max is an astonishing 18 (!) concurrent connections:

That is some serious parallelism, and it has significant implications for application performance.

In December of 2006, I discussed the CNAME trick for circumventing browser connection limits, using 3 hostnames to serve images to trick the browser into using all available connections. At the time, that was 6 for IE. The above capture from IBM Page Detailer confirms 18 concurrent connections in IE8.

As expected, IE8’s handling of the unoptimized version, where only one hostname is used, is comparable to the performance of the optimized page in previous IE versions:

As an aside, the out of the box optimization provided by IE8 is actually slightly faster than the CNAME trick applied to previous IE versions as it does not incur any hostname resolution cost when establishing the first connections. Both examples would use 6 total concurrent connections, and IE8 should be equal to or faster than optimized connection management in previous versions.

But what about IE8 against a page optimized for connection parallelism? If 6 concurrent connections is good, 18 should be terrific, right? Not so fast. While the Page Detailer captures above show some improvement in the 18 connection version, point in time metrics can only tell us so much. What we need is a tool that can collect a statistically significant sample of performance data using both 6 and 18 connections to see if any trends shake out.

For this analysis, I used a hosted performance testing solution from Gomez, my employer. This is the same tool used in my original connection parallelism article. I ran my tests in IE8 compatibility mode, mirroring the new connection levels. As before, one test is against the default (1 host) page, and one test uses the CNAME trick (3 hosts) for greater connection parallelism. The results surprised me:

This aggregate data is made up of hundreds of tests taken from 7 locations in the US over the last 14 hours. The same locations were used for both tests. The “IE8 Parallelized” test, which uses 18 connections, has a much higher standard deviation and a higher average test time than the 6 connection “IE8 Default” test. What gives?

The answer appears to be sporadic connection hangs. The median response time for the parallelized page is lower than the default page, but a higher incidence of outliers skews the median and leads to the increased variability. Looking at the outliers, I typically see a section of the page load that looks like this:

Here we see 2 object downloads taking more than 8 seconds to complete. The average response time for an entire page is around half of a second, so this is a huge outlier. I see these outliers on between 5 and 10% of the test runs for the 18 connection page, but I never seen any comparably high outliers for the 6 connection version.

Below is a revised version of the test averages taken by removing outliers:

Note that the parallelized version is now consistently faster than the default. As expected, the outliers are responsible for the counterintuitive poor performance of the parallelized page.

I suspect that my hosting provider (Dreamhost) simply can’t keep up with the dramatic increase in connection parallelism. 18 connections is simply too much of a good thing, and it will present a scaling problem for those who are on small to medium hosts. 10 users hitting at the same time will yield 180 concurrent connections, a pretty significant number for smaller providers.

[Note: This objection was anticipated and handled by the IE team. See below.] Dial-up and cellular network users are also likely to be negatively impacted by this change. In the high broadband world where latency is the dominant factor, greater connection parallelism is a boon. But in bandwidth constrained networks, it just leads to thrash where progress is slowed by all the connections trying to share a small pipe.

I’m curious what sort of testing Microsoft has conducted to determine the impact of this change. The connection parallelism approach is used widely (including by the Virtual Earth team), and some servers may not be ready for the increase. My tests were conducted against only one host, but if similar results are experienced elsewhere, this may fall under the rubric of “don’t break the web.”

My advice to anyone who is using the connection parallelism trick is to perform a similar analysis of your application before IE8 is released. The new connection levels will create greater strain on your servers, and that may lead to occasional performance hiccups for your users. There are a few different approaches you can take to dealing with this change, but the most important first step is to understand the extent to which your application is impacted.

Update: Kris Zyp and Steve Souders have pointed out that IE8 will use 2 connections per host for dial-up users. This nicely addresses that concern, but the concern about 18 connections for pages using the CNAME approach still stands.

Google Code performance improvements: the Souders factor

March 16, 2008 on 1:05 am | In ajax | No Comments

Steve Souders is now at Google, and the Google Code team has taken some of the advice from High Performance Web Sites and applied it to reduce user-perceived latency. There is no magic in their performance improvements — the techniques (JS/CSS concatenation, CSS sprites, and lazy loading) have been discussed here and elsewhere in the past — but the user-centricity of the approach is what I find most cheering.

The explosion of web performance optimization tools and techniques would be meaningless if we were not focused on improving user experience, and the Google Code team clearly understands this message. The last approach they discuss, lazy loading, is a nice illustration. Rather than initializing the Google loader module in the traditional blocking manner (<script src="blah.js"><script>), the team used the non-blocking DOM scripting approach (document.createElement('script'), set src, append to head). A callback on complete of this operation loads the required APIs.

This approach prioritizes the load time of critical user-visible page elements. To understand the effectiveness of this optimization, you need to measure the time at which the user would perceive the page to be loaded as total page download time may overstate the actual latency. Using experience-centric measurements, the Google Code team saw improvements between 30% and 70% depending on page.

IE8: The Performance Implications

March 7, 2008 on 1:25 am | In ajax | No Comments

Mix08 is here, and with it the first beta of IE8. John has a great roundup of the JS/Dom work, noting that “Internet Explorer 8 is our release.” He’s right.

I’ll run through a few of the items that have particular implications for performance.

  • This one is the most exciting for me: the IE team has finally upped the connection limit to 6 per host from the default of 2. I’ve talked before about DNS tricks to get around the 2 connection limitation, but having this support out of the box will be a great assistance in the war on round-trip latency as it’s easier to make more expensive network calls in parallel. This is especially sweet for Comet and the like where the persistent connection could previously monopolize half of the connections to your site. As you would expect, Joe Walker of DWR is happy.

    One thing I haven’t seen mentioned anywhere is the total connection limit. Previous versions supported 2 per host and 6 total. Is the new version 6 per host / 6 total or 6 per host / 18 total. I really doubt it on the latter, but if no one has the answer I’ll grab the beta this weekend and test it out.

  • w3c Selectors API — Last month I discussed the work Firefox and WebKit have done to implement the new Selectors API spec, and it’s nice to see Microsoft is joining the list. I share John’s concern that these black boxes have a significant potential (make that inevitability) of browser bugs, so smoothing over these will, as always, remain the job of libraries. But it’s nice to have that blazing speed under the covers.
  • DOM Storage and offline events are techniques still on the fringes of relevance. DOM Storage in Firefox 2, as well as Google Gears and its less nerdly cousin Dojo Offline, have a lot of promise, but to this point they’ve lacked a killer app due in no small part to the chicken and egg problem. Having Microsoft on board finally offering these HTML 5 features may help push us to widespread adoption.
  • I’ve dinged Microsoft for the lack of a Firebug-like tool since, well, I first used Firebug, and they finally have a clone. A clone in serious need of a makeover. Yeah, I’m shallow. For those keeping score at home, the sexiness hierarchy goes Webkit Inspector > Firebug > IEBug (or whatever it’s eventually called).
  • For the truly performance obsessed, there are a collection of optimizations to common low level functionality, such as string concatenation and array manipulation.

All in all, some really cool stuff in this beta. If you want to give it a try without downloading, it’s already up on BrowserCam. Just like this:

Include, a new JS compression wrapper

February 27, 2008 on 10:45 am | In ajax | No Comments

Earlier this week, I talked about a tool which removes much of the tedium from generating CSS sprite maps. In a similar vein, Brian Moschel of the JavaScriptMVC Project pointed our good friends at Ajaxian to Include, a wrapper around Dean Edwards’ excellent JS compression tool, Packer.

Include is itself a fairly small chunk of JS which is designed to run within the browser of development and production users. This approach has some nice advantages: there’s no need for server side compression scripts and it’s easy to create many different compressed files depending on the different library requirements in different parts of your application. Expanding on that last point, you can select at browser load time which library to use within a specific page giving you runtime flexibility.

The one thing I don’t like is that Include is packaged as a separate .js file. As I’ve discussed here many times, performance in modern broadband networks is dominated by latency. The round trip time to request the initial include.js, which is only 3kB, will offset some of the gains from compressing and concatenating library files. In most use cases, the best performance approach will be to use include.js to compress your libraries only during development time, replacing all include.js references in production with a single compressed library call per page.

Next Page »

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds. Valid XHTML and CSS. ^Top^