Tag Archives: web performance optimization

The languages of the browser, HTML, CSS, and JavaScript are berated, harped on, abused, misused, and contorted to do all sorts of things they weren’t designed to do.  HTML in it’s core is a semantic language intended to organize documents and the relationship between them.  CSS is simple, straightforward, but hardly dynamic.  Even with Less and Sass, the resulting output is a flat static stylesheet that was never meant to power rich interfaces.  JavaScript was designed in 10 days and as we all know is the most misunderstood language – thank you Papa Crockford. As someone who writes JavaScript day-in and day-out, I am fully aware of all the issues with it. However, the community is massive and the toolsets are plentiful. What was originally used to perform superfluous tasks is now a great option for a unified language application. Javascript is the only viable way – today – that we have to express a dynamic interface in a browser.

HTML5 has been a big step forward, but it does not really deliver in terms of fluid interfaces, particularly for mobile.  Older methods of DOM access and manipulation are flawed, and ripe with performance issues. Dealing with these issues is complex. Everyday development cycles should be abstracted from having to deal with limitations like these. Famo.us does this for us, and from what I’ve seen so far – it does it well.

Browsers have gotten really good.  Our desktop machines are crazy powerful.  Most of the time folks aren’t cognizant of any performance issues with the sites they browse on their desktop machines.  All of this often masks the limitations of HTML/CSS/JS, or poor implementation by the developer.  On mobile devices however, the cracks in HTML5 start to show.  Although smartphones and tablets today are indeed amazingly powerful for their size, the mobile browsers they tote are a far cry from their desktop cousins.  This is where it becomes quickly apparent that HTML5 is not as magical as we think it is – or have been told it is.  When Zuckerberg – of Facebook –  said “The biggest mistake we made as a company was betting too much on HTML5 as opposed to native” it was a huge blow to the perception of HTML5 in the mobile space.  Recently there have been rumblings of a Front End framework that is solely focused on providing a toolset to use HTML5 in a performant way.  This framework is Famo.us, and it wants badly for the web to win.

Famo.us is alot of things; a 3D layout tool, a physics engine, a native wrapper (yet to be seen), a MVC (yet to be seen), but more importantly it’s an attempt to solve the biggest rendering performance issues of HTML5.  Stabs to crack this nut in the past have fallen short in my opinion.  Sencha Touch is a very mature JS MVC with a rich UI kit, however it is beholden to a heavily nested DOM which will grind your app to a halt if you try to get too fancy.  Ionic framework is a more modern attempt to create a UI kit for mobile, and leverage AngularJS as the app convention.  Although the DOM is much lighter, it doesn’t address a fundamental issue of nested elements, reflows, repaints etc.

Famo.us recently entered a controlled beta – adding a few hundred accounts a day, and allowing access to early guides and documentation.  I am excited to be attending HTML5 Dev Conf for hands on Famo.us training and hackathon, and to be a part of this beta release.  What I have seen so far is very promising.  There are a few things that Famo.us is doing that will give a speed boost to the DOM by addressing how we structure and think about modern web apps.

The Flat DOM

Famo.us uses the concept of surfaces and modifiers for layout. Surfaces are what they sound like, a flat surface – In the DOM represented by a div node.  Surfaces are dumb, they don’t know where they are, they don’t know what size they are.  Modifiers are objects that hold, and can modify the state of a surface.  This can be opacity, x/y/z position, and a host of other properties. A surface’s associated modifiers control its properties, as well as any other surfaces that follow.  This abstraction between surface and modifier allows the developer to chain modifiers and surfaces together, which creates a representational render tree such that a modifier can affect many surfaces below it on the tree.  This concept is central to how one architects the layout.  The association of surfaces is held in the JavaScript code instead of in the markup.  This allows greater flexibility and expressiveness in the tree.   The result is a DOM of surfaces that are siblings, and we avoid the pesky nested DOM tree of yore.

Matrix3D

There is a great article on HTML5 Rocks – A resource for open web HTML5 developers about CSS3 and hardware acceleration.  What is not mentioned in that article the the transform matrix3D – which allows a 16 digit matrix of values to describe a point in space in a 3D perspective along with rotation, skew, and scale.  Famo.us wraps this property up and uses it as the main driver of it’s layout engine – flipping brilliant if you ask me.  This allows them to run operations against multiple matrices of modifiers for super performant, complex transitions.  Deeper under the hood Famo.us leverages the JS method “requestAnimationFrame” to know when to update the matrix – ensuring users perveice the UI at 60fps.

Sticky DOM

Nothing really groundbreaking here I would say.  Caching reference dom objects is one of the first things listed in any best practices guide for DOM speed.   Many frameworks have optimized themselves around reducing DOM touches.  They all have their own special flavor on how they get that done, but the concept is straightforward.  I like to think of JS and the DOM as being on opposite sides of a river.  There is a bridge and, as with all bridges, there is a bridge troll.  Everytime the bridge is crossed in order for JS to talk to the DOM, the troll blocks your path and charges a speed tax.  Famo.us’s modifier / surface separation allows the developer to care less about accessing the surface, since the modifier is where all the slick transitioning is handled.

The performance problems that Famo.us is attempting to address, along with the rich catalog of physics, animations, and other features included make it a framework to definitely keep an eye on moving forward. Bootsoft is committed to the open web, and are looking to leverage any piece technology that helps us move that goal forward.  When Famo.us is ready for prime time, we will happily find the right project to unleash it on.

The Bootsoft team recently completed the development of coldwellbankerpreviews.com. Coldwell Banker Previews is the Luxury arm of Coldwell Banker and therefore a high level requirement of the project was to convey an air of luxury with the website. This meant using high quality images while maintaining fast page load speeds, rendering single image files in color and black & white and sophisticated property search technology.

Speed vs. Consistency

With such a visual site, it’d be easy to load the page with lots of data for the user to download in order to achieve the desired look and feel. A few years ago, a site like this would easily be over 2mb to download. Fast forward to today, one of mantras that we always say when developing:

“Delivering speed and good UX is more important than visual or functional consistency”

Creating a website to be “pixel-perfect” across all supported browsers should no longer be a priority. When users interact with your site, it’s more important to deliver the content quickly. The New York Times recently published an article, where users’ expectations of page loads are getting shorter and shorter. If the page takes too long to load, users will click away. There was also a recent case study of how Wal-Mart improved online business through better page speed performance.

Paul Irish, one of the leading developers on the Google Chrome team, wrote a blog post about “Tiered, Adaptive Front-end Experiences”, or the TAFFEE approach:

TAFEE (pronounced: taffy): tiered, adaptive front-end experiences. Customizing the experience to the unique capabilities of each browser, prioritizing a fast and good UX over consistency.

A good way of illustrating this approach is by comparing this to an escalator. You can associate a working escalator to the latest browsers used today (Chrome, Firefox, etc). However, when an escalator breaks down, you can still use it as regular stairs (IE7/8, in some cases IE9).

(cartoon by Jody Zellman from BWOG.com)

Another way of illustrating, as mentioned in Paul Irish’s blog post, is to compare to the various TV sets out in the market. There are black & white TV sets, then there are the latest and greatest HDTV’s you can get today. It doesn’t make sense for TV broadcasters to deliver their shows all in black & white, just so that the show can look the same on all TV sets.

HTML5/CSS3 With Fallbacks

To that end, we looked to HTML5/CSS3 for delivering a visually stunning site, while providing a graceful fallback for older browsers based on feature. To achieve this, we used the following technologies/tools:

Not too long ago, front-end developers had to use a fair amount of jQuery or equivalent to achieve visual effects like animations and fading. Today, we can leverage most of these effects through CSS3. This brings the heavy lifting more to the browser, and also cuts down the code that is loaded on the page. Some of these effects are even hardware accelerated, which is better for users’ batteries on their laptops/tablets.

Reducing Images

With CSS3, we were able to shave off our image download size by cutting corners, essentially by not cutting corners (amongst other tricks). We use CSS3 rounded corners, gradients, opacity, text shadows, transitions, the works. Even the styled buttons are not images, which was especially helpful since Coldwell Banker Previews is a multi-language site. Older browsers will simply fallback to non-gradients and non-rounded corners, which are fine since the site will still load quickly and still will look fine.

The biggest “bang for the buck” was to utilize HTML5 canvas in order to render the black & white versions of all images. While the site itself is very lightweight, the largest download is the large image carousel on the homepage, where each image actually accounts for more than half the file size of the entire homepage. Downloading just one color image, then allowing the browser to render the black & white version reduces the download size tremendously.

Resource Loading

One of the key goals for any website we create is to only deliver what the user needs. It doesn’t serve the user to load the entire kitchen sink of images, styles, and JavaScript files when they only need to see a fraction of that at any given time. With that in mind, we utilize a couple of techniques: lazy loading and resource management.

Lazy Loading

There is no shortage of image carousels on the Coldwell Banker Previews website. It would be a huge amount of images to download if the user had to download all of the images at once. This is why each carousel only loads what the user sees, then lazy loads any image that comes into view when the user interacts with it. In addition to this, all pages load images only when the image comes above the fold. This is especially useful for smaller screens, and when the user has to scroll through anywhere from 10-50 properties or specialists on a search result page.

In the screenshots below,the image on the left shows images only below the fold loaded, while the image on the right shows the page after it’s been scrolled to the bottom of the page, enabling lazy loading of all images on the page.

We also do not limit lazy loading to images. Many JavaScript/template files are only loaded when there is a user interaction that requires it. A good example is the property detail page. When the user scrolls down, there is a mini-map that shows the property location via Bing maps. However, the Bing map library is somewhat large, so we only load it when the map comes above the fold.

Resource Management

While users are browsing the site, they only need a fraction of the resources used for the entire website. For example, when a user is on the homepage, they do not need any JavaScript files or styles necessary for the property detail page. Since this site runs on Java, we utilized wro4j, a web resource optimizer for Java. One of the benefits of using this tool is that it also pre-compiles our LESS files, which is highly recommended over processing the files on the client side.

Filling the Gaps With Polyfills

In an ideal world, we can leave behind the browsers that do not support the latest and greatest features in modern browsers. Allowing functionality to simply fall off in older browsers, like HTML5 placeholder, HTML5 form validation, etc., would otherwise be acceptable if we didn’t have a large enough user base using said browsers. In order to fill this gap, we use what are called “polyfills”. As described by Paul Irish, it is “a shim that mimics a future API providing fallback functionality to older browsers”.

In order to only include these polyfills for the browsers that need them, we use Modernizr for feature detection, and will only include the polyfill if the feature cannot be found.

An example of how it works:

Modernizr.load({

//checks if the feature “HTML5 placeholder” exists in the browser
test: Modernizr.input.placeholder,
// if the feature isn’t found, it will load the jquery plugin for placeholder
nope: ‘jquery.placeholder.min.js’,

callback: function(){

// after the polyfill is loaded, initializes the polyfill for older browsers

},

complete: function(){

// runs a set of code regardless if the feature exists or not

}

});

As illustrated, the polyfill only loads if it is needed, otherwise it saves the bandwidth. Modernizr has a built-in utility/plugin called “yepnope”, which adds the flexibility of conditional loading and behavior.

Putting It All Together

Today, the site averages 2-3 seconds on first time view, and coming in at a fraction of a second on repeat views.

A few novel approaches to development that were employed in addition to our normal process include:

  • New object model for fetching and saving customizable data utilizing a website -> page -> components pattern that saves data directly in JSON format so that little to no pre-processing is needed for display, making front-end AJAX calls really fast.
  • New versioning strategy employed to avoid merging issues that occur when the trunk gets too far out of sync with the development branch.
  • Language support built in for supported languages, including Spanish, German, French, Japanese and English.
  • Locale support to handle automatic selection of currency and area unit based on selected language.

Both Coldwell Banker and Bootsoft are very happy with the final product. Great job team!

Visit http://www.coldwellbankerpreviews.com.

At the end of my last post on base64 encoding I wanted to see how numbers changed when using a large image sprite and also how page speed was affected.

To do this I decided to work with one Google’s larger image sprites nav_logo107.png

And to focus the test I removed 2 of the previous test cases (the individual binary images and split base64 encodes) as it was clear from the first post that they would not be small enough to compete.

So that leaves a pure binary sprite and a sprite encoded as base64 for which I created 2 test pages:

First: File Size

The good news is that results from my previous post were still true.

If we set the baseline that the delivered css for both cases is minified and gzipped then both end up being almost exactly the same overall byte size.

Byte Size Comparison

Second: Page Speed

To check page speed I ran 3 sets of tests using webpagetest.org:

Each test was run 5 times for both First View (no cache) and Repeat View (cache)

My initial thought was that the base64 encoded sprite would be faster (as it would be 1 less request)… and it is when looking at total page load time.

In most tests b64-sprite.html total page load time was around 200ms faster without the additional image request.

So if we were only looking at total load time then we have a clear winner.

But take a moment and look at start render times in those same tests.

On “First View” tests sprite.html has start render times that are around 200ms to 300ms faster in each comparison.

You can see what this translates to visually in this video:http://www.webpagetest.org/video/view.php?id=120412_5129cebbfde1452b10358afa3a1f5dacd106366d

Have you ever heard a user say the phrase “this page feels slow”?

I find that in most cases when questioned further the feeling of slowness was due to the time they were waiting for something (anything) to render to the screen.

This is important to keep in mind when deciding if base64 encoding sprites will make sense for your site.

Yes you are saving a request (which will reduce total load time) but you are also making your css files larger (which could increase start render time).

Even in “Repeat View” where resources are cached locally the start render times for sprite.html are faster.  Yes I know we are talking about 10s of milliseconds but try to think of this on a full production site with lots of other resources.  Remember even though the css is cached locally the browser still needs to pull it from local cache and render it.  So if the cached file is larger in size it will take longer to render.

As an aside: IE8 throws a nice wrench into the gears on this as well. IE8 only supports data URIs up to 32k in size. So if you plan to base64 encode large image sprites you will run into issues with IE8 (basically they won’t render at all in that version).

Conclusion:

I hate to say this but when choosing which option is best it will depend mostly on the project.

I would ask the following questions if you are using a large image sprite:

1) Are start render times more important than overall load time for your users? If yes then based on the above findings it would make sense to try the image in pure binary first and run speed tests.

2) Do you still support IE8? If yes then go with a pure binary sprite.

3) Do you plan to set expires headers the same on css and images? If no then consider a pure binary sprite. That way when a user’s local cache is cleared on your css they won’t have to download a larger file each time.

Thank you for reading and please let me know your thoughts in the comments below

A good friend introduced me to the concept base64 encoding and I have gone down the road to research what scenarios it is best suited for.

On that road I found a great post by David Calhoun testing byte sizes by comparing binary images to their base64 counterparts. After reading it I thought it would be interesting to extend his UI icon test one step further and include an image sprite to see how that compares in byte size.

Note: I understand the drawbacks David has mentioned regarding sprites however I believe those drawbacks are manageable considering the potential gains to be had from a file size standpoint. Especially if you are managing more than just icons in your sprite files.

How I setup the test:

So lets see how the numbers work out…

What I thought I would see:

  1. Test Case#2 (icon-sprite.png + combined-binary.css) would be the smallest in total filesize when compared to Test Case #1 (split-binaries.css plus its images) and Test Case #3 (split-b64.css)
  2. Test Case#4 (combined-b64.css – which replaces the image sprite with its base64 counterpart) would be the smallest total file size after Gzipping was done
  3. Gzipping Binary Images would sometimes lead to larger filesizes as those binaries would typically be compressed fairly well to begin with (or at least should be if you care about responsiveness)

The Results:

  • View on Google Docs
  • assumption #1 turned out true.  The creation of a sprite from the 5 icon images was smaller in file size than both separate images and separate base64 encodes.
  • assumption #2 did not turn out to be true.  However it was by such a small amount that you could pretty much say it is equal to Test Case #2 (we are talking about 23 bytes here)
  • assumption #3 was interesting as gzipping of the binary spite resulted in a slightly larger filesize but gzipping the images individually ended up being smaller in total size.  The differences are pretty tiny anyway and because of potential overhead in the gzip process this would probably not be needed if you are good about optimizing your source images.

So what does this really mean?

I think the numbers above depend on what you prefer to use and how you like to work.  To be transparent I have always preferred image sprites.  My opinion is managing them becomes easier the more you use them and there are a lot of good techniques today to get around some of their earlier drawbacks (such as using pseudo selectors to help get over clipping issues).  So considering the findings above I personally would probably continue to stick with sprites.

However… you could possibly get benefit by making a sprite, converting to it base64 and then gziping it.  The gain here would be 1 less request (and that could mean a lot on a larger site).  To really test this idea I am going to build a case using a larger sprite file that has more than just icons in it to see how it scales.

In the future I am going to try to run performance testing to see baseline speed for each case from different locations (I’ll use webpagetest.org).  Number of requests can sometimes make a huge difference in those tests so that might help put some additional clarity around this topic as well.

Further reading on base64:

  • Wikipedia entry
  • ftp://ftp.rfc-editor.org/in-notes/rfc3548.txt
  • Understanding Base64 Encoding
  • http://calebogden.com/base64-encoded-fonts-images/
  • http://www.nczonline.net/blog/2010/07/06/data-uris-make-css-sprites-obsolete/