A Snapshot of The Yahoo! Photos Beta (from 2006)

Web 1.0 dHTML!! (San Francisco, 2005 on flickr.com) "DHTML" at the Web 1.0 summit, SF (2005)

I moved to California in early 2005 to work at Yahoo! on the redesign of the Yahoo! Photos service. I built some really fun prototypes and worked with the UED/design group in an effort to make something more desktop-like (selection model, drag and drop, animations etc.) at a time when Google Maps had just been released, Jesse James Garrett of Adaptive Path had just written a paper defining the term "Ajax", and the wave of new-age javascript libraries such as YUI were in their planning stages.

Javascript-driven Desktop-like UI

Video: Photos 3 UI in action, showing the selection model, drag and drop, rearrange, keyboard modifiers, photo tray, animation effects and other UI elements. I also had an experimental easter-egg prototype that also included a few key [read: cartoon-like] sound effects when the user successfully completed actions, and others for when the user failed. It was entertaining, though short-lived.

Thoughts from a Web Development perspective (2006)

The redesigned Yahoo! Photos went into beta in the US, and ultimately closed; its users migrated to other photo sharing services like Flickr, which I currently work on. Douglas Crockford recently mentioned Yahoo! Photos in a talk on Ajax Performance as one example of a project that hit "the wall", a performance issue in IE 6, and it seemed apt to bring up this incomplete article written around the time Y! Photos was going into public beta. The following are some notes, thoughts and learnings from my time working on the redesign of Yahoo! Photos, written in 2006, as they apply to javascript/DOM and browser performance, UI considerations, design elements and so on. These are my own thoughts, opinions etc. and are not necessarily those of Yahoo! Inc.

A Brief History (And A Rant)

How Did We Get Here?

The new Yahoo! Photos started out with an interest from the design group, regarding using a drag-and-drop-style interface; this was before Jesse James Garrett's infamous "ajax" article and the subsequent renewed interest in Javascript, so there were some questions around feasibility. Prototyping started on selection and drag-and-drop models for a grid-based thumbnail view, similar to the grid-based file views presented by Microsoft's Windows Explorer and Apple's Mac OS X, which turned out to be successful.

Given browsers do not have native interfaces, APIs or widgets for these sort of desktop-like behaviours, they had to be developed from scratch. Thoroughly-tested prototypes that were developed for the selection, drag-and-drop models and other features have been evolved and extended during development, and are in use at the time of writing.

In more recent times, the Yahoo! User Interface Libraries have been released, providing Javascript packages that include event handling, element positioning, animation, xmlHttpRequest (ajax) connection management and debugging/logging. The new Photos is extensively using the Connection Manager for retrieving photo data, copying and renaming photos, and retrieving other dynamic content, and it has made creating and handling XmlHttpRequest calls quite painless.

How Much Is Too Much?

This project has been a lot of fun to work on, though for myself it has not been without some sanity checks. I personally consider the new Yahoo! Photos to be pushing the boundaries of the browsers' capabilities, almost to the point where I find myself asking, "Should we really be doing this?1" We have made some really neat technical accomplishments with the new Photos, but the level of development complexity has risen significantly in an already-hostile development environment2. Due to the excitement surrounding "ajax" and related approaches to development, all kinds of ideas are being tried out on the web, some more sustainable than others; I think we've made an active effort on Photos to keep realistic features and goals in mind, being aware of and working within the limitations of the current supported browsers, considering maintainability of code and so on, so as not to paint ourselves into a corner with over-the-top requirements or unrealistic goals. In talking to friends and co-workers, I remind them (and myself) to keep "How much is too much?" in mind when working on new projects such as this. Once your work is out there, the development method or language used remains important to the developers for maintenance and so on, but ultimately falls short of the user's interests, the product of that work: "Is the site fast? Does it have the features I want? Is it easy to use?", etcetera.

"Desktop Model" Considerations

In order to provide a more desktop or "application-like" experience, a lot of time has been spent developing and optimising Javascript and related code which makes it all possible. Choosing to implement drag-and-drop on a project for example, may be expensive; with the drag-and-drop "paradigm" comes many related actions by association from the desktop world, such as right-clicking (context menus), selection model (marquee selection) interactions, keyboard shortcuts and so on. Once users realise they can drag and drop items on your web site, they may be quick to try other actions they're used to doing on their desktops (and subsequently, be confused or disappointed if said features do not work.) Ultimately when emulating the desktop model, code has to be written to handle many different interaction use cases in an effective, efficient and maintainable way to be successful.

To quote our theoretical Web Development mindset again, and to summarize: "I don't know if Javascript was meant to do all of this, but we've made it work thus far - so here are some findings we made along the way!"

Nuts and Bolts: Technical Considerations

Javascript can make the browser do more than most people might think, even if there is in fact a lot of trickery going on behind the scenes. Sometimes that's what it's all about.

The following are some of the areas that were considered when developing and testing features on the new Yahoo! Photos.

  • Javascript performance (looping, logic)
  • DOM performance/markup templating
  • Memory/CPU use
  • Throttling event handlers/timer-related calls
  • Debugging/Development Techniques

Javascript performance (looping, logic)

Some general theories apply for making common Javascript logic and evaluations faster; for example, considering the way looping comparisons are done. The performance gains in most looping cases may be negligible unless dealing with larger collections, but it probably doesn't hurt to write more efficient code from the get-go provided it's not significantly more difficult for the developer to read or debug.

For example, an inverse loop:

for (var i=someArray.length; i--;) {
  someArray[i].doSomething();
}

DOM Performance/Node Templating

Javascript itself seems to be pretty fast and efficient when working in its own world; optimizing loops isn't a bad idea, but my findings have been that interactions with the DOM are considerably more expensive and are subsequently one of the most common Javascript bottlenecks that are encountered.

In the age-old debate of "DOM vs. innerHTML" in regards to dynamically inserting and updating content, the ideal standard should be appendChild. Photos takes advantage of HTML "templates" which it clones and re-uses for elements like photo thumbnails, dialogs and so on. Having these kind of "widgets" inline in the HTML (or externalised and fetched via XHR as needed) simultaneously makes internationalisation efforts and modification easier. Separation of concerns (presentation in CSS, logic/behaviour in Javascript, content in HTML) is also maintained this way.

To make the standard DOM methods comparable to using innerHTML, simply modify, clone and append all nodes "offline" in Javascript, and append the completed document fragment (or node collection) to the document when finished. Many approaches I have seen approaches for 100 items say, calling appendChild() for each item being added which would be similar to modifying innerHTML 100 times instead of building a string and changing innerHTML once. It's inefficient because from what I understand, the browser will be busily trying to reflow the document each time innerHTML or appendChild is called.

Photos uses an inline HTML template for a photo item, defined as an LI, a few DIVs, an A and an INPUT. This element is first modified to match the kind of view (based on owner/non-owner, permissions, guest view and so on), and then a "page template" is created by cloning this item, inhereting children, via cloneNode() 100 times (that is, 100 photos per page.)

This empty page template is then cloned and appended whenever a new, full page of photos is needed. A cloneNode() followed by an appendChild() call does the trick.

CSS Class Name vs. Direct Style Changing

Changing the class name of an element is a nice way to use Javascript to dynamically change elements, while maintaining separation of concerns. Javascript could apply an "active" class name to an image, with matching CSS: img.active { border:1px solid #ff3333; }

Performance seems to vary from browser to browser, but generally it seems to be faster to change an element's visual appearance directly via the Javascript "style" attribute, rather than to change a class name on that element. eg. someImage.style.border = '1px solid #ff3333';

This method appears to be more efficient when changing a specific number of items. Sometimes a single class name change is effective however. If you need to change all elements under a given container for example, it is more efficient to change the class name of a parent container which holds the affected elements and let CSS do what it does best. eg., #someImages.active img { border:1px solid #ff3333; } - by changing the class name of "someImages", all of its child images will be updated.

Photos uses a mix of both CSS class and direct approaches where appropriate. When the user is drawing a selection marquee over a number of photos, it is faster to directly modify the style attributes of the selected photo items. Changing class names on the photo items as they were highlighted made the browser eat up more CPU and therefore took longer to update visually.

In the case of showing an "inline dialog", sometimes a modal effect is desired which focuses the user's attention on the dialog by graying out the body background. We accomplish the gray-out by simply changing the class name of the body element to "covered", which causes the "cover" div element to be shown and applies a few other rules. CSS' cascading nature makes it easier to maintain than the inline script method.

I have done some research and have written about class name changing relating to animation on my personal site in the past.

Managing many events, DOM Elements + JS Objects

  • Inverse (reverse?) looping
  • DOM performance / ease-of-templating considerations
  • Memory/CPU considerations
  • Create objects on-the-fly, as-needed
  • Destroy objects on-the-fly
  • IE: "The Wall"
  • Event Assignment/Handling
  • Throttling of handlers/timer-related calls
  • Debug early, debug often

Memory Management

In short, memory use becomes more of a consideration as Javascript is used to create more objects and store more data.

There are several approaches Photos has taken to prevent runaway memory use:

  • "Defensive programming" against known memory leak patterns
  • Object-Oriented Constructor/Destructor-style management approach
  • Minimize use of heavy/expensive objects/assignments
  • Retrieve, create and destroy data on-the-fly
Programming Against Memory Leaks

When Javascript objects hold references to DOM nodes (and when nodes refer back to objects,) there is a chance memory will not be properly freed and released when the page is unloaded from memory (ie., the user leaves the page or reloads.) The result is a browser process that takes an increasing amount of memory until it is finally closed.

Common Causes:

  • JS Object->DOM references
  • "Circular reference" (JS Object->DOM->JS Object)

A common leak condition occours when a Javascript object has a reference to a DOM node, and an event handler (eg., mousedown) is assigned to that node which pointing back to a method within the Javascript object.

Example leak:

function SomeObject(oNode) {
  this.o = oNode; // reference to a DOM node
  this.mouseoverHandler = function() {
    // handler code goes here
  }
  // assignment of event handler pointing back to this object
  this.o.onmouseover = this.mouseoverHandler;
}

This issue affects Internet Explorer most noticeably, and can be worked around by defensively programming against it.

  • "Build it up, tear it down"
  • On destruction of object (when no longer needed, or at unload time):

    • Unhook event handlers on DOM nodes
    • Unhook references to DOM nodes
    • If paranoid, "set everything to null" when objects are no longer needed.

Photos uses initialization/destructor methods in most objects and has separate methods for assigning and removing event handlers, so it's easy to reuse (and maintain track of which handlers you're using.)

Leak-free* example:

function SomeObject(oNode) {
  var self = this;
  this.o = oNode;
  this.mouseoverHandler = function() {
    // handler code goes here
  }
  this.mouseoutHandler = function() {
    // handler code goes here
  }
  this.assignEvents = function() {
    self.o.onmouseover = this.mouseoverHandler;
    self.o.onmouseout = this.mouseoutHandler;
  }
  this.removeEvents = function() {
    self.o.onmouseover = null;
    self.o.onmouseout = null;
  }
  this.init = function() {
    self.assignEvents();
  }
  this.destructor = function() {
    self.removeEvents();
    self.o = null;
    self = null;
  }
}

function doCleanup() {
  someObject.destructor();
  someObject = null;
  window.onunload = null;
}

var someObject = new SomeObject(document.getElementById('someNode'));
someObject.init();

window.onunload = doCleanup;

Having a separate "init" method makes objects more flexible, as it allows for an "Instantiate earlier, initialise later" approach.

On window.unload() in Photos, several core objects' destructor methods are called which subsequently trigger destructor methods on their child objects, if applicable. This chain of events effectively cleans up and prevents most memory leaks. To be safe, the objects themselves are also nulled out after being destroyed.

Identifying Memory Leaks

I use different methods for Internet Explorer and Firefox, respectively. Firefox has a "leak monitor" extension which looks for particular types of leaks (Object-<DOM references as above, I believe.) With IE, I don't use a particular program to determine which elements leaked - I watch the memory allocated to the process in Windows Task Manager.

Using IE, you can monitor the "mem usage" column uner the "processes" tab of Windows Task Manager. I observe the memory at load time, after some use, and finally after refreshing the page. IE typically starts at around 20 MB of RAM used, so if continual refreshing and reuse of your page increases memory use, you've got leaks. IE's reported memory use can vary several hundred KB from what I've seen on refresh, but it's generally stable if there aren't any leaks. Process of elimination in combination with the Firefox leak extension can help to identify and eliminate the sources of the issue.

Internet Explorer: "The Wall"

"The Wall3" is a mysterious browser behaviour specific to Internet Explorer (possibly just version 6,) when a certain amount of javascript objects are present in memory.

It may be a very implementation, approach-specific issue, but Photos hit the wall quite hard in development and a compromise had to be made. Photos creates a "PhotoItem" object instance for each thumbnail that is shown. Each object has a number of DOM node references (eg. an image or a DIV,) methods, properties and references to other objects in some instances.

Having many more than 100 "PhotoItem" objects active at a time begins to degrade DOM, XML parsing, general javascript performance in IE (seen in development on IE6, Windows XP) in a seemingly non-linear fashion. The more active photo objects, the more memory is used at once and the slower the browser runs. Though unconfirmed, it may be possible there's a correlation between this and the number of image elements being generated and loaded, as well.

Firefox seems to be immune to this issue - when going between "pages" of photos, no noticeable performance degradation (javascript, XML parsing, DOM) is seen. Douglas Crockford has theorised this may be related to the browser's garbage collection routines spending too much time trying to manage a large number of objects, looking for items to clean up.

"The Wall" Workaround

To avoid hitting the wall, Photos will destroy PhotoItem objects as they are not needed. PhotoItem objects are created as needed, as the user is viewing and scrolling through photos. Once they move to the next "page" (javascript-driven, same URL) of photos, the previous page is hidden from view and its related objects are destroyed. Throwing away these objects is enough to avoid the effects of the wall without introducing other performance side-effects.

Event Handling Approach

To provide a desktop-like experience, Photos must support specific actions related to drag-and-drop, drawing a selection marquee, using keyboard shortcuts and so on. The following is the approach we took to assigning event handlers.

Lightweight event handling: Event Delegation

The ideal approach would allow assignment of event handlers to individual objects, but these add up quickly with the number of handlers that would be required for each object. A thumbnail would have to watch for mousedown, "drag start", mouseup, mouseover/out for inline editing of its caption and so on. Given five event handlers per photo item, this would result in 500 event handlers being assigned for one page of items - very expensive.

Since event handlers can be somewhat expensive to individually create, assign and maintain, Photos has taken a higher-level approach of assigning a "controller" object which watches events at the document level, determines the object that should handle the event, and dispatches the event accordingly.

Event Delegation Pseudocode

This is the general approach taken by Photos to minimize the number of assigned event handlers. To begin with, only mousedown is watched at the global level; additional event handlers are then assigned and removed on the fly as needed, depending on the action being taken.

 - Watch "global" document.onmousedown
 On mousedown:
  Capture event target (element that received mousedown)

  IF (event target is a child element of the thumbnail area - ie., user clicked inside thumb area) {

   CASE: mousedown on a thumbnail {
     watch mousemove for start of drag
     ON (mousemove) {
       begin drag/drop, assign event handlers (mousemove/mouseup event handlers change)
       mousemove: drag photos and coordinate checking (target "collision detection")
       mouseup: stop drag/drop (reset handlers,) determine drop target, take action if necessary
     }
     ON (mouseup) {
       IF (CTRL key was held down, multi-select mode) {
         Toggle selection (highlight) on this photo, retaining other selections
       } ELSE IF (SHIFT key was held, range select mode) {
         Select "from last-clicked photo (or first photo) to this one"
       } ELSE (no modifier keys) {
         Select only this single photo
       }
     }
   }

   CASE: mousedown in empty/whitespace inside photo area {
     watch mousemove for start of selection marquee
     on (mousemove) {
       draw selection marquee, do coordinate checks to determine photos to highlight
       selection highlighting logic changes to "toggle" with CTRL key - existing selection retained, items "hovered over" toggle their selection status.
     }
     mouseup: reset event handlers
   }

  }

Efficiency: Throttling event handlers/timer-related calls

Something as simple as window.onresize = myResizeHandler; can be expensive, given this handler may be called many times during a resize operation; the responsiveness of the UI can appear slower depending on the amount of CPU time required to execute your handler - particularly if your code is causing the browser to reflow the document.

Browser implementations of event handlers differ in the number of times a handler may be called during an operation. IE will fire onresize() many times while you are moving the mouse, "drag-resizing" (dragging the bottom right-hand corner), Firefox in this case will fire only once after you have stopped moving the mouse. (If you grab the side of the window and resize on the X axis, it appears to fire only after you release the mouse button.)

Either way, you can expect onresize() to fire excessively. This can also apply to other events such as window.scroll(). You may only want your function to run once after the last resize/scroll event has fired, to prevent excessive calls to expensive functions.

This could also be seen as "ensure this function is called 'n' msec after the end of a series of excessive calls."

Event Handler Throttling Example

The following will attempt to limit the number of times an event handler function actually gets called; in the event IE makes many calls to a resize handler, the handler will effectively be "rate limited" to two executions per second in this case. This will help prevent the user's CPU from being floored due to function code and reflow/calculation work from excessive event handler calls.

function SomeObject() {
  var self = this;
  this.lastExecThrottle = 500; // limit to one call every "n" msec
  this.lastExec = new Date();
  this.timer = null;
  this.resizeHandler = function() {
    var d = new Date();
    if (d-self.lastExec<self.lastExecThrottle) {
      // This function has been called "too soon," before the allowed "rate" of twice per second
      // Set (or reset) timer so the throttled handler execution happens "n" msec from now instead
      if (self.timer) {
        window.clearTimeout(self.timer);
      }
      self.timer = window.setTimeout(self.resizeHandler,self.lastExecThrottle);
      return false; // exit
    }
    self.lastExec = d; // update "last exec" time
    // At this point, actual handler code can be called (update positions, resize elements etc.)
    // self.callResizeHandlerFunctions();
  }
}

var someObject = new SomeObject();

window.onresize = someObject.resizeHandler;

Incomplete notice: This was as far as I expanded on my notes, so some items are sparse.

Loading Data, Creating Objects On Demand

 - watching of scroll, resize, key press, mousedown

Photo display logic:
 - Effective method to discuss: algorithm for "find number of photos on-screen"
 - scan left to right, then top to bottom

Creating objects / requesting data on-the-fly

 - Algorithm to show a page of photos:
   IF (!photo data) {
     queue and send request for nearest block of photo data (based on current on-screen photos,) re-call this function when complete
     exit
   } ELSE {
     photo data is available.
     IF (no DOM nodes for this page) {
       clone and append DOM nodes offline
     }
   }
   Map "photos" data object to this new collection of photos
   Swap current active "page" (DOM nodes) with new "page"
   exit

Loading content "on demand"

 - Load thumbnail images and create photo item objects on demand - optimize bandwidth and CPU use
 - Create a page of empty photo "templates" (100 copies of a template item) - DIV, span, IMG, A etc.
 - Effect is an empty-looking page with "placeholders", but full height of page/items is already allocated (scrollbar is present, etc.)
 - Determine which photo items are visible on screen
 - Create photo item objects on demand (results in dynamic API requests for photo data, if needed)

 - Algorithm:
 Get current on-screen photos (eg. #10 through #25)
 IF (photo objects exist for on-screen items) {
   IF (photos have not loaded) {
     queue loading thumbnails (up to a limit, eg. 8 at a time)
     onload: queue next available thumbnail for load ("next to load" determined at onload time)
   }
 } ELSE {
   if (photo data is not present) {
     API request: get photo data for nearest "block" of photos (eg. we load 25 photos at a time from the API, so 1 through 25), mark these photos as being "queued" for incoming data (preventing re-checking of load status)
     exit
   } ELSE {
     create photo objects with related data from API
     fill thumbnail loading queue with these items as applicable
   }
 }

"What we'd like to do" vs. "What we end up doing" / ideal vs. realistic cases

Assigning event handlers: Event handlers assigned to each individual object vs. "event delegation" ("traffic cop" approach, single event assignment which dispatches events to applicable objects)

Manipulating DOM elements: Changing class name vs. directly modifying style attributes via JS - the latter is faster, though not as clean because it mixes behaviour and presentation layers (CSS is now "inline" in Javascript, due to inline style references.)

Fun Stuff

 - Animation/UI effects considerations
 - Practical vs. eye candy

Animation/UI effects considerations

A tricky area at best.

Browsers handle javascript animation (manipulating style elements of DOM elements such as width, height, position (left/top), opacity and so on) with varying degrees of efficiency.

Javascript animation has always been a bit unusual, given it relies on timeout or interval-based "timing"-related methods in order to work, which vary in accuracy depending on CPU use and other factors. (JS timeout/interval calls are always "polite" in that they will wait for the CPU to be free before executing, so it's possible a 500 msec timeout would not actually run until 1000 msec, for example. This makes it difficult to produce consistent frame rates across different browsers.)

From the research and testing I've done over the years, I think optimization (rather, efficiency or the ability to run animations smoothly) may be tied to hardware - specifically, graphics card acceleration. An 800 Mhz Pentium III runs Photos 3's animation effects quite smoothly with Internet Explorer, which seems to take advantage of directX or otherwise use hardware acceleration. By comparison, Firefox on the same machine has a much lower frame rate (and higher CPU use.)

Given that some older javascript animation work I've done seems to run fairly smoothly on Safari in most cases, I wonder if layout complexity (large numbers of relatively-positioned, floated elements etc.) is also a factor. I have not done many tests or profiling, but it's possible the browser may be spending a lot of time just trying to render the document while doing animation.

The Mac platform is most disappointingly slow from the tests I've done, and again I suspect hardware acceleration may be the cause. Firefox on the PC is generally much faster than Firefox on a comparable, or even faster, Mac.

On my modern 2.8 GHZ Pentium 4, I see less-pronounced but still noticeable differences in frame rate between Internet Explorer and Firefox.

Practical vs. eye candy

Be creative, but don't overdo it.

The animation effects in Photos, while fun and unique, have practical reasons and are intended to enhance the UI; the goal is to make the UI feel more responsive and fun without slowing it down. Animation sequences should be brief and ideally fluid, dropping frames as needed to maintain a roughly-consistent run time.

Animation effects are used sparingly, only where they smooth out a transition from one state to another which would otherwise be "too sudden", and by their addition do not cause the UI to feel slow or unresponsive.

As an example of practicality, consider the way photo thumbnails "group together" when being dragged. This was done with the interest of save browser real estate, making a consistent visual representation of photos (regardless of the number of photos selected,) and most importantly, preventing a large "ghost selection" of photos from moving around inside the browser window and creating undesirable horizontal scrollbars.

Using Windows Explorer as an example, when a group of files are selected in "tile" mode and a drag-and-drop begins, a "ghost" representation of the items being dragged are shown and moved relative to the cursor. I wanted to avoid this with Photos, because in the browser this would result in horizontal scrollbars showing up in many drag-and-drop situations, in addition to having a number of large items moving around the screen - visually distracting.

The first prototype of the selection/drag and drop had no transition effect, and would simply position all selected thumbnails directly under the cursor on the start of a drag. Out of this came the opportunity to use some animation effects. It was clear that a transition moving the photos from their original positions to underneath the cursor would give the user some visual feedback as to what was happening in response to their action, as well as avoiding the distracting scroll issues. Early prototypes were well-received by peers and reviewers; users would actually play around with the UI.

Debug Early, Debug Often

writeDebug('some message',E_WARN);

With a fairly complex object-oriented structure, it becomes increasingly difficult to keep track of all event handlers, logic and methods that can be executed on a growing number of objects at any given time - particularly if you didn't write the code yourself, are new to the project or are a QA person trying to find more information about the cause of an error. Adding debug statements to your code can help immensely in troubleshooting logic errors, performance and other common development issues.

The YUI Logger control is now available and provides this kind of functionality - highly recommended.

That is all.

Footnotes:

  1. The web browser was originally developed as (and still is) a "document viewer."
  2. Javascript guru Douglas Crockford has called the web one of the most "hostile" development environments he knows of.
  3. "Leak-free" according to known common leak patterns.
  4. Douglas Crockford has referred to the mysterious "IE slowdown" issue, citing the name from Microsoft, as a bug known as "The Wall"

Related links