Browser Input Events: Can We Do Better Than The Click?

About The Author

Dustan Kasten is a developer advocate at Skookum Digital Works. He spends his time exploring ideas behind browser-based UI application development. He still … More about Dustan ↬

Email Newsletter

Weekly tips on front-end & UX.
Trusted by 200,000+ folks.

You have likely experienced the 300-millisecond delay in mobile browsers or wrestled with touchmove versus scrolling. Certain events that used to be very clear are now filled with ambiguity. The click event used to mean one thing and one thing only, but touchscreens have complicated it by needing to discern whether the action is a double-click, scroll, event or some other OS-level gesture. In this article Dustan Kasten will introduce the event cascade and use this knowledge to implement a demo of a tap event that supports the many input methods while not breaking in proxy browsers such as Opera Mini.

Responding to user input is arguably the core of what we do as interface developers. In order to build responsive web products, understanding how touch, mouse, pointer and keyboard actions and the browser work together is key. You have likely experienced the 300-millisecond delay in mobile browsers or wrestled with touchmove versus scrolling.

In this article we will introduce the event cascade and use this knowledge to implement a demo of a tap event that supports the many input methods while not breaking in proxy browsers such as Opera Mini.

Overview

Three primary input methods are used to interact with the web today: digital cursors (mouse), tactile (direct touch or stylus) and keyboards. We have access to these in JavaScript through touch events, mouse events, pointer events and keyboard events. In this article we are primarily concerned with touch- and mouse-based interactions, although some events have standard keyboard-based interactions, such as the click and submit events.

You have very likely already been implementing event handlers for touch and mouse events. There was a time in our not too distant pass when the recommended method was something akin to this:


/** DO NOT EVER DO THIS! */
$('a', ('ontouchstart' in window) ? 'touchend' : 'click', handler);

Microsoft has led the charge to create a better, future-facing event model with the “Pointer Events” specification. Pointer events are an abstract input mechanism that is now a W3C recommendation. Pointer events give the user agent (UA) flexibility to house numerous input mechanisms under one event system. Mouse, touch and stylus are all examples that easily come to mind today, although implementations extending to Myo or Ring are imaginable. While web developers seem to be really excited about this, not all browser engineers have felt the same. Namely, Apple and Google have decided not to implement pointer events at this time.

Google’s decision is not necessarily final, but there is no active work being done on pointer events. Our input and usage of pointer events through polyfills and alternative solutions will be part of the equation that could eventually tip the scale the other way. Apple made its statement against pointer events in 2012, and I am unaware of any more public response from Safari’s engineers.

The Event Cascade

When a user taps an element on a mobile device, the browser fires a slew of events. This action typically fires a series of events such as the following: touchstarttouchendmouseovermousemovemousedownmouseupclick.

This is due to backwards compatibility with the web. Pointer events take an alternative approach, firing compatibility events inline: mousemovepointerovermouseoverpointerdownmousedowngotpointercapturepointerupmouseuplostpointercapturepointeroutmouseoutfocusclick.

The event specification allows for UAs to differ in their implementation of compatibility events. Patrick Lauke and Peter-Paul Koch maintain extensive reference material on this topic, which is linked to in the resources section at the bottom of this article.

The following graphics show the event cascade for the following actions:

  1. an initial tap on an element,
  2. a second tap on an element,
  3. tapping off the element.

Please note: This event stack intentionally ignores where focus and blur events fit into this stack.

The event cascade on iOS devices for tapping on an element twice and then tapping away
The event cascade on iOS devices for tapping on an element twice and then tapping away. (Image: Stephen Davis) (View large version)
The event cascade on most Android 4.4 devices for tapping an element twice and then tapping away
The event cascade on most Android 4.4 devices for tapping an element twice and then tapping away. (Image: Stephen Davis) (View large version)
The event cascade on IE 11 (before compatibility touch events were implemented) for tapping an element twice and then tapping away.
The event cascade on Internet Explorer 11 (before compatibility touch events were implemented) for tapping an element twice and then tapping away. (Image: Stephen Davis) (View large version)

Applying The Event Cascade

Most websites built today for the desktop web “just work” because of the efforts of browser engineers. Despite the cascade looking gnarly, the conservative approach of building for mouse events as we previously have will generally work.

Of course, there is a catch. The infamous 300-millisecond delay is the most famous, but the interplay between scrolling, touchmove and pointermove events, and browser painting are additional issues. Avoiding the 300-millisecond delay is easy if:

  • we optimize only for modern Chrome for Android and desktop, which use heuristics such as <meta name="viewport" content="width=device-width"> to disable the delay;
  • we optimize only for iOS, and the user does a clear press, but not a quick tap and not a long tap — just a good, normal, clear press of an element (oh, it also depends on whether it’s in a UIWebView or a WKWebView — read FastClick’s issue on the topic for a good cry).

If our goal is to build web products that compete with native platforms in user experience and polish, then we need to decrease interaction response latency. To accomplish this, we need to be building on the primitive events (down, move and up) and creating our own composite events (click, double-click). Of course, we still need to include fallback handlers for the native events for broad support and accessibility.

Doing this requires no small amount of code or knowledge. To avoid the 300-millisecond (or any length of) delay across browsers, we need to handle the full interaction lifecycle ourselves. For a given {type}down event, we will need to bind all events that will be necessary to complete that action. When the interaction is completed, we will then need to clean up after ourselves by unbinding all but the starting event.

You, the website developer, are the only one to know whether the page should zoom or has another double-tap event it must wait for. If — and only if — you require the callback to be delayed should you allow a delay for the intended action.

In the following link, you will find a small, dependency-free tap demo to illustrate the effort required to create a multi-input, low-latency tap event. Polymer-gestures is a production-ready implementation of the tap and other events. Despite its name, it is not tied to the Polymer library in any way and can easily be used in isolation.

To be clear, implementing this from scratch is a bad idea. The following is for educational purposes only and should not be used in production. Production-ready solutions exist, such as FastClick, polymer-gestures and Hammer.js.

The Important Bits

Binding your initial event handlers is where it all begins. The following pattern is considered the bulletproof way to handle multi-device input.


/**
 * If there are pointer events, let the platform handle the input 
 * mechanism abstraction. If not, then it’s on you to handle 
 * between mouse and touch events.
 */

if (hasPointer) {
  tappable.addEventListener(POINTER_DOWN, tapStart, false);
  clickable.addEventListener(POINTER_DOWN, clickStart, false);
}

else {
  tappable.addEventListener('mousedown', tapStart, false);
  clickable.addEventListener('mousedown', clickStart, false);

  if (hasTouch) {
    tappable.addEventListener('touchstart', tapStart, false);
    clickable.addEventListener('touchstart', clickStart, false);
  }
}

clickable.addEventListener('click', clickEnd, false);

Binding touch event handlers could compromise rendering performance, even if they don’t do anything. To decrease this impact, binding tracking events in the starting event’s handler is often recommended. Don’t forget to clean up after yourself and unbind the tracking events in your action-completed handlers.


/**
 * On tapStart we want to bind our move and end events to detect 
 * whether this is a “tap” action.
 * @param {Event} event the browser event object
 */

function tapStart(event) {
  // bind tracking events. “bindEventsFor” is a helper that automatically 
  // binds the appropriate pointer, touch or mouse events based on our 
  // current event type. Additionally, it saves the event target to give 
  // us similar behavior to pointer events’ “setPointerCapture” method.

  bindEventsFor(event.type, event.target);
  if (typeof event.setPointerCapture === 'function') {
    event.currentTarget.setPointerCapture(event.pointerId);
  }

  // prevent the cascade
  event.preventDefault();

  // start our profiler to track time between events
  set(event, 'tapStart', Date.now());
}

/**
 * tapEnd. Our work here is done. Let’s clean up our tracking events.
 * @param {Element} target the html element
 * @param {Event} event the browser event object
 */

function tapEnd(target, event) {
  unbindEventsFor(event.type, target);
  var _id = idFor(event);
  log('Tap', diff(get(_id, 'tapStart'), Date.now()));
  setTimeout(function() {
    delete events[_id];
  });
}

The rest of the code should be pretty self-explanatory. In truth, it’s a lot of bookkeeping. Implementing custom gestures requires you to work closely with the browser event system. To save yourself the pain and heartache, don’t do this à la carte throughout your code base; rather, build or use a strong abstraction, such as Hammer.js, the Pointer Events jQuery polyfill or polymer-gestures.

Conclusion

Certain events that used to be very clear are now filled with ambiguity. The click event used to mean one thing and one thing only, but touchscreens have complicated it by needing to discern whether the action is a double-click, scroll, event or some other OS-level gesture.

The good news is that we now understand much better the event cascade and the interplay between a user’s action and the browser’s response. By understanding the primitives at work, we are able to make better decisions in our projects, for our users and for the future of the web.

What unexpected problems have you run into when building multi-device websites? What approaches have you taken to solve for the numerous interaction models we have on the web?

Additional Resources

Further Reading

Smashing Editorial (da, al, ml, mrn)