Responsive images, the picture element and the W3C: This is how you deal with Hixie and WHATWG

First of all, huge congratulations to Mat Marquis, Jason Grigsby, Scott Jehl, Ethan Marcotte, Florian Rioval and all the other web geniuses who are working to make responsive images much, much easier for those working with responsive web design.

Today is a milestone. The Responsive Images Community Group of the W3C has now published their proposal for responsive images, the picture element, as a W3C draft. Make sure to read the draft of the picture element. It’s a great step towards a better web.


HTML5 grew out of the work of the Web Hypertext Application Technology Working Group, or WHATWG. At the time, HTML was stagnating, XHTML was going nowhere, and not enough was happening at the W3C to change that. WHATWG, with the cooperation of representatives from assorted web browsers, fashioned over time a proposal for the next generation of HTML5. Ian Hickson, aka Hixie, helmed this effort. There is no doubt that his dilligent efforts have moved HTML forward, and HTML5 would not exist without his efforts.

However, the model that allowed WHATWG to make faster progress than the W3C is that of a benevolent dictatorship. Some would quibble over how benevolent the dictatorship has been, but lets just posit for now that some good has come out of this model.

The term dictator comes from the days of the Roman Republic. The form of government of the time was complicated, but some important elements, included the Roman Senate, which consisted of patriarchs (old, rich men from the right families), who mostly served for life, once selected to join this august body; two consuls, elected annually by those of sufficient social status, who served as heads of state; and ten tribunes, also elected annually, who had the power to veto nearly anything. This complicated system of government served to make change difficult.

From time to time, when the Roman Republic was threatened, the Senate chose one man to have nearly unlimited power in order to cut through all the bureaucracy and veto points. This dictator would serve for six months, and then—assuming the crisis had passed—return power to the Republic and retire from power.

Of course, eventually, crises grew so out of hand that power sometimes wasn’t turned over. Julius Caesar was selected as dictator numerous times, as the Republic descended into Civil War. And after he died, his nephew Octavian picked up the reins of power, and before anyone noticed, the Republic had become an Empire, with an Emperor who decided everything.

Octavian, soon known as Augustus, managed the complicated expanse of the Roman Imperium quite well. Some of his successors, however, did not fare as well.

The point of my diversion into the days of swords and sandals is that a temporary dictatorship can be quite effective. For a time. When left unchecked, however, bad things can happen.

So let’s jump forward. A few years ago, Ethan Marcotte created the term responsive web design. The web has been changing. The assumptions we used to have—screen sizes, resolution, input methods, seemingly everything—is changing in the wake of a transition to both a more mobile web and the potential of a larger-screened web, including both TVs and huge monitors. Retina-quality screens, from Apple but also from others, mean we can no longer assume that all images will display at 72 ppi. Responsive images are a necessary component to addressing this challenge.

Our current methods of displaying images just don’t work so well in the world of responsive design. Browsers pre-fetch images before other code, like Javascript and cookies, is evaluated, leaving few ways to choose which size or which resolution image to display, based on the available space in a layout (which shifts through responsive web design techniques).

Smart folks worked on coming up with a new HTML element that could allow for selecting the image with the right size and the right resolution for the current layout. They formed the Responsive Images Community Group. They developed the picture element. Everything smelled like roses.

Then the benevolent dictator of WHATWG, Hixie, began working on the same problem. He looked at the various ideas on how to address this problem and fashioned his own solution, srcset. While that proposal handles resolution-switching fairly well, it doesn’t handle all the needs of designers. In particular, it would not be easy to line up the breakpoints available in srcset with the breakpoints used in the CSS to create a website layout, because srcset relies upon px, while best practices for CSS media queries calls for using em. Fewer options are available for srcset as well, as it relies upon using either min-width or max-width, while designers might choose to use either with CSS media queries. The srcset proposal is also less future friendly as it doesn’t allow for other potential image-switching use cases, such as responding to monochrome or high-contrast displays. Not to mention that the srcset syntax is so succinct, that it is very hard to understand at first glance.

In any case, the way this has worked to date isn’t new. A tough challenge needs to be addressed in HTML5. Web designers and developers band together to advocate for a particular position, such as an accessibility feature. Hixie makes his decision, which is often orthogonal to the needs of designers and developers. A firestorm bursts out. He then responds to everybody’s concerns, using techniques like saying that he doesn’t understand the concern, the concern is not well-documented to his standards, data is required for those who disagree with him, but not those who agree with him, he quite likes his idea, people didn’t respond in the right place, and so on and so forth. Maybe he is actually trying to find a better solution, but the way I have usually seen this play out is that Hixie gets his way, and it is not the way designers and developers would prefer.

If you would like to see a classic example of this, here is Hixie’s response to concerns about the srcset proposal. My favorite bits are the parts where he claims that most designers put images into CSS, so this doesn’t really matter much anyhow, and that nobody changes the default font size, because he has found many websites don’t work when he has tried it. Neither of these assertions are true, but in Hixieworld, it doesn’t much matter, because he will get his way in the end.

Remarkably, though, this time the story may end differently.

Largely through the leadership of Mat Marquis, the Responsive Images Community Group has persevered in proposing the picture element, which better addresses the concerns of web designers and developers, directly to the W3C.

For the past few years, WHATWG has managed the spec of HTML5, and the W3C has mostly rubberstamped WHATWG’s proposals, with a few very important exceptions, such as the time element and some accessibility changes.

WHATWG has claimed at times that their version of HTML is the canonical version, and that they will be in charge of developing the living standard of HTML for the forseeable future, while the W3C will codify snapshots of this living standard, as HTML5, and, etc. This in effect would make Hixie dictator for life of perhaps the most important standard for the web, HTML.

Again, while Hixie has done very good things, he has made mistakes. Not necessarily because he is a flawed person, but because putting all of the decisions about HTML in the hands of one person is a flawed concept. One person alone cannot have the perspective to understand all possible needs. Some needs will inevitably be favored over others. Time and time again, that has meant the needs of browser makers over the needs of designers and developers and the needs of accessibility.

Now the W3C has the opportunity to consider the picture element before srcset is finalized in the WHATWG version of HTML5. This is an opportunity to change everything.

The W3C, which works on more of a consensus basis, can consider the picture element and the needs of those beyond just the browser makers. They can decide to put the picture element into the W3C standard. If Hixie develops the srcset attribute for the img element, that too could be considered: even though it has more limited capabilities, that too could have a role in the future of HTML. Browser makers could then potentially include both options.

What excites me about this is that this is a potential way around the problem of the benevolent dictatorship. WHATWG can continue to work on the tough problems of HTML, from the perspective of the needs of browser makers (and hopefully still considering the needs of others as, to their credit, they have often done). Yet when a problem comes along where web authors and users’ needs diverge from WHATWG’s proposed solution, and where Hixie just can’t be convinced to change his mind, then interested parties can propose a different solution directly to the W3C, for inclusion in the HTML standard.

The hope is that if the W3C does make such a decision, WHATWG would hopefully incorporate that back into their version of HTML. If not, there will be forking of the standards, which will help nobody.

If that were to happen, I would encourage web authors to follow the W3C standard, rather than the WHATWG standard. I would rather see consensus available as an option to resolve disputes rather than simply the fiat decision of one person, however well-intentioned that person may be. And only by giving priority to the W3C rather than WHATWG is that possible.

I dearly hope the picture element is approved by the W3C, and if so, that WHATWG—however reluctantly—puts the picture element in their version of the spec. That is a sustainable path forward for web standards.

Ultimately, web standards rely upon three pillars:

  • A codified spec that describes a standard
  • Browser makers who implement that spec
  • Web authors who use that spec

What happens if the W3C and WHATWG specs diverge? What spec will browser makers implement? If they side with WHATWG, are web authors out of luck?

A worst case situation would be for web authors to rebel and use the W3C version of the spec, using polyfills to implement functionality even if browser makers follow the WHATWG version. That would be worse for web performance and would slow down the case for using web standards. That said, if few if any web authors implemented the WHATWG version of the spec for a particular issue, browser makers might feel pressured to follow what web authors are doing.

I don’t want to see this come to that. I would like us all to just play nicely together. However, I believe in web standards too much for the fate of HTML to be in the hands of just one person who has a history of making decisions with which the community of designers and developers disagree. Griping has not changed anything. So having a backup plan of action seems wise.

For now, make sure to read the draft proposal for the picture element. If necessary, let’s find ways to improve it. If the W3C approves, we will have found at least one path forward for web standards.