Embedding third-party content in your site using oEmbed

There's a whole class of 'Web 2.0' technologies that have emerged recently which have some common features: They solve a simple problem, they do so in a decentralised way and they stay simple. As examples I'd quote things like XFN, OpenID, oAuth and even things like RSS and Atom feeds. They start off by solving a particular use case, and stay as simple as possible (or at least should - I'm looking at you OpenID).

The latest such technology to interest me is oEmbed, via a blog post by Ben Ward. The name is a bit cryptic, but the use case it addresses is one of embedding content from one site into another. That may sound like something esoteric, but just looking back over the handful of blog posts I've done on this very site, a large number of them contain images from Flickr. Looking around the web as a whole people are constantly embedding videos and images from sites all around the web into their forums, blog posts and CMSes.

There are a couple of ways this is normally done in the wild, neither of which are that satisfactory.

  1. The site the content is hosted on generates a snippet of HTML - From looking at a page with the content on, a couple of clicks will give the user some HTML that they can copy and paste into their HTML editor. This is ok for people who are happy with HTML and actually have the ability to edit the HTML in their posts rather than using some sort of WYSIWYG, but can be confusing for novice users. This technique also limits the ability of the receiving site to reformat the content to fit into any existing templating.
  2. The site the content is hosted on gets screen-scraped - Some blogging platforms and CMSes know how major sites like Flickr or YouTube structure their HTML so are able to extract images and videos from just a URL. This of course falls down if the HTML changes significantly, and if you're trying to post content from a site your platform doesn't know about, you're out of luck.

Of the two existing solutions, the second has the best user story. The user clicks an button, pastes in a URL to the content on another site, and the patform slurps up the content, reformats it to fit in with any house styles and inserts it into the content area. What's needed is a way to do this in a decentralised way, which is where oEmbed comes in.

How oEmbed works

What happens with a system that supports oEmbed is as follows:

  1. The user pastes in a URL at which content is hosted.
  2. The system checks that URL to find the address of its oEmbed API via a LINK element in the document's HEAD. This step could be cached as the API location is unlikely to change often.
  3. The system does a GET to the oEmbed API, essentially asking 'what is the content for this URL'?
  4. The system gets a JSON or XML response containing structured metadata for the item.
  5. The system formats the data however it deems appropriate.

A practical example with a picture from my Flickr would be:

  1. I give the URL http://flickr.com/photos/ciaranmcnulty/429868897, one of my holiday snaps.
  2. The system sees from the page that the oEmbed API is at http://flickr.com/services/oembed.
  3. The system does a GET to http://flickr.com/services/oembed?url=http://flickr.com/photos/ciaranmcnulty/429868897/
  4. The system gets the following response:
    <oembed>
        <version>1.0</version>
        <type>photo</type>
    
        <title>Rosella parrot</title>
        <author_name>CiaranJMcNulty</author_name>
        <author_url>http://www.flickr.com/photos/ciaranmcnulty/</author_url>
        <cache_age>3600</cache_age>
    
        <provider_name>Flickr</provider_name>
        <provider_url>http://www.flickr.com/</provider_url>
        <width>375</width>
        <height>500</height>
    
        <url>http://farm1.static.flickr.com/185/429868897_18ea03200a.jpg</url>
    </oembed>
    
  5. The system reformats this according to some sort of template into:
    <div class="figure">
        <div id="caption">Rosella parrot</div>
    
        <a href="http://www.flickr.com/photos/ciaranmcnulty/">
        	<img src="http://farm1.static.flickr.com/185/429868897_18ea03200a.jpg" width="375" height="500" />
        </a>
    
    </div>
    

The oEmbed format docs specify a few different content types: 'image', 'video', 'rich' (i.e. HTML for embedding) and a whole bunch of different URL parameters you can add into the request, for instance you can ask for the result as JSON instead, specify a maximum size for images, specify that you only accept images and so forth.

Why oEmbed is awesome, sort of

On the Internet it's mostly easier to complain than praise, but there are some things oEmbed has done well.

  • It addresses the problem. It's easy to underestimate how important this is. The authors of the oEmbed spec have managed to identify an area that could be improved for users, and have generated a system that does it in a fairly reasonable manner. Frankly even if oEmbed doesn't take off it's got people thinking about the problem.
  • The response formats are really well thought out. The spec defines a useful transfer format for the domain that covers most scenarios I can think of. There's obviously a lot of time and effort that's gone into it, and they've been careful to think about how it would be integrated into real-world Javascript code.

Why oEmbed sucks, sort of

All in all oEmbed is a promising technology, and the big appeal for me is the way it simplifies the entire process for the user. There are however a few bits of the process I think are, well, lame.

  • Multiple URLs for a single resource is not very RESTful. The request to the 'API' can contain a parameter saying whether the response should be XML or JSON. This generates two separate URLs for what is essentially the same resource, a violation of REST principles and a waste of the HTTP Accepts header, which does this in a much nicer way.
  • 'Autodiscovery' is long-winded. From knowing a URL of an HTML representation of a resource, I then have to examine that HTML to find out an API location, then construct my own URL based on that and the original URL, from which I can get the oEmbed response... phew! Again, this just duplicates URLs for a resource - I have a URL to the resource so why can't I just do a request to the original URL, but ask for an oEmbed content-type?
  • The data is already there on the page. OK so this is me with my Microformats head on, but I wouldn't be very surprised if most of the data exposed in a typical oEmbed response is actually present in the HTML anyway. What would be pretty cool would be a way of embedding oEmbed data inside HTML, either via POSH-type semantic HTML, a Microformat, or even some sort of RDFa vocabulary. I mentioned Ben Ward earlier and he's expressed a strong intention to work on this so I'll be interested to see how he gets on.
  • People aren't using it. A few sites support oEmbed but it's not hit the critical mass needed yet where a CMS or blogging platform could use it for their main embedding solution.

oohEmbed - A third party solution

My first instincts when pondering the problems of Autodiscovery and the lack of support specified above, were to build a tool that did oEmbed requests in the background, but presented a nic(er) interface, and added in support for sites that didn't yet support oEmbed.

Imagine my annoyance when I found out someone called Deepak Sarda had already done this, months ago at oohEmbed!

From the developer's point of view, oohEmbed takes out a big step in the oEmbed process - discovery. Rather than doing oEmbed requests to API endpoints defined by each separate service, oohEmbed exposes an oEmbed API that accepts requests for basically any service, and adds support for those with other APIs the author can tap into.

For example, if I wanted to get an oEmbed response for my Flickr via oohEmbed, rather than the long-winded process described above I'd do a single GET to a URL at oohEmbed - http://oohembed.com/oohembed/?url=http://flickr.com/photos/ciaranmcnulty/429868897/ - and get essentially the same response I'd have got if I'd gone to Flickr directly.

Furthermore, if I want to get a YouTube video I can do a similar request to http://oohembed.com/oohembed/?url=http://youtube.com/watch?v=fWUedF_eTvg and get the following oEmbed response, despite the fact that YouTube does not yet support oEmbed! I get an oEmbed JSON response anyway, that I can use right away:

{
    "version": "1.0",
    "type": "video",
    "provider_name": "YouTube",
    "width": 425,
    "height": 355,
    "html": "<embed src='http://www.youtube.com/v/fWUedF_eTvg' type='application/x-shockwave-flash' wmode='transparent' width='425' height='355'></embed>"

}

Now that's not perfect, but it works pretty well, returns something usable, and is far less complex than a full oEmbed discovery process, so is something that could very simply be used in some Javascript code.

Where do I go from here?

I've still got half an idea to implement a front end for oEmbed. I like the oohEmbed solution, but my qualms about the REST nature of oEmbed aren't really assuaged by it and it'd be an interesting process to go through.

I'll be watching the progress of the Microformats embedding efforts once they start, and attempting to put my 2p's worth into the process via the mailing list and wiki.

Also, I've half an idea to start embedding content into this site via oEmbed, taken from places like my Delicious bookmarks, my flickr and other sites. I'd just need to make sure that that sort of Tumblr-style content didn't crowd out the sporadic 'real' blog posts I make, I guess!.

Bookmark and Share

Comments

1.

That was inspiring,

You have provided an excellent explanation on how oEmbed works

Anyway, thanks for the post

web development
22nd October 2009, 14:18

2.

Do you know http://embedit.me/ and can comment on it?
A friend of mine tells me its like oohembed but even simpler.

Tobias
29th October 2009, 09:14

3.

To be honest, Tobias, it doesn't look finished - I'd certainly not use it yet

Ciaran McNulty
7th December 2009, 14:45

Add a comment