On MovieTome: The 10 worst movies of 2009 so far!
BNET Business Network:
BNET
TechRepublic
ZDNet

February 9th, 2007

Yahoo! Pipes and the mashup pipedream

Posted by Phil Wainewright @ 4:09 pm

Categories: Development, Web 2.0

Tags:

Ever since I first started playing with RSS feeds, I've always dreamt that one day someone would come up with an online tool that makes it easy to aggregate multiple feeds. For a brief half-hour today, I thought maybe Yahoo! Pipes, which launched yesterday, would prove to be the answer to my prayers. But no. Instead, it illustrates just how elusive remains the dream of easy data mashups.

What I wanted Yahoo! Pipes to do was create a composite of several different RSS feeds on the topic of web services and SOA. I first set up a web page to publish this selection of feeds in 2002, but I finally gave up on updating the page roughly a year ago when one of the feeds switched to FeedBurner and somehow broke the feedreading program I'd written. In any case I'd always wanted to upgrade the page to present a composite 'river of news' with the freshest items at the top, instead of publishing each individual feed separately.

This is the sort of mashup that Yahoo! Pipes ought to excel at, but it fails at a very simple hurdle. Let me start off though by paying tribute to the designers of Pipe and make clear that they have made it exceptionally easy to link to feeds and mix them together. The trouble with making it so easy to get thus far is that it just gets you to the next obstacle that much more quickly. It took me no more than a few minutes — aided by a quick scan of this introductory tutorial — to fetch two feeds, filter one of them, splice them and then run a sort. But here's the problem I encountered when I looked at the sort output:

Yahoo! Pipes sorts formatted date fields alphabetically instead of in date order

Examine the <pubDate> field in each of the three feed items you can see in this screenshot. The pubDate is in the very widely used RFC 822 format specified in Dave Winer's popular RSS 2.0 specification. But Yahoo! Pipes doesn't sort it as a date. The program treats the field as text and sorts it alphanumerically. So in descending order, instead of starting with the most recent date, it lists all the Wednesdays first, then all the Tuesdays, and so on until it finishes (not shown) in a flourish of Fridays. Of course the sort order bears no relation to any kind of calendar order. And that's just for feeds that all use pubDates. Many feeds put their item dates into a <dc:date> or a <published> field, using ISO 8601 format. Yahoo! Pipes provides no mechanism for normalizing these various date formats so that a composite feed can be sorted by publication date.

This of course is the perfect illustration of why data mashups are so darned difficult. At least there's a chance that Yahoo! Pipes will overcome these problems without getting too complex, and I hope the Brickhouse team who are apparently responsible for the Pipes project will prioritize finding some straightforward solutions to this really fundamental stumbling block. [UPDATE (added Feb 10th): Kevin Cheng from the Pipes design team has posted a TalkBack comment to say "Fixing date sorting (and normalizing common formats) is one of our top priorities.". That's great news, Kevin, thanks.]

But the problem here is the same problem I wrote about last summer when I decribed Google Maps as the fool's gold of mashups. It's all very well to do demonstration mashups that use deceptively well-structured data, but in the real world data structures are a semantic minefield. If the relatively shared semantics of RSS date fields contains so many pitfalls, imagine how much more difficult it is to mashup business critical data from many different enterprise sources.

Nevertheless, having said all that, Yahoo! Pipes is a great advance since it brings these issues into sharp relief, when previously they were masked from view. If it can iron out some of these remaining crinkles and really start to provide meaningful utility then it will provide a real spur for people to get to grips with all the other semantic dissonances and perhaps want to make an effort to structure their data using more easily shared formats and semantics — and that can only be a good thing.

This is not my last word on Yahoo! Pipes and the whole notion of mashing up and linking data and processes from around the Web. Next week, I want to look at some other approaches in addition to Yahoo!'s new experiment, as well as exploring the potential impact that such tools can have in really unleashing the creative power of the Web.

Phil WainewrightPhil Wainewright is a commentator and strategist on emerging software industry trends. See his full profile and disclosure of his industry affiliations.


Email Phil Wainewright

Subscribe to Software as Services via Email alerts or RSS.

  • Talkback
  • Most Recent of 3 Talkback(s)
Sorting on Date should work now
Hi Phil,
This issue should be resolved now. If you have any problems, feel free to drop me a mail.
Thanks!
-edward... (Read the rest)
Posted by: edward.ho Posted on: 02/10/07 You are currently: a Guest | | Terms of Use
Date sorting is a known bug.  kevnull | 02/09/07
Useless  sparkeee | 02/09/07
Sorting on Date should work now  edward.ho | 02/10/07

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement

Recent Entries

advertisement

Archives

ZDNet Blogs

White Papers, Webcasts, and Downloads

Enterprise Applications

  • Check out some of the easiest and most powerful ways to boost productivity while saving money on your application infrastructure. See ZDNet's comprehensive Enterprise Application resource center, now!
  • New Online Dashboard
  • Read about top issues IT decision-makers face every day, plus get cost effective solutions to real life IT problems. Oracle Topline