Category: Web 3.0
February 4th, 2010
Siri offers virtual assistance, with a little help from your iPhone
Back in June, I recorded a podcast with Tom Gruber of Siri. A week later I saw the company’s ‘Virtual Personal Assistant’ put through its paces on an iPhone, and was impressed. Earlier this month I got on the phone with CEO Dag Kittlaus and VP of Engineering Adam Cheyer for an update, and today you can download Siri for yourself via Apple’s App Store. Versions for Blackberry and Android will follow ’soon after,’ and Kittlaus stresses that mobile is ‘just the beginning.’
Today’s iPhone app is the first consumer offering from a company that has spent a long time thinking about this space. Much of the core research resulted from the $150million CALO project at SRI, funded by DARPA. Siri itself emerged from SRI to close an $8.5million Series A round with Menlo Ventures and Morgenthaler back in October of 2008.
Explicitly described as complementary to web search, rather than a replacement for it, Siri seeks to move beyond a paradigm based upon keywords and links to embrace one that is personal, task oriented, and conversational in nature. Siri guides the user along a path, making query formulation iterative and relatively painless, and ensuring that the application gets the information it needs. Early use cases are optimised to ‘help you get things done.’ probably whilst mobile. You might, for example, ask (by speaking to Siri’s Nuance-powered speech processing engine) for ‘Sushi near work at 7pm.’ That simple request is relatively straightforward for a human being to understand, process, and act upon, but requires a significant degree of intelligence on the part of a software agent. Where is ‘work’? What is ’sushi,’ and what do you actually want to do with it (find a restaurant where you can eat it, presumably)? When is ‘7pm’? Alternatively you could let Siri conversationally lead you through the ‘right’ questions to reach the same outcome, as the sequence of screenshots at the end of this post demonstrates.
At this point, it’s worth mentioning that Siri (like so many location-powered applications on the Web, the iPhone, Android, or wherever) is currently only really effective in the United States. This is due to any number of factors, including consumer readiness and the easy availability of cheap yet comprehensive data, but the situation will doubtless hopefully improve over time. Siri currently makes the point explicit, refusing to allow registration of a home or work address (or timezone) outside the United States. For the purposes of experimentation, my office has temporarily relocated to 1600 Pennsylvania Avenue, Washington DC, where there seems to be plenty of sushi available for tonight’s dinner.
The speech processing works well on the whole, although I’m bemused that Siri interpreted my daughter’s ‘Where is the nearest Greek island?’ as ‘18 me.’ Whilst it’s possible that this is an AI’s attempt to avoid causing offence by responding ‘Greece, you silly girl’ it seems more likely that the engine got very confused by her non-American enunciation. Trying to sound like Hannah Montana just made it worse.
Even in the States, data is key to ensuring that apps such as this one deliver a rich and useful experience, as all the AI smarts and user interface polish in the world can’t help an app that ignores the Starbucks across the street when you ask it to find you a coffee. Siri has lined up an impressive group of data providers including OpenTable, MovieTickets, TaxiMagic, Citysearch, Yelp, Yahoo Local, Gayot, Rotten Tomatoes, NYTimes.com, WeatherBug, AllMenus, StubHub, LiveKick, Maponics, Nuance and TrueKnowledge. Kittlaus celebrates the recent explosion of accessible APIs from sites such as these, claiming that Siri has acquired ‘far more data than we’ve had time to integrate yet.’ In a number of cases, revenue sharing arrangements mean that Siri gets a cut when money changes hands. A selection of test searches focussed on areas of the US to which I travel regularly delivered the sorts of results that I’d expect, and there’s clear value in the integration of data from a number of different providers.
The Siri team looks forward to analyzing the logs once users start putting this app to work. Amazon’s Elastic Compute Cloud (EC2) will handle the heavy lifting in the early days, allowing computing resources to scale with demand. Once the team has an understanding of real-world loading, Cheyer suggests that they’ll pull much of the computing resource back in-house to lower costs. There is a clear expectation that Siri’s responses will iterate rapidly as data become available to show how users use the app.
Further out, there’s the ever-present need for more data. Kittlaus is also interested in increasing the opportunities for facilitating revenue-generating commercial transactions, and in allowing Siri to ‘know you better.’ Work, home, and current location is one thing. Why not favourite food, names and contact details for family (so I can have Siri ‘tell my wife I’ll be late home’), preferred airline, and more? It makes sense not to introduce these features from the outset, as consumers will need to both value and trust Siri before willingly giving up such detail. But you can be sure they’ll be included soon.
Voice already plays a role on mobile devices such as the iPhone, perhaps most usefully in Google’s search app. It remains to be seen whether consumers will really use two, or look for many of Siri’s features to move across and enrich the voice-powered search experience they’re already getting from Google, which presumably has many of the same data deals already in place.
Kittlaus stressed several times that Siri will deliver value on other platforms, suggesting a Siri email address (similar to plans@tripit.com, presumably), a destination web site with which users might converse, or a Siri IM buddy that could be drawn into conversations.
By delivering value to users, and by building an ongoing relationship (backed by data) that’s difficult to replicate, Siri seeks to offer a compelling and defensible business. Playing with the application from the other side of the Atlantic it shows clear promise, and I look forward to putting it through its paces on my next trip to the States.
And, of course, you can be pretty sure it’ll run on the iPad.
This sequence of screenshots illustrates Siri’s conversational approach to getting from my vague opening query about ‘restaurants’ to a reservation for specific people in a specific place at a specific time on a specific day. It would have been quicker to simply say what I wanted up front… but sometimes you just don’t know until prompted.
August 10th, 2009
Moving Data.gov towards the Semantic Web
Government transparency in all its forms would appear to be very much in vogue at present, spanning everything from the Obama administration’s Data.gov portal and Prime Ministerial pronouncements in the UK Parliament to municipal proclamations of openness in Vancouver and compelling grass-roots demonstrations by activists and even newspapers.
At the heart of many of today’s initiatives lie programmes to surface Government data for use and re-use by third parties. The ‘open’ in ‘Open Data’ is, of course, a very loaded term, and I’ve looked before at some of the ways in which data might become ‘open’ whilst remaining effectively useless. Nevertheless, Governments’ current enthusiasm for being seen to embrace transparency should certainly be both welcomed and encouraged, and there are real opportunities to work with Government in ensuring that today’s transparency fervour continues undiminished, whether by omission or commission.
Given the complex and varied nature of the data involved, and the obvious linkages between the entities (you and I, our communities, our schools, our hospitals) described in numerous different databases, there’s a clear opportunity for technologies and approaches from the Semantic Web community to play a significant role in simplifying the whole process of moving these legacy databases online.
Already interested in Open Government from previous roles, and (obviously!) committed to encouraging real-world adoption of semantic technologies, I’ve spent some time recently talking to a number of those involved. A number of those conversations are now available as podcasts, and I’ll continue to seek out fresh examples and perspectives to share.
My most recent podcast conversation, released today, is with Professor Jim Hendler and Dr Li Ding of the Tetherless World Constellation at Rensselaer Polytechnic Institute in Troy, NY. The team at Rensselaer have been working with some of the US Federal Government’s data sets on Data.gov, and so far they’ve converted sixteen data sets from their original form, resulting in 2,927,398,352 freely available RDF triples and a number of demonstration applications.
Other conversations already released in the series include;
- David Eaves, talking about Vancouver’s commitment to Open Data
- John Sheridan, Head of e-Services at the UK Government’s Office of Public Sector Information, talking about his Department’s efforts to get Government data online
- Mark Birbeck, talking about work with the UK Government’s Central Office of Information to embed lightweight RDFa into workflows and web pages
Each offers an example of ways in which ‘open data’ contributes to Government transparency, or to increasing the value of the massive sunk investment in collecting, managing and curating the data upon which Governments depend. The Semantic Web’s notion of Linked Data (whether actually in RDF or not!
) offers a means to increase the utility of the data we have, without a massive programme of reengineering the systems used to manage it. The examples we see today, and the work of the individuals and teams with whom I have been speaking, will teach us a lot about how to make this work at Government scale.
June 17th, 2009
Semantic Search Round Table at the Semantic Technology Conference
Wednesday’s opening Keynote here in San Jose sees Guidewire’s Carla Thompson joined on stage by senior representatives from many of the more interesting players in the Semantic Search space; Tomasz Imielinski from Ask, Peter Norvig from Google, Riza Berkan of Hakia, Scott Provost from Microsoft, William Tunstall-Pedoe of the UK’s True Knowledge, and Andrew Tomkins of Yahoo.
Carla asks each panellist to describe the differentiating aspects of their product in ‘one or two sentences;’
Tomasz; “we receive about three times as many questions as other search companies. We want to answer questions the best we can from multiple sources… using structured and unstructured data.”
Scott; “Bing really focusses on understanding the intent behind queries, and organising the page to help people get to their answer much faster.”
Peter; “We focus on being comprehensive, accurate and fast… so we have to keep on innovating in crawling, ranking, systems engineering. One thing that differentiates us… most companies decide whether to focus on marketing or sales. We focus on engineering.”
Riza; “We are a complete semantic search engine, from the bottom up. We don’t even have an index. We’ve optimised the entire process for semantic operations. We focus on credible and dynamic content, and offer users a new perspective.” Instead of popularity, they focus on credibility.
William; “True Knowledge is a platform that does direct question answering. There’s a knowledge base and an inference engine to answer questions we haven’t seen before.” True Knowledge tries to ‘help when it can, and stay quiet when it can’t,’ as can be seen demonstrated in their recently released Firefox plugin.
Andrew: “Yahoo! is very aggressive about semantic annotation… SearchMonkey is about acquiring semantic information and surfacing it in search results on the page.”
Carla mentions Tom Tague’s keynote from yesterday, where he suggested that ’semantic search is an answer to a question no one is asking’… so “why do we need to change search?”
Tomasz responds, suggesting that users don’t necessarily demand new products that subsequently become successful. eg; no one was asking for the iPod before it launched. “When they see it, they will want it.”
Turning to Google and Yahoo!, Carla asks them “why do we need to change search?”
Peter… “as an industry, satisfaction is very high… but that is just because that’s what people know [now]… People don’t like technology… people like solutions. When we deliver it, people will want it.”
Andrew; “Does search need to change? It already is… Today, on any major search engine, if you search for a restaurant, you’ll see structured information about that restaurant; reviews, phone number, etc… This has been accelerating over the last 3-4 years… When we put this information up, and trigger it correctly, we see far higher levels of engagement from our users than anything else.”
Carla; “it may be a stupid question, but it has to be asked; what is semantic search?”
Scott; “it means a lot of different things. At Powerset we focussed on understanding the meaning in web pages, so we could present them, rank them…”
Carla; “Has Powerset’s focus been diluted by the [Microsoft] acquisition?”
Scott; “No.”
Carla asks Riza; “Someone from Hakia that I spoke to last year said you were the only one doing ‘true semantic search.’ Is that true?”
Riza; “No… Semantic Search can enrich search results… Semantic Search can improve precision/disambiguation… Semantic Search can organise results better. In the future, search will move to more conversational systems, and for that you really need semantic technology.”
Carla; “How do you measure the ’semanticity’ of a search engine?”
Tomasz; “That’s my favourite question… We took a sample of ‘equivalent’ queries from the logs, and ran it to evaluate ranking etc; does the search engine give similar answers to questions like ‘Top 10 songs’ and ‘Top Ten songs,’ etc. Should they?”
Andrew; “It’s incredibly hard to understand what a user will like… if you mess with the logo, it changes the perception of the results… if you make tiny changes, it can have a big impact on perception… When it comes to understanding semantic contact in search, we should identify the task the user is trying to solve… and have a metric that’s aligned to that use case… We can break search queries today into different classes; how do we do when a user is trying to book dinner, or a vacation? Semantic Technology should be judged on its impact based on these task metrics rather than any underlying notions of entity resolution, etc… SearchMonkey, for instance, lets users inject structured data into the process… The information can be incorporated in any way… and change how the results are presented. We have about 15,000 people in our development community, changing the way those results are presented every day.”
Tomasz; “I would expect a semantic search engine to deliver equivalent results to queries that would appear similar to a human being; ‘Top 10 songs’ and ‘Top Ten Songs’ should deliver the same answer. Today in most mainstream search engines they don’t.”
Carla; “Search v. Answers. True Knowledge is billed as ‘the Internet Answer Engine;’ is it necessary to move search to an answer-based format, or has Google trained users to think in keywords?”
William; “We support both keyword search and full-text questions. It’s important to answer users’ questions.”
Peter; “Different types of answers are appropriate for different types of questions; sometimes the answer is a fact, or a page, or a series of results to support a process of study. To say there’s going to be one technology or one type of answer doesn’t make sense.”
Riza; “You could be asking a ‘where,’ ‘why,’ ‘how’ type of question. Questions are important, and the search engine needs to be able to interpret the mode of the question and return results appropriately.”
Carla; “You mentioned talking about the credibility of search results. How do you define a ‘credible’ search result, and how much of a need is there really? I’m not hearing users question the credibility of search results they see today.”
Riza; “Practically, credibility is important in ’serious’ subjects; medical information, etc. You want to know where the results come from and how credible they are. When it comes to credible content, you can’t really do a statistical search or have a ‘popularity vote,’ because much credible content isn’t ‘popular.’
Scott; “People’s expectations for credibility are different depending upon the query. If you ask an ‘instant answer’ type query you expect the answer to be credible. If you do a broader search, you expect a mix of results to be returned”
William; “If a system understands structured knowledge, it can understand when different sources contradict one another”
Riza; “A system doesn’t need to know what’s credible; we can go to a librarian for that. Hakia doesn’t decide whether a resource is credible or not; we use librarians for that”
Tomasz; “If you ask for the capital of Japan we expect a single answer. If you ask about taxes, maybe the IRS is the best source but there are others. If you ask ‘how to get rid of acne’ you expect a lot of results.”
Carla; “We’ve seen three news-making launches in the past month; Wolfram Alpha, Bing, Siri. Is Wolfram the first step towards 2001? How is this engine valuable to those of us who don’t need to solve complex maths?”
Scott; “it’s not the first step… we’ve been working on these problems for a long time. There are a lot of questions people want to ask about the types of data that Wolfram aggregates… We see these things as part of full-search services. Powerset has moved along this path as well, pulling structured data in response to full-text queries.”
William; “Wolfram is a tremendous effort. An interesting example of question answering with structured data. I think people will find uses for it in particular use cases; I spoke to someone who’d used it to calculate when his visa expired, because it could do date calculation. I think there will be use cases in various scenarios; maths, nutrition information, etc… if you remember that it has that sort of information and remember to go to it… However one thing it doesn’t have is a decent back-fill. If it doesn’t have the data, or doesn’t understand the way you asked the query, it gives you nothing. We try to keep quiet and fail over to standard internet search in that sort of circumstance.”
Carla; “Does a semantic search engine know how not to answer a question?”
William; “that’s absolutely fundamental. You need the ability to reliably keep quiet when you don’t have the answer… and fail over reliably to other search services. [True Knowledge does try to do this...] “That requires very high quality semantics.”
Andrew; “One way to characterise the approach of Wolfram Alpha is that it’s a centralised approach. The Wolfram Alpha team goes out to find data and bring it in-house to convert to a standard form. A different approach is to have an ecosystem contributing data in the public eye… It’s not clear yet how much of a value-add is going to come from this centralised knowledge mapping approach. Yahoo! is focussed on the ecosystem approach, and helping people with knowledge to make it available.”
Peter; “Our inclination would be that we don’t want a closed walled garden. We want all the information available to combine in different ways. We want the information to be open, and the tool set to be open for mashing up in different ways.”
Scott; “If Wolfram Alpha hadn’t taken a walled garden approach they might never have launched a product.”
Tomasz; “Wolfram Alpha is great, but it’s not a search engine”
Carla; “Siri… caused a lot of buzz, uses True Knowledge… what are your thoughts?”
Andrew; “To be counter-cultural… the notion of getting much deeper and assisting a user with a task is spot on. We’re going to see much more of that. Search has tended to be stateless. Each query you enter is more or less processed without context. Yahoo! is rolling out more stateful search tools, and other companies will do the same. We expect people to use these tools on lots of devices. Would be expect people to come to the same place for purchase, navigation, etc? Do we expect one interface? There are going to be virtual assistants… I just don’t know if they’re going to be embedded into a search box.”
Scott; “Conversation is the ultimate user interface… but it’s not clear that I want to have a conversation with my laptop during the working day. How do I display the results? But there’s a huge role for conversation and dialogue in refining search and getting a user to their results faster.”
Tomasz; “What is the goal of Siri? If you try to go to broad you become a search engine.”
Scott; “When people have a conversational interface, they won’t speak in keywords.”
Carla; “What are the larger goals for Bing?”
Scott; “Bing is trying to simplify key tasks that people do when they come to a search engine. In travel, health, shopping, we can understand what people are trying to do, and get them to better results faster. The thinking has evolved from ten blue links to the whole page, and organising things to help the user by understanding their tasks.”
Carla; “Peter; what did you think of Bing?”
Peter; “I like the idea of innovation in the user interface. There’s a lot of room for that. There’s been a lot of emphasis on getting the ranking right. You still need to do that, but other things are important too. I’m usually happy with results on my big screen. On a mobile device, I’m usually not happy with the results I get.”
June 16th, 2009
Semantic Technology Conference kicks off with Keynotes from Open Calais and Siri
This year’s Semantic Technology Conference got fully underway this morning, with Keynote presentations from Tom Tague of Thomson Reuters’ Open Calais Initiative and Tom Gruber from Siri.
Despite the wider economic situation, attendance for this fifth year of the event feels a little up on last year, and there’s clearly real enthusiasm in the buzzing Halls.
Tague’s Open Calais has been one of the success stories for useful and easy application of semantic technologies beyond a core community of enthusiasts and adopters, and has been covered here and on Cloud of Data a number of times since it launched. Just today, they announced a new set of partners and a postal service that should remove one more perceived barrier for another set of potential adopters.
Speaking to the theme of ‘Web 3.0 - the Web of Me,’ Tague’s abstract suggests;
“The mainstream adoption of Web 2.0 technologies – from RSS feeds to social networks – is hastening the demise of the portal. With each new face on Facebook, and each new Twitter account, our once routine habits and traffic patterns shift. This wave of change in the way we consume, transact and interact on the Web is dis-intermediating ‘destination’ sites of all kinds. Our once centralized content has been atomized.
And yet our fundamental problem persists. We’re overwhelmed with input, yet still can’t find the one thing we need… now.
Semantic technologies – and the content interoperability and Linked Data connections they beget – offer new hope. That is not to say the answer lies in building new search engines, and few would argue for another news aggregator. Rather, our point of inflection lies at the point of consumption. Our task is to simultaneously refine and enrich our digital experience of everything from content and community to commerce.”
Early on, Tague made a ‘non-apologetic statement;’
“People need to start deriving financial benefits from semantic technology. It’s time”
Absolutely!
Tague looks back at the move from ‘Web 1.0,’ described as ‘the last Web we agreed on,’ to ‘Web 2.0,’ which he sees as largely defined by the ‘addition of social.’ Today, he reckons, we are ‘extraordinarily content-rich’, ‘extraordinarily information-poor’ and ‘experientially deficient.’ Despite a wealth of content, we are failing to make the most of it.
‘We’re at the inflection point’ where ‘innovation is exploding’ as we move from developing and inventing toward mainstream adoption of technologies in the semantic technology space. Lots of things will be tried; 90% will fail, but that’s ok.
‘Everyone needs plumbing,’ and that’s what Calais is; semantic plumbing. 13 version releases in 18 months; about 100 presentations, 13,000 registered Open Calais developers, a million great ideas.
Tague reckons the various efforts he comes in contact with fall into six broad buckets;
Tools; Social; Advertising; Search; Publishing; Interface.
First, Enabling Tools. Data Management, Data generation, Databases, Integration and workflow. ‘A big yes.’ ‘We need tools.’ Everyone needs tools, especially as you move from early adopters toward the mainstream. Tools build the bridges that cross the chasm to enterprise adoption.
Enterprise adoption will not happen because it’s cool. Enterprise adoption will not be talked about on Twitter. Enterprise adoption will happen because it’s cheaper/faster/better than what they have just now.
‘Tool vendors need to simplify their story; it’s not about more functionality.’ ‘If I can’t understand your story, then Enterprise IT certainly can’t’
Second, ‘let’s put some frosting on top of social.’ ‘Wouldn’t it be cool if we could…’ Some of it might be cool, but there’s a challenge in monetising social. Adding frosting to the top of an industry that hasn’t worked out its own monetisation is fraught with risk.
‘I haven’t seen a compelling story yet.’
Next, advertising. Almost a dirty word in the semantic technology domain last year. But advertising is fuel, and semantic technologies have a clear role to play in enhancing advertising (see my podcast with Scott Brinker from last year…).
Semantic search; ‘the semantic industry’s brilliant yet under-achieving child.’ The answer to a question no one is asking? General, consumer-facing semantic search… directly competing with Google et al? Not viable.
But vertical search in specific domains… a huge growth opportunity, and people are willing to invest the time, effort and money to make it happen. Room for a handful of players in each domain?
Search; ‘a bifurcated marketplace.’
Publishing; content producers, editorial/aggregation, ‘robotic publishing.’
‘Classic publishers can get enormous value from this technology… not all of the value is in the user experience.’ Much of the value is being found in the back office, making existing data and investments work harder.
Little value in ‘robotic publishing,’ because the content isn’t that readable. Aggregation services like Huffington Post and Daily Me present ‘enormous opportunities.’
Interface; gaming a huge and growing market. $57bn industry. A ’seamless, interactive and responsive experience,’ it’s ‘graphically engaging and fun.’
Zemanta, AdaptiveBlue, Feedly, Apture et al ‘trying to make the consumption experience different’ [better?]. Not suggesting that these are like a game, but many of the drivers may be similar?
“People are on their mobile devices and in the browser; go where the people are.” Which links well to the next keynote…
“Do you care about semantics or about user value?”
“Don’t fund/buy semantic infrastructure beyond what you need; use infrastructure built by others where possible.”
“Think very hard about the user experience; make it compelling and exciting.”
Following Tague’s presentation, Tom Gruber took to the stage to talk about Siri; a company building a Virtual Personal Assistant (with an interesting iPhone app to start things off) that we discussed during a podcast last week. As Gruber’s says;
“We are beginning to see a new interaction paradigm for the web: the Virtual Personal Assistant (VPA). A VPA is task focused: it helps you get things done. You interact with it in natural language, in a conversation. It gets to know you, acts on your behalf, and gets better with time. The VPA paradigm builds on the information and services of the web, with new technical challenges of semantic intent understanding, context awareness, service delegation, and mass personalization.
Siri is a virtual personal assistant for the mobile Internet. Although just in its infancy, Siri can help with some common tasks that human assistants do, such as booking a restaurant, getting tickets to a show, and inviting a friend. We will describe the technology underlying Siri and how it fits in the larger ecosystem of services and data providers. And we will offer a vision of where assistants like Siri are going.”
Tom starts off by showing the Knowledge Navigator video from Apple… which dates all the way back to 1987. Many of the ideas are now coming to fruition; touch screens, a global network, awareness of temporal and social context, speech in and out, a ‘conversational interface,’ ‘delegation of work’ to the machine, and trusted use of personal data.
Is the Knowledge Navigator possible today? ‘No, but we’re getting there.’
Siri is pretty close… in certain well understood contexts, as Gruber shows in a video demo of the evolving iPhone application.
What is a Virtual Personal Assistant? It does things for you; it’s task-oriented. It understands your intent via a conversational metaphor. It gets to know you; it’s not the same for everybody, unlike a search engine.
‘Service delegation [like Siri]; the mother of all mashups’
‘Context is king’ in communicating with a VPA; where am I, what time is it, who am I, etc.
“This really is the beginning of the age of the start of Virtual Assistants.”
Need to solve authorisation/ authentication. If we reach a ‘data commons’ there will be more, better, information to drive choices and decisions.
Tom Tague is a regular member of the Semantic Web Gang podcast, which I moderate. Tom Gruber was the latest guest in my Executive Briefing podcast series.
April 3rd, 2009
AdaptiveBlue updates Glue; I avoid 'sticky' puns with this title
New York-based semantic technology startup AdaptiveBlue yesterday unveiled an update to their Glue product, and the world’s technology writers were unable to contain their enthusiasm for the obvious puns. I spoke with AdaptiveBlue’s CEO, Alex Iskold, ahead of the launch to hear about the latest enhancements.
Currently offering an iPhone App and a Firefox browser extension, Glue provides useful functionality in aggregating interactions with identifiable objects such as films and books from across the various sites on which people find with them. As I described when Glue originally launched last year,
“There are plenty of offerings that will put you in touch with your social network on a single site. Glue is interesting because it escapes the tyranny of the site and connects people to things across a growing number of sites. My interactions with Social Networks and the Semantic Web on Amazon.com are visible to members of my network who prefer to shop with Barnes&Noble, and those who are amongst the 32 owners of this book hanging out on LibraryThing. My personal preferences are respected, as I only need to interact with the item on a site of my choosing. Members of my network gain the ‘benefit’ of that interaction without needing to change their habits and visit sites of my choosing. Behind the scenes, semantic technologies are hard at work reconciling the 0387710000 with the 978-0387710006, the 4561465 and the various other ways in which we choose to refer to a single body of intellectual expression. When a match is found, the Glue Firefox plugin does a nice job of subtly highlighting the fact… without getting in the way of whatever task you are trying to complete.”
Since that launch there have been 110,000 downloads of the browser extension, and Alex reports 35,000 ‘active’ users.
Yesterday’s visible additions to the product mostly appear quite superficial, but they’ll be important in converting more of those downloaders to active users, especially once the pool of potential users is expanded by the upcoming version for Internet Explorer.
First, the Glue Bar at the top of a browser window is 25% thinner. Even though the Glue Bar is ‘contextual,’ and only appears on pages where relevant content (a book, a film, a person, etc) is detected, the old version could still intrude quite a long way into the browsing experience. For those who actively engage with the bar only occasionally, it’s now an awful lot less intrusive and therefore more likely to be tolerated in day-to-day browsing for the benefits it brings.
More significantly, the ‘2 cent’ comments that users of the previous version were able to make are now aggregated and made far more visible to other Glue users via what Iskold described as ‘Connected Conversations.’ As with the core Glue offer, users’ 2 cent comments about a film on Netflix, Wikipedia or IMDB are pulled together and made visible to their peers, regardless of the site on which they happen to be viewing details of the same film. These comments can also be shared via a user’s social graph on Twitter, Tumblr and Friendfeed, reaching a community beyond Glue.
By aggregating the interactions of every Glue user with items scattered across the Web, Glue is also now able to compile - and display - lists of the most popular books, films, etc. Unlike traditional measures that might track rentals from Netflix and sales from Amazon, Glue is able to measure a far more complex set of interactions with a given resource. Users might buy from Amazon or rent from Netflix. They might also demonstrate their interest by reading about a film on Wikipedia or IMDB… and Glue would track those interactions too. Below, for example, we can see that Knowing is the most popular film on Glue this week.
These lists are calculated daily, and are based upon the aggregate of all user activity over the preceding seven days.
The list of sites that Glue understands is growing, but remains heavily biased to the US market. Alex suggested that the team were keen to consolidate their US position before devoting more attention to adding international sources. However, with a Glue API to follow their Internet Explorer release, it’s possible that a sufficiently motivated community will soon be able to start adding some of these new resources for themselves…
October 20th, 2008
Radar Networks opens Twine to the world with version 1.0
Less than a year after its unveiling at last year’s Web 2.0 Summit, and a mere eight months after closing a $13 million Series B funding round, Radar Networks‘ Twine today moves out of beta as a 1.0 Release, open to all comers.
The ‘Semantic Web’ with which it was so closely associated (an association that has attracted flak) at the outset is almost nowhere to be seen, and this is bound to incite a further round of criticism, nay-saying, and mud slinging. What many of those critics forget, though, is that this is quite explicitly billing itself as a consumer application.
If my mother, my brother or my children can ’see’ the Semantic Web, it has failed - big time.
Talking with Radar Networks’ CEO Nova Spivack ahead of today’s launch, he was keen to stress the
“big focus in this release upon usability.”
Twine is billing itself as
“a place to keep up with your interests”
A company briefing document suggests;
“Radar Networks is a venture-funded startup focused on ‘interest networking’ – the practice of connecting with others around the topics we care about most.
If a social network is about who we are interested in, an interest network is about what we are interested in.
The company’s first product, Twine, is the logical next step beyond a social network – It connects people around the content they find interesting.”
The team has invested a lot of effort in easing new users into Twine, and streamlining workflow once inside. There’s still some work to do; frankly the interest feed is a pain to keep ‘caught up’ when you are subscribed to a sizeable number of twines; especially given users’ penchant for cross-posting items to multiple twines, most of which you’re also likely to be subscribed to. It’s fixable, though, and this release is a significant step forward from earlier iterations in the beta process. [Update, 0017 PST, 21 October: responding to this post, Nova Spivack tells me that enhancements to the interest feed will be rolling out over the next 24 hours.]
Twine 1.0 is definitely noticeably faster than previous releases, with Spivack suggesting that the site was
“1,000 times faster than last week”
Writing late on Monday evening in the UK as the Twine team add the last lick of Californian paint, I am still seeing the site occasionally slow to a crawl, but I’m not going to hold that against them. The site is technically still in beta as I write. If it’s still slow after this post sees the light of day, then I’ll complain.
As with so many ’social’ sites, it can be difficult to clearly communicate value to a new user. Indeed, for many sites there is no value for a new user until they have invested significant effort in manually constructing their network. Twine is a little different, and the new signup screen encourages prospective users to enter some of their interests before actually signing up.
Straightaway, a prospective user is able to discover information that others have added to Twine. Behind the scenes, the semantic technologies that make Twine work are doing what they do best; without the user having to concern themselves. If interested in what they see, the visitor is then able to work through a straightforward sign-up process and begin to realise the additional benefits of connecting to other members and registering with subject threads (’twines’) of interest in order both to post material of their own and to receive updates from other members of the twine.
Once a member, there are two main - linked - functions within Twine. The first is tracking and commenting upon content posted to twines by other people, and the second is bookmarking content that you discover out on the web.

New items and comments posted to twines of interest are visible in the Interest Feed that greets you each time you log in to Twine, as well as in optional email alerts, RSS feeds and the like. On the basis of user behaviour, Twine will also begin to recommend people and twines that may be of interest, and Spivack notes that an upcoming release will greatly enhance this feature by explaining why the recommendations are being made. In the same way as you can with Amazon, it would be useful to be able to declare non-interest in these recommendations, so that particular people and twines do not recur.

A simple bookmarklet enables Twine users to post items of interest into Twine. Around 50% of all twines are private and restricted to an individual or a group. The rest are public, and open to be read by anyone with a web browser. This example shows the result of trying to submit a page from the BBC. In this case, all of the text has been auto-generated by Twine, and all that I need to do is select the twine(s) and/or people with which I wish to share, and (optionally) add a comment of my own before saving. The result is as below (click to see the real thing), where you can see Twine’s power beginning to express itself in the series of facets and tags down the right hand side;
Items can also be submitted by email, and in an upcoming release Twine will be able to directly consume RSS.
During the beta programme, Twine has grown in size, complexity and utility. According to Radar Networks, they have seen 500,000 unique visitors during the beta, 50,000 of whom are described as ‘active’ in adding over 1,000,000 items to 20,000 twines.
More than half of those users originate outside the United States, and they tend (around 75%) to be male, well educated, comfortably employed, and between 31 and 50 years of age; a pretty good demographic to monetise, in other words.
Turning to monetisation, Spivack suggested that;
“social networks do not monetise because you’re basically there to communicate”
Twine, on the other hand,
“is different, because you’re there to keep up on a topic. [You might therefore welcome] targeted advertising around that topic”
Spivack reports that the company is actively signing up a variety of partners looking to benefit from Radar’s patent pending recommender system, and he expects the first adverts to begin to appear early next year.
Other features due in enhancements that are expected to roll out each month from now on include the release of an API and far more investment in making existing semantics or structure work that much harder.
As soon as November, for example, Spivack suggests that the company will release a new mining system that will use Natural Language Processing (NLP) to do a far better job of parsing information from pages that Twine users bookmark into the system.
During 2009, Spivack suggests that we will
“start to see the other 90% of our Platform.”
Into 2009, users will gain the ability to create far more item ‘types’ (events, product data, etc,) and a public API that’s already operational within the company will include capabilities such as the import of existing third party ontologies.
The API is apparently fully RESTful, and
“similar to Freebase.”
One (unnamed) partner is using the API to integrate Twine into Microsoft Office. Powerset, anyone?
Despite alluding to similarities, Spivack was quick to stress that he has
“No interest in doing what Freebase is doing… building an encyclopaedic view of the world. [He would] much rather make it easy to pull Freebase data into Twine.”
Twine has come a long way since I first saw it. As with all complex applications, some rough edges remain, but there is certainly enough utility for the avid hoarder of ’stuff’ to get to work populating their twines today. Is it for everyone? No, probably not. But for all those people who want to track a professional subject, a hobby, or their favourite band, there’s something here. For people who want to do those things, and who see the value of doing it along with similarly enthused individuals around the planet, there’s even more.
The Semantic Web’s technologies lie behind Twine. Sometimes you can almost see that, if you know where to look. Often you can’t. Given Spivack’s ambitions for 2009, the semantics in Twine are going to get a whole lot richer. The trick will be adding that richness whilst ensuring that the application continues to get demonstrably faster and more usable at the same time.
See Radar Networks’ overview of Twine functionality in this short video, and listen to Radar Networks’ CEO Nova Spivack talking to me about the Semantic Web several months before Twine was announced
Nova Spivack will be joining October’s episode of the Semantic Web Gang to report on the first week of full operation, and to discuss the company’s next moves.
August 5th, 2008
Everywhere I look, I see Clouds
No, not a comment on the weather in East Yorkshire this ’summer,’ but rather a reflection on the recent eruption of content related to Cloud computing. Having taken a long weekend away from the computer, punctuated by occasional iPhone-powered checking of my feeds to sate the addiction, it really did feel as if there was a new Cloud computing post near the top of the pile every time I looked.
First up was a great piece from fellow ZDNet blogger, Dion Hinchcliffe. In ‘Enterprise cloud computing gathers steam‘, Dion writes about the Cloud’s ability to lower IT costs within the enterprise and accelerate technological innovation at the same time.
“Interestingly, it’s at this very intersection of issues that cloud computing appears especially compelling. By offering easy access to more efficient IT capabilities across computing, storage, and applications while providing direct and immediate access to both external innovation and innovation capability, cloud computing offers an on-demand, scalable, and repeatable resource that can be used the solve two of the major challenges facing IT departments today. We’ll see in a moment how cloud computing can help with these issues in ways that traditional on-premises computing is hard pressed to match.”
Dion’s piece was followed (at least in my reading, if not necessarily chronologically) by a guest post on TechCrunchIT by Salesforce CEO Mark Benioff. In ‘Welcome to Web 3.0: Now Your Other Computer is a Data Center‘, Mark paints a compelling picture of the shift back to shared compute resources;
“For almost ten years now, we have been witnessing a decisive shift from client-server software to software as a service. Google, eBay, and Amazon.com established the value of multi-tenant internet applications in the consumer market, and salesforce.com, Google, and others have been proving that this same multi-tenant model is winning in the enterprise as well.”
Interestingly, especially in the context of this blog, Mark makes use of the ‘Web 3.0′ moniker… but in a very different way;
“This shift to Web-based applications has generated two powerful waves so far. Now, we are seeing a third wave—one that we are calling Web 3.0—and it may prove to be the most significant and disruptive yet to the traditional software industry.
While the world doesn’t need another buzzword, I feel that both the emerging generation of entrepreneurs and developers, as well as traditional software ISVs, need to grasp the enormity of Web 3.0 and its potential to create change, disruption, and opportunity. Web 3.0 is about replacing existing software platforms with a new generation of platforms as a service.”
He suggests that Web 1.0 was, fundamentally, a ‘transactional’ web; Web 2.0 a ‘participatory’ web; Web 3.0 a web in which anyone can innovate by calling upon shared resources in the Cloud.
“Web 3.0 changes all of this by completely disrupting the technology and economics of the traditional software industry. The new rallying cry of Web 3.0 is that anyone can innovate, anywhere. Code is written, collaborated on, debugged, tested, deployed, and run in the cloud. When innovation is untethered from the time and capital constraints of infrastructure, it can truly flourish.”
The transitions that Dion and Mark identify and describe are significant shifts in the IT industry. These shifts are as significant to the increasingly mainstream capabilities of the Semantic Web as elsewhere, although neither Dion nor Mark directly make that leap in their posts.
Discussing these posts internally, a colleague was quick to remind us that these huge hosted data centres are not just full of powerful servers. They’re full of data, ripe for interconnection and manipulation in very similar ways to those in which computers and software applications are already being meshed and combined. The Linked Data movement is increasingly central to the Semantic Web and it is a small step to move beyond its current projects to consider web-based applications that draw seamlessly upon these web-addressable pools of accessible and usable data.
Google did web developers, application users (and themselves) a huge favour when they formalised the apis that provided access to large bodies of map data. Mashups exploded, and everyone sat up and took notice of the opportunities for innovation boot-strapped upon the shared capabilities of Google’s code and servers, and the underlying data licensed by Google.
As more and more data - and compute capability - moves to the Cloud, and as the licensing frameworks formalise in order to explicitly ensure a wide range of re-use for those data, we’re moving ever closer to a mode in which software is available on demand (SaaS)… and so is data (and no, I’m not going to call it DaaS!)
This wealth of Web-addressable data needs web-native structures in which it can be stored and manipulated. The siloised mentality, code and structures of the RDBMS are unlikely to fit the bill here, whereas the web-native model of the Semantic Web is ready and waiting.
And yesterday evening, as I pretended to watch Dragons’ Den, a flurry of Cloud-related posts appeared in my BusinessWeek feeds to join the piece by Stacey Higginbotham that I wrote about last week.
Everywhere I look, I see Clouds. But the Dragons haven’t noticed yet.
And here in East Yorkshire? The sun might actually be coming out.
August 1st, 2008
Kevin Kelly looks to the next 5,000 days of the Web
Like most of us at Talis, I’m a big fan of the TED videos. There’s always a wow moment lurking in each of these recordings, and many of them do a great job of challenging assumptions. More often than not, they’re also powerful examples of how best to convey new ideas in presentation form.
Amongst the videos highlighted in the latest email alert from the TED team was this one, recorded at the Entertainment Gathering last December, in which Kevin Kelly notes that the Web is a mere 5,000 days old… and asks how we can predict what the next 5,000 will bring.
Mal Booth clearly got the same email, and was also impressed.
Kevin begins by looking back, reminding us about the extent to which things have changed in the past 5,000 days. He suggests that we could not have anticipated the services that have arisen, or the economic models by which they are supported.
He postulates the emergence of a global machine, more reliable than any other we have ever built. Within that machine, he suggests, we come together to navigate some 55 trillion links between documents, making 100 billion clicks per day and sending 2 million emails every second (most of which seem to arrive in my inbox!) As Tim Berners-Lee did in Beijing, Kelly draws parallels between the size of the Web and the size of the human brain. Kelly suggests that the size of the Web is doubling every year, such that the ‘total processing power… in raw bits’ of the Web will surpass that of humanity by 2040.
So how does this fit with the Semantic Web?
One of three broad trends that Kelly predicts for the next 5,000 days is a restructuring of the Web; a restructuring that Kelly argues is the Semantic Web.
He points to a progression from the pre-web linking of computers, to the web’s linking of pages, and on to the Semantic Web and the linking of the data lying behind pages and applications wherever they may be.
“You have to be open to having your data shared… which is a much bigger step than just sharing your web pages or your computer.”
“The next 5,000 days; it’s not going to be [just] the Web, only better.”
Worth 20 minutes of your time, to watch this…
July 16th, 2008
Yahoo! SearchMonkey Developer Challenge illustrates diversity
Back in May Yahoo! opened up their SearchMonkey platform, and kicked off a competition in which developers could put SearchMonkey through its paces.
The whole Yahoo! open platform initiative continues to grow apace (and largely unchallenged by Google and Microsoft), with BOSS rolling out earlier this month, and back in the SearchMonkey space the Developer Challenge’s winners have been announced.
According to UCLA student Marco Vitanza, who picks up the $10,000 Grand Prize for his Blogspot Infobar,
“SearchMonkey is a great first step towards the semantic web, creating the incentive for site owners to add semantic tags to their content while providing a richer, more useful search experience for users.”
Marco is joined by four other winners, one each for ‘Best Infobar’ (from BooRah), ‘Best Enhanced Result’ (from Greg Schechter, for Xbox.com), ‘Best Data Service’ (from David Hinckley), and most ‘Innovative Structured Data’ (from StumbleUpon).
StumbleUpon’s entry, for example, enriches results returned from Yahoo! Search, adding simple reviews and ratings from StumbleUpon to the Yahoo! page as illustrated by this screenshot from the Yahoo! Search Gallery listing for the entry.
Each entry takes a different path toward meshing data from diverse sources, yet each results in an end-user experience that is richer and more compelling than the vanilla search result from Yahoo! or either of their major competitors. With the arrival of BOSS, the ‘limitation’ of having this enriched interaction only available to those who happen to be on a Yahoo! Search web page diminishes, and opportunities for rich and innovative mashing and meshing of data in context and at the point of need draw ever closer.
Congratulations to this year’s winners, and to all those who entered; I look forward to seeing what those who follow them are capable of, and await responses from Mountain View and Redmond with anticipation.
May 20th, 2008
Ask questions of Tim Berners-Lee, Nova Spivack and others
On 11 June, Rennsaelaer Polytechnic Institute (RPI) is hosting a debate to explore the views of luminaries in the evolving Semantic Web. Somewhat unusually, the questions that panellists face will be influenced by “the collective wisdom of Web users from around the world.”
Anyone can ask a question, and the community is invited to vote on questions posed in order to select the most popular topics for discussion.
Panellists comprise (former podcast subject) Sir Tim Berners-Lee, (former podcast subject) Jim Hendler, (former podcast subject) Nova Spivack, Deborah McGuiness, Wendy Hall, and Nigel Shadbolt.
The debate will be livecast across the Web, and it’s down to all of you to make sure that the questions you want to be answered get asked. How often do you sit through an anodyne panel where the questions are hardly even worth asking, let alone answering? Well, if that happens this time, it’s your fault.
I look forward to seeing what happens.
Paul Miller provides consultancy and analysis services at the interface between the worlds of Cloud Computing and the Semantic Web. See his full profile and disclosure of his industry affiliations.
Subscribe to The Semantic Web via Email alerts or RSS.
SponsoredWhite Papers, Webcasts, and Downloads
- Unified Communications and Your Business: What You Need to Know Qwest Communications Get an overview of the potential benefits of Unified Communications (UC), including key considerations for mapping out your UC strategy. Download Now
- Live Webcast: The Power of Centralization in Distributed Development CollabNet Distributed teams are common in software development today. However ... Download Now
- Volume Activation Planning Guide Microsoft Volume Activation helps Volume Licensing customers automate and manage the ... Download Now
Recent Entries
- Siri offers virtual assistance, with a little help from your iPhone
- Oracle delivers native support for Thomson Reuters’ OpenCalais service
- Moving Data.gov towards the Semantic Web
- New open source Semantic Web store from Garlik capable of enterprise scale
- Semantic Web Gang podcast looks back at the Semantic Technology Conference
Blogs From Our Sponsors
Most Popular Posts
Top Rated
Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
Archives
Favorite Links
ZDNet Blogs
- A Developer's View
- All About Microsoft
- The Apple Core
- Between the Lines
- BriefingsDirect
- Collaboration 2.0
- Dev Connection
- Digital Cameras & Camcorders
- Ed Bott's Microsoft Report
- Emerging Tech
- Enterprise Web 2.0
- Forrester Research
- Googling Google
- GreenTech Pastures
- Hardware 2.0
- Home Theater
- iGeneration
- Irregular Enterprise
- IT Project Failures
- Laptops & Desktops
- Lawgarithms
- Linux and Open Source
- Managing L'unix
- The Mobile Gadgeteer
- On Sustainability
- The Semantic Web
- Service Oriented
- Smartphones and Cell Phones
- Social Business
- Social CRM: The Conversation
- Software & Services Safari
- Software as Services
- Storage Bits
- Team Think
- Tech Broiler
- Technology and the Global Supply Chain
- Tom Foremski: IMHO
- The ToyBox
- Virtually Speaking
- The Web Life
- ZDNet Education
- ZDNet Government
- ZDNet Healthcare
- Zero Day
White Papers, Webcasts, and Downloads
- Volume Activation Operations Guide Microsoft Microsoft? Volume Activation helps Volume Licensing customers automate and ... Download Now
- Volume Activation Technical Reference Guide Microsoft This reference guide is for information technology (IT) implementers whose ... Download Now
- How Windows Server 2008 R2 Helps Optimize IT and Save You Money Microsoft A key goal for customers over the next several years will be to reduce ... Download Now
SmartPlanet
- Thought-provoking progressive ideas on diverse topics that intersect with technology, business, and life, and matter to the world at large. Visit SmartPlanet
- More from IBM
- How to Drive Better Business Outcomes with Exceptional Web Experiences Download the eBook
- Driving Business Agility through SOA Connectivity & Integration Read the White Paper from IBM
- Linking Decisions and Information for Organizational Performance Read the Tom Davenport study






















