Category: Commercialisation
September 1st, 2009
Oracle delivers native support for Thomson Reuters' OpenCalais service
Thomson Reuters and Oracle today announced support for the media giant’s OpenCalais metadata generation service within release 2 of Oracle Spatial 11g. The integration gives Oracle users and developers direct access to OpenCalais’ natural language processing (NLP) capabilities.
More importantly, perhaps, direct integration with an Enterprise product such as Oracle’s database says much about how far the semantic technology community has come in being able to offer solutions capable of scaling - robustly - to meet Enterprise-scale demands.
Xavier Lopez, Oracle’s Director for Spatial and Semantic Technologies, is quoted in Thomson Reuters’ press release;
“This interoperability lets users quickly process documents in different formats (such as Microsoft Word and Adobe PDF), to extract semantic metadata that can be used for more semantically complete searches in Oracle11g.”
June 17th, 2009
Nova Spivack interviews Wolfram Alpha's Russell Foltz-Smith
Radar Networks attracted a fair degree of attention with their roll-out of Twine, and the company’s CEO has built a reputation as one of the more thoughtful thinkers in the space. Nova took to the stage at the Semantic Technology Conference today, not to talk about his own company or ideas, but to lead a conversation with Russell Foltz-Smith from Wolfram Research.
Wolfram Research, of course, is the company behind the recently launched Wolfram Alpha; a ‘computational knowledge engine’ that attracted a wave of attention that reached into the mainstream media.
“Putting all of the world’s computable knowledge; it sounds impossible… or over-confident, maybe. What is computable knowledge?”
“It’s ’systematic knowledge.’ It can be compared, contrasted, correlated, computed on. It’s not a movie review. Examples are classical physics, financial data and models, weather data and models… It’s not the latest opinion on who Britney Spears is dating. We don’t have a model to do anything with that in our system.”
Nova asks if it’s the difference between objective and subjective… Alpha deals with objective information. ‘Facts,’ almost?
Nova asks about sources, pointing to the example of Tibet; is it a ‘fact’ that Tibet that is part of China, or not… ?
“In the case of geo-political things, and religious things, we have to make choices… and allow the community to let us know whether they agree or not…” Couldn’t the system represent multiple views, tied to the diverse sources? Could we not show the different opinions, and allow the user to make informed decisions themselves?
Nova; “is the world’s computable knowledge infinite?”
Russell; “the foundation of computable knowledge is likely to be finite… The amount of knowledge that can be computed and generated from that is infinite…”
Nova; “I can see that maths could be finite. But geopolitics, health, etc… that’s much, much larger…”
Russell; “The instances seem very complex… Huge, but finite… I don’t want anyone to think we’ll have this done in ten years… It’s a long term thing.”
Nova; “Stephen [Wolfram] reckons it could be done in three years…?”
Nova; “Looking at the back end, the ontology seems to be implicit. I didn’t see any classes, just a lot of instances… a set of facts. As the team grows, how do you prevent people adding facts in different ways?”
Russell; “There are a set of stored facts; things you know about a city. But then there are computed facts that you couldn’t store in a traditional ontology.” Huh?
Nova; “Can you make a statement about what percent of the world’s computable knowledge is there today?”
Russell; “I can’t make a statement…”
Nova; “The syntax is quite interesting… but enigmatic. It wasn’t necessarily that the knowledge wasn’t there, but that I’d asked for it in the wrong way. Can’t you make a manual? … Stephen [Wolfram] said it would be an impossible task to write the manual… or to make a generic natural language on top.”
Nova; “In some cases a naive query will get you the answer, but maybe there’s a need for a layer that helps you when you don’t get what you want…”
Russell; “I think we’re getting close… we’re going to put an API out in the next few weeks, and hopefully someone will build the application using that to parse natural language and translate it for Alpha… Do we spend our time doing that, or putting more data and more models into the system… I reckon our time is best spent adding more data…”
Nova; “Is there a set of schemas or ontologies to link all of this stuff together?”
Russell; “There isn’t an ontology over the whole system… but within a domain there is structure… Is there some grand scheme that we have internally? Not really. The company has been doing this stuff for 23 years, so there’s a bit of a shared understanding internally.”
Examples keep coming back to mathematics… To succeed, Alpha has to offer compelling examples that are far broader…
Nova; “What about reasoning. You’ve said that you can derive additional knowledge. What kinds of reasoning is the system capable of?”
Russell; “I’d call it very simple reasoning. For example changing the currency based upon your geo-location… Is there any weak or strong AI in here? Not really. Could you build something like that? Probably. Will we? I don’t know…”
Nova; “Alpha seems to be a subset of Mathematica capabilities… Would you expand that, and bring a full Mathematica to the Web”
Russell; “It is, and there are plans to extend the capabilities. I don’t know if we’d go to a full-blown Mathematica on the web.”
Russell’s mentioning a subscription service for people working with more data, or needing more compute time. The public web site tends to time out a query in 4-8 (or 48?) seconds… The professional subscription version will have a monthly subscription version that will allow you to compute bigger questions. There will also be a pay-per-use API… and ‘primitive’ advertising. More advanced advertising, based on transactions, to be launched soon.
Nova; “Alpha’s really cool, but I want to do this on my own knowledge… inside an enterprise, inside a government agency…”
Russell; “We can roll out a custom Wolfram Alpha for those who want it behind the firewall. We will also let people upload their own data sets. We need to find a sensible way to let people do this…”
Nova; “There was a lot of hype - possibly my fault - around Alpha being a Google Killer. Obviously it’s not that. It’s something quite different. Who is the user, and what are they using it for?”
Russell; “Use will evolve, and it already has. There’s an obvious use by students, but the school year has just ended.
Nova; “Wolfram Alpha; now even Ph.D’s can cheat on their homework.”
Nova; “Are consumers using it? Obviously they’re having a play, but are they coming back and using it?”
Russell moves off to talk about academic use… Dodging the question?
Nova; “Are the financial capabilities in Alpha differentiated from the capabilities banks and investors already have in their vertical?”
Russell; “more sophisticated than a general finance web site, but probably less sophisticated than you’d find on a terminal in a bank.”
Nova; “Do I really need to know how long it would take an ant to get from San Francisco to Cairo?”
Russell; “Because of the way the system is engineered, it just keeps computing until it runs out of time. With simple queries you’ll get a lot of data. It just keeps computing.”
Nova; “What’s the big challenge, moving forward?”
Russell; “Setting priorities.”
Nova; “So let’s talk about Google. They made some aggressive marketing moves during the Alpha roll-out, and they’re continuing to roll products out to chip away… Do you think that what you’ve built is defensible, just because it’s hard… or can you defend it in other ways?”
Russell; “There are significant barriers to what we’re doing. Someone else could build this… but would they want to? That’s an open question.”
Nova; “Do you hope to work with other companies? Perhaps revenue share with them?”
Russell; “Obviously.”
Nova; “There’s been a lot of interest in how Alpha might connect with open standards and the Semantic Web…”
Russell; “If you want the platform to be used, we’ll have to do some of this stuff… RDF, OWL, etc could play a huge role.”
Nova; “Timeframe?”
Russell; “It’ll depend on pick-up of the API… which is due out in a few weeks.”
Nova; “So what’s the implication for education? It makes it possible to do some things without even thinking…”
Russell; “It’ll be a heated debate for a while… Some things are positive, some negative. There’s going to be a reorientation… It has to happen.”
Nova; “The danger is that if you delegate thinking [inside education] to a computation service… you may not actually understand enough to know if the answer that comes back is correct.”
Russell; “That’s a valid concern.”
Q&A
“You rely more on your computational engine than natural language… but you lay a lot of emphasis on the linguistics in your system. So if it’s not NLP what is it?”
Russell; “Domain linguistics, mainly; mathematical language, engineering language, etc… We think about how people describe things and search in these domains… and crawl the web looking for examples of how people use language in these domains.”
“Stephen is focussed on quality of data, which is important to a lot of people here. There aren’t a lot of tools. In addition to making your data store, I wonder if there might be scope to make some of your data curation tools available to the community, to improve the data out there.”
Russell; “Great point. Can we make these tools genuinely useful to people, without creating a support nightmare…”
June 16th, 2009
Semantic Technology Conference kicks off with Keynotes from Open Calais and Siri
This year’s Semantic Technology Conference got fully underway this morning, with Keynote presentations from Tom Tague of Thomson Reuters’ Open Calais Initiative and Tom Gruber from Siri.
Despite the wider economic situation, attendance for this fifth year of the event feels a little up on last year, and there’s clearly real enthusiasm in the buzzing Halls.
Tague’s Open Calais has been one of the success stories for useful and easy application of semantic technologies beyond a core community of enthusiasts and adopters, and has been covered here and on Cloud of Data a number of times since it launched. Just today, they announced a new set of partners and a postal service that should remove one more perceived barrier for another set of potential adopters.
Speaking to the theme of ‘Web 3.0 - the Web of Me,’ Tague’s abstract suggests;
“The mainstream adoption of Web 2.0 technologies – from RSS feeds to social networks – is hastening the demise of the portal. With each new face on Facebook, and each new Twitter account, our once routine habits and traffic patterns shift. This wave of change in the way we consume, transact and interact on the Web is dis-intermediating ‘destination’ sites of all kinds. Our once centralized content has been atomized.
And yet our fundamental problem persists. We’re overwhelmed with input, yet still can’t find the one thing we need… now.
Semantic technologies – and the content interoperability and Linked Data connections they beget – offer new hope. That is not to say the answer lies in building new search engines, and few would argue for another news aggregator. Rather, our point of inflection lies at the point of consumption. Our task is to simultaneously refine and enrich our digital experience of everything from content and community to commerce.”
Early on, Tague made a ‘non-apologetic statement;’
“People need to start deriving financial benefits from semantic technology. It’s time”
Absolutely!
Tague looks back at the move from ‘Web 1.0,’ described as ‘the last Web we agreed on,’ to ‘Web 2.0,’ which he sees as largely defined by the ‘addition of social.’ Today, he reckons, we are ‘extraordinarily content-rich’, ‘extraordinarily information-poor’ and ‘experientially deficient.’ Despite a wealth of content, we are failing to make the most of it.
‘We’re at the inflection point’ where ‘innovation is exploding’ as we move from developing and inventing toward mainstream adoption of technologies in the semantic technology space. Lots of things will be tried; 90% will fail, but that’s ok.
‘Everyone needs plumbing,’ and that’s what Calais is; semantic plumbing. 13 version releases in 18 months; about 100 presentations, 13,000 registered Open Calais developers, a million great ideas.
Tague reckons the various efforts he comes in contact with fall into six broad buckets;
Tools; Social; Advertising; Search; Publishing; Interface.
First, Enabling Tools. Data Management, Data generation, Databases, Integration and workflow. ‘A big yes.’ ‘We need tools.’ Everyone needs tools, especially as you move from early adopters toward the mainstream. Tools build the bridges that cross the chasm to enterprise adoption.
Enterprise adoption will not happen because it’s cool. Enterprise adoption will not be talked about on Twitter. Enterprise adoption will happen because it’s cheaper/faster/better than what they have just now.
‘Tool vendors need to simplify their story; it’s not about more functionality.’ ‘If I can’t understand your story, then Enterprise IT certainly can’t’
Second, ‘let’s put some frosting on top of social.’ ‘Wouldn’t it be cool if we could…’ Some of it might be cool, but there’s a challenge in monetising social. Adding frosting to the top of an industry that hasn’t worked out its own monetisation is fraught with risk.
‘I haven’t seen a compelling story yet.’
Next, advertising. Almost a dirty word in the semantic technology domain last year. But advertising is fuel, and semantic technologies have a clear role to play in enhancing advertising (see my podcast with Scott Brinker from last year…).
Semantic search; ‘the semantic industry’s brilliant yet under-achieving child.’ The answer to a question no one is asking? General, consumer-facing semantic search… directly competing with Google et al? Not viable.
But vertical search in specific domains… a huge growth opportunity, and people are willing to invest the time, effort and money to make it happen. Room for a handful of players in each domain?
Search; ‘a bifurcated marketplace.’
Publishing; content producers, editorial/aggregation, ‘robotic publishing.’
‘Classic publishers can get enormous value from this technology… not all of the value is in the user experience.’ Much of the value is being found in the back office, making existing data and investments work harder.
Little value in ‘robotic publishing,’ because the content isn’t that readable. Aggregation services like Huffington Post and Daily Me present ‘enormous opportunities.’
Interface; gaming a huge and growing market. $57bn industry. A ’seamless, interactive and responsive experience,’ it’s ‘graphically engaging and fun.’
Zemanta, AdaptiveBlue, Feedly, Apture et al ‘trying to make the consumption experience different’ [better?]. Not suggesting that these are like a game, but many of the drivers may be similar?
“People are on their mobile devices and in the browser; go where the people are.” Which links well to the next keynote…
“Do you care about semantics or about user value?”
“Don’t fund/buy semantic infrastructure beyond what you need; use infrastructure built by others where possible.”
“Think very hard about the user experience; make it compelling and exciting.”
Following Tague’s presentation, Tom Gruber took to the stage to talk about Siri; a company building a Virtual Personal Assistant (with an interesting iPhone app to start things off) that we discussed during a podcast last week. As Gruber’s says;
“We are beginning to see a new interaction paradigm for the web: the Virtual Personal Assistant (VPA). A VPA is task focused: it helps you get things done. You interact with it in natural language, in a conversation. It gets to know you, acts on your behalf, and gets better with time. The VPA paradigm builds on the information and services of the web, with new technical challenges of semantic intent understanding, context awareness, service delegation, and mass personalization.
Siri is a virtual personal assistant for the mobile Internet. Although just in its infancy, Siri can help with some common tasks that human assistants do, such as booking a restaurant, getting tickets to a show, and inviting a friend. We will describe the technology underlying Siri and how it fits in the larger ecosystem of services and data providers. And we will offer a vision of where assistants like Siri are going.”
Tom starts off by showing the Knowledge Navigator video from Apple… which dates all the way back to 1987. Many of the ideas are now coming to fruition; touch screens, a global network, awareness of temporal and social context, speech in and out, a ‘conversational interface,’ ‘delegation of work’ to the machine, and trusted use of personal data.
Is the Knowledge Navigator possible today? ‘No, but we’re getting there.’
Siri is pretty close… in certain well understood contexts, as Gruber shows in a video demo of the evolving iPhone application.
What is a Virtual Personal Assistant? It does things for you; it’s task-oriented. It understands your intent via a conversational metaphor. It gets to know you; it’s not the same for everybody, unlike a search engine.
‘Service delegation [like Siri]; the mother of all mashups’
‘Context is king’ in communicating with a VPA; where am I, what time is it, who am I, etc.
“This really is the beginning of the age of the start of Virtual Assistants.”
Need to solve authorisation/ authentication. If we reach a ‘data commons’ there will be more, better, information to drive choices and decisions.
Tom Tague is a regular member of the Semantic Web Gang podcast, which I moderate. Tom Gruber was the latest guest in my Executive Briefing podcast series.
May 29th, 2009
Bing is not alone; similar techniques alive and well in existing vertical search
Microsoft’s Bing is attracting plenty of interest today, and perhaps deservedly so as it brings some interesting fresh ideas to the world of generic search engines. Whether it is sufficiently compelling to break our deeply ingrained association of ’search’ with ‘Google’ remains to be seen.
It should be remembered, of course, that broadly similar approaches are already taken to managing and navigating data inside the data centres of large corporations where Autonomy, FAST, Endeca and their peers provide powerful capabilities.
I recorded a podcast with Endeca Chief Scientist Daniel Tunkelang in January and, by chance, spoke with Robin Johnson yesterday. Robin is CEO of FT Search, part of the Financial Times Group, and responsible for a new vertical search tool called Newssift. Newssift combines components from various technology companies (including Endeca, Nstein, Lexalytics and ReelTwo) to offer a useful means of learning more about businesses and the external factors affecting them.
April 20th, 2009
Can semantic technologies help brands profit from social media?
In my latest podcast interview with those shaping our evolving engagement with Semantic Technologies, I speak with Eric Hillerbrand.
Drawing upon years of experience in the development and deployment of Semantic Web solutions, Eric has spent the past few years considering the ways in which semantic technologies could bring structure and value to the increasingly visible online conversations around products and brands.
Have a listen, and share your views on the ways in which this might impact your brand, or your interaction with those of others.
April 13th, 2009
True Knowledge API lies at the heart of real business model
Semantically powered question answering start-up True Knowledge today made its Semantic Search API available for public consumption, taking the next step on the company’s journey out of beta and providing a clear steer as to the way in which they intend to generate revenue.
As the company’s press release notes,
“True Knowledge offers two distinct API services for developers: the ‘Direct Answer API’ and the ‘Query API.’ The Direct Answer API allows developers to leverage True Knowledge’s natural language question answering technology, giving any search site or application the ability to provide a single direct answer for questions asked on any subject in plain English. This is especially well suited to mobile applications where providing a lengthy list of search results may be impractical.
The Query API allows developers to bypass True Knowledge’s natural language translation system and directly query True Knowledge’s knowledge base using a simple query language. This allows automated systems such as web and mobile applications to tap into True Knowledge’s vast machine-understandable knowledge of the world, making them behave more intelligently.”
The company was founded in August 2005 and is based in the British city of Cambridge, at the heart of ‘Silicon Fen.’ A $4m Series A investment round was closed in July 2008, led by Octopus Ventures.
I spoke with CEO William Tunstall-Pedoe ahead of today’s announcement to see how the core knowledge base continues to improve, and to discuss the company’s plans.
For those who haven’t tried it, True Knowledge offers an interesting slant on attempting to answer your question rather than simply return hundreds or thousands of documents that might contain the answer as traditional search engines tend to. Tunstall-Pedoe quoted Google’s Larry Page during our conversation, noting that Page has asserted that
“the perfect search engine would understand exactly what you mean and give back exactly what you want.”
It is this that True Knowledge attempts with their ‘Internet Answer Engine,’ and core to their solution are a comprehensive (137 million facts, and growing) knowledge base, a proprietary system for understanding a query and a powerful inference capability that enables the system to answer questions more reliably. Part of that reliability, as Tunstall-Pedoe frequently stresses, lies in the system’s ability to know when it doesn’t know the answer. Along with a success rate of less than 50% for providing answers to questions, this may seem little more than an academic curiosity, but an ability to reliably know when to fall back to less structured approaches (such as passing the query to Google) is far better than ‘guessing’ or delivering wholly inappropriate responses… especially once the Answer Engine’s capabilities are embedded in some third party site.
True Knowledge’s process of inference also allows the system to cope with ambiguity, and even with contradictory ‘facts.’ During our conversation, we told the system that President Obama was born in Cambridge. It allowed us to make this assertion, but subsequent analysis of the overwhelmingly contradictory data drawn from elsewhere in the knowledge base means that it was deemed to be untrue and flagged as such.

A different query, in which I ask ‘How far is San Jose from SFO?,’ shows both how the system copes with ambiguity and the manner in which supporting facts are drawn from sites such as Metaweb’s Freebase.
The current True Knowledge home page is not going to draw huge numbers of users away from their search engine of choice, but that isn’t really the point. As Tunstall-Pedoe pointed out, the site is intended to showcase the company’s capabilities and facilitate the addition of new knowledge (as well as the millions of facts drawn from Wikipedia, Freebase and a growing body of licensed commercial content, over 120,000 facts have already been added by individuals in the beta programme.) The real utility of True Knowledge will lie in licensing the underlying system for use in vertical and horizontal third party applications, and public availability of the True Knowledge API begins that process. There’s a long way to go in further extending the knowledge base, suggesting that vertical search applications may be the first to sign up; it’s much easier to approach comprehensiveness within a bounded domain than across all areas of knowledge.
The market for semantically enhanced search is growing crowded, and stalwarts of the search industry have been hard at work too, with Google and others getting increasingly good at returning actual answers to factual questions.
Tunstall-Pedoe used a slide to demonstrate the differentiation the company sees between itself and ‘obvious’ competitors such as Wikipedia, Freebase, and hotly anticipated Wolfram Alpha. Key differentiators in the diagram included True Knowledge’s ability to infer (something Wolfram Alpha also claims), its language independence (although currently only available in English, the concept extraction techniques used by True Knowledge should work equally well in other languages), and the system’s reliance upon an internal ontology comprising 20,000 classes (plus biological species, product information, etc). True Knowledge (unsurprisingly) scored far better than the competition, but in a market that also includes the likes of Hakia and Powerset (neither of which could usefully answer my question about San Jose and SFO) the true picture is a lot more complex.
True Knowledge is certainly interesting, and frequently impressive. It remains to be seen whether a Platform proposition will set them firmly on the road to riches, or if they’ll end up finding more success following the same route as Powerset and getting acquired by an existing (enterprise?) search provider.
January 14th, 2009
Thomson Reuters bets on Content remaining King with Calais 4.0
Global information behemoth Thomson Reuters today announces the latest version of its Calais web service, delivering on earlier promises with respect to ‘Linked Data’ and firmly staking out the company’s intention to be a significant player in the shifting market for timely and authoritative information.
I’ll take a more in-depth look at the importance of authoritative sources in the emerging Linked Data ecosystem in this related post, and concentrate on the specifics of the Calais 4.0 release here.
Thomson Reuters’ Tom Tague describes version 4.0 as
“a fundamental change to the underlying service; it’s basically a new service”
This re-engineering of Calais will deliver the functionality that users have come to rely upon, whilst ensuring Thomson Reuters’ ability to continue to scale in a timely and cost-effective manner on the back of Amazon’s Web Services offering.
Tague describes the service released today as a technology preview to run alongside the existing Calais service for a period, but he is confident that it is at production strength from Day 1. Developers, Tague suggested, would
“try it and stay.”
In addition to this strengthening of the core offering, Calais 4.0 includes five substantive developments.
First, the company has followed through on earlier talk about ‘Linked Data,’ ensuring that any of around 25 entity types (company names, geographic areas, album titles, etc) discovered in content submitted to Calais will now be returned to the submitter with a ‘dereferenceable URI‘ that may be followed by either people or software in order to discover further information. The URI resolves to a Calais-hosted page of RDF with pointers to the Linked Data community’s usual suspects; DBpedia, MusicBrainz, GeoNames, the CIA Factbook, etc.
More unusually, and importantly, the second development sees the document include pointers to Thomson Reuters own content such as the (current) stock ticker, Board membership data, etc.
As the Press Release notes,
“In keeping with its commitment to the Linked Data standard, Thomson Reuters has also made a subset of its core data assets available for public use on the Web. The collection of business information represents the first contribution to the ‘Linked Data cloud’ made by a major publisher. It enables developers to programmatically query and use fundamental facts on hundreds of thousands of publically-traded companies, including company descriptions, stock tickers, management teams, locations, boards of directors and more.”
Thirdly, Calais 4.0 includes a ‘metadata transport layer’ to simplify the process of exposing and sharing large bodies of semantically rich data. Tague suggested that 2-300,000,000 persistent and dereferenceable URIs are available today (and capable of servicing tens or hundreds of millions of hits per day) for content previously submitted to Calais, with many more to come as the service scales.
Fourth, Calais is making its first move beyond English language content, and version 4.0 now supports entity extraction in French. French-language relationship and event extraction will follow shortly, as will other languages. Tague suggested that Hebrew, Arabic and Chinese will be amongst those rolled out during 2009. Behind the scenes, the team are also experimenting with automated translation services, which Tague reports to be ‘working very well’ in the lab.
Fifth, and finally, the Calais team is publishing an RDFS version of their schema, giving developers far more flexibility as to the ways in which they integrate the Calais web service into their own applications.
All in all, a welcome set of incremental improvements to Calais that also serves to raise an interesting set of questions about the role of ‘professional’ data in the Linked Data ecosystem.
Thomson Reuters’ Tom Tague is a regular member of the Semantic Web Gang, and should be discussing the release of Calais 4.0 in more depth on this month’s show, due to be recorded on 15 January.
December 16th, 2008
Inquira speaks in tongues with new release
San Bruno (CA) headquartered InQuira Inc. has unveiled the latest version of their InQuira product, and is making much of its multilingual capabilities.
I spoke with Chris Hall and Peter Tebbenhoff ahead of the announcement, to learn a little more about the company and its solutions. Hall is VP, Product Marketing, and Tebbenhoff the company’s Senior Director for Product Management. Both are relatively new hires and operate from the US eastern seaboard, remote to the company’s main offices in San Bruno and Los Angeles (California), Orlando (Florida) and Shanghai (China). InQuira employs approximately 125 staff at those four offices, servicing more than 80 companies and reporting some $30million in revenue from software sales.
Hall and Tebbenhoff stressed InQuira’s roots in Natural Language Search, and talked about the way in which they (like Amplify, which I covered recently) focus upon analysing text in order to extract ‘metadata around the searcher’s intent.’ This deep analysis of the way in which searchers submit queries, and the relationship between what they want and the words they use to describe it allows InQuira to surface content from across a wide range of resources. The company does not produce CRM software of its own, but partners with solutions such as Oracle’s Siebel to enhance the capabilities of that product.
InQuira is used in a variety of contexts, including customer-facing web self service, within corporate call centres, and in powering internal knowledge bases, although Tebbenhoff suggested that
“web self-service is our sweet spot”
Hall echoed sentiments recently outlined by Burt Helm in a BusinessWeek article about Pittsburgh-based PNC, suggesting that
“Generation Y likes online self-service”
He went on to say that large corporations are seeing effective self-service as the next big opportunity to cut costs, following the last round of savings that led to the rise of their call centres. Effective self-service is in its infancy, though, and 50-60% of InQuira revenue still comes from call centre applications, with c.30% from self-service and the remainder from internal knowledge base and help desk sales.
In an interesting change of direction, I was told the story of a ‘major UK bank’ that is closing their impersonal and much-maligned call centres in an effort to save money. Instead, the bank is seeking to redirect customers back to the branches, and InQuira is being installed in the branches to provide staff with ready access to information that will enable them to support customers. The tool will also encourage branch staff to add content of their own.
Oh, the irony. So is my bank branch going to start opening longer hours, too?
Given the rise of non-English markets outside the US (and the increasing importance of Spanish in certain States), there is an increasing need to provide multilingual support, and it is this that InQuira 8.1 seeks to address.
The previous focus on understanding intention brings value here, making it easier to take the next step and begin to extract meaning from queries and documents in different languages. As the press release describes;
“According to ByteLevel Research, in 2003, very few global websites supported more than ten languages. In 2008, the average number of languages supported by all 225 sites reviewed in their Global Web Report Card is twenty, up from eighteen last year. Although most global companies have contact centers all around the world, the bulk of the knowledge content tends to be written in one primary language, and then translated to other languages, making content disparate and out-of-sync from region to region. InQuira’s cross-lingual search and retrieval enables consumers and agents to return context-relevant results regardless of either their native language or the content’s original language. For example, a query would first find intent-driven content in the same language as the search, then search automatically in English, and if specified, any or all of the supported languages. The cross-lingual features extend to authoring and translation support, which empowers companies to distribute the workload for authoring solutions by encouraging the authoring by frontline customer service agents, in their native language, at the moment of customer interaction.”
Sue Feldman, IDC’s VP for search and discovery technologies, is quoted as noting that;
“The ability to understand a question, and then find the answer in any language is an important requirement for any customer support site or intranet. Customers and service representatives operate today in a multilingual environment. So do global enterprises. Single language applications have hamstrung them, and waiting for translated materials to catch up with the original information is not a good option. In a flat world, cross-language capabilities are critical to working with multilingual users.”
December 9th, 2008
Metatomix at work with County Court system in Florida
I ended my last post on Metatomix by noting;
“The demonstrations and the rhetoric are certainly impressive; the true test lies in seeing the extent to which real tasks are made easier and more effective for real people in the real world.”
Today, the company announced the implementation of their Active Warrant Alert System in Lee County, Florida, and I spoke with Sheila Mann, Court Operations Officer with the Twentieth Judicial Circuit of Florida, to hear her impressions.
The company’s Active Warrant Alert System
“[identifies] defendants appearing before the court who have open local, Florida, out-of-state or federal warrants, enabling court and law enforcement personnel to take appropriate action. Within the first seven days of implementation in Lee County, the Metatomix system identified 141 warrants that led to 16 arrests.”
I also understand that the system has been implemented here specifically because of a tragic case in which a police officer was allegedly shot and killed by a fugitive with an unknown warrant.
“The Active Warrant Alert System is an extension of the Metatomix technology which also powers the Judicial Inquiry System (JIS) and First Appearance Calendar, statewide solutions that query and deliver correlated results from 13 state and national data sources empowering criminal justice personnel with real-time information.”
“Lee County, with a population of over 400,000, is the first county in Florida to integrate the Active Warrant Alert System, which flags warrants for any subject appearing for court events such as arraignment, pretrial, bond, motion, case management or traffic hearings. The technology provides court officials with access to warrant information from sheriff’s departments and attorney’s offices across local, state and national information systems. Previously, bailiffs were required to log in to numerous computer systems to acquire information on a subject’s criminal history as well as any local, Florida, out-of-state or federal warrants. ”
Metatomix clearly have a satisfied customer in Sheila Mann, who told me that the system takes minutes each day to carry out local, State and Federal checks on all those expected to appear in court on a given day. Mann suggested that local checks alone used to occupy two Court staff every day. The system also has the intelligence to avoid issuing ‘Failure to Appear’ notices for witnesses and suspects known by other parts of the judicial system to be in custody; and therefore unable to appear.
The system was in trial during October and November, and went live across the County on 17 November. Mann reports that
“we haven’t had any issues,”
and spoke highly of the partnership between Court and Sheriff’s Department officials and the team from Metatomix.
Finally, Mann noted that neighbouring Collier County is looking to implement the same system, and she looks forward to seeing additional benefits from further exchange of data.
December 9th, 2008
Zemanta talks Linked Data with SDK and commercial API
I covered Slovene semantic technology startup Zemanta back in September when they secured investment from New York City’s Union Square Ventures, and the company also received frequent mentions in the Semantic Web Gang’s recent look back over 2008.
Yesterday, the company released an update to their popular WordPress plug-in and today they announced [PDF] commercial availability of their ‘Semantic API.’
The company describes the API, suggesting that;
“We analyze your post through our proprietary natural language processing and semantic algorithms, and statistically compare its contextual framework to our preindexed database of content.
We are using a combination of machine learning techniques and end-user input from our widget users, that enables us to train the engine and constantly improve the recommendations.”
Users familiar with the blog plug-in will recognise - and probably value - these capabilities, which the API makes available for use in other situations.
Superficially, there are clear similarities with the capabilities of services such as Thomson Reuters’ Open Calais, which also permits third parties to pass data via an API and receive structured and enriched results in return. A news article discussing a merger, for example, might be returned marked up with structured information on the companies involved, their key personnel, etc.
Given the backgrounds of Zemanta and Thomson Reuters, and the different data sets upon which they draw, it’s likely that a quite clear distinction will emerge in the use cases for which each is appropriate. It appears likely that Zemanta is more suited to the informal Web (pulling content from IMDb, Twitter and the like) whilst Calais will excel in mission-critical applications at the Fortune 500 and their ilk. Both add value in the mid-range, and only time will tell which is preferred moving forward.
Interestingly, both are making moves to embrace the Semantic Web’s Linking Open Data movement, which I’ve covered frequently here. Calais made announcements in that direction back in September, and an upcoming release of their service will make good on that. Zemanta’s press release today states;
“Zemanta fully supports the Linking Open Data initiative. It is the first API that returns disambiguated entities linked to dbPedia, Freebase, MusicBrainz, and Semantic Crunchbase. The data can be returned in the standard format of Semantic web – RDF. It is an ideal gateway from unstructured web to semantic web. This represents a major step ahead for efforts to connect the Web into a semantic web of objects.”
Zemanta CTO, Andraž Tori, commented;
“I see it as a stargate portal from unstructured content into the world of Semantic Web.”
Zemanta has already signed up a number of partners, and one of those is Freebase. Jamie Taylor (who recorded an early podcast about Freebase here) commented on the way that end users might benefit from accessing Freebase data via Zemanta;
“For publishers, the Zemanta API acts as a front door to the universe of open data on the web, facilitating the jump from unstructured text to semantic entities. You can take plain text, use the Zemanta API to resolve that text into strongly identified entities, and then query Freebase for detailed information about the mentioned people, places, movies, etc. Truly empowering.”
Use of the API is free for up to 10,000 API calls per month, with a subscription fee above that level.
Paul Miller provides consultancy and analysis services at the interface between the worlds of Cloud Computing and the Semantic Web. See his full profile and disclosure of his industry affiliations.
Subscribe to The Semantic Web via Email alerts or RSS.
SponsoredWhite Papers, Webcasts, and Downloads
- Recession Proofing Your Organization with Electronic Forms IBM Corp. The current economy is forcing organizations of all sizes to look more ... Download Now
- Smarter Products: The Building Blocks for a Smarter Planet IBM Corp. Businesses are delivering a new generation of smarter products that are ... Download Now
- Twelve Ways to Reduce Costs with Microsoft(r) SQL Server(r) 2008 Microsoft Looking to squeeze the best possible value from new and existing systems? Learn 12 proven ways to save time and money using Microsoft SQL Server 2008. Download Now
Recent Entries
- Siri offers virtual assistance, with a little help from your iPhone
- Oracle delivers native support for Thomson Reuters’ OpenCalais service
- Moving Data.gov towards the Semantic Web
- New open source Semantic Web store from Garlik capable of enterprise scale
- Semantic Web Gang podcast looks back at the Semantic Technology Conference
Blogs From Our Sponsors
Most Popular Posts
Top Rated
Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
Archives
Favorite Links
ZDNet Blogs
- A Developer's View
- All About Microsoft
- The Apple Core
- Between the Lines
- BriefingsDirect
- Collaboration 2.0
- Dev Connection
- Digital Cameras & Camcorders
- Ed Bott's Microsoft Report
- Emerging Tech
- Enterprise Web 2.0
- Forrester Research
- Googling Google
- GreenTech Pastures
- Hardware 2.0
- Home Theater
- iGeneration
- Irregular Enterprise
- IT Project Failures
- Laptops & Desktops
- Lawgarithms
- Linux and Open Source
- Managing L'unix
- The Mobile Gadgeteer
- On Sustainability
- The Semantic Web
- Service Oriented
- Smartphones and Cell Phones
- Social Business
- Social CRM: The Conversation
- Software & Services Safari
- Software as Services
- Storage Bits
- Team Think
- Tech Broiler
- Technology and the Global Supply Chain
- Tom Foremski: IMHO
- The ToyBox
- Virtually Speaking
- The Web Life
- ZDNet Education
- ZDNet Government
- ZDNet Healthcare
- Zero Day
White Papers, Webcasts, and Downloads
- Total Economic Impact of SQL Server 2008 Upgrade Microsoft See how upgrading to Microsoft SQL Server 2008 can provide your company with an anticipated ROI of between 160 and 180 percent. Download Now
- A Case Study in Scientific Application Streaming at the Harvard School of Engineering and Applied Sciences Intel The School of Engineering and Applied Sciences (SEAS) serves as the ... Download Now
- Volume Activation Improvements in Windows 7 Microsoft Explore the activation technologies in Windows 7, including updated interfaces, flexible activation procedures, reduced memory footprint, and more. Download Now
SmartPlanet
- Thought-provoking progressive ideas on diverse topics that intersect with technology, business, and life, and matter to the world at large. Visit SmartPlanet
- More from IBM
- How to Drive Better Business Outcomes with Exceptional Web Experiences Download the eBook
- Driving Business Agility through SOA Connectivity & Integration Read the White Paper from IBM
- Linking Decisions and Information for Organizational Performance Read the Tom Davenport study






