Category: Talking Semantics
August 10th, 2009
Moving Data.gov towards the Semantic Web
Government transparency in all its forms would appear to be very much in vogue at present, spanning everything from the Obama administration’s Data.gov portal and Prime Ministerial pronouncements in the UK Parliament to municipal proclamations of openness in Vancouver and compelling grass-roots demonstrations by activists and even newspapers.
At the heart of many of today’s initiatives lie programmes to surface Government data for use and re-use by third parties. The ‘open’ in ‘Open Data’ is, of course, a very loaded term, and I’ve looked before at some of the ways in which data might become ‘open’ whilst remaining effectively useless. Nevertheless, Governments’ current enthusiasm for being seen to embrace transparency should certainly be both welcomed and encouraged, and there are real opportunities to work with Government in ensuring that today’s transparency fervour continues undiminished, whether by omission or commission.
Given the complex and varied nature of the data involved, and the obvious linkages between the entities (you and I, our communities, our schools, our hospitals) described in numerous different databases, there’s a clear opportunity for technologies and approaches from the Semantic Web community to play a significant role in simplifying the whole process of moving these legacy databases online.
Already interested in Open Government from previous roles, and (obviously!) committed to encouraging real-world adoption of semantic technologies, I’ve spent some time recently talking to a number of those involved. A number of those conversations are now available as podcasts, and I’ll continue to seek out fresh examples and perspectives to share.
My most recent podcast conversation, released today, is with Professor Jim Hendler and Dr Li Ding of the Tetherless World Constellation at Rensselaer Polytechnic Institute in Troy, NY. The team at Rensselaer have been working with some of the US Federal Government’s data sets on Data.gov, and so far they’ve converted sixteen data sets from their original form, resulting in 2,927,398,352 freely available RDF triples and a number of demonstration applications.
Other conversations already released in the series include;
- David Eaves, talking about Vancouver’s commitment to Open Data
- John Sheridan, Head of e-Services at the UK Government’s Office of Public Sector Information, talking about his Department’s efforts to get Government data online
- Mark Birbeck, talking about work with the UK Government’s Central Office of Information to embed lightweight RDFa into workflows and web pages
Each offers an example of ways in which ‘open data’ contributes to Government transparency, or to increasing the value of the massive sunk investment in collecting, managing and curating the data upon which Governments depend. The Semantic Web’s notion of Linked Data (whether actually in RDF or not!
) offers a means to increase the utility of the data we have, without a massive programme of reengineering the systems used to manage it. The examples we see today, and the work of the individuals and teams with whom I have been speaking, will teach us a lot about how to make this work at Government scale.
July 14th, 2009
New open source Semantic Web store from Garlik capable of enterprise scale
An oft-repeated concern in discussing large-scale deployment of Semantic Web ideas is that of ’scale.’ With many of the better known data stores upon which the Semantic Web depends capable of storing only tens or at best a few hundreds of millions of RDF triples, it can be difficult to argue that the technology is fit for real-world deployment at scale.
There are, of course, different ways of managing data, and it’s not always necessary to store everything in one massive store… but for those concerned about scale today’s news from UK-based Garlik may well put their minds at rest.
The company has taken their internally developed (and massively scalable) RDF triple store and released it to the world under an Open Source license as 4store.
I spoke with the company’s CEO and Head of Architecture just ahead of the launch, to learn more about the system and their motivation behind sharing it.
May 29th, 2009
Bing is not alone; similar techniques alive and well in existing vertical search
Microsoft’s Bing is attracting plenty of interest today, and perhaps deservedly so as it brings some interesting fresh ideas to the world of generic search engines. Whether it is sufficiently compelling to break our deeply ingrained association of ’search’ with ‘Google’ remains to be seen.
It should be remembered, of course, that broadly similar approaches are already taken to managing and navigating data inside the data centres of large corporations where Autonomy, FAST, Endeca and their peers provide powerful capabilities.
I recorded a podcast with Endeca Chief Scientist Daniel Tunkelang in January and, by chance, spoke with Robin Johnson yesterday. Robin is CEO of FT Search, part of the Financial Times Group, and responsible for a new vertical search tool called Newssift. Newssift combines components from various technology companies (including Endeca, Nstein, Lexalytics and ReelTwo) to offer a useful means of learning more about businesses and the external factors affecting them.
April 20th, 2009
Can semantic technologies help brands profit from social media?
In my latest podcast interview with those shaping our evolving engagement with Semantic Technologies, I speak with Eric Hillerbrand.
Drawing upon years of experience in the development and deployment of Semantic Web solutions, Eric has spent the past few years considering the ways in which semantic technologies could bring structure and value to the increasingly visible online conversations around products and brands.
Have a listen, and share your views on the ways in which this might impact your brand, or your interaction with those of others.
April 15th, 2009
Leigh Dodds talks about Talis Connected Commons
I wrote about Talis’ Connected Commons last month, and today spent some time talking with the company’s Platform Programme Manager, Leigh Dodds.
The conversation has just been released as a podcast which looks at the rationale behind the company’s offer and the specific licensing choices that beneficiaries are asked to make.
Have a listen, and see if the Connected Commons might help your next project.
Disclaimer: Talis is my former employer
April 8th, 2009
Ivan Herman discusses Semantic Web activity at the World Wide Web Consortium
Ivan Herman is Semantic Web Activity Lead at the World Wide Web Consortium (W3C), and in this podcast he talks about a range of current activities across the Semantic Web community.
December 8th, 2008
Mark Greaves of Vulcan sees business opportunities in the Semantic Web
Vulcan shares many traits with its reclusive founder, Paul Allen, yet behind the scenes the company is responsible for philanthropic support to research and community-building activities, as well as investing commercially in the likes of Radar Networks (the company behind Twine) and Evri.
Last week, I had the opportunity to talk with Mark Greaves, Vulcan’s Director of Knowledge Systems Research, and the resulting podcast was released earlier today.
Drawing upon a background that includes the likes of Boeing and DARPA, Greaves is persuaded of the benefits to be found in applying semantic technologies to existing business problems and processes.
Greaves identifies four broad areas ripe for development;
- Search
- Enterprise Information
- Social Semantic Web Applications
- Web-scale Knowledge Publishing
It will be interesting to see the extent to which Vulcan - and others - invest in these areas next year.
July 31st, 2008
Crunchbase meets the Semantic Web
Technology web site TechCrunch is one of those staples (like ZDNet, of course) to which we all turn for news and analysis on the companies shaping the Web. Their CrunchBase directory provides a wealth of information on the companies and people featured in their stories (and elsewhere, as it’s editable by anyone), and they recently took the step of opening up an API to the data.
Amongst those taking advantage of the API is Semantic Web developer, Benjamin Nowack.
As he reports on his blog, Benji has created Semantic Crunchbase, an expression of the Crunchbase content as that ‘Linked Data’ about which Sir Tim Berners-Lee and others are currently so passionate. Remember,
“Linked Open Data is the Web done as it should be.”
Benji is continuing to add features to his demonstration, and will be blogging some of them (including the intriguing-sounding ‘Pimp my API‘) in future posts to his blog.
“Imagine [writes Nowack] you are looking for a job in California at a company that is at a specific funding stage. CrunchBase knows everything about companies, investments, and has structured location data. CrunchBoard on the other hand has job descriptions, but only a single field for City and State, and not the filter options to match our needs.”
And then stop imagining, and just run the query.
“This is where Linked Data shines. If we find a way to link from CrunchBoard to CrunchBase, we can use Semantic Web technology to run queries that include both sources. And with SPARQLScript, we can construct and leverage these links. Below is a script that first loads the CrunchBoard feed of current job offers (only the last 15 entries, due to common RSS’ limitations/practices, the use of e.g. hAtom could allow more data to be pulled in). In a second step, it uses the company name to establish a pattern join between CrunchBoard and CrunchBase, which then allows us to retrieve the list of matching jobs at stage-A companies with offices in California.”
For more information on Benji, listen to a podcast interview he did with my colleague Danny Ayers earlier this year.
May 8th, 2008
Peter Mika offers bananas at Yahoo! Research
Yahoo! are certainly being a lot more open than competitors such as Google and Microsoft when it comes to talking about their use of semantic technologies. They’ve been active for several years in recruiting stalwarts of the Semantic Web community such as Dave Beckett, and there is a long tradition of the company’s employees actively contributing to the research side of the Semantic Web world.
More recently, at least part of that sometimes-esoteric research has begun to make the transition toward Yahoo!’s consumer-facing properties. FireEagle and, most recently, SearchMonkey are obvious examples of this transition. SearchMonkey, for example, has real potential to compellingly demonstrate the case for the Semantic Web and is likely to drive a scramble toward structured markup within the SEO sector.
It’s hard to believe that Microsoft and Google are not also actively engaged in this area, although their reticence in speaking about it creates a perfect opportunity for Yahoo! to make the most of the attention… for now.
SearchMonkey came out of hiding late last month, when Yahoo! CTO Ari Balogh introduced it to attendees at the Web 2.0 Expo in San Francisco. There’s a Developer event in Sunnyvale next week, and Yahoo! looks likely to be pretty visible at the Semantic Technology Conference in San Jose, 18-22 May.
It was in the context of the Semantic Technology Conference that I found myself in conversation with Peter Mika of Yahoo! Research earlier today. We talked about the potential for SearchMonkey before considering some of the issues posed by moving Semantic Web specifications such as RDF out of the relatively well behaved academic sphere and onto the open Web where honesty is not everyone’s priority. Peter is speaking at Semantic Technology, a few days after Yahoo!’s own SearchMonkey Developer event. I look forward to seeing how much more Yahoo! shares on those occasions. Peter did suggest that public access to SearchMonkey will be ’sooner than [we] think.’ Next week, maybe?
As Peter enjoins listeners at the end of our conversation, ‘Follow the Monkey!’ It will be interesting to see where it leads. I will also be intrigued to see whether Google and Microsoft quietly join the followers… or are found hiding behind a tree waiting for us when we get where we’re going.
March 20th, 2008
Jim Hendler shares AI's lessons for the Semantic Web
Professor James A. Hendler goes by the daunting title of ‘Tetherless World Senior Constellation Professor’ at Rensselaer Polytechnic Institute (RPI) in Troy, New York. Behind the title stands a man who has been closely involved with Artificial Intelligence (AI) research for many years, and someone recognised as amongst the progenitors of the Semantic Web ideal. Hendler is also Associate Director of the Web Science Research Initiative (WSRI), an activity that is being pushed hard by Sir Tim Berners-Lee (a Director) and others.
I spoke to Jim recently, and in a wide-ranging conversation we touched upon early hype around the promise of Artificial Intelligence, conflicting aspirations for the Semantic Web Read the rest of this entry »
Paul Miller provides consultancy and analysis services at the interface between the worlds of Cloud Computing and the Semantic Web. See his full profile and disclosure of his industry affiliations.
Subscribe to The Semantic Web via Email alerts or RSS.
SponsoredWhite Papers, Webcasts, and Downloads
- Virtualization: Architectural Considerations And Other Evaluation Criteria VMware Of the many approaches to x86 systems virtualization available in the ... Download Now
- Reducing Server Total Cost of Ownership with VMware Virtualization Software VMware VMware virtualization enables customers to reduce their server TCO and ... Download Now
- The True Costs of Virtual Server Solutions VMware In an economic environment that is repeatedly heralding the message "do ... Download Now
Recent Entries
- Oracle delivers native support for Thomson Reuters’ OpenCalais service
- Moving Data.gov towards the Semantic Web
- New open source Semantic Web store from Garlik capable of enterprise scale
- Semantic Web Gang podcast looks back at the Semantic Technology Conference
- New York Times embraces Linked Data
Blogs From Our Sponsors
Top Rated
Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
- Learn more about tools to grow your business
-
The Business Essentials Guide provides you useful tools and templates to help grow your business and save you time with automated shipping solutions.
- Save time with the UPS Business Essentials Guide
- New Online Dashboard for IT Leaders
-
Read about top issues IT decision-makers face every day, plus get cost-effective solutions to real-life IT problems.
- Learn more >>
- The more you simplify, the more you save
-
When you transition from your existing Red Hat environment to SUSE Linux Enterprise from Novell, you can recognize dramatic cost savings, perhaps as much 50%
- Learn more >>
- Keep Up With The Latest In Document Management with The DocuMentor.
-
Doc delivers the scoop on today's enterprise content management, printer maintenance, and all other issues related to document management. It's the DocuMentor Blog.
- Learn more >>
- Microsoft Dynamics CRM Online - Free Six-Month Trial for Eligible Organizations
-
Microsoft Dynamics CRM Online provides fast online access, simple contact management and better sales performance for a low monthly cost - the best value on the market today.

- Learn more about the free, six-month trial offer>>
Archives
Favorite Links
ZDNet Blogs
- All About Microsoft
- The Apple Core
- Between the Lines
- BriefingsDirect
- Collaboration 2.0
- Dev Connection
- Digital Cameras & Camcorders
- Ed Bott's Microsoft Report
- Emerging Tech
- Enterprise Web 2.0
- Forrester Research
- Googling Google
- GreenTech Pastures
- Hardware 2.0
- Home Theater
- iGeneration
- Irregular Enterprise
- IT Project Failures
- Laptops & Desktops
- Lawgarithms
- Linux and Open Source
- Managing L'unix
- The Mobile Gadgeteer
- On Sustainability
- Rational Rants
- The Semantic Web
- Service Oriented
- Smartphones and Cell Phones
- Social Business
- Social CRM: The Conversation
- Software & Services Safari
- Software as Services
- Storage Bits
- Team Think
- Tech Broiler
- Technology and the Global Supply Chain
- Tom Foremski: IMHO
- The ToyBox
- Virtually Speaking
- The Web Life
- ZDNet Education
- ZDNet Government
- ZDNet Healthcare
- Zero Day
White Papers, Webcasts, and Downloads
- Building the Virtualized Enterprise with VMware Iinfrastructure VMware VMware virtualization software has been adopted by over 120,000 enterprise ... Download Now
- Three Steps You Need to Know to Stop Data Loss Varonis Sensitive data exposed to misuse or loss... it is the stuff of nightmares ... Download Now
- Reducing Server Total Cost of Ownership with VMware Virtualization Software VMware VMware virtualization enables customers to reduce their server TCO and ... Download Now
SmartPlanet
- Thought-provoking progressive ideas on diverse topics that intersect with technology, business, and life, and matter to the world at large. Visit SmartPlanet
- More from IBM
- Innovate your business' process model, play against the market, compete against others on our scoreboards and WIN! Try INNOV8 2.0: A BPM Simulator
- Enabling Real-World Business Transformation through IBM Service Management Read the EMA Analyst Report





