On TechRepublic: 12 tech terms that make you sound old
BNET Business Network:
BNET
TechRepublic
ZDNet

Category: Talking Semantics

August 10th, 2009

Moving Data.gov towards the Semantic Web

Posted by Paul Miller @ 3:46 am

Categories: Open Data, Podcasts, Research, Semantic Web, Semantic Web People, Standards, Talking Semantics, Web 3.0

Tags: Government, Semantic Web, Podcasts, RDF, XML, Internet, Software/Web Development, Web Development, Paul Miller

Government transparency in all its forms would appear to be very much in vogue at present, spanning everything from the Obama administration’s Data.gov portal and Prime Ministerial pronouncements in the UK Parliament to municipal proclamations of openness in Vancouver and compelling grass-roots demonstrations by activists and even newspapers.

At the heart of many of today’s initiatives lie programmes to surface Government data for use and re-use by third parties. The ‘open’ in ‘Open Data’ is, of course, a very loaded term, and I’ve looked before at some of the ways in which data might become ‘open’ whilst remaining effectively useless. Nevertheless, Governments’ current enthusiasm for being seen to embrace transparency should certainly be both welcomed and encouraged, and there are real opportunities to work with Government in ensuring that today’s transparency fervour continues undiminished, whether by omission or commission.

Given the complex and varied nature of the data involved, and the obvious linkages between the entities (you and I, our communities, our schools, our hospitals) described in numerous different databases, there’s a clear opportunity for technologies and approaches from the Semantic Web community to play a significant role in simplifying the whole process of moving these legacy databases online.

Already interested in Open Government from previous roles, and (obviously!) committed to encouraging real-world adoption of semantic technologies, I’ve spent some time recently talking to a number of those involved. A number of those conversations are now available as podcasts, and I’ll continue to seek out fresh examples and perspectives to share.

My most recent podcast conversation, released today, is with Professor Jim Hendler and Dr Li Ding of the Tetherless World Constellation at Rensselaer Polytechnic Institute in Troy, NY. The team at Rensselaer have been working with some of the US Federal Government’s data sets on Data.gov, and so far they’ve converted sixteen data sets from their original form, resulting in 2,927,398,352 freely available RDF triples and a number of demonstration applications.

Other conversations already released in the series include;

  • David Eaves, talking about Vancouver’s commitment to Open Data
  • John Sheridan, Head of e-Services at the UK Government’s Office of Public Sector Information, talking about his Department’s efforts to get Government data online
  • Mark Birbeck, talking about work with the UK Government’s Central Office of Information to embed lightweight RDFa into workflows and web pages

Each offers an example of ways in which ‘open data’ contributes to Government transparency, or to increasing the value of the massive sunk investment in collecting, managing and curating the data upon which Governments depend. The Semantic Web’s notion of Linked Data (whether actually in RDF or not! :-) ) offers a means to increase the utility of the data we have, without a massive programme of reengineering the systems used to manage it. The examples we see today, and the work of the individuals and teams with whom I have been speaking, will teach us a lot about how to make this work at Government scale.

July 14th, 2009

New open source Semantic Web store from Garlik capable of enterprise scale

Posted by Paul Miller @ 5:20 am

Categories: Podcasts, Semantic Web, Semantic Web Companies, Talking Semantics

Tags: Open Source, Semantic Web, RDF, XML, Internet, Software/Web Development, Web Development, Paul Miller

An oft-repeated concern in discussing large-scale deployment of Semantic Web ideas is that of ’scale.’ With many of the better known data stores upon which the Semantic Web depends capable of storing only tens or at best a few hundreds of millions of RDF triples, it can be difficult to argue that the technology is fit for real-world deployment at scale.

There are, of course, different ways of managing data, and it’s not always necessary to store everything in one massive store… but for those concerned about scale today’s news from UK-based Garlik may well put their minds at rest.

The company has taken their internally developed (and massively scalable) RDF triple store and released it to the world under an Open Source license as 4store.

I spoke with the company’s CEO and Head of Architecture just ahead of the launch, to learn more about the system and their motivation behind sharing it.

The result has just been released as a podcast.

May 29th, 2009

Bing is not alone; similar techniques alive and well in existing vertical search

Posted by Paul Miller @ 4:10 am

Categories: Commercialisation, Podcasts, Semantic Web Companies, Talking Semantics

Tags: Endeca Technologies Inc., Technique, Robin, Podcasts, Productivity, Search, Internet, Paul Miller

Microsoft’s Bing is attracting plenty of interest today, and perhaps deservedly so as it brings some interesting fresh ideas to the world of generic search engines. Whether it is sufficiently compelling to break our deeply ingrained association of ’search’ with ‘Google’ remains to be seen.

It should be remembered, of course, that broadly similar approaches are already taken to managing and navigating data inside the data centres of large corporations where Autonomy, FAST, Endeca and their peers provide powerful capabilities.

I recorded a podcast with Endeca Chief Scientist Daniel Tunkelang in January and, by chance, spoke with Robin Johnson yesterday. Robin is CEO of FT Search, part of the Financial Times Group, and responsible for a new vertical search tool called Newssift. Newssift combines components from various technology companies (including Endeca, Nstein, Lexalytics and ReelTwo) to offer a useful means of learning more about businesses and the external factors affecting them.

April 20th, 2009

Can semantic technologies help brands profit from social media?

Posted by Paul Miller @ 8:16 am

Categories: Commercialisation, Podcasts, Semantic Web, Talking Semantics

Tags: Brand, Social Media, Branding, Semantic Web, Marketing, Internet, Paul Miller

In my latest podcast interview with those shaping our evolving engagement with Semantic Technologies, I speak with Eric Hillerbrand.

Drawing upon years of experience in the development and deployment of Semantic Web solutions, Eric has spent the past few years considering the ways in which semantic technologies could bring structure and value to the increasingly visible online conversations around products and brands.

Have a listen, and share your views on the ways in which this might impact your brand, or your interaction with those of others.

April 15th, 2009

Leigh Dodds talks about Talis Connected Commons

Posted by Paul Miller @ 11:05 am

Categories: Open Data, Podcasts, Semantic Web, Semantic Web Companies, Talking Semantics

Tags: Connected Corp., Podcasts, Internet, Paul Miller

I wrote about Talis’ Connected Commons last month, and today spent some time talking with the company’s Platform Programme Manager, Leigh Dodds.

The conversation has just been released as a podcast which looks at the rationale behind the company’s offer and the specific licensing choices that beneficiaries are asked to make.

Have a listen, and see if the Connected Commons might help your next project.

Disclaimer: Talis is my former employer

April 8th, 2009

Ivan Herman discusses Semantic Web activity at the World Wide Web Consortium

Posted by Paul Miller @ 8:33 am

Categories: Podcasts, Semantic Web, Semantic Web People, Standards, Talking Semantics, W3C

Tags: W3C, Ivan Herman, Semantic Web, Internet, Paul Miller

Ivan Herman is Semantic Web Activity Lead at the World Wide Web Consortium (W3C), and in this podcast he talks about a range of current activities across the Semantic Web community.

December 8th, 2008

Mark Greaves of Vulcan sees business opportunities in the Semantic Web

Posted by Paul Miller @ 11:35 am

Categories: Commercialisation, Investment, Open Data, Podcasts, Research, Semantic Web, Semantic Web Companies, Semantic Web People, Standards, Talking Semantics, W3C

Tags: Knowledge, Vulcan, Mark Greaves, Semantic Web, Podcasts, Strategy, Aerospace & Defense, Internet, Management, Manufacturing

Vulcan shares many traits with its reclusive founder, Paul Allen, yet behind the scenes the company is responsible for philanthropic support to research and community-building activities, as well as investing commercially in the likes of Radar Networks (the company behind Twine) and Evri.

Last week, I had the opportunity to talk with Mark Greaves, Vulcan’s Director of Knowledge Systems Research, and the resulting podcast was released earlier today.

Drawing upon a background that includes the likes of Boeing and DARPA, Greaves is persuaded of the benefits to be found in applying semantic technologies to existing business problems and processes.

Greaves identifies four broad areas ripe for development;

  • Search
  • Enterprise Information
  • Social Semantic Web Applications
  • Web-scale Knowledge Publishing

It will be interesting to see the extent to which Vulcan - and others - invest in these areas next year.

July 31st, 2008

Crunchbase meets the Semantic Web

Posted by Paul Miller @ 9:47 am

Categories: Podcasts, Semantic Web, Talking Semantics, Web 2.0

Tags: Job, API, Crunchbase, Semantic Web, Recruitment & Selection, Internet, Human Resources, Workforce Management, Paul Miller

Technology web site TechCrunch is one of those staples (like ZDNet, of course) to which we all turn for news and analysis on the companies shaping the Web. Their CrunchBase directory provides a wealth of information on the companies and people featured in their stories (and elsewhere, as it’s editable by anyone), and they recently took the step of opening up an API to the data.

Amongst those taking advantage of the API is Semantic Web developer, Benjamin Nowack.

As he reports on his blog, Benji has created Semantic Crunchbase, an expression of the Crunchbase content as that ‘Linked Data’ about which Sir Tim Berners-Lee and others are currently so passionate. Remember,

“Linked Open Data is the Web done as it should be.”

Benji is continuing to add features to his demonstration, and will be blogging some of them (including the intriguing-sounding ‘Pimp my API‘) in future posts to his blog.

“Imagine [writes Nowack] you are looking for a job in California at a company that is at a specific funding stage. CrunchBase knows everything about companies, investments, and has structured location data. CrunchBoard on the other hand has job descriptions, but only a single field for City and State, and not the filter options to match our needs.”

And then stop imagining, and just run the query.

“This is where Linked Data shines. If we find a way to link from CrunchBoard to CrunchBase, we can use Semantic Web technology to run queries that include both sources. And with SPARQLScript, we can construct and leverage these links. Below is a script that first loads the CrunchBoard feed of current job offers (only the last 15 entries, due to common RSS’ limitations/practices, the use of e.g. hAtom could allow more data to be pulled in). In a second step, it uses the company name to establish a pattern join between CrunchBoard and CrunchBase, which then allows us to retrieve the list of matching jobs at stage-A companies with offices in California.”

For more information on Benji, listen to a podcast interview he did with my colleague Danny Ayers earlier this year.

May 8th, 2008

Peter Mika offers bananas at Yahoo! Research

Posted by Paul Miller @ 6:30 am

Categories: Podcasts, Research, Semantic Web, Talking Semantics

Tags: Yahoo! Inc., Semantic Web, Internet, Paul Miller

Yahoo! SearchMonkey logoYahoo! are certainly being a lot more open than competitors such as Google and Microsoft when it comes to talking about their use of semantic technologies. They’ve been active for several years in recruiting stalwarts of the Semantic Web community such as Dave Beckett, and there is a long tradition of the company’s employees actively contributing to the research side of the Semantic Web world.

More recently, at least part of that sometimes-esoteric research has begun to make the transition toward Yahoo!’s consumer-facing properties. FireEagle and, most recently, SearchMonkey are obvious examples of this transition. SearchMonkey, for example, has real potential to compellingly demonstrate the case for the Semantic Web and is likely to drive a scramble toward structured markup within the SEO sector.

It’s hard to believe that Microsoft and Google are not also actively engaged in this area, although their reticence in speaking about it creates a perfect opportunity for Yahoo! to make the most of the attention… for now.

SearchMonkey came out of hiding late last month, when Yahoo! CTO Ari Balogh introduced it to attendees at the Web 2.0 Expo in San Francisco. There’s a Developer event in Sunnyvale next week, and Yahoo! looks likely to be pretty visible at the Semantic Technology Conference in San Jose, 18-22 May.

It was in the context of the Semantic Technology Conference that I found myself in conversation with Peter Mika of Yahoo! Research earlier today. We talked about the potential for SearchMonkey before considering some of the issues posed by moving Semantic Web specifications such as RDF out of the relatively well behaved academic sphere and onto the open Web where honesty is not everyone’s priority. Peter is speaking at Semantic Technology, a few days after Yahoo!’s own SearchMonkey Developer event. I look forward to seeing how much more Yahoo! shares on those occasions. Peter did suggest that public access to SearchMonkey will be ’sooner than [we] think.’ Next week, maybe?

As Peter enjoins listeners at the end of our conversation, ‘Follow the Monkey!’ It will be interesting to see where it leads. I will also be intrigued to see whether Google and Microsoft quietly join the followers… or are found hiding behind a tree waiting for us when we get where we’re going.

March 20th, 2008

Jim Hendler shares AI's lessons for the Semantic Web

Posted by Paul Miller @ 2:34 am

Categories: Podcasts, Research, Semantic Web, Semantic Web People, Standards, Talking Semantics, W3C

Tags: Web, Vision, Hendler, Semantic Web, Internet, Paul Miller

Jim HendlerProfessor James A. Hendler goes by the daunting title of ‘Tetherless World Senior Constellation Professor’ at Rensselaer Polytechnic Institute (RPI) in Troy, New York. Behind the title stands a man who has been closely involved with Artificial Intelligence (AI) research for many years, and someone recognised as amongst the progenitors of the Semantic Web ideal. Hendler is also Associate Director of the Web Science Research Initiative (WSRI), an activity that is being pushed hard by Sir Tim Berners-Lee (a Director) and others.

I spoke to Jim recently, and in a wide-ranging conversation we touched upon early hype around the promise of Artificial Intelligence, conflicting aspirations for the Semantic Web Read the rest of this entry »

Paul MillerPaul Miller provides consultancy and analysis services at the interface between the worlds of Cloud Computing and the Semantic Web. See his full profile and disclosure of his industry affiliations.


Email Paul Miller

Subscribe to The Semantic Web via Email alerts or RSS.

SponsoredWhite Papers, Webcasts, and Downloads

advertisement

Recent Entries

Top Rated

    advertisement

    Archives

    Favorite Links

    ZDNet Blogs

    White Papers, Webcasts, and Downloads

    SmartPlanet

    • Thought-provoking progressive ideas on diverse topics that intersect with technology, business, and life, and matter to the world at large. Visit SmartPlanet
    • More from IBM
    • Innovate your business' process model, play against the market, compete against others on our scoreboards and WIN! Try INNOV8 2.0: A BPM Simulator
    • Enabling Real-World Business Transformation through IBM Service Management Read the EMA Analyst Report
    Click Here