Archive

Archive for the ‘Google’ Category

Google Joins Semantic Web

May 28th, 2009

As I highlighted a few short weeks ago, Google has been dropping hints about the Semantic Web so subtle that even us chaps realised something exciting was going on over at the Googleplex. During the Searchology conference (their annual slap in the face to startups who dared think they were on to something unique and exciting) Big G revealed that the Christmastime rumors of data islands were no more and that RDFa was accepted!

The announcement focuses on hCard and hReview, which if found on your page be will be turned in to a visual presentation and added to your result on their SRP. Sound familiar? If it does that’s because, as many bloggers pointed out, it’s incredibly similar to Yahoo! SearchMonkey Structured Objects. Competition aside, this is great news for publishers as it is yet another vindication of the benefits of structured data on your pages.

Google Rich Snippets

Google Rich Snippets

Yahoo! SearchMonkey

Yahoo! SearchMonkey

Where SearchMonkey has focused on complete Objects for presentation - e.g. a Video looks like this whilst a News article looks like that - Rich Snippets, as Google is calling this, call out single key/value pairs which can add value to a standard result. So far however their presentation appears to be behind flood controls as you need to add your domain to a waiting list. My hunch is that Google is treading carefully due to concerns as much about spam as the resulting visual impact on their end users.

Now that the top two engines are adopting public, open-standards we can expect to increasingly enjoy the benefits of ever richer, more accurate results with highly targeted presentations.

Wolfram goes Alpha

May 17th, 2009

Continuing their ‘WTF?!’ launch policy Wolfram|Alpha chose to open the floodgates to their servers late on Friday afternoon, several days earlier than announced. Perhaps this was ploy to reduce the likelihood of their hardware stumbling under the load, if it was it didn’t work - mostly because their target audience doesn’t have much else to do on a Friday night.

As Google proved a few days prior, server problems happen to the best of us and I for one won’t be marking them down for that - it’s an Alpha for a reason. If anything it’s extra marks for thinking ahead and offering an error message aimed squarely at their audience.

Wolfram|Alpha Server Failure Message

Wolfram|Alpha Server Failure Message

Ok, so what are the results like? Overall I’m impressed, the linking of data is frankly excellent even if you get the feeling they’re just showing off at times. For example, knowing the height of the ‘tallest tree’ in the most appropriate unit would be satisfactory. Going on to convert 385ft to miles, yards, meters, km, cm and even fathoms is bordering on the autistic. Another classic example informs me that the speed ‘55mph’ is 0.62 x the speed at which Marty McFly needed to drive the Delorean DMC-12 in order to time travel ( 88 mph ) - now is that geeky, a fun Easter egg or just data because it was there?

Childlike fact telling aside, Wolfram doesn’t offer the most accurate Query Linguistic Analysis engine and that leads to many failed queries which it would appear Wolfram actually does have the answer to. For example ‘average salary’ fails whereas ’salary’ returns average salary information for a set of major occupations. This is something that can be improved dramatically with access to a massive volume of real world queries, this Alpha release and associated ‘Google Killer’ hype will certainly enable the collection of that.

I’m also not going to knock off marks for the user interface or breadth of their dataset, both of those can be fixed over time if the proof of concept warrants it - and the first look suggests that it really does. Whilst Google wanted to index the world’s data, Freebase, Wikipedia and now Wolfram seem to have most of the worlds ‘factual content’ wrapped up.

Bogged down by Semantics

May 9th, 2009

I’m running massively behind on my Podcasts. The backlog has been building up for the past month whilst I’ve been focusing on that ever present joy - quarterly planning. As you might have guessed from my place of work, planning right now has a few more variables than one might hope for. Digressions aside, I grabbed a few hours this weekend to get psyched about Tech again.

Highest on my playlist was The Semantic Web Gang, and not just because my colleague Peter Mika was taking part this time. This is regularly a great show for anyone wanting to learn more. I ended up a little depressed as the conclusions of everyone on the panel sadly matched those I’ve been coming to for a while.

No one likes to ‘reinvent the wheel’ so before delving in to code most of us look around to see if we need to. When investigating Semantic Objects today there is no clear source of truth as to prior-art for any developer (corporate or personal) wanting to create an Ontology. Whilst this doesn’t surprise me at this stage in the Semantic Web, I am a little shocked that no one has attempted to take ownership of this space.

It’s in the interest of the community to offer a set of complete vocabularies for specific objects and all of us spend a fair amount of time trying to define the next set. With both these thoughts in mind, here’s my elevator pitch for a possible solution:

  • Offer a gallery style view of known and ‘complete’ objects.
  • This gallery would be user contributable.
  • This gallery would allow for comments and feedback to the authors to ensure the needs of the wider world are considered by the authors.
  • This gallery would offer links to ontology creation tools.
  • This gallery would support and allow for group collaboration on the definition of a new object.
  • When an ontology is complete and examples of real world usage were linked to by more than 3 people Yahoo!, MSN, Ask, Google etc.  would extend support for it by adding crawler support (e.g. we would agree to accept this format for our indexes).
  • The entire Ontology set would be made available under CC licenses (or most appropriate alternative) and ‘donated’ to the community to ensure adoption.

Why is ‘something’ like the above useful? It would be a start point for the confused masses. Does an ontology exist for ‘bicycles’? A simple search could return nothing:  You’ll need to go and create something, and here are some tools and access to a community. Or something: Here’s an ontology you can go and use or contribute to in order to extend it as you need.

Well, that’s one possible way to lower the barriers to entry which people are increasingly telling me are too high right now. What do you think, is there a better way?

Wolfram|Alpha

April 28th, 2009

Ignoring the traditional ‘how to launch a new site’ playbook which state you must whore yourself around expert commentators, provide personal updates on your blog for months in advance, build a following among an ever increasing alpha test group and finally issue an overblown PR announcement on the day of launch which preferably includes some quotes hinting ‘Google killer?’ from your new friendly commentators, Stephen Wolfram has seemingly rubbed much of the industry the wrong way the mysteriously quiet run up to the launch of Wolfram|Alpha.

As redundant as this may sound, geniuses are aren’t stupid people. For a while there though I was starting to question the wisdom of the MacArthur genius grant review committee. Whilst Wolframs approach has garnered the biggest swell in anticipation prior to a launch since, well probably since, Teoma and Wisenut back in 2002/3, yesterdays webcast was a bust for me. Scheduled at a time I couldn’t attend I hoped to catch up later in the evening. No such luck. The broadcast appeared to have been replay free until over 30 hrs had passed and we started to see some neat download options began to appear – download the video, stream it or grab the MP3 – cool! Speaking of which, Cuil followed the playbook, everyone seems to hate them, and even if they did publish an MP3 version of their most recent announcement (a timeline presentation seen before a dozen times elsewhere) nobody would have cared. It’s important to add that you don’t get a single screenshot of this ‘amazing’ new product during the entire 90 minute presentation - Stupidity or extreme genius? You decide.

What can it do? It can describe places, like Lexington, Mass., by its vital statistics, like location, population, weather, etc. It can compare Lexington with Moscow. If you type “LDL 180,” it will tell you the percentile of the population with higher or lower cholesterol and show you the answer on a chart. If you tell “LDL 180 male 45,” it will adjust the chart for gender and age group. It can chart the life expectancy of a male age 40 in Italy or tell you who was president of Brazil in 1928

http://bits.blogs.nytimes.com/2009/04/28/wolfram-alpha-veil-lifted

Without visual proof of the thing in action it’s hard to state this with any degree of convicion, but there appears to be nothing in the demo that couldn’t be achieved without a decent query parser and a triple or perhaps, if we wanted to store the context of the data, a quad store. I have seen a few leaked screenshots from the initial webcast and it would seem that many of the examples can be knocked up with Freebase. So is Wolfram|Alpha one of the next generation of Object Data store powered Search Engines? Hard to say from this small ‘preview’, but the indications do hint at it.

To cap the growing excitement with the fateful rubber stamp of ‘Google Killer’, Google themselves came out with a Direct Display for the top of their results to show US Census data. A nothing launch on any day of the week – with the exception of the nice graphing animations thanks to Trendalyzer – this timing got the press buzzing. Do Google think this is a threat? Is Google trying to prove that whatever Wolfram can do they can do better? And so on until you loose the will to care. Well, at least until you get the chance to see for yourself in May when the real launch happens.

[UPDATE May 11th]  According to their blog, Wolfram|Alpha will open to the full force of the Webs interest on 18th May 2009. If you’re lucky enough to stumble in to a test bucket you may be able to experiment already.

Google sees the light?

March 28th, 2009

It’s been all quiet on the Semantic front over at Google until a flurry of recent press murmurings. Something’s definitely changed over at the Googolplex, but so far it appears to be just their PR department!

A pretty good article over at PCW discusses some of these recent announcements, but in short it appears that Semantic Search holds a different meaning for Google than everyone else… with the possible exception of your favourite dictionary. Semantic simply means ‘the meaning of language’ or ‘the relationship between symbols’. If we assume words and phrases are symbols, then Google is certainly pursuing Semantic Search. Their visible focus of late has been to provide links to related topics and longer summaries, both of which have been available at the competition for a very long time. Nothing new so far.

Rumours in December leaked hinting at some early work to create data-islands within the pages of a number of top publishers - a new form of non standard markup which could lead to new presentations in the Google SRP. So far, there’s been little sign of this progressing which is a tremendous relief, and not just because I head up the SearchMonkey program over at Y! where we’ve already launched an open approach to this.  I truly believe in the power of utilising metadata for Search. Seeing our competitors follow suit at this stage is more of a vindication than a troubling development, but any attempt to force the market to use non-standard markup is not a good sign for the web at large.

Have you started to see signs of other Semantic Search developments on the web? What do you think of them so far, and do you think open standards are of any importance at this time?

Author: Nick Cox Categories: Google, Search, Semantic Tags: ,