Archive

Archive for the ‘Internet’ Category

SES Chicago Dec 7 - 11th

December 7th, 2009
Chicago, originally uploaded by aserpa.

Well, it’s been a long time, a fantastically busy time and that’s the end of my non-excuse for the lack of posting of late.

I’ve just landed in Chicago to attend SES and discovered some weather for the first time in 15 months (you’d be amazed to hear you can actually miss weather if you move to California!) and a bank of Taxis outside the conference parading around with the new Yahoo! advertising on the top. I have a feeling this is a happy accident, but marketing is rather a ‘dark art’ so I’m not committing to that.

I’m here because I was asked to join a panel (Developments in Information Retrieval on the Web) and talk to the crowd about Semantic Data along with Jamie Taylor (MetaWeb), Martin Hepp (Universität der Bundeswehr München) and Jay Myers (Best Buy).

It’s a great panel - two guys capable of talking about any aspect of RDF and Microformats and two guys who’ve had the pleasure of learning from them and implementing structured data solutions. Jay works over at BestBuy where he’s done a cracking job of integrating structure on a public site with a massive catalog (~500k+ pages) - oh and he’s also been instrumental in developing the GoodRelations spec, so all-in-all a Semantic Superstar! That sort of implementation makes my life so much easier and hopefully, in turn, yours as Search Engines and aggregators start to use this structure to help you find what you want.

If you’re in town and would like to meet for a drink please drop me a mail, IM or reach out on Twitter.

UPDATE:

It’s all over, i’m heading back to California where white stuff doesn’t fall out the sky and coats are for people with holiday cabins in Tahoe. Huge thanks to Sean Golliher (great blog template, Sir!) for organising a great panel, it was a really enjoyable session.

Shame on those of us who felt the delay whilst Martin got his Mac ready would have made a good advert for Microsoft’s ‘I’m a PC’ series - the upshot is this great video of his presentation with slides. http://vimeo.com/8065914 It’s a pity we don’t have the full 126 slide original to compare it against!

In the spirit of sharing, as soon as I get to a stable connection I’ll be adding my slides to SlideShare and asking Jamie and Jay to do likewise…. more soon.

UPDATE 2:

I’ve uploaded my slides from the panel to: http://www.slideshare.net/NickCox/ses-chicago-2009-searchmonkey

UPDATE 3:

I just saw a Tweet from Jay Myers, now his slides from SES are up on Slideshare at http://slidesha.re/4UoQbg. 3 down, 1 to go!

Yahoo! Placemaker

June 5th, 2009
Yahoo! Placemaker

Yahoo! Placemaker

Recently Yahoo! launched a new Geo API called Placemaker. I’ve been playing with it all week and am continually delighted with the recall and accuracy it’s able to deliver.

Essentially you can pass in a text string or web document (structured or unstructured) and the service will identify, disambiguate and extract the places contained within. For example this sentence includes the location Sunnyvale, California which whilst seemingly completely out of context is where I work. I ran this paragraph through the API and here’s an extract of what was returned:

<document>
<administrativeScope>
<woeId>2502265</woeId>
<type>Town</type>
<name><![CDATA[Sunnyvale, CA, US]]></name>
<centroid>
<latitude>37.3716</latitude>
<longitude>-122.038</longitude>
</centroid>
</administrativeScope>

</document>

Along with the location name, a latitude and longitude of both the centroid and each corner of a bounding box we also have the superb WOEIDs (Where-on-Earth ID). Armed with all this information there’s almost no location based application I can’t build. Indeed sites such as Just Landed which searches Twitter for the text ‘just landed in’ and geocodes the places in order to provide intriguing visualisations just became as simple as tying two APIs together!

As a supporter of all things Semantic, it’s important to highlight that this API goes far beyond some complex string matching. Placemaker recognizes geographic semantic tags, such as the W3C Geo Vocabulary, and microformats such as geo and adr. Pretty neat huh? Drop a note in the comments below and let me know what you think about this and post any links to cool applications it’s allowed you to build.

Bogged down by Semantics

May 9th, 2009

I’m running massively behind on my Podcasts. The backlog has been building up for the past month whilst I’ve been focusing on that ever present joy - quarterly planning. As you might have guessed from my place of work, planning right now has a few more variables than one might hope for. Digressions aside, I grabbed a few hours this weekend to get psyched about Tech again.

Highest on my playlist was The Semantic Web Gang, and not just because my colleague Peter Mika was taking part this time. This is regularly a great show for anyone wanting to learn more. I ended up a little depressed as the conclusions of everyone on the panel sadly matched those I’ve been coming to for a while.

No one likes to ‘reinvent the wheel’ so before delving in to code most of us look around to see if we need to. When investigating Semantic Objects today there is no clear source of truth as to prior-art for any developer (corporate or personal) wanting to create an Ontology. Whilst this doesn’t surprise me at this stage in the Semantic Web, I am a little shocked that no one has attempted to take ownership of this space.

It’s in the interest of the community to offer a set of complete vocabularies for specific objects and all of us spend a fair amount of time trying to define the next set. With both these thoughts in mind, here’s my elevator pitch for a possible solution:

  • Offer a gallery style view of known and ‘complete’ objects.
  • This gallery would be user contributable.
  • This gallery would allow for comments and feedback to the authors to ensure the needs of the wider world are considered by the authors.
  • This gallery would offer links to ontology creation tools.
  • This gallery would support and allow for group collaboration on the definition of a new object.
  • When an ontology is complete and examples of real world usage were linked to by more than 3 people Yahoo!, MSN, Ask, Google etc.  would extend support for it by adding crawler support (e.g. we would agree to accept this format for our indexes).
  • The entire Ontology set would be made available under CC licenses (or most appropriate alternative) and ‘donated’ to the community to ensure adoption.

Why is ‘something’ like the above useful? It would be a start point for the confused masses. Does an ontology exist for ‘bicycles’? A simple search could return nothing:  You’ll need to go and create something, and here are some tools and access to a community. Or something: Here’s an ontology you can go and use or contribute to in order to extend it as you need.

Well, that’s one possible way to lower the barriers to entry which people are increasingly telling me are too high right now. What do you think, is there a better way?

Conference Depression

April 24th, 2009

I’ve just returned from a relaxing couple of weeks touring my new home state of California. Since moving to the US 7 months ago I’ve not taken any holiday and still find the SF tourist stuff intriguing. Among the 2,104 emails (real figure) awaiting my return, the award for most depressing email goes to O’Reilly. The Found conference has been cancelled.

This is depressing on two fronts - first (selfishly) I’m obviously sad not to be presenting my talk on the Semantic Web to the SEO community. Secondly, and more importantly, it’s one of the first major signs of the ‘Great Depression 2.0′ here in the Bay Area. Sure people have been laid off in their thousands, property prices have plummeted, and queues have run from Job Centres out on to the street, but oddly this doesn’t seem to be reflected in the traffic on the 101 and 208 each morning.

For those interested in the statement from O’Reilly, here it is in full. 

O’Reilly Found Conference 2009

Due to the challenging economic environment, we’re sorry to announce that we’ve made the difficult business decision to postpone the O’Reilly Found Conference, which was to take place June 9-11 in Burlingame, CA.

We are grateful for the support of everyone involved in the event, particularly program co-chairs Vanessa Fox and Nathan Buggia, sponsor Microsoft Live Search, and the event partners and participants.

O’Reilly will continue to explore the topic of search-friendly architecture for developers, including the possibility of integrating some of the excellent Found program into other offerings from O’Reilly.

If you would like to continue the conversation on making the web easier to find, please visit janeandrobot.com and follow twitter.com/janeandrobot to become part of the community, read the latest on technical SEO issues from industry experts, and attend local technical SEO meetups. 

 

Is this the first of many? I expect so. Have you seen any other cancellations?

Author: Nick Cox Categories: Internet Tags: , , ,

Linked Data

April 3rd, 2009

Tim Berners-Lee used his 15 minutes at TED to state a fairly obvious point. Linked Data, or the Semantic Web, or the Deep Web are all important things we can’t do much with right now because people don’t take the time semantically tag it in a useful way. As one would expect from a certified genius, he’s quite correct about the problem. I do however feel that the reason people aren’t doing this at scale is threefold – complexity, tangible benefit, security.

The issue of data security became highly apparent to me at SxSW. Discussions during the metadata panels would often focus on the issue of Intellectual Property (IP) and the fear of leaking proprietary information to competitors. My personal view is that semantically enriching your data is rather like a city investing in transport infrastructure. On one hand, yes people can move quickly to leave your town, but on the other you’ve provided a more satisfying travel experience and made it easier and more attractive for people to visit and return.

As for complexity and benefits. You won’t notice this unless you came to this page from Yahoo! Search, but I’ve taken the mandate from TBL to heart and embedded enough mark-up to identify his TED talk as a video with an associated thumbnail image in nice well formed RDFa.

Here’s an example we’ve used fairly extensively which shows how to embed a Hulu video and benefit from this new approach. Just the first two lines of code are required to generate an enhanced result. The other four lines are optional and assist with the display.

<link rel="image_src" href="http://thumbnails.hulu.com/9/967/32912_145×80_generated__VfW.jpg" />

<link rel="video_src" href="http://www.hulu.com/embed/GREW9Qw0P7KjIyjJydQYRw" />

<meta name="video_height" content="296"/>

<meta name="video_width" content="512"/>

<meta name="description" content="Video description: Homer gets upset at a vending machine filled with apples."/>

<meta name="video_type" content="application/x-shockwave-flash"/>

Simple, huh?

The premise of Linked data is simple, but until now the implementation was difficult and the tangible benefits lacking. At Yahoo! we’re trying to clarify the benefits with richer search results powered by SearchMonkey, and I spend large portions of my day evaluating ways to simplify the mark-up process for site owners and publishers.

If you have a flash object such as a video, game, or document embedded on a page, adding a few lines of code will make it appear as an enhanced result after we re-crawl your page. No semantic mark-up knowledge is required as you can simply cut-and-paste our example code, and you don’t have to build your own application to display the result – although you can go crazy with the SearchMonkey Developer tool if you so wish. SearchMonkey does the heavy lifting, taking your mark-up and extracting the necessary structured data to display it as an enhanced result.

Those ‘hardcore’ few who started writing the Web using Vi or Notepad are the same who understand RDFa and Semantic Mark-up technologies. The Linked world probably needs a FrontPage or Dreamweaver solution before we see mass adoption. Until that great day, until there’s a Semantic-FrontPage for the rest of us, why not give our documentation a whirl?