Yahoo! Working On A Technorati Killer

So the rumor is that Yahoo! is working on a Technorati killer (via Scoble). This was mentioned by JeremY! at Open Tech 2005. With Ice Rocket searching feeds with support for tags things are really getting busy in the feed search (tag:feedsearch [Technorati, Ice Rocket]) field. With Yahoo! getting in the feed search game it seems like Google and Microsoft will be pulled in eventually. Perhaps one of them will buy Technorati?

I always figured that Bloglines and Technorati seemed like an interesting merge though.

Ranked Search, Merging Google And Yahoo

Last week I discovered Twingine via Russell. Twingine puts the results of your query from Google and Yahoo! into frames so that you can compare the results side by side. This seemed like an interesting idea, but the interface isn’t particularly useful, it’s too much work to visually compare the two. Then I remembered Matt’s announcement of using Yahoo’s search APIs at WordPress.org, which started me thinking about the availability of APIs from Yahoo! and from Google.

It seemed like there should be some way of combining these two resources, taking the search results from both Google and Yahoo! and mix them in some semi-meaningful way. So last night I started putting together the Ranked Search website. You enter a query and the site requests the top 10 results from both Yahoo! and Google via their search APIs, giving each link rank. The first link gets rank of ten and so on through all ten links from each result set. The idea being the the links with the highest rank are more likely to be what you are looking for. Then I look for links that appear in both sets, merging them into one, with a new rank that is the sum of their original ranks. All of the unique links from each set are then merged in and the new set is sorted by rank. The highest potential score is 20 (where Google and Yahoo! both return the same link in the #1 position) and the lowest possible score is 1. It is really basic stuff.

Making requests out over the Internet to both Google and Yahoo! isn’t the fastest thing in the west. So I put in some basic caching for every query. The result sets from every query is cached in a PostgreSQL database and is used when a exact query match is found and the results are less than 12 hours old. If the results are more than 12 hours old the query is sent off to Google and Yahoo! and the new results are cached again.

Everything is very plain and basic right now, consider it an experiment. If you have any additional thoughts leave a comment or use my contact form to drop me a note.

Most of this information is also available on the about page for Ranked Search.

My Web 2.0, By Yahoo!

It really is amazing how quickly concepts can spread. Tagging data (URLs, images, etc) has impressed a lot of people as a “better way” to organize content. Normally when having these types of discussions you point to del.icio.us, Flickr and more recently Technorati. Today a new, much larger, player is added to that list, Yahoo!. Their announcement about My Web 2.0 emphasizes that the reasoning behind this is to capitalize on the community knowledge behind allowing virtually anyone to tag websites. JeremY! has some thoughts on why this is important.

My first impression is that this looks pretty darn cool! Go start at http://myweb2.search.yahoo.com/ (assuming your have a Yahoo! account already) and simply do some searches on the web and start tagging them. You do this via the “Save” link for the site in the search results page. When you save a link it asks your for a description and tags, with the tags field providing suggestions (presumably from tags that other people are using). Permissions can be set to restrict this to just yourself, your community or everyone. Another feature I don’t remember seeing before is the “View as XML” link. It turns out that Yahoo! is identifying RSS/ATOM feeds and then providing a link. So if a site as the “View as XML” link on it in the search results, you know it has a feed. Nice to see search engines trying to divine what features sites have a do something with that knowledge.

You can import an existing list of links from IE, Yahoo! Bookmarks or and RSS feed. Although I haven’t tried it yet, I suspect you’ll be able to import your del.icio.us bookmarks into My Web 2.0 via the RSS import feature. There are some other features that you’d expect from this type of service, like the top 100 most popular sites and browsing by tags. Of course you can also search on just your own set of links. One thing I couldn’t find was way to see how many other people had saved the same URL. This is something that you can do in del.icio.us and it’s rather disappointing to see that left out here. So far that is the most obvious feature that is missing. I should note here that so far the site seems to respond very quickly, which is something that del.icio.us has had problems with, either being down or just extremely slow.

As JeremY! noted, they’ve exposed My Web 2.0 via the Yahoo Search API, which was a very smart move. In the future I’d like to see more this approach, where virtually every feature is exposed (to some degree) via an API that we can get our hands the same day a new feature is released. For now this trend is still pretty close to the bleeding edge, but as time goes on and things mature a bit more, companies that don’t provide APIs will be missing the boat in a major, major way.

It is my sincere hope that Yahoo!’s My Web 2.0 doesn’t get completely overrun by those trying to game and spam search engines. Although Google is still the number target for this type of “attack”, Yahoo! is a big enough player to attract the attention of those who would do evil in this regard. Since you need a Yahoo! account in order to use this feature the obvious spot to defend yourself is at the account creation process. Shore up your defenses Yahoo!, I’m sure the bad guys will be coming with a renewed effort.

The new My Web 2.0 looks impressive.

Wait a minute. I can’t find any way to syndicate my bookmarks (saved sites) via an RSS/ATOM feed? What is up with that? I can’t find any mention of it in the FAQ. Common guys, you covered so many other features on launch, how could you possibly leave that one out? I was about to mention how this was going to be a del.icio.us killer, but I doubt anyone will give up on it until they can get a feed of their links. Fix that and I’ll likely give up on del.icio.us and move to My Web 2.0.

In the meantime I’ll play with this a bit more and look at the API features to see what is possible.

UPDATE 2:30pm 29 Jun 2005: As Toby pointed out in the comments below, you can get a feed of your links via the API. Links for this should be plastered all over the place in My Web 2.0, tagging and feeds go hand in hand in many ways. Interesting, you don’t need a valid application id (appid) in the URL for it to work. So a feed for My Web 2.0 account looks like http://api.search.yahoo.com/MyWebService/rss/urlSearch.xml?appid=somestrangeid&yahooid=somestrangeid. You can get a feed for anyone that you have a Yahoo username for. I’m going to guess that permissions tie into this somehow, a link that I mark as private shouldn’t show up in public feed. Thanks for the pointer Toby.