We’re relevant! Search engine launched at edgeio

Hot on the heels of our acquisition of Adaptive Real Estate Services we can today announce the initial roll out of edgeio’s relevance based search engine. This is the first step in our efforts to make edgeio.com the best place to find “stuff” anywhere in the world.

By way of background, edgeio launched in March with zero listings. We took in about 100 new listings per day at that time. Today we take in about 700,000 new listings per day. The search engine we began with (free text matching and then results in reverse chronological order) simply was not good enough to function with this number of listings.

We now have a dedicated search team and this is their first push. It is not yet perfect but it is a vast improvement on what was there before.

In this upgrade we are acknowledging the way partners and users are using edgeio and trying to improve their experience. Many listings based sites are uploading their listings to us and we are providing search traffic back to them. We are being used as a listings search service by companies with listings and by users looking for listings. A “search engine for stuff” if you will.

Here are a few searches to try:

Used iPod
Sony Vaio
Baby Crib

These are all global searches (edgeio has data from about 15000 cities worldwide). You can use the geography widget (top right of the results screen) to choose a city. Once you have done that then the slider control can be used to fine tune the results (zip, city, state, country, continent, world). Of course, you can also sort by price or by date listed.

Arun Jagota; Josh Myer and Dale Johnson are the team – mostly quite new at edgeio – who are working on search, and have moved us from a reverse chronological display of results into a relevance ranked display. Of course they have had a lot of help from others, most notably our technical advisors. And they have a lot of work still to do to make the results the best there is.

Going forward, as edgeio strives to bring together, organize and distribute the world’s marketplaces, edgeio.com will be the place that our organizing efforts are most obvious. It will be the place to find “stuff”.

From here on relevance will be our default sorting method. Of course we will enable users to modify the sort order (by time, by price, and in the future by other criteria). Our outbound APIs will eventually reflect these options also.

There is a whole lot more to come from us, and this is a baby step in many ways, but a significant directional move. Let us know what you think.In future posts we will talk about the bring together and distribute parts of our vision – these are realized through our edgedirect product.

But for now lets meet the team working on search:

Arun Jagota

Arun Jagota I am a search engineer at edgeio. I am working on the design and evaluation of algorithms for improving relevance in particular and search in general at edgeio.

One of the key challenges is the relevance problem itself. A tough nut to crack. The challenge is to find methods that are both simple and efficient, yet effective in returning relevant results. Another challenge (specific to edgeio) is to fetch relevant results from a variety of sources in real-time, recompute their relevance internally in real-time, and merge them into a single set of results that the user sees. A third issue, also specific to edgeio, is that our documents (unlike general web pages) are listings in verticals with varying degrees of structure. So there are special issues involving relevance and search for finding “stuff” rather than web pages.

What keeps me motivated is that “relevance and search” supply me with a constant source of challenging (but not impossible) problems to solve, and algorithms from computer science, statistics, and information retrieval present me with solution methods to consider and evaluate. Another thing that keeps me going is constant incremental progress and quick feedback. You have an idea, try it out, sometimes it improves relevance, and you notice it quickly.

Before working at edgeio, I worked at another start-up (Xoom corporation) as a data analyst and machine learner. There I designed improved algorithms for predictive modeling in an e-commerce setting and also some for improved fuzzy matching of names and addresses of people. Prior to that I taught graduate courses as an adjunct faculty member in computer engineering at Santa Clara University, including one on “Information Retrieval And Search Algorithms”.

Josh Myer

Josh MyerHi, I’m Josh. I’m the Young Guy at the office, but I make up for it with an intense background. Before going to college, I spent a few years working as a reverse engineer and general puzzle-solver, in fields ranging from accounting to instant messaging. I just wrapped up two degrees from the University of North Carolina (Chapel Hill), one in Linguistics and one in Mathematics. I focused on the typically-impractical formal aspects of both, but it’s actually come in handy when working on search problems.

I spend a lot of time in the plumbing of edgeio, but have been working more on search lately. The user-visible bits that I’ve done so far are the real-time search results from external providers. I’m currently working on several things to make search better, faster, and more user-friendly.

Working here has been great: there’s always a new problem to solve and the freedom to solve it the way you want to. All told, I get to use my entire background at work, ranging from unix arcana to the acquisition of language in children. It’s all the fun parts of college (laid-back, lots of new knowledge) with the fun parts of a job (making useful things, getting paid).

Dale Johnson

Dale Johnson

I am variously “search engineer”, “sphinx developer”, “data platform engineer”, “senior database engineer”, roughly in that order.

I have done 15 years of database work, on relational database, data warehousing and search. I have done work on PostgreSQL internals, and have studied MySQL internals. Most recently I worked at Tellme where I designed and developed a 1.5TB data repository to drive data warehouse reporting functions for the call details of each of over 1 billion Tellme-answered phone calls. This involved a redundant and reliable cluster of over 50 mysql servers using inexpensive off-the-shelf hardware. This used a combination of mysql and record-oriented raw data files.

I am currently coding extensions in C++ to our search engine, Dale's Whiteboarddoing things like parsing out Chinese sentences into searchable blocks to support http://mulu100.com. Also I recently have implemented some statistical approaches to our full text search, gathering a corpus profile and applying that in real time to search terms to improve the selectivity of results.

The key challenge I think is be able to flexible enough to implement a solution as we discover the most natural way for a user to navigate through millions of items. To provide a back-end that is able to support a dynamic state-of-the art interactive user experience that people now expect; and to be able to provide these results in real time. Many requests need to distinguish between tens of thousands of documents which have one or more of the search terms present, and determine the top 10 / 100 / 1000 of those items in under a quarter of a second. Under these operational constraints, the traditional relation database approach completely falls down; quite the fun engineering challenge.

What keeps me motivated is the knowledge that the web is still 95% noise and 5% signal. Search is the thing that has the potential to cut through the noise, so we’re really fighting the good fight, of taking listings from potentially obscure but highly useful sites, and making them available to the people it will really matter to, and doing it in a fair and egalitarian way.

NB. Our email addresses are first name at edgeio dotcom

New edgeio features to be launched tonight

Later tonight we will be releasing a new version of the edgeio web site. It has a number of new features that we are really excited about. The new site should go live sometime after midnight Pacific time.

1. The ability to post an item for sale directly onto the edgeio web site. No blog needed. When you do this for the first time you will be asked to choose a login and password. On subsequent visits you will be able to post instantly. This is in effect giving you a listings blog hosted by edgeio. In the future we want to consistently add features to this blog platform. Some we already know about (the ability to skin your blog for example). Others we want to hear from you about. feedback@edgeio.com is the place to send product feature requests.

2. The ability to add posts from your blog to edgeio without tagging your posts in advance. To accomplish this simply put your URL in edgeio’s home page and click “get listings”. We will retrieve all recent posts from your blog and you can select which ones to add to edgeio.

3. Tagclouds for cities and for items. Click on the number of cities on the edgeio home page to see the most popular cities on edgeio. After 2 weeks we have around 1200 cities with listings. This grows by about 300 cities a week. We expect to be at 10,000 cities this year. To see category clouds just click on the “more” at the end of each category list on the home page.

4. Advice for power users about how to automate edgeio listings via their RSS feed. We decribe the “edgeio control language” or ECL. This is a means of using tags to tell edgeio a lot about your listing and helps ensure it is listed in the correct cities, and for the correct categories. Over time ECL will evolve into a rich control language for power listers. This is a link from the top of the new edgeio home page.
Rob Hof has written a piece on the new features. We will add more links here as they are published.

Other links:

Read/Write Web

SomeWhat Frank


First listings from China

First Chinese Listing on edgeio

The first edgeio listing from China was posted today.

Here is a link to it:

First edgeio listing from China

Original Post

Pretty soon there were several more:

Anima Causa “适形椅”

This is really exciting for us. edgeio was built to provide for bottoms up publishing from every town and city on the earth. To have achieved 12,000 listings in 10 days, and to have listings from 1600 cities feels great. To see Chinese listings is awesome.

I also posted a fuller look at “instant listings” on my personal blog.

A busy week at edgeio

I wanted to give a status report on our first week from a product development 
point of view. First, it’s been a busy first week at edgeio. Since launch, over 
5,000 items have been published from more than 1,400 cities. While this is great
 news, things did not go smoothly for many people who wanted their items 
listed. In the world of RSS, ping servers, tagging, and claiming blogs 
there are numerous quirks that we did not encounter during our beta
 tests. So, we have learned a lot. We are now addressing known issues as 
quickly as possible. While mostly under-the-covers, we’ve been launching updates 
to the edgeio service on almost a daily basis since and plan to continue that 
pace of improvement until publishing on edgeio is easy for everyone.

 In addition to fixing bugs, we are also trying to answer the question,
 “what if I don’t know how to tag and I am not sure if my website 
notifies a ping server?”. Stay tuned for the answer which should be
 coming within the week. If you have specific issues please send feedback to feedback@edgeio.com. We 
read everything and learn from it.