Indexing the firehose

Both Google and Bing have signed agreements with Twitter to be able to index the live feed of ‘tweets’. There are several things I’d love to know about this.

Firstly, just from technical curiosity: how fast is that data flow, exactly? I wonder what kind of infrastructure is needed to index it in real time. Presumably they’re going to index everything?

Secondly, the business side… Several companies have exited successfully by creating something interesting enough for Google or Microsoft to want to buy. I wonder how many healthy ongoing businesses can be made from creating a data stream interesting enough for them to want to index?

And thirdly, the statistics will be fascinating, if we ever get to hear them. For example, I wonder how often the search query will now be longer than the item returned…

Enjoyed this post? Why not sign up to receive Status-Q in your inbox?

Got Something To Say:

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

*

© Copyright Quentin Stafford-Fraser