Archive for the ‘Web/Tech’ Category

Media Futures, Part 2/5: ALGORITHM

Wednesday, March 23rd, 2005

An Algorithm is a set of instructions or procedures for solving a problem.

diff1.jpgIn the same way that computer scientists 50 years ago focused on the single problem of designing a general purpose computer, there is a similar focus in 2005 among leading Internet service architects:  creating a social media computer that leverages user generated content to automate the production of commercial content.  In so far as this represents the important problem that the best and brightest of us are looking to solve, then to an extent it is a race for the best algorithm. 

From PageRank to PeopleRank

Hovering over this endeavor is the shadow of the last great algorithm, namely Google search engine.  At its core, Google is PageRank (which nominally cites both one of its founders Larry Page and its subject of operation, Web pages):

"PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives;  it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important.”"

In this case, (1) the input for the algorithm is the population of web pages, (2) the instructions rank them in value based on their link structure, and (3) the output is the list of links that you see when you search for something.

Now transpose people for web pages, and you see how the race for the next great search algorithm has less to do with organizing static HTML content than with coordinating the constantly changing expressions of millions of distributed people.  For an interesting perspective, see my fellow entrepreneur Mark Pincus’s riff on the PeopleWeb.  Many Internet businesses have tried to direct user behavior into certain architectures of participation.  Services such as Friendster, Orkut, and even Pincus’s own Tribe, presume to address all of a person’s social communication needs in one place.  All of these services, however, are now rapidly trying to reinvent themselves to stay relevant to a community that refuses to be intermediated by somebody else’s system. 

The services that seem to do the best job at enabling users to communicate on their own terms are those that manage to find a middle ground between the DIY (do-it-yourself) ethos that is beginning to pervade the web and the need for structure to guide constructive interactions (ie the reason by Wikipedia succeeds and most other Wikis fail).  LinkedIn, with its two million profiles of professional affiliations, provides the tools for interesting social media production, even if the site itself limits one’s imagination (open up the API please).  The reason behind the annoying digerati blogfest on folksonomies (myself included) stems from the simple but mildly heretical notion that users, given decent primary (meta)data, might actually be able to create their own systems that scale.  Clay Shirky (lighting designer for the Wooster Group, CTO of SiteSpecific, advisor to Flickr, current leading pundit for the digerati at shirky.com) captures the anxiety perfectly in his title to last week’s panel at ETech:  "Folksonomy, or How I Learned to Stop Worrying and Love the Mess".  The question is, then, whether a PeopleRank algorithm that uses community driven tags as its input, could do to About.com, Gawker Media, and Weblogs what Google did to Alta Vista, namely deliver a superior end-user experience that requires only incremental server bandwidth to scale. 

Media Futures, Part 1/5: AUTOMATA

Monday, March 21st, 2005

Self Operating Machines

PeopleconnThe most exciting new Internet companies are focused on lead generation, behavioral targeting, co-registration paths (aka coreg) and domain name brokerage.

I seem to stumble every day across some new firm propping itself up on the shoulders of Google, Yahoo! or others to take advantage of a current wrinkle in an otherwise perfectly efficient landscape.  The fact is, however, that these wrinkles never seem to disappear.  These advertising mechanisms emerged alongside of the pure media companies, starting with Doubleclick back in 1996, and followed by other advertising networks such as Flycast, Overture and Advertising.com, each firm iterating on the prior model to gain some head room from the imminent internal advertising service offerings of the pure media companies.

Although these advertising technologies have focused on targeting the behavior of consumers, they have also tended to ignore the role of consumers in the production process of the media that they consume.  This is why these networks tend to dynamically balloon in terms of sales growth.  They capitalize on a behavioral blindspot, where the supply of inventory versus the demands of advertising value are disjointed.  As consumers become smart about these artificial mechanisms (banners, keywords, freeipods) their effectiveness drops and they look to get acquired by larger media entities.

The elusive goal of internet media (and the advertising that drives its value) has been to keep up with changing consumer preferences as the technologies of communication continue to evolve.  The adoption of a new means of using the Internet (whether it be ecommerce, webmail, search engines, or shortly blog readers) creates enormous economic value for the donors incredibly quickly. 

The latest shift in online consumption has been the institutionalization of amateur publishing tools.  The unique attribute of RSS is how simply it enables individual publishers of data to connect with individual subscribers to such data.  RSS tools like typepad, flickr, adsense and del.icio.us have made it easy for individuals to syndicate their preferences, memories and desires.  RSS tracking tools like feedburner make it easy for individuals to track who is paying attention.  At the site you are on, I am streaming my Flickr photos, am recommending books as an Amazon associate and am promoting a Google ad-sense ad.

Not unlike dial-up authentication protocols (remember the classic AOL logon "hand-shake"),  the interaction between RSS feeds and their readers is a structured negotiation:  do you, John Q. Public, agree to take the whole feed, nothing but the feed, until you delete it?  I do. Click.   

When you aggregate all of these individual reading and writing agents, it looks more like a landscape of cellular automata than a tradition publishing model.  This would seem to be the essence of social media (props to my wife and guide Tina Sharkey for coining this years ago and registering the domain) and social computing, two memes that seem to be growing in influence.  When individual decisions such as applying certain tags to pages or photos achieve a broad social consensus, then it as if these tags begin to self replicate which is the essence of automatic behavior.

There is a good word to describe this, which comes out of physics, namely Excitable Media.  as per Wikipedia:

Cellular automata provide a simple model to aid the understanding of excitable media. Each cell of the automaton is made to represent some section of the medium (for example, a patch of trees in a forest, or stress in heart tissue). Each cell can be in one of the three following states:

Quiescent or excitable — the cell is unexcited, and can be excited. In the forest fire example, this corresponds to the trees being unburnt.

Excited — the cell is excited. The trees are on fire.

Refractory — the cell has recently been excited and has not yet been through the refractory period. A patch of land where the trees have burnt and the vegetation has yet to regrow.

The concept of cellular automata is useful as a metaphor for next generation Internet content, which is similarly dynamic, member-generated, and excitable.  In the next post, I will focus on algorithms, as they transform the automatic social media into business rules and procedures. 

Technostalgia and Human Computers

Monday, February 28th, 2005

 Turk
Technostalgia

 A few weeks back, I went to a Christies auction on the history of
cyberspace.  In front of me was Mark Stahlman,
one of the earliest cybergeeks in New York.  To my right, with his wife,
was Mitch Kapor.  Kapor of course invented Lotus 123, the precursor to Microsoft
Excel, which was ult
imately sold to IBM for a
lot of money.  In any case, he was aggressively bi
dding on the first
computer business plan by Eckert and Norbert Weiner’s entire library of strange
scientific stu
dies.

I bid for a couple of interesting things that
weren’t quite as famous (or expensive), like an early 19th century british
study on early automata (with extended comme
ntary on the Turk, the "magic" chess machine that
revealed a dwarf inside) as well as examples of the first programming language
guides (for a language called Algol).

When I brought my
collection home,  I opened one of the oldest items which was a book from
1833 on
Zerah Colburn who was one of the
first human computers.  He could ca
lculate in his head
so quickly that crowds were literally amazed. I do not think that former AOL
biz dev exec David Colburn is a desce
ndant, although I am
not sure of this in so far as he too could calculate accretive partnerships
with unnatural alacrity. The book is so old that its spine has bas
ically turned into
dust and so all of the pages are barely caked together.  A last remnant of
the first human powered computer.

It is now 2005, more than 50 years since the
first electronic computer was built, and we have a massively popular open
computing program that is the internet.  It is generating more than $100
billion in annual online ecommerce and more than $10 billion in advertising
which is ra
pidly growing.  It has become an automatic part of our daily
lives, converting the 5-10 physical inte
ractions that we
might have had during a work day in the past into a constant interactive stream
of tens if not hu
ndreds of conversations.

A few weeks ago, Jordan Rohan a solitary sell-side analyst came out with a note that claimed paid search keyword prices were
rapidly declining. this appears to have been based on a si
ngle anecdotal
conversation with a 3rd tier search engine off-the-record.  The entire
sector plumme
tted. 

"Checks
indicate pricing has shifted from robust to weaker-than-expected," Mr.
Rohan said of Google, adding the pri
cing shortfall will
prove difficult to make up in the last six weeks of this quarter. Advertising
prices have fallen 10 per cent this quarter, and the company’s profit ma
rgins will narrow as hiring
continues in the face of slowing sales growth, he said. Shares of Google fell
$5.06 to $188.89 on the Nasdaq Stock Market yesterday after earlier dropping as
low as $182.23. AP

Investors are
gasping at any evidence of a turn for the worse (or the  better) for GOOG
and YHOO, and to a lesser extent, EBAY and AMZN.  Almost a quarter-trillion dollars of market capitalization is
currently being waged on the fortunes of these companies.  They are
perceived in terms of "platforms, networks, marke
tplaces"
central to the future of the Media and Technology markets.  Just like
Compuserve Ne
tscape, Excite, and of course AOL before them.  What is
defensible now is the "critical mass" of the consumer audience base
(250 million or so) and the highly scalable merchandising, adve
rtising, and search
systems that have evolved to support their needs online.


The Reintermediation of  Community into the Online Equation 

What is curious when you press investors as to
why really these businesses can continue to scale, their answers somehow point
to the participation of the community in the pr
oduction
process.  They talk about the "sellers" of EBay.  For
Google, they talk about the buyers of a
dwords and the small
site owners that can stick ad-sense (
and soon YPN) easily on their sites.  Yahoo and Amazon suffer because
they haven’t figured out how to
syndicate peoples’
desires to sell
to others as well as they
have done with consumers themselves.


It was interesting to see About get bought for
$400m from the New York Times.  Have you been to About.com recently?  It is
riddled with highly curated ads.  Cheezy, but satisfying in terms of answering
ce
rtain kinds of commercial problems with human experts.

What I heard was
that the digital gray lady was more inte
rested in About.com’s
ability to generate highly popular organic search results than it was in
About.com’s content.  The New York Times has fi
gured out,
correctly, that it may not be able to afford keywords on search engines for
placing its st
ories.  Since so much internet activity starts with search,
and the engines control so much inve
ntory, ny times
content therefore needs to show up algorithmically: About.com as New York Times
in-house SEO platform.

When Scott Kurnit
started the mining company in 1997, he was late to the party and I thought that
the idea of hiring a
ctual guides to curate
content was silly.  Boy, was I wrong!  Well, it was silly.  And
opinion-driven, amateur, and limited relative to the breadth of the automatic
direct
ories and engines.

And so now, the
traditional elite editorial voices of the times, are being undermined by a co
ntent management
approach that in some strange way reintermediates human agency in a distributed
fashion.  How many guides are there, 500 or so?  Perhaps this is the face of a new kind of m
edia machine. You can download a presentation that I gave to the Stanford Media Center in early February with data supporting this thesis: Download sethgoldstein.mediacenter.2.9.pdf.


The Four (New) A’s of Internet Advertising:

And so from Van Kempelen’s
dwarf hidden within his chess-playing machine the Turk, to Ku
rnit’s human guides
powering the SEO efficacy of About.com, there would seem to be a long hi
story of computing
machines exposing their human qualities as core to their e
ffectiveness.  In proposing a framework for online media futures, it is critical to incorporate the consumer as
both creator of content and a su
bject of
advertisements at the same time. The next post will be an attempt to do so, along the following lines:

  1. Automata
  2. Algorithm
  3. Alchemy
  4. Arbitrage