Tuesday, May 26, 2009

Secret of Googlenomics: Data-Fueled Recipe Brews Profitability (From Wired.com)

In the midst of financial apocalypse, the gadflies and gurus of the global marketplace are gathered at the San Francisco Hilton for the annual meeting of the American Economics Association. The mood is similar to a seismologist convention in the wake of the Big One. Yet surprisingly, one of the most popular sessions has nothing to do with toxic assets, derivatives, or unemployment curves.

"I'm going to talk about online auctions," says Hal Varian, the session's first speaker. Varian is a lanky 62-year-old professor at UC Berkeley's Haas School of Business and School of Information, but these days he's best known as Google's chief economist. This morning's crowd hasn't come for predictions about the credit market; they want to hear about Google's secret sauce.

Varian is an expert on what may be the most successful business idea in history: AdWords, Google's unique method for selling online advertising. AdWords analyzes every Google search to determine which advertisers get each of up to 11 "sponsored links" on every results page. It's the world's biggest, fastest auction, a never-ending, automated, self-service version of Tokyo's boisterous Tsukiji fish market, and it takes place, Varian says, "every time you search." He never mentions how much revenue advertising brings in. But Google is a public company, so anyone can find the number: It was $21 billion last year.

His talk quickly becomes technical. There's the difference between the Generalized Second Price auction model and the Vickrey-Clark-Groves alternative. Game theory takes a turn; so does the Nash Equilibrium. Terms involving the c-word—as in clicks—get tossed around like beach balls at a summer rock festival. Clickthrough rate. Cost per click. Supply curve of clicks. The audience is enthralled.

During the question-and-answer period, a man wearing a camel-colored corduroy blazer raises his hand. "Let me understand this," he begins, half skeptical, half unsure. "You say that an auction happens every time a search takes place? That would mean millions of times a day!"

Varian smiles. "Millions," he says, "is actually quite an understatement."

Why does Google even need a chief economist? The simplest reason is that the company is an economy unto itself. The ad auction, marinated in that special sauce, is a seething laboratory of fiduciary forensics, with customers ranging from giant multinationals to dorm-room entrepreneurs, all billed by the world's largest micropayment system.

Google depends on economic principles to hone what has become the search engine of choice for more than 60 percent of all Internet surfers, and the company uses auction theory to grease the skids of its own operations. All these calculations require an army of math geeks, algorithms of Ramanujanian complexity, and a sales force more comfortable with whiteboard markers than fairway irons.

Varian, an upbeat, avuncular presence at the Googleplex in Mountain View, California, serves as the Adam Smith of the new discipline of Googlenomics. His job is to provide a theoretical framework for Google's business practices while leading a team of quants to enforce bottom-line discipline, reining in the more propellerhead propensities of the company's dominant engineering culture.

Googlenomics actually comes in two flavors: macro and micro. The macroeconomic side involves some of the company's seemingly altruistic behavior, which often baffles observers. Why does Google give away products like its browser, its apps, and the Android operating system for mobile phones? Anything that increases Internet use ultimately enriches Google, Varian says. And since using the Web without using Google is like dining at In-N-Out without ordering a hamburger, more eyeballs on the Web lead inexorably to more ad sales for Google.

The microeconomics of Google is more complicated. Selling ads doesn't generate only profits; it also generates torrents of data about users' tastes and habits, data that Google then sifts and processes in order to predict future consumer behavior, find ways to improve its products, and sell more ads. This is the heart and soul of Googlenomics. It's a system of constant self-analysis: a data-fueled feedback loop that defines not only Google's future but the future of anyone who does business online.

When the American Economics Association meets next year, the financial crisis may still be topic A. But one of the keynote speakers has already been chosen: Googlenomist Hal Varian.

Ironically, economics was a distant focus in the first days of Google. After Larry Page and Sergey Brin founded the company in 1998, they channeled their energy into its free search product and left much of the business planning to a 22-year-old Stanford graduate named Salar Kamangar, Google's ninth employee. The early assumption was that although ads would be an important source of revenue, licensing search technology and selling servers would be just as lucrative. Page and Brin also believed that ads should be useful and welcome—not annoying intrusions. Kamangar and another early Googler, Eric Veach, set out to implement that ideal. Neither had a background in business or economics. Kamangar had been a biology major, and Veach's field of study was computer science.

Hal Varian, hight priest of Googlenomics.
Photo: Joe Pugliese

Google's ads were always plain blocks of text relevant to the search query. But at first, there were two kinds. Ads at the top of the page were sold the old-fashioned way, by a crew of human beings headquartered largely in New York City. Salespeople wooed big customers over dinner, explaining what keywords meant and what the prices were. Advertisers were then billed by the number of user views, or impressions, regardless of whether anyone clicked on the ad. Down the right side were other ads that smaller businesses could buy directly online. The first of these, for live mail-order lobsters, was sold in 2000, just minutes after Google deployed a link reading see your ad here.

But as the business grew, Kamangar and Veach decided to price the slots on the side of the page by means of an auction. Not an eBay-style auction that unfolds over days or minutes as bids are raised or abandoned, but a huge marketplace of virtual auctions in which sealed bids are submitted in advance and winners are determined algorithmically in fractions of a second. Google hoped that millions of small and medium companies would take part in the market, so it was essential that the process be self-service. Advertisers bid on search terms, or keywords, but instead of bidding on the price per impression, they were bidding on a price they were willing to pay each time a user clicked on the ad. (The bid would be accompanied by a budget of how many clicks the advertiser was willing to pay for.) The new system was called AdWords Select, while the ads at the top of the page, with prices still set by humans, was renamed AdWords Premium.

One key innovation was that all the sidebar slots on the results page were sold off in a single auction. (Compare that to an early pioneer of auction-driven search ads, Overture, which held a separate auction for each slot.) The problem with an all-at-once auction, however, was that advertisers might be inclined to lowball their bids to avoid the sucker's trap of paying a huge amount more than the guy just below them on the page. So the Googlers decided that the winner of each auction would pay the amount (plus a penny) of the bid from the advertiser with the next-highest offer. (If Joe bids $10, Alice bids $9, and Sue bids $6, Joe gets the top slot and pays $9.01. Alice gets the next slot for $6.01, and so on.) Since competitors didn't have to worry about costly overbidding errors, the paradoxical result was that it encouraged higher bids.

"Eric Veach did the math independently," Kamangar says. "We found out along the way that second-price auctions had existed in other forms in the past and were used at one time in Treasury auctions." (Another crucial innovation had to do with ad quality, but more on that later.)

Google's homemade solution to its ad problem impressed even Paul Milgrom, the Stanford economist who is to auction theory what Letitia Baldridge is to etiquette. "I've begun to realize that Google somehow stumbled on a level of simplification in ad auctions that was not included before," he says. And applying a variation on second-price auctions wasn't just a theoretical advance. "Google immediately started getting higher prices for advertising than Overture was getting."

Google hired Varian in May 2002, a few months after implementing the auction- based version of AdWords. The offer came about when Google's then-new CEO, Eric Schmidt, ran into Varian at the Aspen Institute and they struck up a conversation about Internet issues. Schmidt was with Larry Page, who was pushing his own notions about how some of the big problems in business and science could be solved by using computation and analysis on an unprecedented scale. Varian remembers thinking, "Why did Eric bring his high-school nephew?"

Schmidt, whose father was an economist, invited Varian to spend a day or two a week at Google. On his first visit, Varian asked Schmidt what he should do. "Why don't you take a look at the ad auction?" Schmidt said.

Google had already developed the basics of AdWords, but there was still plenty of tweaking to do, and Varian was uniquely qualified to "take a look." As head of the information school at UC Berkeley and coauthor (with Carl Shapiro) of a popular book called Information Rules: A Strategic Guide to the Network Economy, he was already the go-to economist on ecommerce.

At the time, most online companies were still selling advertising the way it was done in the days of Mad Men. But Varian saw immediately that Google's ad business was less like buying traditional spots and more like computer dating. "The theory was Google as yenta—matchmaker," he says. He also realized there was another old idea underlying the new approach: A 1983 paper by Harvard economist Herman Leonard described using marketplace mechanisms to assign job candidates to slots in a corporation, or students to dorm rooms. It was called a two-sided matching market. "The mathematical structure of the Google auction," Varian says, "is the same as those two-sided matching markets."

Varian tried to understand the process better by applying game theory. "I think I was the first person to do that," he says. After just a few weeks at Google, he went back to Schmidt. "It's amazing!" Varian said. "You've managed to design an auction perfectly."

To Schmidt, who had been at Google barely a year, this was an incredible relief. "Remember, this was when the company had 200 employees and no cash," he says. "All of a sudden we realized we were in the auction business."

It wasn't long before the success of AdWords Select began to dwarf that of its sister system, the more traditional AdWords Premium. Inevitably, Veach and Kamangar argued that all the ad slots should be auctioned off. In search, Google had already used scale, power, and clever algorithms to change the way people accessed information. By turning over its sales process entirely to an auction-based system, the company could similarly upend the world of advertising, removing human guesswork from the equation.

The move was risky. Going ahead with the phaseout—nicknamed Premium Sunset—meant giving up campaigns that were selling for hundreds of thousands of dollars, for the unproven possibility that the auction process would generate even bigger sums. "We were going to erase a huge part of the company's revenue," says Tim Armstrong, then head of direct sales in the US. (This March, Armstrong left Google to become AOL's new chair and CEO.) "Ninety-nine percent of companies would have said, 'Hold on, don't make that change.' But we had Larry, Sergey, and Eric saying, 'Let's go for it.'"

News of the switch jacked up the Maalox consumption among Google's salespeople. Instead of selling to corporate giants, their job would now be to get them to place bids in an auction? "We thought it was a little half-cocked," says Jeff Levick, an early leader of the Google sales team. The young company wasn't getting rid of its sales force (though the system certainly helped Google run with far fewer salespeople than a traditional media company) but was asking them to get geekier, helping big customers shape online strategies as opposed to simply selling ad space.

Levick tells a story of visiting three big customers to inform them of the new system: "The guy in California almost threw us out of his office and told us to fuck ourselves. The guy in Chicago said, 'This is going to be the worst business move you ever made.' But the guy in Massachusetts said, 'I trust you.'"

That client knew math, says Levick, whose secret weapon was the numbers. When the data was crunched—and Google worked hard to give clients the tools needed to run the numbers themselves—advertisers saw that the new system paid off for them, too.

AdWords was such a hit that Google went auction-crazy. The company used auctions to place ads on other Web sites (that program was dubbed AdSense). "But the really gutsy move," Varian says, "was using it in the IPO." In 2004, Google used a variation of a Dutch auction for its IPO; Brin and Page loved that the process leveled the playing field between small investors and powerful brokerage houses. And in 2008, the company couldn't resist participating in the FCC's auction to reallocate portions of the radio spectrum.

Google even uses auctions for internal operations, like allocating servers among its various business units. Since moving a product's storage and computation to a new data center is disruptive, engineers often put it off. "I suggested we run an auction similar to what the airlines do when they oversell a flight. They keep offering bigger vouchers until enough customers give up their seats," Varian says. "In our case, we offer more machines in exchange for moving to new servers. One group might do it for 50 new ones, another for 100, and another won't move unless we give them 300. So we give them to the lowest bidder—they get their extra capacity, and we get computation shifted to the new data center."

The transition to an all-auction sales model was a milestone for Google, ensuring that its entire revenue engine would run with the same computer-science fervor as its search operation. Now, when Google recruits alpha geeks, it is just as likely to have them focus on AdWords as on search or apps.

The across-the-board emphasis on engineering, mathematical formulas, and data-mining has made Google a new kind of company. But to fully understand why, you have to go back and look under AdWords' hood.

Most people think of the Google ad auction as a straightforward affair. In fact, there's a key component that few users know about and even sophisticated advertisers don't fully understand. The bids themselves are only a part of what ultimately determines the auction winners. The other major determinant is something called the quality score. This metric strives to ensure that the ads Google shows on its results page are true, high-caliber matches for what users are querying. If they aren't, the whole system suffers and Google makes less money.

Google determines quality scores by calculating multiple factors, including the relevance of the ad to the specific keyword or keywords, the quality of the landing page the ad is linked to, and, above all, the percentage of times users actually click on a given ad when it appears on a results page. (Other factors, Google won't even discuss.) There's also a penalty invoked when the ad quality is too low—in such cases, the company slaps a minimum bid on the advertiser. Google explains that this practice—reviled by many companies affected by it—protects users from being exposed to irrelevant or annoying ads that would sour people on sponsored links in general. Several lawsuits have been filed by would-be advertisers who claim that they are victims of an arbitrary process by a quasi monopoly.

You can argue about fairness, but arbitrary it ain't. To figure out the quality score, Google needs to estimate in advance how many users will click on an ad. That's very tricky, especially since we're talking about billions of auctions. But since the ad model depends on predicting clickthroughs as perfectly as possible, the company must quantify and analyze every twist and turn of the data. Susan Wojcicki, who oversees Google's advertising, refers to it as "the physics of clicks."

During Varian's second summer in Mountain View, when he was still coming in only a day or two a week, he asked a recently hired computer scientist from Stanford named Diane Tang to create the Google equivalent of the Consumer Price Index, called the Keyword Pricing Index. "Instead of a basket of goods like diapers and beer and doughnuts, we have keywords," says Tang, who is known internally as the Queen of Clicks.

The Keyword Pricing Index is a reality check. It alerts Google to any anomalous price bubbles, a sure sign that an auction isn't working properly. Categories are ranked by the cost per click that advertisers generally have to pay, weighted by distribution, and then separated into three bundles: high cap, mid cap, and low cap. "The high caps are very competitive keywords, like 'flowers' and 'hotels,'" Tang says. In the mid-cap realm you have keywords that may vary seasonally—the price to place ads alongside results for "snowboarding" skyrockets during the winter. Low caps like "Massachusetts buggy whips" are the stuff of long tails.

Tang's index is just one example of a much broader effort. As the amount of data at the company's disposal grows, the opportunities to exploit it multiply, which ends up further extending the range and scope of the Google economy. So it's utterly essential to calculate correctly the quality scores that prop up AdWords.

"The people working for me are generally econometricians—sort of a cross between statisticians and economists," says Varian, who moved to Google full-time in 2007 (he's on leave from Berkeley) and leads two teams, one of them focused on analysis.

"Google needs mathematical types that have a rich tool set for looking for signals in noise," says statistician Daryl Pregibon, who joined Google in 2003 after 23 years as a top scientist at Bell Labs and AT&T Labs. "The rough rule of thumb is one statistician for every 100 computer scientists."

Keywords and click rates are their bread and butter. "We are trying to understand the mechanisms behind the metrics," says Qing Wu, one of Varian's minions. His specialty is forecasting, so now he predicts patterns of queries based on the season, the climate, international holidays, even the time of day. "We have temperature data, weather data, and queries data, so we can do correlation and statistical modeling," Wu says. The results all feed into Google's backend system, helping advertisers devise more-efficient campaigns.

To track and test their predictions, Wu and his colleagues use dozens of onscreen dashboards that continuously stream information, a sort of Bloomberg terminal for the Googlesphere. Wu checks obsessively to see whether reality is matching the forecasts: "With a dashboard, you can monitor the queries, the amount of money you make, how many advertisers you have, how many keywords they're bidding on, what the rate of return is for each advertiser."

Wu calls Google "the barometer of the world." Indeed, studying the clicks is like looking through a window with a panoramic view of everything. You can see the change of seasons—clicks gravitating toward skiing and heavy clothes in winter, bikinis and sunscreen in summer—and you can track who's up and down in pop culture. Most of us remember news events from television or newspapers; Googlers recall them as spikes in their graphs. "One of the big things a few years ago was the SARS epidemic," Tang says. Wu didn't even have to read the papers to know about the financial meltdown—he saw the jump in people Googling for gold. And since prediction and analysis are so crucial to AdWords, every bit of data, no matter how seemingly trivial, has potential value.

Since Google hired Varian, other companies, like Yahoo, have decided that they, too, must have a chief economist heading a division that scrutinizes auctions, dashboards, and econometric models to fine-tune their business plan. In 2007, Harvard economist Susan Athey was surprised to get a summons to Redmond to meet with Steve Ballmer. "That's a call you take," she says. Athey spent last year working in Microsoft's Cambridge, Massachusetts, office.

Can the rest of the world be far behind? Although Eric Schmidt doesn't think it will happen as quickly as some believe, he does think that Google-style auctions are applicable to all sorts of transactions. The solution to the glut in auto inventory? Put the entire supply of unsold cars up for bid. That'll clear out the lot. Housing, too: "People use auctions now in cases of distress, like auctioning a house when there are no buyers," Schmidt says. "But you can imagine a situation in which it was a normal and routine way of doing things."

Varian believes that a new era is dawning for what you might call the datarati—and it's all about harnessing supply and demand. "What's ubiquitous and cheap?" Varian asks. "Data." And what is scarce? The analytic ability to utilize that data. As a result, he believes that the kind of technical person who once would have wound up working for a hedge fund on Wall Street will now work at a firm whose business hinges on making smart, daring choices—decisions based on surprising results gleaned from algorithmic spelunking and executed with the confidence that comes from really doing the math.

It's a satisfying development for Varian, a guy whose career as an economist was inspired by a sci-fi novel he read in junior high. "In Isaac Asimov's first Foundation Trilogy, there was a character who basically constructed mathematical models of society, and I thought this was a really exciting idea. When I went to college, I looked around for that subject. It turned out to be economics." Varian is telling this story from his pied-รจ0-Plex, where he sometimes stays during the week to avoid driving the 40-some miles from Google headquarters to his home in the East Bay. It happens to be the ranch-style house, which Google now owns, where Brin and Page started the company.

There's a wild contrast between this sparsely furnished residence and what it has spawned—dozens of millionaire geeks, billions of auctions, and new ground rules for businesses in a data-driven society that is far weirder than the one Asimov envisioned nearly 60 years ago. What could be more baffling than a capitalist corporation that gives away its best services, doesn't set the prices for the ads that support it, and turns away customers because their ads don't measure up to its complex formulas? Varian, of course, knows that his employer's success is not the result of inspired craziness but of an early recognition that the Internet rewards fanatical focus on scale, speed, data analysis, and customer satisfaction. (A bit of auction theory doesn't hurt, either.) Today we have a name for those rules: Googlenomics. Learn them, or pay the price.

Senior writer Steven Levy (steven_levy@wired.com) wrote about the Kryptos sculpture at CIA headquarters in issue 17.05.

Sunday, May 10, 2009

Google - The 2008 Founders' Letter



Introduction

Since 2004, when Google began to have annual reports, Larry and I have taken turns writing an annual letter. I never imagined I would be writing one in the midst of an economic crisis unlike any we have seen in decades. As I write this, search queries are reflecting economic hardship, the major market indexes are one half of what they were less than 18 months ago, and unemployment is at record levels.

Nonetheless, I am optimistic about the future, because I believe scarcity breeds clarity: it focuses minds, forcing people to think creatively and rise to the challenge. While much smaller in scale than today's global collapse, the dot-com bust of 2000-2002 pushed Google and others in the industry to take some tough decisions — and we all emerged stronger as a result.

This new crisis punctuates the end of our first decade as a company, a decade that has brought great change to Google, the web and the Internet as a whole. As I reflect on this short time period, our accomplishments and our shortcomings, I am very excited about what the next ten years may bring.

But let me start a little farther back — in 1990, the very first web page was created at http://info.cern.ch/. By late 1992, there were only 26 websites in the world so there was not much need for a search engine. When NCSA Mosaic (the first widely used web browser) came out in 1993, every new website that was created would get posted to its "What's New" page at a rate of about one a day: http://www.dejavu.org/prep_whatsnew.htm. Just five years later, in 1998, web pages numbered in the tens of millions, and search became crucial. At this point, Google was a small research project at Stanford; later that year it became a tiny startup. The search index sat on a small number of disk drives enclosed within Lego-like blocks. Perhaps a few thousand people, mostly academics, used the service.

Fast-forward to today, the changes in scale are striking. The web itself has grown by about a factor of 10,000, as has our search index. The number of people who use Google's services every day is now in the hundreds of millions. More importantly, billions of people now have access to the Internet via computers and mobile phones. Like many other web companies, the vast majority of our services are available worldwide and free to users because they are supported by ads. So a child in an Internet cafe in a developing nation can use the same online tools as the wealthiest person in the world. I am proud of the small role Google has played in the democratization of information, but there is much more left to do.

Search

Search remains at the very core of what we do at Google, just as it has been from our earliest days. As the scale has changed dramatically over the years, the presentation and quality of our search results have also undergone many changes since 1998. In the past year alone we have made 359 changes to our web search — nearly one per day. Some are not easy to spot, such as changes in ranking based on personalization (launched broadly in 2005) but they are important in getting the most relevant search results. Others are very easy to see and improve search efficiency in a very clear way, such as spelling correction, annotations, and suggestions.

While I am proud of what has been accomplished in search over the past decade, there are important areas in which I wish we had made more progress. Perfect search requires human-level artificial intelligence, which many of us believe is still quite distant. However, I think it will soon be possible to have a search engine that "understands" more of the queries and documents than we do today. Others claim to have accomplished this, and Google's systems have more smarts behind the curtains than may be apparent from the outside, but the field as a whole is still shy of where I would have expected it to be. Part of the reason is the dramatic growth of the web — for any particular query, it is likely there are many documents on the topic using the exact same vocabulary. And as the web grows, so does the breadth and depth of the curiosity of those searching. I expect our search engine to become much "smarter" in the coming decade.

So too will the interfaces by which users look for and receive information. While many things have changed, the basic structure of Google search results today is fairly similar to how it was ten years ago. This is partly because of the benefits of simplicity; in fact, the Google homepage has become increasingly simple over the years: http://blogoscoped.com/archive/2006-04-21-n63.html. But we are starting to see more significant changes in search interfaces. Today you can search from your cell phone by just speaking into it and Google Reader can suggest interesting blogs without any query at all. It is my expectation that in the next decade our searches and results will look very different than they do today.

One of the most striking changes that has happened in the past few years is that search results are no longer just web pages. They include images, videos, books, maps, and more. From the outset, we realized that to have comprehensive search we would have to venture beyond web pages. In 2001, we launched Google Image Search and via Google Groups we made available and searchable the most comprehensive archive of Usenet postings ever assembled (800 million messages dating back to 1981).

Just this past fall we expanded Image Search to include the LIFE Magazine photo archive. This is a collection of 10 million photos, more than 95 percent of which have never been seen before, and includes historical pictures such as the Skylab space station orbiting above Earth and Neil Armstrong landing on the moon. Integrating images into search remains a challenge, primarily because we are so reliant on the surrounding text to gauge a picture's relevance. In the future, using enhanced computer vision technology, we hope to be able to understand what's depicted in the image itself.

YouTube

Video is often thought of as an entertainment medium, but it is also a very important source of high-quality information. Some queries seem like natural choices to show video results, such as for sports and travel destinations. Yet videos are also great resources for topics such as computer hardware and software (I bought my last RAID based on a video review), scientific experiments, and education such as courses on quantum mechanics.

Google Video was first launched in 2005 as a search service for television content because TV close-captioning made search possible and user-generated video had yet to take off. But it subsequently evolved to a site where individuals and corporations alike could post their own videos. Today Google Video searches many different video hosting sites, the largest of which is YouTube, which we acquired in 2006.

Every minute, 15 hours worth of video are uploaded to YouTube — the equivalent of 86,000 new full length movies every week. YouTube channels now include world leaders (the President of the United States and prime ministers of Japan, the UK and Australia), royalty (the Queen of England and Queen Rania of Jordan), religious leaders (the Pope), and those seeking free expression (when Venezuelan broadcaster El Observador was shut down by the government, it started broadcasting on YouTube).

When it began, online video was associated with small fuzzy images. Today, many of our uploads are in HD quality (720 rows and greater) and can be streamed to computers, televisions, and mobile phones with increasing fidelity (thanks to improvements in video compression). In the future, vast libraries of movie-theater-quality video (4000+ columns) will be available instantly on any device.

Books

Books are one of the greatest sources of information in the world and from the earliest days of Google we hoped to eventually incorporate them into our search corpus. Within a couple of years, Larry was experimenting with digitizing books using a jury-rigged contraption in our office. By 2003, we launched Google Print, now called Google Book Search. Today, we are able to search the full text of almost 10 million books. Moreover, in October we reached a landmark agreement with a broad class of authors and publishers, including the Authors' Guild and the Association of American Publishers. If approved by the Court, this deal will make millions of in-copyright, out-of-print books available for U.S. readers to search, preview, and buy online — something that has been simply unavailable to date. Many of these books are difficult, if not impossible, to find because they are not sold through bookstores or held on most library shelves; yet they make up the vast majority of books in existence. The agreement also provides other important public benefits, including increased access to users with disabilities, the creation of a non-profit registry to help others license these books, the creation of a corpus to promote basic research, and free access to full texts at a kiosk in every public library in the United States.

Geo

While digitizing all the world's books is an ambitious project, digitizing the world is even more challenging. Beginning with our acquisition of Keyhole (the basis of Google Earth) in October 2004, it has been our goal to provide high-quality information for geographic needs. By offering both Google Earth and Google Maps, we aim to provide a comprehensive world model encompassing all geographic information including imagery, topography, road, buildings, and annotations. Today we stitch together images from satellites, airplanes, cars, and user uploads, as well as collect important data, such as roads, from numerous different sources including governments, companies, and directly from users. After the launch of Google Map Maker in Pakistan, users mapped 25,000 kilometers of uncharted road in just two months.

Ads

We always believed that we could have an advertising system that would add value not only to our bottom line but also to the quality of our search result pages. Rather than relying on distracting flashy ads, we developed relevant, clearly marked text-based ads above and to the right of our search results. After a number of early experiments, the first self-service system known as AdWords launched in 2000 starting with 350 advertisers. While these ads yielded small amounts of money compared to banner ads at the time, as the dot-com bubble burst, this system became our life preserver. As we syndicated it to EarthLink and then AOL, it became an important source of revenue for other companies as well.

Today, AdWords has grown beyond just being a feature of Google. It is a vast ecosystem that provides valuable traffic and leads to hundreds of thousands of businesses: indeed in many ways it has helped democratize access to advertising, by creating an open marketplace where small business and start-ups can compete with well-established, well-funded companies. AdWords is also an important source of revenue for websites that create the content that we all search. Last year, AdSense (our publisher-facing program) generated more than $5 billion dollars of revenue for our many publishing partners.

Also in the last year we ventured further into other advertising formats with the acquisition of DoubleClick. This may seem at odds with the value we place on relevant text-based ads. However, we have found that richer ad formats have their place such as video ads within YouTube and dynamic ads on game websites. In fact, we also now serve video ads on television with our AdSense for TV product. Our goal is to match advertisers and publishers using the formats and mediums most appropriate to their goals and audience.

Despite the progress in our advertising systems and the growth of our base of advertisers, I believe there are significant improvements still to be made. While our ad system has powerful features, it is also complex, and can confuse many small and local advertisers whose products and services could be very useful to our users. Furthermore, the presentation formats of our advertisements are not the optimal way to peruse through large numbers of products. In the next decade, I hope we can more effectively incorporate commercial offerings from the tens of millions of businesses worldwide and present them to consumers when and where they are most useful.

Apps

Within a couple of years of our founding, a number of colleagues and I were starting to hit the limitations of our traditional email clients. Our mailboxes were too big for them to handle speedily and reliably. It was challenging or impossible to have email available and synchronized when switching between different computers and platforms. Furthermore, email access required VPN (virtual private networks) so everyone was always VPN'ing, thereby creating extra security risks. Searching mail was slow, awkward, and cumbersome.

By the end of 2001 we had a prototype of Gmail that was used internally. Like several existing services at the time, it was web-based. But unlike those services it was designed for power users with high volumes of email. While our initial focus was on internal usage, it soon became clear we had something of value for the whole world. When Gmail was launched externally, in 2004, other top webmail sites offered 2MB and 4MB mailboxes, less than the size of a single attachment I might find in a message today. Gmail offered 1 Gigabyte at launch, included full-text search, and a host of other features not previously found in webmail. Since then Gmail has continued to push the envelope of email systems, including functionality such as instant messaging, video-conferencing, and offline access (launched in Gmail Labs this past January). Today some Googlers have more than 25 gigabytes of email going back nearly 10 years that they can search through in seconds. By the time you read this, you should be able to receive emails written in French and read them in English.

The benefits of web-based services, also known as cloud computing, are clear. There is no installation. All data is stored safely in a data center (no worries if your hard drive crashes). It can be accessed anytime, anywhere there is a working web browser and Internet connection (and sometimes even if there is not one — see below).

Perhaps even more importantly, new forms of communication and collaboration become possible. I am writing this letter using Google Docs. There are several other people helping me edit it simultaneously. Moments ago I stepped away and worked on it on a laptop. Without having to hit save or manage any synchronization all the changes appeared in seconds on the desktop that I am back to using now. In fact, today I have worked on this document using three different operating systems and two different web browsers, all without any special software or complex logistics.

In addition to Gmail and Google Docs, the Google Apps suite of products now includes Spreadsheets, Calendar, Sites, and more. It is also now available to companies, universities, and other organizations. In fact, more than 1 million organizations use Google Apps today, including Genentech, the Washington D.C. city government, the University of Arizona, and Gothenburg University in Sweden.

Because tens of millions of consumers already use our products, it is easy for organizations — from businesses to non-profits — to adopt them. Very little training is required and the passionate Google users already in these organizations are usually excited to help those who need a hand. In many ways, Google Apps are even more powerful in a business or group than they are for individuals because Apps can change the way businesses operate and the speed at which they move. For example, with Google Apps Web Forms we innovated by addressing the key problem of distributed data collection, making it incredibly simple to collect survey data from within the enterprise — a critical feature for collecting internal feedback we use extensively when "dogfooding" all of our products.

There are a number of things we could improve about these web services. For example, since they have arisen from different groups and acquisitions, there is less uniformity across them than there should be. For example, they can have different sharing models and chat capabilities. We are working to shift all of our applications to a common infrastructure. I believe we will achieve this soon, creating greater uniformity and capability across all of them.

Chrome

We have found the web-based service model to have significant advantages. But it also comes with its own set of challenges, primarily related to web browsers, which can be slow, unreliable, and unable to function offline. Rather than accept these shortcomings, we have sought to remedy them in a number of ways. We have contributed code and generated revenue for several existing web browsers like Mozilla Firefox, enabling them to invest more in their software. We have also developed extensions such as Google Gears, which allows a browser to function offline.

In the past couple of years, however, we decided that we wanted to make some substantial architectural changes to how web browsers work. For example, we felt that different tabs should be segregated into separate sandboxes so that one poorly functioning website does not take down the whole browser. We also felt that for us to continue to build great web services we needed much faster Javascript performance than current browsers offered.

To address these issues we have created a new browser, called Google Chrome. It has a multiprocess model and a very fast JavaScript engine we call V8. There are many other notable features, so I invite you to try it out for yourself. Chrome is not yet available on Mac and Linux so many of us, myself included, are not able to use it on a regular basis. If all goes well, this should be addressed later this year. Of course, this is just the start, and Chrome will continue to evolve. Furthermore, other web browsers have been spurred on by Chrome in areas such as JavaScript performance, making everyone better off.

Android

We first created mobile search for Google back in 2000 and then we started to create progressively more tailored and complex mobile offerings. Today, the phone I carry in my pocket is more powerful than the desktop computer I used in 1998. It is possible that this year, more Internet-capable smartphones will ship than desktop PCs. In fact, your most "personal" computer, the one that you carry with you in your pocket, is the smartphone. Today, almost a third of all Google searches in Japan are coming from mobile devices — a leading indicator of where the rest of the world will soon be.

However, mobile software development has been challenging. There are different mobile platforms, customized differently to each device and carrier combination. Furthermore, deploying mobile applications can require separate business arrangements with individual carriers and manufacturers. While the rise of app stores from Apple, Nokia, RIM, Microsoft, and others as well as the adoption of HTML 5 on mobile platforms have helped, it is still very difficult to provide a service to the largest group of network-connected people in the world.

We acquired the startup Android in 2005 and set about the ambitious goal of creating a new mobile operating system that would allow open interoperation across carriers and manufacturers. Last year, after a lot of hard work, we released Android to the world. As it is open source, anyone is free to use it and modify it. We look forward to seeing how this open platform will spur greater innovation. Furthermore, Android allows for easy creation of applications which can be deployed on any Android device. To date, more than 1000 apps have been uploaded to the Android Market including Shop Savvy (which reads bar codes and then compares prices), our own Latitude, and Guitar Hero World Tour.

AI

The past decade has seen tremendous changes in computing power amplified by the continued growth of Google's data centers. It has enabled the growth and processing of increasingly large data sets such as the web, the world's books, and video. This in turn has allowed problems once considered to be in the fantasy realm of artificial intelligence to come closer to reality.

Google Translate supports automatic machine translation between 1640 language pairs. This is made possible by large computer clusters and vast repositories of monolingual and multilingual texts: http://www.google.com/intl/en/help/faq_translation.html. This technology also allows us to support translated search where the query gets translated to another language and the results get translated back.

While the earliest Google Voice Search ran as a crude demo in 2001, today our own speech recognition technology powers GOOG411, the voice search feature of the Google Mobile App, and Google Voice. It, too, takes advantage of large training sets and significant computing capability. Last year, PicasaWeb, our photo hosting site, released face recognition, bringing a technology that is on the cutting edge of computer science to a consumer web service.

Just a few months ago we released Google Flu Trends, a service that uses our logs data (without revealing personally identifiable information) to predict flu incidence weeks ahead of estimates by the Centers for Disease Control (CDC). It is amazing how an existing data set typically used for improving search quality can be brought to bear on a seemingly unrelated issue and can help to save lives. I believe this sort of approach can do even more — going beyond monitoring to inferring potential causes and cures of disease. This is just one example of how large data sets such as search logs coupled with powerful data mining can improve the world while safe guarding privacy.

Conclusion

Given the tremendous pace of technology, it is impossible to predict far into the future. However, I think the past decade tells us some things to expect in the next. Computers will be 100 times faster still and storage will be 100 times cheaper. Many of the problems that we call artificial intelligence today will become accepted as standard computational capabilities, including image processing, speech recognition, and natural language processing. New and amazing computational capabilities will be born that we cannot even imagine today.

While about half the people in the world are online today via computers and mobile phones, the Internet will reach billions more in the coming decade. I expect that by using simple yet powerful models of computing such as web services, everyone will be more productive. These tools enable individuals, small groups, and small businesses to accomplish tasks that only large corporations could achieve before, whether it is making and releasing a movie, marketing a product, or reporting on a war.

When I was a child, researching anything involved a long trip to the local library and good deal of luck that one of the books there would be about the subject of interest. I could not have imagined that today anyone would be able to research any topic in seconds. The dark clouds currently looming over the world economy are a hardship for us all, but by the time today's children grow up, this recession will be a footnote in history. Yet the technologies that we create between now and then will define their way of life.

Monday, May 4, 2009

Map of alcohol consumption in the world.

Map of alcohol consumption in the world.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifvbc6V56V4n8v3yEzJapKteU7KcP4kiuUXVF5J_04pvUGxtcEO7FSdtOTHQONlAeauq1gUYwpiPN1HnIJ-XJ9e9R-gIgTCeH4oOPg2PHAs0zERk-Ffq-lYsgfFlxVuzY86HiphWouH3M/s1600-h/Alcohol.JPG

I guess it is quite safe to say.

  • The colder the climate, the higher the consumption
  • Consumption has a link with religion but this might be coincidental with the climate issue also.???
  • China being the most populous country in the world with a relatively high per capita consumption should be a bigger market than india. Same might be the case with brazil

Tuesday, April 28, 2009

THE FOURTH QUADRANT: A MAP OF THE LIMITS OF STATISTICS - Nassim Nicholas Taleb

THE FOURTH QUADRANT: A MAP OF THE LIMITS OF STATISTICS [9.15.08]
By Nassim Nicholas Taleb

http://www.edge.org/3rd_culture/taleb08/taleb08_index.html

A great article on how the use of conventional statistical tools have made us very very susceptible to the impact of rare events.

The article starts with reference to the ongoing Global Financial Crisis and talks about in general people quantifying risks in a financial system almost dumbed it down to – “It has not happened in the last 30 years and hence its probably not likely to happen anytime soon” while almost the reverse was true.

The gist of the article though is how the current state of Financial Statistics dumbs down the improbable but highly dangerous event while modelling and deciding the future mainly because the probability based on past data fools them. A classic case is the Turkey Story.

A Turkey is fed for a 1000 days—every day confirms to its statistical department that the human race cares about its welfare "with increased statistical significance". On the 1001st day, the turkey has a surprise.- ITS THANKSGIVING”

He goes on to create a Mapping this based on two distinct types of decisions - true- false binary type and yes-no but how much - and two distinct types of randomness – mediocristan (the measurable randomness) & the extremistan (the event that hits you out of the blue).

He goes on to explain how because of our fragile theories, a game breaking event can throw everything up in the air. And that’s what seems to have happened in general with the current Financial Crisis. Wise men bet so heavily on the a set of probabilities that were self reinforcing that at the first instance of an anomaly, the whole system collapsed like a house of cards caught in a tornado.

The last section deals with what to do with such extremistan items.

Thursday, February 12, 2009

Budgeting Time - A few points

While driving back home yesterday, a thought hit me. I think it was around the time I was hearing somebody talking on the radio about the business of selling money. The whole idea that for banks money was only a commodity that had to be sold to get more of the same back was pretty interesting. In many ways that puts me in the role of a person trying to budget time as a resource. Now this concept is a bit vague in my mind and probably I would expand on this as we go along but its interesting nonetheless. For any project, budgeting time works on many levels similar to budgeting money. You have to identify the activities that would consume time and then allocate time to it. So alongwith money and manpower and equipments etc, time acts as a resource. You need a certain minimum amount of time to do a job. This is expandable to a certain extent without affecting the other activity. However the one aspect that is different here is that time can be overlapped with activities working in parallel and still the total time spent would remain the same. This is a unique aspect that makes it more flexible than other resouces. If money were to behave like this then we would have situations wherein the cost of the project would be the cost of the longest activity and nothing more. So in effect time does not work in the strictly cumulative sense that money does.

The other aspect of time is how the projects work on an external calendar and how that affects the way time is distributed across an activity. This has a slightly lesser impact on the other resources. For example – a set of unforeseen holidays in a project would directly stretch the affected activity’s budgeted time but would have no direct effect on the cost, manpower and other resources. Its when we start quantifying time in cost terms that we start seeing a knockon effect on cost.

The other thing that skews the nature of our dealings with time is that even though we budget time, we are not remunerated back in time. We are remunerated in cost. This is the primary driver of the concept of the cash flow. If for a hypothetical case, remuneration were to be allowed in time too, we would have had a scenario of having time surplus and cash profit at the same time. Again this is a vague concept, but it is not totally new. We get salaries for our work but we also get holidays for every day of our work. The issue with time is that there is always a saving on time but never a time profit. So currently there is no system by which I can claim time or ask for a time credit for a task done early. Interestingly, sometime back I was reading about the concept of time debt wherein I would owe somebody the time that I promised and then wasted.

Another thing with time and the manner we do stuff and the issue of procrastination is to do with risk. Yesterday while talking to Ashwini, I again had a brainwave ( funny how most of these come when I am talking to her). The assertion that I make here is not new but we never perceive it in the manner I am putting it forward. And my assertion is that procrastination after a point of time exponentially increases the risk sensitivity of your budgeted time in the future. Let me make it clear with an example. If I had 4 days to pack my stuff up for a road trip, then by putting it off to the last moment would surely increase the risk sensitivity of my last moment. Any small impact on that last moment would have huge disproportional impact on my project completion. So while it is very convenient and tempting to procrastinate, it is a very costly strategy

Tuesday, February 3, 2009

Gloating time

Now that I have had my share of crows to eat on the Satyam saga and question Ive had to ask myself on where we were going with regards to Satyam as a company and as an investible stock, let me announce with some self congratulatory glee that if you had been the ones pumping in money when Satyam was around Rs 25 per share , you would have doubled your money today and that is just in 1 month. The underlying logic has always been one of going against the prevailing trend and not to panic. Going ahead from here, this is what I feel will happen with Satyam:
  • price will be news specific and will fluctuate around Rs 55 levels till we get concrete news on ownership change.
  • and once that happens, the share will start floating along with the general sensex.

So my recommendations are:

  • buy on every fall. Be prudent and dont overdo it, the general trend for this stock seems to be up only

On a different note though:

How many people today believe, that the Zimbabwean dollar is a good investment avenue. I know this will be a conventionally nonsensical idea but I have a feeling that judging by the manner in which the currency has depreciated, anytime in the future the currency starts to appeciate, it will do so in much the same manner. Again, sentiment seems very much against the currency and any minor good news will drastically improve the returns. So do we start buying our 1st billions of Zimbabwean dollars?

Sunday, January 18, 2009

http://www.businessweek.com/bwdaily/dnflash/content/jan2009/db20090116_786365.htm

So it appears, not all is gloom and doom in a recession. A very nice article from BusinessWeek about recession unaffected jobs. Some of them are quite logical, trades relating to repair tend to perform better as people resort to more reuse, so does training and education.