Data Vu All Over Again

April 6th, 2009


I’m very excited about some of the work my company OpenBI, a BI consultancy focused on open source (OS) solutions, is currently doing. Over the last month or so, we’ve been collaborating with OS analytics database leader Infobright and OS BI platform vendor Jaspersoft — with “special sauce” from the freely-available R Project for Statistical Computing — to build an enterprise-level, cloud-based analytics demo to showcase at the upcoming MySQL User Conference.

The plan was to assemble a complete open source BI prototype – database management, ETL, query/reporting, dashboards, OLAP, and statistical models/graphics. We decided early on to use real monthly data from the Current Population Survey, and were then off to the development races. We started work on a script that included database/table definitions, ETL, and comprehensive BI and statistical back ends. We’d deploy the completed script in turn to our Amazon cloud account. No sweat I thought. We’ve honed our expertise on open source BI in the cloud. We’ve performed this drill several times already. Alas, not so fast…..

We had no problem with the initial database and ETL sides, correctly loading all 206,404 records to Infobright.  In fact, we had little difficulty deploying the initial demo reports, dashboards, and OLAP cubes. We were even able to seamlessly integrate R with Jaspersoft to run statistical scripts that exploit R’s powerful graphics capabilities as if they were native to Jaspersoft. What I hadn’t planned for in our tight schedule, however, were the data problems/delays that routinely plague BI development. Shame on me!

The team got the raw data loaded correctly quickly using JasperETL. The main data file actually represented two different record types, household and individual. Once loaded, we conducted sanity inspections of each of the attributes and further reviewed individual tabulations and summaries to synch with those posted on the census site.

Our travails started when we looked to condense the initial data tables to consider households, individuals, and income-generating adults, respectively. It took a bit of experimentation to find the right combinations of attributes and categories to restrict the data like we wished. Truth be told, I cheated a bit at the end, eliminating several hundred records of full-time employed adults with little or no income.

We committed more time than originally allocated to cross tabulate, summarize and validate separate variables we knew to be related. With over 100 attributes, many of which relate in theory, there was quite a bit of sanity checking –work though that had to be done to provide us comfort with the data.

Other challenges emerged from both the absolute numbers of attributes and the granularity of the important categorical variables. There are, for example, 21 separate values for race, and 17 categories for education – far more than are sensible for basic OLAP and statistical analysis. So for these and several other categorical attributes, we created new items collapsing levels of the old to create more compact, analytic-friendly dimensions. Of course, testing the new creations took time not allocated on the plan.

We also had to map the text “dimension” columns to their integer-valued counterparts. Education, for example, came in two flavors: an attribute of labels such as “5th or 6th grade” and “Some college but no degree” and one of corresponding arbitrary integers like 33 and 40. The challenge was to synch those representations so that the ordering of categories for OLAP and statistical analysis would be sensible. Left to their own devices, most analytics tools present categories in alphabetical order – almost certainly not what is desired.

The additional data work expanded our demo deployment by over 30%. A stats guy who routinely sets expectations that the data side comprises more than three quarters of BI effort, I felt a bit humbled by my oversight. Hopefully, this refresher learning will stay with me for a while this time!


The Divine R

March 30th, 2009

A colleague recently asked me for a good introductory text on the R statistical computing platform. Though there are a seemingly endless number of published books on R, I recommended a personal favorite, Introductory Statistics with R, Second Edition (Peter Dalgaard). The book does an excellent job introducing the R language as well as demonstrating R’s usage for solving real world statistical problems.

I chuckle when I read uncomplimentary reviews of R documentation by analytics pundits. In addition to scores of books, comprehensive reference manuals, and online help and documentation, there’s a wealth of R “how to” publications written by the community and freely available to anyone with internet search. One such gem that I recently discovered is The R Inferno by Patrick Burns. The abstract to this brief (103 pages) pdf concisely conveys the tome’s goal: “If you are using R and you think you’re in hell, this is a map for you.”

Not only is Burns a capable R analyst, he’s also a very clever writer. The R Inferno is a play on the Inferno cantica of Dante Alighieri’s The Divine Comedy, in which Dante navigates the nine circles of hell. The circles are concentric, each progressively more depraved, representing ever increasingly grievous sins, ultimately culminating with Satan in the center of hell.

Burns sees the journey through R learning hell with a similar lens. His concentric circles depict problems that typically trip up those new to R. Much attention is focused on vectorizing computations to perform efficiently. My experience is proof positive the new R programmers often bring procedural baggage to their learning. Burns also obsesses on the many benefits of modular function development in R, as well as its various flavors of object orientation. The eighth circle, Believing It Does What is Intended, addresses scores of R gotchas, and is pertinent for even the most experienced R programmers. Finally, circle nine clearly articulates the R community-established norms for asking help of the many support lists. The uninitiated who routinely leap before they look are not treated charitably in R land.

After reading Inferno, I was prompted to look in the attic for one of my all time favorite computer books, the now 35 year old The Elements of Programming Style, by Kernighan and Plauger. (Aging analysts might recognize Brian Kernighan as co-author with Dennis Ritchie of C Programming Language, one of the most important programming books of the last 30 years.) Just as Burns uses the Divine Comedy as a metaphor for his writing, Kernighan and Plauger model the timeless and concise writing manifesto, Elements of Style, by Strunk and White, as their guide. And just as I try to remember important S&W dictums like “Put sentences in a positive form”, “Omit needless words”, and “Revise and rewrite” when writing, so too do I look to K&P’s wisdom — “Let the data structure the program”, “Don’t patch bad code; rewrite it”, “Watch out for off-by-one errors”, “Make sure your code ‘does nothing’ gracefully”, and “Make it right before you make it faster” – to structure programming work. Much like Elements of Style and The Elements of Programming Style, The R Inferno is destined to become a manuscript that ages well – that always rewards those who invest the time to review.

 

Planning for Predictive Models – Wisdom From Regression Modeling Strategies

March 9th, 2009

I’m getting ready to start another predictive modeling effort and decided to turn to several trusted stats books for a quick review. Three favorites include Maindonald and Braun’s Data Analysis and Graphics Using R,  The Elements of Statistical Learning, by Hastie, Tibshirani and Friedman, and Frank Harrell’s Regression Modeling Strategies. The books provide a nice balance of theory and practice, statistical inference and statistical learning.

 

I didn’t even get past the Preface to RMS before I started taking notes on important considerations for planning my new prediction studies. Indeed, I found the emphases spot on, even though I’m not certain whether I’ll use the regression models that Frank espouses or the statistical learning models of ESL.

 

The following are nuggets of wisdom from RMS for planning/executing modeling studies, along with a statistical blogger’s commentary:

 

1)      The cost of data collection outweighs the cost of data analysis. This means it’s critical to maximize the value of data in hand and to analyze it judiciously. It also underscores the oft-heard warning from Predictive Analytics World that quality data is perhaps the leading critical risk/success factor for predictive analytics projects.

2)      Prudent handling of missing data is critical. Simple deletion of cases for which there are missing attributes can lead to prediction coefficients that are either terribly biased or grossly inefficient. There’re well-developed methodologies and statistical procedures for “imputing” missing values that should be a part of the analyst’s arsenal.

3)      Mean square error, which equals variance + bias, is generally a criterion for evaluating a model. Statisticians often look first for unbiased estimates, but it may be better in many cases to trade off a small amount of bias for reduced variance.

4)      Analysts need to pay special attention to non-linearity and non-additivity in their models. The careless deployment of simple linear models is often a by-product of the regression capabilities of BI tools. A miss-specified model may lead to erroneous predictions and results. Techniques like cubic splines are available for testing and incorporating these complications in standard models.

5)      Graphical methods to support the understanding of complex models are critical. The connection of predictive models to graphics is particularly strong in R. The lattice graphics pioneered by William Cleveland and included in R are central to its productivity and popularity.

6)      Methods for handling large numbers of predictors are central to today’s predictive models. Fortunately, there are answers like data reduction methods (e.g. principal components) from the multivariate statistics world, as well as Least Angle Regression (LARS), the Lasso, Random Forests, and Gradient Boosting from statistical learning.

7)      Overfitting is a common problem. Model validation approaches that include the bootstrap and cross validation are now central to estimating and testing. The stepwise regression procedures I learned in grad school 30 years ago are now non-grata in the prediction world. Fortunately, resampling techniques that are part and parcel of statistical practice have come to the rescue.

 

 

 

Rattle Redux and Predictive Analytics World Potpourri

March 2nd, 2009

 

Rattle

 

I received an email from John Maindonald the other day. A little over a year ago, I wrote a review of an excellent statistical text, Data Analysis and Graphics Using R, John co-authored with John Braun. Part of his message was to inform that the 3rd edition of Data Analysis would be coming out soon. Maindonald is also on the faculty of the Australian National University, co-teaching a course on data mining with Graham Williams. Williams is the developer of Rattle, the R Analytical Tool To Learn Easily, a front end to the significant machine learning/data mining capabilities of R. The second piece of John’s message was a request to update the url to the course for Information Management readers. Done. I would highly recommend Math3346 for those seeking an accessible treatment of applied data mining.

 

Predictive Analytics World

 

As I mentioned in last week’s blogs, I was pleasantly surprised by version one of Predictive Analytics World, finding it quite useful on a number of levels. Today, I offer a few final observations on the conference.

 

I guess I shouldn’t be too surprised that the most oft-cited success (or risk) factors for analytics deployments have to do not with analytics per se, but rather with business sponsorship, business/IT/analytics team alignment, methodology, data quality, communication, incremental wins, and governance. It appears lessons learned for predictive analytics look much like those for broader business intelligence.

 

On the evening of Wednesday, Feb 18, The Bay Area useR Group (R Programming Language), held its meeting using PAW hotel facilities. 70 people, many of whom were not R users, listened to presentations by commercial R vendor Revolution Computing as well as web titans Facebook and Google. Both Facebook and Google are big advocates of R’s open source analytics and graphical capabilities, employing analysts who learned the package in grad school. R is particularly popular for preliminary, exploratory data analysis (EDA) tasks.

 

I was a bit surprised by the limited range of analytics techniques demonstrated in the technical sessions I attended. Logistic regression and CART seemed the norm for classification problems, while ordinary least squares and stepwise regression appeared the choice for interval-level prediction. One session presented a hand-rolled ensemble of logistic regressions, demonstrating reduced variance and sharpened predictions – results R users take for granted with Random Forests and Gradient Boosting. Maybe I’m just spoiled by the embarrassment of riches available to predictive modelers in R. There are now scores of the very latest techniques accessible for free.

 

The Bay area is home to the top two schools of statistics in the U.S., Stanford and Cal Berkeley. It’d been nice to have an academic perspective on the current state of predictive analytics, especially given the rapid developments in both statistics and machine learning. One of the Stanford professors among Trevor Hastie, Rob Tibshirani, or Jerome Friedman, co-authors of the just-released book, The Elements of Statistical Learning, Second Edition, would have been an ideal presenter. Perhaps next year there can be sessions surveying both statistical learning and Bayes modeling.

 

Looking forward to PAW 2010!

 

 

 

Predictive Analytics World — Methodology and Business Learning

February 23rd, 2009

This is the second correspondence on last week’s Predictive Analytics World (PAW) in San Francisco. About a year and a half ago, I wrote a book review on Super Crunchers by Yale economist Ian Ayres, in which I noted that super crunching is the amalgam of predictive modeling and randomized experiments. Randomization to treatment and control groups allows investigators to minimize the risk of study bias so that the only important differences between groups out of the gate are that one is named treatment while the other is called control. Predictive modeling by itself allows analysts to infer relationships and correlation; the addition of experiments sharpens the focus to cause and effect. The combination of predictive modeling and experiments is thus a very potent tool in the business learning arsenal of hypothesize/experiment/learn.

 

The power of analytics plus experiments was understood well by PAW participants. Conference chair Eric Siegel noted the importance of experiments in demonstrating the value of predictive modeling, citing the oft-told story of Harrah’s Entertainment that “not using a control group” is rationale for termination. Siegel also detailed the champion/challenger experimental analogy used by enterprise decision management practitioners.

 

SAS’s Anne Milley improved her standing with me quite a bit with a short but incisive presentation. Anne’s just now starting to get over an unfortunate remark on the risk of using the open source analytics platform R in a January NY Times article.

 

In this talk, she quotes Derek Bok, president of Harvard University from 1970-1991: “If you think education is expensive, try ignorance”. Anne proceeds to frame predictive analytics in a broader context of applying scientific principles to business. This framework for business analytics is one of:

1)      Observe, Define, Measure

2)      Experiment

3)      Act

She also proposes an Analytics Center of Excellence to promote dialog between producers and consumers of analytics, sagely noting that the social is every bit as important as the analytical, and that data quality is king. Sounds like someone who’s been around the modeling block more than a few times.

 

John McConnell of Analytical People discusses the popular CRISP-DM (CRoss-Industry Standard

Process for Data Mining) methodology in his study of customer retention. The steps of the CRISP-DM feedback loop include Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation and Deployment. Randomized experiments or other rigorous designs are part and parcel of the evaluation step.

 

Jun Zhong, VP Targeting and Analytics, Card Services Customer Marketing, Wells Fargo, uses randomized experiments as well as propensity adjustments for his response modeling so he can distinguish re-active purchasers from pro-active and non purchasers to best allocate scarce targeting dollars.

 

Finally, Andreas Weigend, former Chief Scientist of Amazon.com is a big proponent of the scientific method for learning in business. His talk, The Unrealized Power of Data, articulated a methodology, PHAME, for measuring the power of data. Weigend’s approach,

ProblemàHypothesisàActionàMetricsàExperiments, supplements top-down problem definition, hypotheses formulation and evaluation metrics with the bottom-up performance measurement of experiments in a learning feedback loop. Tom Davenport would be proud.

Predictive Analytics World — Keynotes +

February 23rd, 2009

I just returned from two days at Predictive Analytics World in San Francisco. I must admit, my expectations weren’t very high. I’m not enthusiastic about version one of just about anything, a reticence that has generally served me well. This time was an exception. Kudos to conference chair Eric Siegel and producers Prediction Impact, Inc. and Rising Media Ltd.

 

A hobbled Siegel kicked-off the conference on Wednesday with his keynote: Five Ways to Lower Costs with Predictive Analytics. His focus was on variations of response and churn modeling along with risk management. Uplift modeling has to do with understanding whether a positive response was caused by modeling solicitation or would have occurred anyway. It is, of course, cheaper not to intervene with those inclined to respond. Siegel is a proponent of proving the value of analytics by contrasting results of experimental and control groups.

 

Though not a keynote, John Elder’s multiple case study talk: The High ROI of Data Mining Solutions for Innovative Organizations, offered wisdom for practical predictive analysis. Elder sees the primary functions of data mining as 1) eliminate bad; 2) discover good; and 3) streamline/automate for efficiency. His examples for Anheuser Busch, Walt Disney World, the IRS, and hedge fund optimization were quite interesting, but his broader message was on the alignment of business and technology to ensure success.  Critical factors include committed business champions, a strong interdisciplinary team, data vigilance, and a methodology that analyzes across multiple time periods.

 

Usama Fayyad, CEO of Open Insights, LLC and late Chief Data Officer of Yahoo, followed the data path in his keynote: New Challenges in Predictive Analytics. At both Yahoo and Open Insights, data strategy involves turning data into insights and strategic business assets. For Yahoo, the magnitude of data handled – over 25 terabytes and hundreds of millions of events per day – is staggering. Fayyad sees Yahoo’s “model” as the funnel from awareness–àpurchase, involving brand advertising and marketing.

 

The confluence of search, behavioral targeting and social media has both expanded and complicated traditional modeling. Fayyad sites Flickr photo sharing, with 90 million users organizing, generating, and distributing their own content using the “wisdom of crowds”. This new generation of on-line marketing involves individual targeting + social networks that lead to social targeting. The opportunities – and challenges – for predictive modeling in this evolving context are enormous.

 

Finally, Andreas Weigend, former Chief Scientist of Amazon.com, delivered a passionate talk entitled: The Unrealized Power of Data. Weigend’s point of departure is the evolution of Customer Relationship Management (CRM) to Customer Managed Relationships (CMR). Observing that 50% of Amazon customers do not come onto the site with the intention of buying, Weigend opines that companies must fully utilize both existing and new data to impact their bottom lines and strengthen relationships with customers. He argues that location data, individual self-reports and individual relationship information are special keys to marketing 2.0 – viral or social marketing.

 

Weigend is also a big proponent of scientific methods for learning in business. His PHAME approach will be covered in a subsequent blog.

 

Hypothesize/Experiment/Learn + Incentives

February 16th, 2009


Two recent WSJ articles offer insight into Stats Man’s Corner theme of hypothesize/test/learn for business. I’ll cover one today and the other in a future blog.

More Smokers Quit if Paid, Study Shows, WSJ, Feb 12, 2009, details results of a study published in the New England Journal of Medicine that calibrates the success of getting smokers to quit by offering financial incentives. Over 20% of adults in the U.S. smoke, costing their employers $3,400 per smoker annually. 480,000 Americans die each year from smoking-related diseases, so smoking remains a significant health and business problem.

The study tracked a group of 878 smoking General Electric employees for 18 months from 2005 and 2006. The employees were first given information on smoking cessation programs and then randomly divided into two groups. Employees in the intervention group were offered cash incentive payments of up to $750 over the course of the investigation to abstain from smoking, while those in the control group were provided no such cash subsidy. The maximum $750 payment for included $100 for completing the program, $250 for not smoking six months after enrolling in the study, and an additional $400 for another six months of abstinence. Smoking habits were self-reported, with validation from saliva and urine testing.

The results of the experiment were somewhat heartening, with 14.7% of the intervention group in contrast to 5% of the controls reporting smoking cessation for the first year of the study – a significant difference. At the conclusion of the 18 months, the figures were 9.4% and 3.6% respectively. The study raises important policy questions, but the fact that individuals were assigned to intervention and control groups at random supports the internal validity of the results: other unmeasured or potentially conflicting explanations for the differences in cessation between groups should be minimized. Steven Schroeder, director of the Smoking Cessation Leadership Center at UCSF, remarked that the study “shows that incentives work”. At the same time, the study offers little in terms of the external validity or generalizability of findings. Are the positive results specific to the population tested? To study time frame payouts? What will the results look like in five years?

Lead researcher Kevin Volpp, a physician and faculty member of the prestigious Wharton School of the University of Pennsylvania, is also Director of the Leonard Davis Institute of Health Economics Center for Health Incentives (LDI CHI). The charter of LDI CHI is to facilitate research that makes significant contributions to reducing the disease burden from major public health problems such as tobacco cessation, obesity, and medication non-adherence for cardiovascular and other diseases through better understanding of how to design and apply incentives and other behavioral economic approaches to improving health. The center has three primary missions:

1.      To advance knowledge about incentive design

2.      To develop and test scalable and cost-effective applications

3.      To work with private and public sector entities such as large employers, insurers and health systems to improve health care delivery and the health of the population

The LDI CHC is research engage — combining evidence-based health care with the behavioral economics of incentives and nudge, and our now-familiar business tool chest of hypothesize/test/learn. A powerful learning and change platform indeed.

Learning from the Black Swan

February 11th, 2009

Nassim Nicholas Taleb was right. The world financial system was recently devastated by unpredicted catastrophes of grand proportion – a financial black swan – and struggles today to recover and explain the carnage. Taleb predicted such an event in his 2007 bestseller The Black Swan, The Impact of the Highly Improbable. In 2005, Taleb took on the investment community with his highly entertaining Fooled by Randomness: The Hidden Role of Chance in the Markets and in Life., in which he assails the financial services community for its dumb luck, its hubris and its reckless conduct.  With no shortage of ego, Taleb iterates his financial doom and gloom theme with a vengeance in a Feb 2, 2009 Forbes interview, castigating the industry for its overconfidence and its lax risk-control models built on faulty assumptions. For Taleb, the extraordinary booms and busts experienced over the last 20 years are far outside the tidy predictions of our cherished bell curve models. Fat tails are, unfortunately, facts of financial life.

I wrote an article on Fooled By Randomness for Information Management last year in which I focused less on the financial implications of Taleb’s meanderings than his insights on human behavior. It seems people are often deluded in attributing causes of their own behavior. Success is generally interpreted in a strict causal trail of events – I did such-and-such which resulted in a favorable outcome — while failure is simply bad luck, or random. Survivorship bias occurs when the weak die young, thus skewing results toward the successes still alive. It’s all too easy to forget the deceased. Hindsight bias, a particularly pernicious human foible, allows failings to be adroitly “predicted” after the fact – I knew it all along. And the narrative fallacy allows us to fool ourselves with anecdotes and stories, which are much “easier” than rigorous evidence. Finally, humans expect progress to be linear, when it often plays out like an S Curve: “Tomato ketchup in a bottle – None will come and then the lot’ll”.

The antidote for these flawed explanations of cause and effect? None other than the hypothesize/test/learn cycle outlined in an earlier blog.

L. Gordon Crovitz, Information Age columnist for the WSJ, offers hope in a February 9, 2009 article with observations from the latest Technology, Entertainment and Design (TED) conference. He cites linked data, an exciting concept proposed by Tim Berners-Lee, inventor of the World Wide Web, which will facilitate correlating digital data in disparate formats. Linked data will enable analysts to better formulate and test business performance hypotheses using the Web, thus providing a more scientific basis for decision-making. Indeed, Berners-Lee thinks linked data may be the salvation of financial services, promoting an intelligent antidote to flying blind with new — and risky — financial instruments.

 

The R Learning Lasso

February 4th, 2009

I got an email the last week in January from the R help list announcing the release of the newest version of glmnet, a statistical learning algorithm that fits lasso and elastic net regularization paths for squared error, binomial and multinomial models via coordinate descent. Don’t be ashamed if you find that description a bit abstruse: just know you’re not alone! Suffice it to say that glmnet is a state-of-the-art modeling package that handles the prediction of interval and categorical dependent variables efficiently.

The package’s creator is Trevor Hastie, co-author with Jerome Friedman and Rob Tibshirani of the accompanying arcane-sounding paper: Regularized Paths for Generalized Linear Models via Coordinate Descent, published last summer. Hastie, Friedman and Tibshirani are also eminent professors of Statistics at Stanford University, the top-rated such department in the country. Last Fall, I attended a statistical learning seminar with Hastie and Tibshirani where similar models were presented at a dizzying pace.

So the R user community had just been provided access to a latest learning algorithm hot off the development presses from three world-renowned practitioners – for free. And glmnet is readily accessible from the internet, installing on existing R platforms painlessly. No commercial stats package that I know of – certainly not the market leader – is even close to releasing a competitive offering. I’d say that’s a pretty good deal for stats types like me, and a benefit to working with a fertile, world-wide open source initiative like R.

After installing glmnet on my PC, I tested it against a 1988 Current Population Survey (CPS) data set that consists of 25,631 cases. My objective was to predict the log of weekly wages from experience and education, both measured in years. I first divided the base data set into two subsets, a training set with two thirds of the cases randomly selected, and a test one with the remainder of the records. I then developed two separate models with the training data – one a straight linear model with an interaction term, the other using cubic spline mappings of experience and education. Once model parameters were developed with the training data set, I evaluated and graphed the results using the separate test data.

The plot on the left shows the linear plane generated by glmnet; the one on the right depicts the curvilinear plane from the cubic spline mapping. The linear model seems naïve in contrast to the cubic spline alternative which provides a much closer fit between actual and predicted wages. Indeed, preliminary exploration of the training data set confirmed the curvilinear nature of the relationships between education, experience and wages, with wages actually declining for high- end experience. The linear model incorrectly details uniformly increasing wages across the ranges of both education and experience.

The relationship on the left is thus mis-specified and produces predictions out of synch with actual outcomes. A naive linear specification like this is, unfortunately, more the rule than exception for BI analysts using Excel or other standard BI tools for their models. Prudent analysts will turn to the sophisticated packages of platforms like R for predictions that closely reflect the subtlety of their data.

For those interested, Hastie and Tibshirani are offering a new two-day seminar,Statistical Learning and Data Mining III (http://www-stat.stanford.edu/~hastie/sldm.html), March 16-17, 2009 in Palo Alto, CA.

 

Linear and curvilinear plane example


 

Hypothesize/Experiment/Learn

February 2nd, 2009

I’ve subscribed to the Harvard Business Review for about five years. When the monthly magazine arrives in the mail, it often seems there are either several articles pertinent for business intelligence or none at all. The February 2009 edition was one of the former.

The article: Why Good Leaders Make Bad Decisions, cites neuroscience research to observe that leaders often make decisions through the unconscious processes of pattern recognition and emotional tagging. Pattern recognition uses assumptions from prior experiences to categorize a current decision situation, often suggesting solutions similar to those that worked in the past. Emotional tagging is about the emotionally-committed preferences of the decision-maker, which of course can have substantial impact on the action taken. These processes, which in ways are similar to rules of thumb or heuristics, may produce effective decisions. They may also, however, be sources of systematic bias that can lead to faulty decisions.

The article cites examples from a book by one of the authors, Sydney Finkelstein, of Dartmouth’s Tuck School of Business, that conducted post-mortems of flawed business decisions. The authors suggest that businesses should build safeguards into their management decision processes driven by red flag conditions to guard against such sources of bias.

Tom Davenport’s article: How to Design Smart Business Experiments, offers a scientific antidote to the “on a wing and a prayer” approach to business decision-making. The culture of hypothesize/experiment/learn for operational decisions is closely aligned with the Evidence-Based Management philosophy espoused by Stanford professors Jeff Pfeffer and Bob Sutton. Indeed, the hypothesize/test foundation for business decisions is promoted in the Balanced Scorecard, Super Crunchers, and Enterprise Decision Management.

Davenport espouses a cycle for putting ideas to the test that includes:

1)      Create/Refine Hypotheses

2)      Design Experiment

3)      Execute Experiment

4)      Analyze Results

5)      Plan Rollout

6)      Rollout

Findings from all steps in the process are submitted to a Learning Library for posterity and, hopefully, for reuse.

Testing generally makes the most sense for smaller, operational decisions that are repeated often – the core of business transactions. At eBay, Amazon, and Google, randomized testing is the norm for website development. Sears has tested several formats for including its merchandise in Kmart stores, and vice-versa. Capital One has been in the testing vanguard since 1988, using experiments to design new offerings, moving to the top ranks of credit card companies by its “ability to turn a business into a scientific laboratory…..subject to testing using thousands of experiments.” And Harrah’s Entertainment has given teeth to its hypothesize/test/learn culture by a mandate that “not using a control group” is rationale for termination.

Business Intelligence, which distinguishes between exploratory and confirmatory analytics, loves the focus given its efforts by an evidenced-based culture built on hypotheses and experimentation.  Evidence-based companies generally embrace BI early and significantly; return on investment (ROI) exercises for BI are strategic and substantive.