Saturday, September 22, 2007

Kick Starting Your Analytics Strategy

“Did you watch this movie ‘The Bridges of Madison County’? It’s really great,” my cousin asked me one afternoon. “Isn’t that a really old chick flick?” I said, “Nah, I don’t watch those”. “Well,” she replied, “I really like these kinds of movies. I can’t get enough”. “How do get your movies?” I asked. “Netflix”, she replied. “Why not Blockbuster? They offer the same service, and you can even get movies from the store if you want to watch something right away. Isn’t that better? I don’t know why Netflix is still around”. “Well”, she answers, “I’m quite happy with my Netflix service. I don’t see why I should change”.

One of the reasons why Netflix prevailed over its more powerful and well entrenched competitor is because of its shrewd use of analytics. At the heart of Netflix’s analytics strategy is ‘Cinematch’, a proprietary software that analyzes customers’ choices and feedback on the movies they have previously rented, and recommends new movies in ways that optimize both the customer’s taste and Netflix’s inventory. This is also one of the primary reasons why so many of Netflix’s rentals are old movies rather than new releases (and explains why my cousin ended up renting ‘The Bridges of Madison County’). Blockbuster, on the other hand, focuses on new releases and how quickly titles move from its shelves. If a title is on Blockbuster’s shelf too long, it’s discarded. In similar circumstances, Netflix - with the help of Cinematch - is trying to find a customer for those titles. The result is greater customer satisfaction (why my cousin refuses to ditch Netflix for Blockbuster), while lowering costs (After all, old titles cost much less than new releases). No wonder Netflix has offered to pay $1 million to anyone who can improve Cinematch’s accuracy by a mere 10 percent.

Now you want to do something similar for your organization. You want to harness the awesome power of analytics to crush your competitors and leave them in the dust. But how do you get started?

To jump start your analytics strategy, here are some of the steps you need to take:

Define your business problem.

There used to be a saying at IBM,"There are no problems, only opportunities". True. The impetus to do anything in a business environment would be because there is a need to act. That need would be your business problem. For example, the need to find out why sales are down would obviously qualify as a business problem. Likewise, the need to further increase sales can also be considered a business problem. Although, the purpose of an analytics strategy is to provide solutions to business problems, the problem itself may not be so obvious. In our example above, a company might confuse both problems as the same and offer heavy discounts to increase sales. However, if the company was in the business of making floppy disks, it would have realized that more people were using memory sticks because they're more convenient to use. The company just hurt itself financially by lowering the price for diehard floppy disk users. Successful analytical solutions depend to a great degree on how well you have analyzed and scoped your business problem. If you have not had at least one meeting with all your stakeholders, you have not yet defined your business problem. Also, keep in mind that every business problem is also an opportunity to improve.

Determine your metrics.

After you have spent significant time to understand and define your business problem, the next step is to determine the data points or measures needed to execute and evaluate your strategy. Analysts, for instance, use anything from earnings to net income to measure the health of a company; but I knew a stock analyst who liked to use gross profit minus operating expenses because he believed that the trend for that data gave him a raw assessment of how seriously a company was trying to boost productivity. The metrics have to be determined and agreed on before you execute your analytics strategy, it cannot be 'lets figure it out as we go along'. Think of your strategy as driving to a destination, and the data points are road signs telling you which road to take and how close you are to your destination. If you're driving from Monterey to Salinas and do not see any road signs after half an hour, that road will not take you to Salinas.

See what information you have.

Once you're determined the data points to execute and evaluate your strategy, you need to figure out the availability of the data needed to execute the strategy. Data that already exists or was gathered for some other purpose is known as Secondary data. It is always preferable to look for secondary data, because it makes it cheaper and easier to carry out your strategy. Most large organizations suffer from an information silo effect, which means the information that you need is gathered and used somewhere in the organization, but not known to you or your group. Sometimes, you may have to purchase the required data from a vendor like Gallup or Nielson. If data is sparse or non-existent, then you may have to gather information from Primary sources. This may involve conducting a census or a survey. Obtaining data from primary sources should be one of your last resorts, after every effort to obtain secondary data has failed. That is because primary data will would readily increase the cost of executing your analytic strategy. In dire circumstances, where data is sparse or lacking, and you can't spend too much money to obtain data from primary sources, you can look for data obtained for purposes that are similar, but not quite meet the current requirements, and make minor adjustments to it (i.e., a company trying to introduce a new sports drink in a town can use existing data from a survey of athletes who drink carbonated beverages). However, in these cases, you should be aware of the risks inherent in using data obtained in such a manner.

Validate your data.

After you have obtained your data, the next step is to assure the integrity of your data. Never assume that your data is 100% reliable, no matter who the source is or how strong the brand name for that source. A recent survey by Gartner found that at least 25 percent of all data for Fortune 1000 firms were bad or corrupted. There are many ways to determine whether your data is good. Some may include determining whether the data is in the proper format, where you have missing or null values where there should be none, or whether the data fall into proper ranges. Another crude way would be to determine the count of records being input into the system and verifying that count in different points of the process to ensure that no records are getting dropped. Finally, you could pull a small random sample of 100 - 500 records from the raw dataset and just view it to see that there are no obvious errors.

Select your tools.

This depends on what tools you already have, the amount of data required to execute your strategy, and what kind of analytics you intend to perform. If your dataset is relatively small, Microsoft Excel - utilizing visual basic applications - can be used to perform a surprisingly large range of sophisticated analytical analyses. Excel is not a good tool, however, if you have to perform analytics on large datasets or have to transfer data to other applications. Statistical software such SAS, however, can perform analytics on large datasets and can output data in different formats that can be fed into other applications. Personally, I have found that using SAS in conjunction with Excel (an advantage of SAS is that it allows you to automatically kickoff Excel VBA macros, simplifying much of your work) provides plenty of latitude to execute a relatively wide range of analytics strategies.

Validate the strategy.

Before relying on your strategy to provide the solution to your business problem, it would be a good idea to test the reliability of your strategy. You can do this by carrying out pilot tests prior to the actual implementation. Create a small subset from your data population. Determine what the end result would be. Run the data through your solution. See whether the end result matches your expectations. Another benefit of conducting pilot tests is that it allows you to uncover and fix any unexpected quirks that you failed to anticipate.

Don't reinvent the wheel.

Finally, do not reinvent the wheel if you don't have to. Analytics primary involves data and quantitative concepts. More often than not, the same strategy can be tweaked and applied for similar business problems across the board. The probability is also high that someone in your organization faced a similar problem and came up with a successful strategy. If that is the case, you can save a lot of money and energy, just borrowing or building your solution from that strategy.


Wednesday, September 19, 2007

A Working Definition for Analytics

It's Christmas season and your wife finally got you to agree to mail the Christmas gifts. Perturbed, you drag the small hill of packages to the local post office, only to see a half mile long waiting line and two very indifferent postal clerks taking their sweet time handling the customers. Two hours later you are still standing in line, very cross with the U.S. Postal Service and your wife for wasting your entire afternoon. Next year, you decide to make the trip to UPS instead.

The U.S. Postal Service has just lost a customer. Coincidentally, USPS does gather data on how much time customers spend at the window. Unfortunately, it also gathers data on 2000+ other matrices. Critical information such as customer wait time often gets overlooked because managers are bombarded with too much information. Adding another postal clerk could have brought down the wait time from 2 hours to 45 minutes. Adding two more postal clerks would bring that wait time to 15 minutes. Adding three more postal clerks would bring that wait time to 5 minutes. Customer satisfaction surveys show that postal customers start getting agitated if wait time exceeds 18 minutes. Consequently, we can see that adding one more postal clerk would not have improved the situation, adding three more postal clerks would have resulted in the postal service incurring unnecessary costs. Therefore, there is an optimal number of postal clerks that the USPS could have assigned to the above mentioned post office, and that number is four. The postal service could have retained their customer if they had a good analytics strategy in place. So what is "analytics"?

The most simple definition of analytics would be that it is "the science of analysis". In reality, the word "Analytics" has not been properly defined by the professional community and may mean different things to different people. A simple and practical definition, however, would be how an entity arrives at the most optimal and realistic decision from a variety of available options, based on existing data. There are several aspects to this definition. First, the purpose or goal of this endeavor is to arrive at a decision. Second, the process should be able to identify the best option from a range of available options. Finally, the decision making process should be based on data. Business managers may choose to make decisions based on past experiences or rule of thumb, or there might be other qualitative aspects to decision making; but unless there is data involved in the process, it would be considered beyond the purview of analytics.

Many people think that analytics only involves the use statistical analyses or mathematics to predict and improve business performance. It would, however, be erroneous to limit the field of analytics to only statistics and mathematics. Good analytics professionals should be well trained in business concepts and the social sciences, as well as have a good grasp of statistics and mathematics. A good analytics professional should be willing and able to work across various fields to come up with the proper solutions. Others argue that an analytics professional should also be cognizant of his data sources, which includes knowledge of his organization's IT infrastructure (how else would he know that his data is not being compromised or corrupted by the system). That is why analytics is unique and much broader than the use of statistics or mathematics in business.

With computers getting more powerful along with the increasing popularity of Business Intelligence (BI) tools, the importance of analytics is growing for businesses. Analytics has been credited for helping Netflix ward off competition from Blockbuster video and helping Google overtake Yahoo! to become the most profitable portal on the internet.