5 Questions to Ask About Predictive Analytics

Predictive Analytics is a branch of data mining that uses a variety of statistical and analytical techniques to develop models that help predict future events and/or behaviors. It helps find patterns in recruitment, hiring, sales, customer attrition, optimization, business models, crime prevention and supply chain management to name a few. As we move to self-learning organizations, it is imperative that we understand the value of Business Analytics in general and Predictive Analytics in particular.

It turns out that Predictive Analytics is about Business Transformation.  But in order for this Business Transformation to take place, you have to take into account the organizational contexts in the following ways:

  1. Strategic Perspectives: Not all organizations are the same and thus what works in one organization might not work in yours. Based on the knowledge of your organization’s maturity, you have to decide if Predictive Analytics is going to be a top-down, bottom-up, cross-functional or a hybrid approach. Additionally, take into account what should be measured and for how long but be flexible in understanding those insights might be gained from data that might initially seem unrelated.
  2. Tactical Perspectives: One of the key factors in Business Transformation is change management. You need to understand how a change would affect your organization in terms of people, processes, and technologies. You have to take into account the practical implications of this change and what kind of training is needed within your organization.
  3. Operational Perspectives: It is all about how the execution of Predictive Analytics is done within your organization. To fully integrate Predictive Analytics into your organization, you have to learn from best practices, learn the pros and cons of your technology infrastructure and determine if the necessary tools are intuitive enough for people to make use of them.

Now that you understand the different organizational perspectives, it is time to ask the following:

 

Today

Tomorrow

Who uses Predictive Analytics to make decisions? Who should use Predictive Analytics to make decisions?
What happens to decisions when Predictive Analytics is used? What would happen to decisions if Predictive Analytics will be used?
Where does the data for Predictive Analytics come from? Where should the data for Predictive Analytics come from?
When is Predictive Analytics relevant? When should Predictive Analytics be relevant?
Why Predictive Analytics is being used? Why Predictive Analytics should be used?

When you ask the above questions, keep in mind that the reliability of the information and how it is used within the organization is paramount. A pretty picture does not guarantee that the insights you get are correct but you can reduce decision-making errors by having people who understand what the data actually means and what it does not.

Measurement
Measurement

 

5 Questions to Ask About Your Information

Information collection, understanding and sharing has been a worthwhile pursuit since the dawn of humanity. In the beginning, now and in the foreseeable future, this pursuit will continue, even if the “tools” change. We will continue to use the information to make short-term and long-term decisions for our groups and ourselves. But depending upon the sources of the information, we might make good decisions or we might not. It is only until the results of the decisions are evident that we will know if where we ended is where we wanted to be. Sometimes we will make quick decisions and sometimes we will take our own time to make a decision. But in all of these circumstances, we will always hope that the information sources that we used to make our decisions are credible.

In order to understand the information, we need to understand the various “flavors” of information that we receive. Let’s explore them below:

  1. Redundant Information: Think about how many times you have received the same information from two different secondary sources. In your mind, you might be thinking that since two different secondary sources are providing the same information then it must be true. But what if the primary source of the information is the same? What if nothing new has been added to the information that you received? This is the concept of Redundant Information where the primary source of the information is the same and nothing new has been added to it.
  2. Corroborated Information: Think about how many times you have received the same information from two different secondary sources and are sure that the primary sources of the information are different. In your mind, you might be thinking that since the two primary sources are different then it must be true. This is the concept of Corroborated Information where the primary sources of the information are not dependent on each other.
  3. Contradicting Information: Think about how many times you have received the same information from two different secondary sources and found out that they were saying the opposite things. This is the concept of Contradicting Information where the information that we receive does not agree with each other.
  4. Perspective-Dependent Information: Think about how many times you have received the same information from two different secondary sources and determine that there are various versions of the truth. One version might be at a high level while another version might be at a lower level. This is the concept of Perspective-Dependent Information where information that you receive has been looked at from top-down, bottom-up and horizontal perspectives.
  5. Biased Information: Let’s face it, everyone has biases at some level based on their history, culture, societal norms, politics, religion, age, experiences, interactions with others and various other factors. These biases can creep into the information that we receive from others but also influence us when we make our own decisions. This is the concept of Biased Information where even in front of mounting evidence that challenges your views, you are still holding on to your conscious and unconscious thought processes to make decisions.

Now that you understand the various flavors of the information that you receive, it is time to ask the following:

 

Today

Tomorrow

Who receives the information? Who should receive information?
What happens to the information? What would happen to the information?
Where does information come from? Where would information come from?
When is information being shared? When would information be shared?
Why information is collected? Why should the information be collected?

When you ask the above questions, keep in mind that the information flavors and contexts are closely related. Even if you understand the information flavors being used but do not understand the context around them then your decisions will be skewed. On the other hand, be mindful of only looking at information that confirms your views (aka cherry-picking) since you will miss something that might have helped you better understand the world around you.

Information Flavors Information Flavors

5 Questions to Ask About Your Business Processes

The term business process is used to describe the connectivity of the various “steps” performed to achieve a certain goal. These steps are performed by information systems (e.g., calculate products sold per region), individuals (e.g., print/read reports) or a combination of both. The basis for these steps comes from policies (e.g., thou shall not eat at the computer), procedures (e.g., after you have created a report make a list of people who actually read it), governance (e.g., when information comes in or created by the organization then who and how it should be distributed), etc. These steps can be for a particular division (e.g., finance) and/or cross-functional (e.g., financial reports used by HR to make offers to potential hires). On the other hand, these steps can be wasteful (e.g., a division is creating reports for an individual who is not with the organization any more).

In order to understand the complexities of the business processes that are ingrained into the organization, the following questions need to be asked about your current and future business processes:

 

Today

Tomorrow

Who follows business processes? Who should follow business processes?
What happens in business processes? What should happen in business processes?
Where do business processes take place? Where should business processes take place?
When do business processes happen? When should business processes happen?
Why business processes happen? Why business processes should happen?

When you are asking the above questions across all levels of the organization, keep in mind that there is an interconnectedness among the information that you are collecting even if it is not evident at first glance. During or after the collection of this information, it is useful to create business process maps to show what happens and what would happen in the future. These maps should not be created just to be created but should be created to make intelligent decisions. These maps should be kept at a central place where people can easily have access to them and should be able to understand them without the need for an expert.

Another thing to be cognizant of who you talk to in the organization since depending upon who you talk to their definition of “a business process” might be different than what you are trying to understand. Yet one more term that is interchangeably used for the business process is workflow. For technical folks, this can also mean the business process that happens within an information system.

In conclusion, too often it is seen that organizations are struggling because of the ineffective communication and management mechanisms in place. By mapping the business processes, determining their qualitative and quantitative values, you will be able to see these gaps and make decisions that can prove to be beneficial to you as an individual and the organization as a whole.

Process

Zillow.com and the MLS CIO

Let’s suppose that you are Chief Information Officer (CIO) of a Multiple Listings Service (MLS) and a proposal has been put forth by Zillow.com to join Zillow’s Partnership Program (ZPP). For this scenario assume: (1) hiring, business processes, and technology infrastructure would remain unchanged and (2) a budget would only be provided to create data feeds used by Zillow.com. Here are some of the risks, disadvantages, and advantages of joining ZPP:

Risks to MLS:

  1. Brand recognition: The MLS brand recognition would be compromised if (1) current MLS users completely transition over and prefer to use Zillow and (2) future MLS users may not be aware of MLS’s existence.
  2. Zillow’s Zestimate: Zillow provides a property’s cost estimate to users based on a proprietary algorithm called the “Zestimate”. Research indicates that (1) in certain areas these estimates are wildly off and (2) Zillow has changed the algorithm in the past without prior notice. This would result in user confusion and the perception that it could be an MLS issue thus affecting the MLS’s credibility.
  3. Zillow’s acquisition strategy: Zillow has grown through acquisition and it is expected that this strategy would continue. Due to the complexity of management and systems integration during acquisitions, there is a possibility that not enough resources would be available from Zillow if there were issues with the MLS at the same time.
  4. Customer conversions: By joining ZPP, the MLS would exponentially increase the users who can view the MLS data through Zillow’s website and mobile applications. However, the increase in the number of views is not a guarantee that those users would become customers.

Risks to MLS Information Systems:

  1. Technology infrastructure: The MLS could encounter an exponential increase in the number of users who can view MLS data that could overwhelm the servers. This could be an issue if MLS is currently (1) running at full capacity and (2) does not have an updated technology infrastructure.
  2. Data security incidents: Due to sharing data with Zillow, the MLS could anticipate an increase in security incidents either from (1) data in transit from the MLS systems to Zillow and/or (2) data compromised at Zillow

Disadvantages by not joining ZPP:

  1. Users: 55.7 million mobile and web visitors access Zillow compared to the entire population of a major metropolitan area. The MLS would not have access to such a large user base if ZPP were not joined.
  2. Adoption: If MLS does not adopt in a timely manner then it would be perceived by the industry in general and the MLS community in particular as behind the times and could erode MLS’s ability to hire top talent for projects.
  3. Information relevance: Since (1) Zestimate pulls information from previous years’ tax assessments and (2) users have the ability to edit data, careful consideration should be made about the relevance of the information since the accurate reflection of the up-to-date fair market value could be an issue for the user.

Advantages by joining ZPP:

  1. Users: Access to 55.7 million mobile and web visitors that access Zillow monthly.
  2. Account Executive: A dedicated account executive would be assigned to the MLS. This could help in coordination and quickly resolving issues between Zillow and MLS.
  3. Metrics and traffic statistics: Zillow would be sharing user metrics and traffic statistics. This information (1) could be used by MLS to prepare for peak times and enhance maintenance schedules and (2) could be used by brokers and agents to improve their business through trends and predictive analytics.

Recommendations:

Based on the above, the disadvantages and advantages of joining ZPP, the MLS would not be ready however joining would be significantly beneficial. Thus the CIO should recommend:

  1. Budget increase needed to develop data feeds and updates the technology infrastructure to make it robust, resilient, scalable and highly reliable that could handle exponential user growth.
  2. New policies, procedures, processes and governance models need to be developed to address optimal firewall settings, data integrity issues, security, escalation, prioritization, communication channels between the MLS and Zillow.
  3. Recruit an experienced account executive that has taken their MLS through the same ZPP process.
Risks, disadvantages and advantages of joining ZPP
Risks, disadvantages and advantages of joining ZPP

Where is My Big Data Coming From and Who Can Handle It

Recently, a reader asked my insights on the article (Data Scientists are the New Rock Stars as Big Data Demands Big Talent).  Here is my response.

It seems like in today’s world people and organizations are somewhat struggling with this big data concept and do not know where to begin. Due to this reason, they are collecting everything they can think of in the hopes that one day they will be able to use this data in a meaningful way such as better customer experience, new products/services, better collaboration, increasing revenue, etc. This hope approach of “let’s collect data and later decide what we can use it for” on the surface might seem sound but last I checked hope is not a strategy. Perhaps this is one of the reasons that even now only <1% of the data collected is actually being analyzed. What good is more data when one cannot even make sense of the other 99%+ of data it already has? Are we chasing a ghost?

While it is true that vast amounts of data are and will be generated from financial transactions, medical records, mobile phones, and social media to the Internet of Things but there are questions that need to be asked to understand data’s meaningful use:

  1. How will data be managed?
  2. How will data be shared?

I believe that in order to come to a point where data becomes meaningful and useful it would require (broadly speaking) three phases:

  1. Establishment of standards, governance, guidelines. (E.g., open architectures)
  2. Creation of industry specific data exchanges. (E.g., healthcare data exchanges, environment data exchanges, etc.)
  3. Creation of cross-industry data exchanges. (E.g., healthcare data exchanges seamlessly interacting with environmental data exchanges, etc.)

Additionally, let’s keep this in mind that the data we are talking about is data that can be captured by current tools and systems but the data which is perhaps the most difficult to capture is unstructured human data which within organizations is called Institutional Knowledge. This does not reside in a document or a system but in the minds of the people of an organization who understand what needs to be done in order to move things forward.

So, the question becomes, do we really need Data Scientists who have a mix of coding skills with PhDs in scientific disciplines and business sense or do we need someone who is able to connect the dots and have the ability to create the future. The answer is not a simple one. Perhaps you need both. The ability to code should not be the deciding factor but rather the ability to leverage technology and data should be. I agree that there is a shortage of people with diverse talent but there is also a shortage of people who actually know how to leverage this kind of talent.

Before organizations go on a hiring spree they should consider:

  1. Why do they need a Data Scientist? (E.g., have strategic intent, jumping on the bandwagon, etc.)
  2. Who will the Data Scientist report to? (E.g., Board, CEO, CFO, COO, CIO, etc.)
  3. Does the organization have the ability to enhance/change its business model? (E.g., making customers happy, leading employees, etc.)
  4. Is the Data Scientist really an IT person with advanced skills or does s/he have advanced skills and happens to know how to leverage technology and data?
  5. How often will you measure the relevancy of the data? (E.g., key data indicators)
3 Phases of Big Data Harmonization
3 Phases of Big Data Harmonization