The Customer-Centric Marketer  |  iperceptions Blog
The Customer-Centric Marketer - Customer Experience Blog

3 Principles You Can’t Ignore for Feedback Text Analysis


Dec 11, 2015, By Naoko Tomioka, Ph.D.

text analysis/text analytics

What do your customers want? This is a question that every company is striving to answer. Typically, the best method is to just ask. Asking customers on your website to provide feedback and answer open-ended questions, like “what do you want to see on this website”, allows customers to describe in their own words what they want.

Nothing beats open-ended questions in terms of insightfulness, levels of details and context provided. However, open-ended answers are harder to analyze, especially, when you collect a large volume of responses. Therefore, you need a proven method to capture what people say and zoom into the most important comments.

But not all text is the same and the methods used to extract the greatest insight will be different depending on where the text you collected came from. For example, open-ended responses collected from customer feedback left on your digital properties is completely different to text scraped from your Twitter feed or other social media platforms.

In this post, I will outline three guiding principles to consider when unearthing actionable open text insights from a Voice of the Customer program.

1. Understand how people speak

Text data collected from running a Voice of the Customer (VoC) project has very distinct linguistic patterns. The most commonly used words collected from a VoC project, typically, do not match the most commonly used words of formal English (c.f.[1]).

For example, below are comments left by visitors on a website responding to the question, “What could [The site] add or improve on the website to make it better?”

  • “Try to make the pages with less stuff that to make it go faster"
  • “More facts, less fluff”
  • “Give me the real prices!”
  • “Let me find and order accessories easier.”

As shown above, sentences from VoC text data are succinct, often leaving out subjects, objects, and other linguistic items, which prescriptive English grammar would treat as obligatory. The topics are focused, falling within a single taxonomic category of a generic semantic categorization.

These features of the VoC text heavily limit the effectiveness of generic text-analytic tools because text analytic tools are built on the assumption that the text data has a “standard linguistic form”, conforming to the rules of English grammar narrowly defined for formal speech and publications. In addition, the granularity of the information that is necessary for digital analytics makes general-purpose semantic categorization tools ineffective. These tools generally assign the semantic category “computer” to most keywords relating to the technical aspects of digital analytics.

In the case of VoC text data from a hospitality website, most of the text will be assigned to the semantic category “travel”. In order to extract actionable insights from VoC text data, you need much finer-grained semantic categorization system, with a clear focus on the semantic categories that are most relevant to digital analytics. Therefore, it is essential when analyzing VoC open-text answers to consider the different linguistic nature of VoC open-ended responses and the information need of digital analytics.

2. Combining digital analytics and Voice of Customer data

VoC data is linked with information that is crucial to digital analytics – “Who were these digital visitors?”, “What was their intent?” The answers to these questions are best captured through the integration of multiple pieces of information, some from the VoC data and others from structured data of digital analytics.

Visitor’s intent is best captured through a closed-ended question in a VoC study, and the visitor’s path to your digital property, or their geographical location are captured through digital analytic data. Such information allows you to segment your visitors and identify the most important group. Once the target group is identified, you can use the VoC text analysis to understand the experience of these selected visitors. The key to effectively performing this type of deep-dive analysis is the integration of all necessary information into your text analytic tool. By combining the granular insights from verbatim text and the precise segmentation using the digital analytics data, you can extract the most relevant information that is aligned with your digital strategy.

3. Integrating Key Metrics with Text Analysis

In order to prioritize your digital strategies, you need to address how your visitors feel about you. So, what is the best method to measure how digital visitors feel about their experience?

Assessing the sentiment from text is unreliable and imprecise. This is partly because emotions are context-specific. Bad movies don’t make people as unhappy as a website that crashed as they were about to buy that long-anticipated product. People don’t talk to their friends the same way as when they talk to a customer service agent (or the person who designed that website that crashed).

These differences affect the accuracy of text analysis that measures sentiment. General-purpose sentiment analysis tools are designed for one type of text (e.g. movie reviews – see [2]) and will not be accurate with the new type of text. Most importantly, according to iperceptions research, comparing the language of VoC text and the customer’s experience shows that the use of negative language does not equal negative experience. Instead of relying on text-based sentiment analysis, at iperceptions we combine semantic categories with structured data to identify issues, underperformers and over-performers based on the known key metrics.

By integrating multiple data streams, we are able to take advantage of the strengths of each source – the reliability and robustness of the quantitative data, such as customer’s evaluation of their digital experience, their digital intent, and whether or not their goals were met. Those were metrics that are most relevant to digital analytics and customer experience management and the integration of text analysis allows us to extract most pertinent insights from the survey verbatim. This allows us to identify concepts associated with the lowest overall experience, highest friction, or other metrics that accurately measure the quality of digital customer experience.

Not all text analytic tools are created equal

By asking for feedback from your visitors you can access a wealth of information. But just having mountains of feedback from customers is not enough. The power of feedback comes in gleaning key insights through text analysis but beware not all text analytic tools are created equal. When selecting a VoC solution ensure that it has text analytic capabilities that are specifically designed for digital analytics, and has the capacity to integrate multiple data streams to provide precise segmentation and to produce reliable quantitative measurement.

[1] R. Harald. Baayen. Word Frequency Distributions. Springer Netherlands, Dordrecht, 2001.
[2] Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1631–1642. Citeseer, 2013.

Naoko Tomioka, Ph.D.

Naoko Tomioka is an experienced data scientist with a Ph.D. in linguistics from McGill University. As a data scientist, Naoko works with some of iperceptions’ biggest clients developing integrated linguistic analyses to extract actionable insights from text-based customer feedback.

3 Principles You Can’t Ignore for Feedback Text Analysis


Dec 11, 2015, By Naoko Tomioka, Ph.D.
|0 comments

text analysis/text analytics

What do your customers want? This is a question that every company is striving to answer. Typically, the best method is to just ask. Asking customers on your website to provide feedback and answer open-ended questions, like “what do you want to see on this website”, allows customers to describe in their own words what they want.

Nothing beats open-ended questions in terms of insightfulness, levels of details and context provided. However, open-ended answers are harder to analyze, especially, when you collect a large volume of responses. Therefore, you need a proven method to capture what people say and zoom into the most important comments.

But not all text is the same and the methods used to extract the greatest insight will be different depending on where the text you collected came from. For example, open-ended responses collected from customer feedback left on your digital properties is completely different to text scraped from your Twitter feed or other social media platforms.

In this post, I will outline three guiding principles to consider when unearthing actionable open text insights from a Voice of the Customer program.

1. Understand how people speak

Text data collected from running a Voice of the Customer (VoC) project has very distinct linguistic patterns. The most commonly used words collected from a VoC project, typically, do not match the most commonly used words of formal English (c.f.[1]).

For example, below are comments left by visitors on a website responding to the question, “What could [The site] add or improve on the website to make it better?”

  • “Try to make the pages with less stuff that to make it go faster"
  • “More facts, less fluff”
  • “Give me the real prices!”
  • “Let me find and order accessories easier.”

As shown above, sentences from VoC text data are succinct, often leaving out subjects, objects, and other linguistic items, which prescriptive English grammar would treat as obligatory. The topics are focused, falling within a single taxonomic category of a generic semantic categorization.

These features of the VoC text heavily limit the effectiveness of generic text-analytic tools because text analytic tools are built on the assumption that the text data has a “standard linguistic form”, conforming to the rules of English grammar narrowly defined for formal speech and publications. In addition, the granularity of the information that is necessary for digital analytics makes general-purpose semantic categorization tools ineffective. These tools generally assign the semantic category “computer” to most keywords relating to the technical aspects of digital analytics.

In the case of VoC text data from a hospitality website, most of the text will be assigned to the semantic category “travel”. In order to extract actionable insights from VoC text data, you need much finer-grained semantic categorization system, with a clear focus on the semantic categories that are most relevant to digital analytics. Therefore, it is essential when analyzing VoC open-text answers to consider the different linguistic nature of VoC open-ended responses and the information need of digital analytics.

2. Combining digital analytics and Voice of Customer data

VoC data is linked with information that is crucial to digital analytics – “Who were these digital visitors?”, “What was their intent?” The answers to these questions are best captured through the integration of multiple pieces of information, some from the VoC data and others from structured data of digital analytics.

Visitor’s intent is best captured through a closed-ended question in a VoC study, and the visitor’s path to your digital property, or their geographical location are captured through digital analytic data. Such information allows you to segment your visitors and identify the most important group. Once the target group is identified, you can use the VoC text analysis to understand the experience of these selected visitors. The key to effectively performing this type of deep-dive analysis is the integration of all necessary information into your text analytic tool. By combining the granular insights from verbatim text and the precise segmentation using the digital analytics data, you can extract the most relevant information that is aligned with your digital strategy.

3. Integrating Key Metrics with Text Analysis

In order to prioritize your digital strategies, you need to address how your visitors feel about you. So, what is the best method to measure how digital visitors feel about their experience?

Assessing the sentiment from text is unreliable and imprecise. This is partly because emotions are context-specific. Bad movies don’t make people as unhappy as a website that crashed as they were about to buy that long-anticipated product. People don’t talk to their friends the same way as when they talk to a customer service agent (or the person who designed that website that crashed).

These differences affect the accuracy of text analysis that measures sentiment. General-purpose sentiment analysis tools are designed for one type of text (e.g. movie reviews – see [2]) and will not be accurate with the new type of text. Most importantly, according to iperceptions research, comparing the language of VoC text and the customer’s experience shows that the use of negative language does not equal negative experience. Instead of relying on text-based sentiment analysis, at iperceptions we combine semantic categories with structured data to identify issues, underperformers and over-performers based on the known key metrics.

By integrating multiple data streams, we are able to take advantage of the strengths of each source – the reliability and robustness of the quantitative data, such as customer’s evaluation of their digital experience, their digital intent, and whether or not their goals were met. Those were metrics that are most relevant to digital analytics and customer experience management and the integration of text analysis allows us to extract most pertinent insights from the survey verbatim. This allows us to identify concepts associated with the lowest overall experience, highest friction, or other metrics that accurately measure the quality of digital customer experience.

Not all text analytic tools are created equal

By asking for feedback from your visitors you can access a wealth of information. But just having mountains of feedback from customers is not enough. The power of feedback comes in gleaning key insights through text analysis but beware not all text analytic tools are created equal. When selecting a VoC solution ensure that it has text analytic capabilities that are specifically designed for digital analytics, and has the capacity to integrate multiple data streams to provide precise segmentation and to produce reliable quantitative measurement.

[1] R. Harald. Baayen. Word Frequency Distributions. Springer Netherlands, Dordrecht, 2001.
[2] Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1631–1642. Citeseer, 2013.

Naoko Tomioka, Ph.D.

Naoko Tomioka is an experienced data scientist with a Ph.D. in linguistics from McGill University. As a data scientist, Naoko works with some of iperceptions’ biggest clients developing integrated linguistic analyses to extract actionable insights from text-based customer feedback.

iper.text product sheet

Discover the unexpected

Use the power of iperceptions' iper.text for open-ended feedback analysis. Learn more in our product sheet.

Download the iper.text product sheet

Popular posts