The Gist
- Hidden potential. Dark data, often ignored, can be a treasure trove of business intelligence when properly harnessed using AI.
- Security risks. Holding onto dark data without analyzing it poses significant security risks, like unanticipated data breaches.
- AI revolution. Advanced AI technologies like NLP and ML are unlocking new ways to turn dark data into actionable insights.
Dark data refers to the information a business collects, processes and stores, but fails to analyze or use for any business purpose. It can come from sources such as website interactions, IoT sensors, social media activity, transaction records, chat records, and more. While dark data has always represented missed value, artificial intelligence (AI) can finally provide the analytics horsepower needed to effectively tap into its potential on an ongoing basis. Marketers can shift from broad demographics to analyzing meticulous customer intelligence that is hidden within this unused asset. Let’s examine dark data, the ways that AI can be used to leverage it, and how it can benefit marketers.
What Is Dark Data?
The term dark data was born through an analogy of dark matter in physics. Astronomers tell us that around one-third of the universe is made of dark matter, so-called because it does not interact with light and thus cannot be seen. Similarly, the data we are unaware of may play a larger role for marketers than that with which we are aware.
Dark data can be thought of as the digital equivalent of an old attic, crammed with forgotten stuff that could either be valuable antiques or simply junk. Essentially, dark data is all the data a brand collects, processes, and stores during regular business activities, but generally fails to use for other purposes. For example, think about customer service call logs that just sit there, collecting digital dust. They might contain valuable insights into customer behavior but are largely ignored.
Dark data comes from a variety of sources: IoT devices, server logs, old emails, log files, business documents, social media posts, webpages, tables, spreadsheets, images and more. With the exponential growth of data, thanks to the digitalization of virtually everything, the amount of dark data is drastically increasing. According to a 2019 study by Splunk, 55% of an organization’s total data is dark. Additionally, a report from Veritas indicated that 52% of the average company’s data storage budget is spent on dark data. Not only is dark data useful — when it’s not being used to obtain actionable insights, it’s costing businesses money with no ROI.
Dark data can also be found in the Deep Web, which is an umbrella term for parts of the internet that are not accessible using a web browser. This should not be confused with the Dark Web, a smaller subset of the Deep Web that can only be accessed using special software that was created by the Tor Project to anonymize access to information for people living under the control of oppressive governments. Dark data should also not be confused with Dark Tactics, which are nefarious practices that deceive consumers into handing over their data in breach of the GDPR.
Dark social is another aspect of dark data and refers to the social sharing of content that happens outside of public view. The term was created by journalist Alexis C. Madrigal in 2012, and refers to social communications that can’t be tracked through the use of conventional web analytics tools. It typically occurs via private channels such as emails, DMs on social media platforms, messaging apps, SMS, and other private forms of digital communications.
Dark data can be a wealth of insights. Imagine if those old customer service call logs could tell you exactly what tweaks could elevate the customer experience. Conversely, holding onto dark data can pose security risks. Consider a data breach involving information that no one knew existed. Dark data is an opportunity that brands can’t afford to ignore.
Bob Brauer, founder and CEO of Interzoid, a data usability consultancy and generative AI-powered data quality solutions provider, told CMSWire that there is a great deal of insight in the form of unorganized, raw data that has been accumulated within organizations over time, but unfortunately, this potentially valuable resource is often overlooked, forgotten, or its existence is even unknown to the business from which it has been generated.
“If this underutilized, often vast ‘dark data’ can be properly harnessed, organized, and readily made available to a business’s decision-makers, these data assets can in many cases become a treasure trove of business intelligence,” said Brauer. “These signals can often be immediately actionable and provide significant value to a company’s operations, influence product road maps, assist in the development of sales game plans, refine customer communication frameworks, and sharpen marketing strategies.”
Related Article: The Role of Data Privacy in Customer Trust and Brand Loyalty
AI Can Mine Unstructured Data for Actionable Insights
AI has been able to analyze structured data for many years, but recent advances in natural language programming (NLP), natural language understanding (NLU), speech recognition, and machine learning (ML) have enabled AI to handle the ambiguity and complexity of unstructured data such as audio, video, text, documents, emails, chats, social media posts and more.
“Dark data within a company can take various forms, including social media posts, server logs, customer emails, survey responses, mobile GPS logs, website clicks, sales transaction records, employee files, and more,” explained Brauer. “When this diverse data is extracted and organized into a ‘state of usability,’ advanced AI analytics can transform it from a heap of digital clutter into a valuable source of internal intelligence, offering a significant competitive edge.”
View all
Marshal Davis, president and founder at Ascendly Marketing, a marketing agency and consulting firm, told CMSWire that dark data is often replete with unstructured text, which can be an opportunity for better understanding consumer sentiments. “This text can come from a variety of sources such as customer reviews, social media comments, or even chat logs from customer service interactions. While this data is usually ignored or underutilized, it holds the key to understanding your audience on a more personal and emotional level,” said Davis.
Related Article: 10 Potential Data Privacy Pitfalls for Marketers
Examples of Brands Using Dark Data for Marketing
Davis said that to tap into this reservoir of consumer sentiment, his firm uses advanced sentiment analysis algorithms. “These algorithms are based on NLP techniques that can understand the context, tone, and emotional undertones of the text,” explained Davis. “They can differentiate between positive, negative, and neutral sentiments, and even pick up on subtleties like sarcasm or urgency. This allows us to turn what was once considered ‘noise’ into actionable data.”
“By transforming these vague impressions into quantifiable metrics, we can tailor our marketing strategies more effectively,” said Davis. “For instance, if the sentiment analysis reveals a spike in negative sentiments following a product launch, immediate corrective actions can be taken. This could range from addressing product issues to modifying advertising messages.” Davis said that this level of responsiveness and customization was previously unattainable and has revolutionized their approach to customer engagement.
Once dark data has been located and analyzed, the opportunities it provides are myriad. Alex Hall, technical director at Intuita Consulting, a data and analytics consultancy, told CMSWire that his business used dark data to improve the customer experience. Through the analysis of the data from a video-on-demand (VoD) platform, Hall’s firm was able to take IP addresses from anonymized set top box (STB) content and access and error logs, and match them with customer router DHCP requests. This enabled them to identify customer issues with hardware and software combinations and feed this data into customer satisfaction and churn reduction initiatives. This required his firm to change the wording of its privacy policy to allow for the use of the content access logs in order to obtain this dark data.
Hall was able to analyze the breadcrumb and UI access path to content on the STB to determine portions of code that were seldom used and could probably be removed, saving development costs while enabling ad placement in the UI to be optimized. These two use cases provide examples of how out-of-the-box thinking and dark data provide brands with opportunities to increase ROI while enhancing and improving the customer experience.
The Challenges of Dark Data
While dark data represents a potential wealth of insights and opportunities, extracting value from it is far from simple. First, organizations must invest in cataloging and centralizing their dark data into accessible repositories. Legacy systems often silo data in ways that obstruct analysis. Moreover, dark data tends to be messy, requiring preprocessing and cleansing to be usable. The skills needed to handle large-scale data engineering and orchestration are in short supply.
Once aggregated, actually applying analytics at scale has its difficulties too. Advanced AI techniques like NLP and computer vision are data-hungry, needing huge training sets to be effective. Tagging and labeling unstructured data to feed AI is manually intensive. Many companies also lack in-house ML and data science expertise.
AI-driven data solutions such as Splunk, Microsoft Azure Cognitive Services, and IBM Datacap feature data analysis tools that enable brands to stay on top of their dark data. Ironically, one of the least obvious applications of dark data lies in its role in providing additional data that will be used to train AI applications. The more data that AI ingests, the better the actionable insights it is able to provide.
Brauer said that achieving a “state of data usability” is often the major prerequisite — and stumbling block — for conducting advanced AI data analyses, especially when dealing with dark data. “Common challenges include inconsistent data forms where the same data is represented in a number of different ways, redundant or duplicated entries (including the inability to either separate or combine interactions by the same individual or company), unverified information, and mismatched data formats,” said Brauer. “These challenges render the data difficult to make use of effectively.”
There are process challenges as well. IT teams accustomed to structured databases are now tasked with rapidly iterating to activate dark data findings. Responsible use of data demands solid governance frameworks be in place. And the technical complexity of deploying AI models on dark data adds overhead.
“Additional challenges in leveraging dark data involve the intricacies of data integration, where diverse data assets must be cohesively and coherently merged within analytics platforms to be available for use,” said Brauer.. “Also, organizations must often grapple with the sheer volume of data that requires curation, which requires considerable effort and specialized expertise. Furthermore, data security, compliance, and regulatory considerations add layers of complexity. A lack of expertise in applying relevant AI to these newly-prepared datasets is another obstacle, as is the ability to translate AI-generated insights into actionable business strategies.”
Final Thoughts on Dark Data
While bringing dark data to light poses challenges, the customer and market insights buried within present an enticing opportunity for brands. With the right data infrastructure, AI expertise, and governance, brands can efficiently mine enterprise data for actionable insights, a deeper understanding of customers and increased ROI while improving the customer experience.