In a world stocked to the brim with different products vying for the attention of consumers, brands need to be focused on personalized feedback to earn the trust of the user.

For my parent’s generation, personalization was simply a recommendation from a friend or family member who knew their situation and could recommend a product that might help. Today, Amazon sells more products through their use of sophisticated ratings and recommendation systems.

From an enterprise’s’ perspective, platforms such as Amazon, don’t offer much control for their brands to build a relationship with their consumers outside of damage control events (i.e. contacting after bad reviews and ratings).


Right Data Will Be Replacing Big Data - Google Docs Google Chrome, Today at 10.47.05 AM

This person is distraught due to their trial and error experience and the brand was not there to support.


Outside of the brand packaging, enterprises have no means of directly connecting with consumers in the store. However, there is another way that everyday products can be enhanced with a layer of personalization: the smartphone.


Every Product Can Be Personalized with a Smartphone

Enterprises, especially in Silicon Valley, are investing in the Internet of Things (IoT) because they realize that consumers are expecting more and more “smartness” in their shopping experience.

But how can this “smartness” be delivered? There are two competing options right now:

  1. Devices specifically designed by an enterprise containing RFID chips/sensors, or
  2. Something more ubiquitous, like processing images and sensor data from a smartphone.

External devices require a high per unit cost and have a much lower adoption rate than simply building product personalization into the smartphone.

The questions we ask is: what are the right elements to create such a platform?


Platforms for End-to-End Brand Management Employing Machine Learning Are Built from the Right Datasets

There are two elements you require to build a successful platform: (1) machine learning (or advanced machine learning tools) that are optimized for the use case and (2) an understanding of the end user and how these products affect their lives.

The above two elements present a need for dataset that reflects the needs of the end user to build optimal analysis and recommendation tools powered by machine learning.

My argument here revolves around why I think startups that work in the machine learning space for consumer healthcare and CPG need to continuously collect their own datasets outside of relying on datasets from enterprise clients or even datasets from a couple of years before.

Enterprises that are in the CPG and pharmaceutical space need to keep up with a burgeoning number of new devices out there and determine the best ways to integrate with these devices to engage their customers. This means they cannot rely on old datasets collected over the last several years to design tools involving machine learning.

As a company, CRIXlabs is continuously collecting new and updated datasets to refine our tools for the latest devices. This allows us to keep up with the trends in how consumers engage with their devices.

One of the factors that assist us in acquiring precise datasets is our years of research and development regarding machine learning on images and multimodal datasets. This enables us to statistically define the bounds of the datasets we need to collect (images and sensor data) and enable functionalities associated with user engagement such as diagnosing a skin condition and predicting the next instance of this condition.

In my efforts to study the consumer and find appropriate and effective user acquisition strategies, I’ve approach the acquisition process in the following manner:

  • Spatial: images, sensor data, GPS, and other methods of interaction. The right combination depends on the specific use case.
  • Temporal (time series): observing this data over time to see changes and patterns.

This approach has helped us secure precise datasets that diagnose skin conditions that were previously unable to be diagnosed via images and predict triggers for a specific condition. In instances where teams have resorted to mining big data to create algorithms, they do not have these capabilities.


Old School Data Acquisition Experts

From our experience working with multiple enterprise clients, they are usually either (1) sitting on big outdated datasets or (2) no relevant data for their desired projects.  One reason for this is the advancements of modern technology and how these users engage with it.  For example, just look at how smartphone camera resolutions have increased and the sheer number of pictures and video being taken on a daily basis.

Enterprises need to pick a method for data acquisition.  The first is going out and grabbing everything they can, Big Data, and after a certain amount of time use tools and teams to start finding insights from the data.  The second method is starting with a specific set of tools and precise data with heavy usability testing in the real world to derive an offering of value to the end user.

Let me explain the two methods above in an analogy about the Gold Rush:

Think of data as the dirt covering the United States. The only important data for your needs are gold nuggets.  If you’re tasked to find gold, are you going to pick a random spot to start digging and hope for the best? I’d hope not. You need to locate areas where veins of gold are highly likely to be found (such as in California) and put all of your efforts there.  

But how do you find that vein? What’s worked for us is a (1) laser focus on a well-defined use case and (2) tried and tested data acquisition tools. We were able to gather the right datasets to build our platform by carefully defining the bounds of the user base, nature of their interaction with devices and quality/specific number of data points of the data we needed to generate a training dataset to work in conjunction with the tools we created. I often find that the process of tool optimization and data acquisition for precise prediction and diagnosis is a system of constant iteration.  


Gathering Precise Data Is Now Mandatory for All Enterprises

My role at CRIXlabs centers around studying how users interact with their devices and acquiring appropriate datasets to enable the platform to accurately diagnose, predict and return recommendations and specific guidance for the user.

As a system design engineer, I’ve become an advocate for the Right Data and not Big Data. The acquisition of precise datasets has enabled our Quantified Skin platform to use images and sensor data that is within the smartphone to personalize products for a consumer.

Lately, we’ve seen that it’s not intuitive on how to find the Right Data and large enterprises are looking for guidance on how to form teams to go out and gather this precise data.  We’ll be holding some webinars in the coming months and in person meetups to help those interested in find the right data and appropriate tools for building real bonds with their consumers.



Jon Stenstrom is a co-founder of CRIXlabs (DBA Quantified Skin).  Prior to co-founding CRIXlabs, Jon worked as a systems engineer for Invetech and ran an eCommerce site for a grammy-award winning southern California based rap artist.  Jon brings his expertise in design, systems engineering, cognitive science, and knowledge of CRM tools to CRIXlabs. At CRIXlabs, Jon focuses on acquiring datasets for creating optimal tools based on machine learning and studying the nature of consumer interaction with their smartphones and wearables.