FROM SPECIFIC TO GENERAL INTELLIGENCE

BY ADAM GROSSMAN

Team members from Google Brain, Google Research, and the University of Toronto recently published a paper entitled “One Model To Learn Them All.” What is the model and what is it trying to learn? The model is focused on how “deep learning yields great results across many fields, from speech recognition, image classification, to translation.” The learning is “for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning.”

What does this mean in English? One of the challenges in machine learning is that computers focus on specific, rather than general, intelligence. More specifically, machines are often trained to focus on a specific task (i.e. speech recognition, image classification, or text translation) requiring different models to solve each problem. A human brain takes a more general approach to problem solving so that one person can solve multiple problems with the same “machine” (in this case the brain).

For machine learning to be more practical in a business context, moving from a specific process to a more general process is critical. Sports sponsorship is a great example of why this is the case. A partnership usually entails multiple activation elements where machine learning could be applied. For example, a corporate partner of a team or athlete might want to track television viewable signage during a live game broadcast and examine the social media conversation around its brand at same time. In the past, this would require two separate models and likely working with two separate companies to achieve these results.

Google’s research demonstrates that this may no longer be necessary. Not only could one model be used to evaluate both these tasks but also the Google researchers found “a block is not crucial for a task, we observe that adding it never hurts performance and in most cases improves it on all tasks.” A block (as in a building block) in this case means “convolutional layers, an attention mechanism, and sparsely-gated layers…crucial for a subset of the tasks we train on.” Translation: this model is filled with “blocks” that are important for completing specific machine learning tasks.

What is interesting here is that the whole is greater than the sum of parts. The researchers have discovered that the model often works better when using all pieces together than when individual blocks work on their specific tasks. This is true even when a block is not required to solve a problem. In a sports context, let’s say a partner was only interested in getting results for television viewable signage. The general model that could be used to examine both television viewable signage and social media conversation actually works better at analyzing television viewable signage than just using a specific model for examining television signage.  

Why does this happen? According to the research team, “there are computational primitives shared between different tasks that allow for some transfer learning even between such seemingly unrelated tasks.” Translation, while the blocks are different they share some of the same foundational elements. Therefore, the blocks actually can learn from each other with even seemingly unrelated blocks adding new information to a task. This would be the equivalent of the sponsorship team helping the general manger determining if he should sign a player to a new contract. While seemingly unrelated, the sponsorship team can provide insights on the off-field value of a player to engage with the team’s fans in ways that can drive new revenue for that organization in part by examining perception, engagement, and value of social media conversation around that player. This insight can help a team’s general manager better determine if it makes sense to sign that player.

At Block Six Analytics (B6A), we share the same view that cross-channel analysis for similar tasks generates better results than using new technology for each task. We have developed products that leverage machine learning to create holistic insights that can be used by buyers and sellers of sports sponsorship. Although we do not use a single machine learning model, we have made optical character recognition (OCR) as a computer vision algorithm to "look" at images. B6A’s Media Analysis Platform (MAP) uses OCR to “see” text-based logos on screen (think of seeing the word “Pepsi” on screen) while our Social Sentiment Analysis Platform uses OCR to "see" logos images in social media posts.  

B6A does employ the same valuation model to determine ROI of corporate partnerships. Our Corporate Asset Valuation (CAV) Model applies the same framework of quantity (how many people see a partnership), quality (how valuable is the activation element and audience to the corporate partners), and engagement (how much interaction does an audience have with an activation element across all in-venue, traditional media, digital, event, hospitality, and IP activations).

Using a general model for specific assets enables our clients to compare results across different channels while also improving insight generation for each specific channel in a similar way that Google’s general model for machine learning produces improved results. In addition, the results of activation elements in one channel can be applied to a different channel. For example, our CAV model determines the value a television viewable logo receives in part by examining lifts in social media conversation on Twitter, Facebook, and Instagram.  

The research team makes it clear that its work “treads a path towards interesting future work on more general deep learning architectures.” More specifically, machine learning still needs to evolve until it can function as well as the human brain in terms of general intelligence. Moving from specific to general intelligence approaches will be vital to the future success of machine learning analysis.