April 2023

Four challenges when running logo tracking on sport

Four challenges when running logo tracking on sport
Sponsorship is ultimately about brands paying to be in front of audiences. Armed with a logo detection model, I wanted to measure the amount of TV exposure that sponsors get from sporting events. Easier said than done.

The goal

Sponsorship is ultimately about brands paying to be in front of audiences. Armed with a logo detection model, I wanted to measure the amount of TV exposure that sponsors get from sporting events. Easier said than done.

 

How it works

In brief, machine learning (ML) relies on a neural network. This is a computer network set up for recognition or classification of images and video. The premise being that we programmatically tell our model to scan an image with a filter. It mimics human scanning of the image. We then create a map of image features. With training (telling our model what it’s looking at) and pass it through a series of filters. This allows the model produce predicted values for the image it is observing. 

Different looks

To be successful the model needs to be able to identify the object in full visibility, partial visibility and with varying levels of light. Interpreting shaded images is a must too. This is why it its called supervised (machine) learning. We are telling the machine what to learn.


Low quality images should be included in training a machine learning model. They help the model to develop a better understanding of how logos appear in different settings. The images you supply to train the model should broadly represent the standard of inputs you insert in your live processing. 

 

Sourcing data

Machine learning (a subset of artificial intelligence) requires significant data processing. For each new object you want to train your model to recognise, you need at least 1,000 data examples of that class. 


While you can find ready-made datasets, the task is no less daunting. Many datasets from public sources are poorly maintained. Others may not be exactly relevant to your use case. In my case I wanted to track logos and so I used Google’s logo image database, containing over 100,000 corporate logos. Though, it doesn’t lessen the sheer scale required for labelling if starting from scratch.

Similarly, a model won’t miraculously be able to see something that you can’t. A blurred input image is a blurred input image. That means that an object that you can’t make out through seeing won't unravel to be visible to the model.

A brilliant article and example by Brian Cockfield showed how artificial intelligence can provide “smoothing” of video sequences. For test cases like the World Rally Championship (WRC) and motorsports, his technique could correct images in live TV and deliver higher levels of consistent logo visibility. 


Capturing motion

Think of your eyes when you track a sequence of events as motion or interrupted motion, such as an overtake in a race. For a computer, to process this it must break the sequence into a series of frames. This is challenging for a model, especially true when using automated motion tracking. 


A running sequence would go something like this:

·       We identify a leading object and a trailing object.

·       The trailing object gets closer to the leading object.

·       We have static background objects or markings. But we can distinguish it doesn’t interrupt what we are observing.

·       Our trailing object now becomes level with our leading object.

·       Our objects merge.

·       Our new leading object accelerates and the two shapes become separated.


I saw automated object tracking tools incorrectly merge shapes when they overlapped. Take a look at this example of a penalty kick.



Our model needs to learn that the two objects should not combine into one new (elongated) shape. This is why the model needs supervision during labelling. 

 

The model needs to learn about the object you want it to track. This way it can assign the different sets of coordinates to the objects it sees. 


Fascinating. Frustrating. Interesting. Important.


Tracking against video is challenging. Motion creates new shapes, shadows and tones which the model can pick up as an object to identify. 

 

Another example to think about is a post goal celebration. Players on the same team coming together to hug and embrace. They raise their arms in the air or jump on top of each other. All motions that create new shapes, shadows and blocks of colour. The model is observing and interpreting to see if it can identify a logo it has been trained to detect. But quite often it generates false identification.


Cost

There is a huge amount of processing power that goes into object recognition models. The tools rely on cloud computing power. This brings processing overheads from global data centres. While the cost per unit is small (cents), once you begin to run this over increasing amounts of data costs can accumulate, rapidly. As much as running logo tracking across entire games is useful, it may not always be valuable.

Results

In short, there’s a large amount of error involved with working with video. A balance must be struck between practicality and precision. To also get more accurate results, monitor over longer timeframes. This way you can arrive at an average. Running on a small number of events introduces error. Consideration would also need to go towards press conference exposure, interview and media days. 

Summary

  • The data requirements for A.I. are significant with data training requiring vast amounts of data and prone to error.
  • A.I. and technology will become extremely sophisticated in helping to assess and value sponsorship opportunities.
  • Logo tracking is one component of a complex data driven valuation.

Exclusive to webflow marketplace