Camera Shot Segmentation

SoccerNet Camera Shot Segmentation and boundary detection

Selecting the proper camera at the right moment is the crucial task of the broadcast producer to trigger the strongest emotions on the viewer during a live game. Hence, identifying camera shots not only provides a better understanding of the editing process but is also a major step towards automating the broadcast production. This task naturally generalizes to any sports broadcasts but could also prove interesting for e.g. cultural events or movies summarization.

Our task.

Camera shot temporal segmentation consists in classifying each video frame among our 13 camera types. Concurrently, we define a task of camera shot boundary detection, where the objective is to find the timestamps of the transitions between the camera shots.

Our classes.

We provide annotation for all common camera shot types: {Main camera center, Close-up player or field referee, Main camera left, Main camera right, Goal line technology camera, Main behind the goal, Spider camera, Close-up side staff, Close-up corner, Close-up behind the goal, Inside the goal, Public,other,I don't know} as well as all transition types between cameras: {abrupt, logo, smooth}

Our data.

The data consists of 500 videos from soccer broadcast games available at two resolutions (720p and 224p). We also provide extracted features at 2 frames per second for an easier use, including the feature used by the 2021 challenge winners, Baidu Research.

Our Metric.

The mIoU is used as the main metric for the camera shot temporal segmentation. For the boundary detection evaluation, we use the spotting mAP metric with a single tolerance δ of 1 second as transitions are precisely localized and happen within short durations.

For more details, check out our development kit on github

SoccerNet-v2: a new MASSIVE soccer dataset

In this video, we present our paper: “SoccerNet-v2: A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos” published at the CVPR 2021 workshop CVsports. We provide 300,000 temporal annotations within 500 soccer games. This allows a 17-class action spotting task, a 13-class camera boundary detection task, and a novel replay grounding task. We provide benchmarks for all these tasks to start an international challenge.