Realtime Scoring and Feedback for Dance Moves Using TensorFlow.js

In pretty much every game that’s ever existed, there’s some way to keep track of how you’re doing. This is usually in the form of a some sort of score value, such as how many points you have or how many enemies you’ve defeated.

In dance and rhythm games, such as Just Dance and Dance Dance Revolution, there are also scoring systems to give you realtime feedback on your performance. We’ll dive into these soon to see why and how they work.

As it happens, I am working on a dance game as well! It’s much more similar to Just Dance than DDR, but I think there are things that can be learned from both scoring systems. The goal here is to design a scoring system that feels rewarding while not rewarding the wrong moves. I want the player to feel good when they are doing well, and on the flip side, I want to give the player accurate feedback when they aren’t doing well so they can improve.

Examining Existing Scoring Systems

Before trying to create a scoring system from scratch, let’s take a look at some existing scoring systems for similar dance games to see how they did it.

Dance Dance Revolution

Dance Dance Revolution has been out for over 20 years, and its scoring system has been battle-tested throughout this time. If you’re unfamiliar with DDR, it’s a rhythm dance game where you have to step on arrows on a dance pad as they line up with the arrows on the screen to music:

(that’s me playing above!)

It’s a simple concept and it makes the scoring system pretty easy to figure out! If you hit the note right on, you get a perfect score. If not, you get less than perfect, depending on the timing.

Timing is a big concept in DDR. The scoring system uses the concept of timing windows, which are periods of time that you earn certain judgements. The timing windows for DDR look like this:

  • Marvelous: +/- 16.7 ms (+/- One Frame)
  • Perfect: +/- 33 ms (+/- Two Frames)
  • Great: +/- 92 ms (+/- 5.5 Frames)
  • Good: +/- 142 ms (+/- 8.5 Frames)
  • Boo: +/- 225 ms (+/- 13.5 Frames)

You get the idea - it’s tricky to get the best scores because you have to hit it perfectly in time with the music and on-screen visuals! But there’s no ambiguity to scoring - you either hit it on time, or you didn’t.

Your overall score at the end uses these “judgement scores” (the names of each score) to calculate an overall score at the end. If you got all Marvelous, you’ll get a perfect score. Otherwise, it’ll be a little bit lower. If you’re interested in learning more, there’s a fun documentary video about the best DDR player in the world here.

Just Dance

The scoring system in Just Dance is quite similar to what mine will be: every few seconds, you receive a “judgement score,” or text that indicates how well you’ve been doing in the time since the last judgement.

It looks like this during gameplay:

screenshot of just dance gameplay

Each player is represented on the top and receives various judgements (from OK to MEGASTAR) based on how well they did in the last period. If the got the moves perfectly, they’ll get a MEGASTAR, and if they didn’t do so well, they might get an OK or GOOD judgement.

Just Dance has been available on almost every platform imaginable throughout the last decade or so. Each platform is a little bit different on how the dance moves are detected. For example, when playing on the Nintendo Switch, the gyroscope and accelerometer data from your controllers is used to estimate your arm positions. After that, I imagine that the data is compared to the expected dance movements and you’re given a score. Only your hand positions are taken into account with this approach.

This same approach is taken with the modern Just Dance games on PC, where it connects to your smartphone to estimate your movements. Now it only has one hand to use to give you scores, which doesn’t seem like dancing at all to me! You could sit in a chair and get perfect scores with this approach.

Although, this isn’t the whole story. Upon doing more research, I found this quote from an article from 2010:

“We have built in a certain amount of leniency into the rhythm and precision detection, but to say that it’s smoke and mirrors simply isn’t true. We have constructed symmetrical movements using natural body dynamics to identify how well someone is playing even with just one controller. We look at the Wii remote as an extension of the players’ hands and body.

“At first we used the nunchuk as well, but found that the wire would hit you in the face and that you lost that sense of freedom you want when dancing. The system we use takes into account the way you move, dance and handle the Wii Remote to offer more precision in the detection frames. This is seamless for the player, but it takes a lot of work to enable the proper detection through a single input device.”

The game features a DDR-esque scoring system, measuring the player’s performance on a wide variety of factors. I ask Granger whether a better player will always beat a poorer player, or whether the machine can be duped by flailing arms and raw energy. “No way. If you pick up the moves and hit the beat, your performance will be scored accordingly. A better dancer will always beat a poorer player. If you want to be in time and accurate, you have a much higher chance if you follow both arm movements and keep the rhythm with both your lower and upper body. So essentially, you’re better off doing what the dancer is doing… Otherwise you will find the transitions difficult to match.”

What they did for the first Just Dance (and likely subsequent versions) sounds much more impressive than I’ve been giving them credit for! However, based on customer reviews, results from this “one-handed movement detection system” are mixed. There’s only so much data you can extrapolate from a single controller.

In contrast to the single controller though, there have been versions that use the Xbox Kinect (and Xbox One Kinect) throughout the years. These camera sensors would allow you to track full body movements, but I am unsure to the extent to which they were used in the scoring system. Reviews that used the Kinect seem like they had much better results than when using a single controller.

Side note: I have always been impressed by the fidelity of the dance videos in Just Dance and didn’t think that it was possible to achieve with just animated characters, and I was right! It’s super cool to see how much effort they’ve put into producing their dance videos.

Once you’ve finished playing a song, you’re greeted by a summary screen:

screenshot of just dance summary screen

It shows you how many of each judgement you received and an overall score based on those judgements. There are a few additional scoring attributes, but those are the main aspects.

sparky.tv

As it turns out, someone has already made what I’m making! It’s called sparky.tv and it’s very similar to what I am working on. It uses AI (or rather, ML) to analyze your movements in realtime and compare them to the video to give you scores.

I’m unsure if it uses the same TensorFlow model that I am using or some custom in-house solution instead. One interesting note is that all the processing is done server-side and takes about a half hour to process what are called “game intervals”, or areas of the game that the player will be scored in. My game’s processing is all done client-side and will usually take the duration of the video to process, if not less (depending on configuration options).

The gameplay screen looks a little something like this:

screenshot of sparky.tv gameplay

As you do better or worse, the little yellow running dude above the progress bar in the top right moves forwards or backwards. Every few seconds, a judgement (the “excellent” text and blue glow on the sides) appears based on where the yellow dude is on the progress bar. This means that the current judgement scores aren’t based on your current performance, but your aggregated performance over time. If you fail a move, your dude will go down and cause your next judgement to be lower, but it will not be terribly low (versus Just Dance or Dance Dance Revolution where if you miss a move you get a “miss” judgement score).

This scoring system has its merits in an experimental AI game - the AI that analyzes you isn’t perfect yet, so there are bound to be analysis mistakes. Instead of taking the score each frame, it aggregates it to give you a more consistent score that’s still meaningful and mostly representative of how well you’re doing.

After the game, it displays a summary screen of statistics showing how well you did, as seen below.

screenshot of sparky.tv summary screen

As seen in the screenshot, it gives you a grade based on your point score and also lists out how many of each type of judgement you got. The summary pictured here is a theoretical maximum score; that is, I used the video that you’re supposed to dance to as my webcam feed, so it was receiving the exact inputs that it was outputting. This should result in the maximum score possible since the movements are exact, but it did not result in a perfect score. (the video was also mirrored as my webcam since it assumes you are doing mirrored movements)

I also found a bug where if the game can’t detect you in the frame, it will continue to give you the same score. So if you get to the maximum “excellent” score, you will continue to get excellents until the video finishes. It may be possible to get a higher score by utilizing this bug.

sparky.tv has a lot of potential and is still a very new platform. It looks like they started as a simple site to watch workout and dance videos on, but now they’ve expanded to allow users to play alongside the videos and get scored using their AI. My dance game is similar in some aspects, but will have enough differences to set it apart.

My Scoring System In Action

Based on my research, I’ve implemented a comprehensive scoring system in DanceTime. This is what the gameplay currently looks like in my game with all the scoring logic above. In this video, I am feeding the direct dance video feed back into the game, so in theory, it should give a perfect score. But it doesn’t, and that’s okay! The TensorFlow model isn’t perfect, but it works pretty well!

Surprisingly, I can actually beat the “theoretical maximum score” (by feeding the dance video back in) myself sometimes! And other times…not so much! 😅 They’re a little more difficult.

There’s still a lot more work to do on this game, but it’s starting to finally come together! It’s playable now and I’ve been having a blast playing it myself and with friends.