Last weekend, I volunteered for the fifth time with the Blue Lobster Bowl, the National Ocean Sciences Bowl regional competition for the Massachusetts region. My main role was as a moderator, reading questions quickly and clearly for my seventh year since competing myself back in high school. But this year, I also played another role behind the scenes, working with the regional coordinator to automatically post results to a partially published Google spreadsheet. Now teams (and anyone else!) could follow along real-time with how each division was going.

My presentation of the standings after the Round Robin portion of the 2017 Blue Lobster Bowl. This year, we gave everyone access to a Google spreadsheet with this information, updated throughout the day.

Volunteering for the BLB also gave me the opportunity to reflect on what makes a competition great. I’ve been involved in competitions in a range of capacities over the last 20 years, so some of these thoughts have been simmering for a while. That said, most of my examples will come from recent experience along with the Blue Lobster Bowl last weekend, which includes:

Playing in a church softball league with a regular season and playoffs.

Teaching high school students extracurricular (“contest”) math through weekly team competitions.

Playing in frisbee tournaments both in college and in a Boston-area hat league in the fall.

In all of these contexts, I’ve taken a special interest in improving the structures present. This has ranged from reasonable discussions with my softball team captain about the logistics of seeding a double-elimination tournament with some missing games to a moment at a college frisbee tournament where I went a little overboard on the analysis. The schedule for the second day of our tournament wasn’t what I was expecting, and as I pieced it together, I eventually realized that the team from Claremont had dropped out. That realization became a running joke on our team; for instance, this picture of me looking for a piece of my watch that had fallen off later that day got the caption: “WHERE IS CLAREMONT?”

All of this to say, I’ve cared a lot about competitions and running them well for a long time, and I thought it might be helpful to share some of my thoughts and general approach to making tournaments better.

The Competing Purposes of Competitions

Why compete at all? Nothing is really hinging on which church in the Boston area has the best softball team, or even which team is the best at answering buzzer questions about science. Winning that second competition gave me, along with the rest of my high school National Science Bowl team, the opportunity to meet President Bush, but that didn’t mean he started consulting us on policy.

Everyone says Bush looks Photoshopped here, but it’s real! His comment to us went something like this: “When I look at you, I see… well, I don’t know what I see, but I see smart people!”

The way I see it, competition tends to have three purposes: (1) to recognize, reward and encourage excellence, (2) to build enthusiasm for the subject of the competition, and (3) to entertain both the competitors and audience. Good competitions must do all three of these well, and the trickiest decisions relate to tradeoffs between them.

With these purposes in mind, I’d like to go through some of the common considerations and structures that I’ve found effective in helping competitions to best achieve these objectives. As a note, I’ll generically use “competitor” to refer to either teams or individuals who join the competition.

Regular Seasons and Playoffs

There is a tension in most tournaments between trying to identify the best (objective 1, to reward their excellence) and making sure everyone gets an opportunity to play (objective 3, to entertain every competitor and their fans). If through the first few matches, a competitor has already demonstrated that it’s not the best, it isn’t necessary for them to keep playing, but it’s often more enjoyable to not simply eliminate poor performers too quickly.

The natural solution that many competitions take is to split the matches into a “regular season” and a “playoffs” for the top teams, like happens with all of the major sports. For shorter tournaments, the first phase of the competition often involves splitting the teams into “pools” of teams who all play each other (“round-robin”) and from which only a fixed number advance. This is the format illustrated in the Blue Lobster Bowl results above; in each of the four “Divisions”, every pair of teams played one match against each other, and as you can see in the rightmost column, the top two teams advanced.

The “playoffs” in such a structure are typically geared towards determining the overall best competitor. This is often done through an elimination-style tournament, where losers are eliminated and winners advance until there is only one left. The two most popular such tournaments are single-elimination tournaments, the simplest and most common style you’ve definitely heard of, and double-elimination tournaments, which are a little more complicated but allow a competitor to recover from a single loss; it takes two to be eliminated. I’m a big fan of double-elimination tournaments for a couple reasons: It allows competitors to recover from an off-game, and also naturally outputs the top four without any post-elimination matches.

Either way, keep in mind that there has already been thought as to how to set up such a bracket. I’ve heard horror stories of someone making the top two seeds play in the first round (they just filled it in from the top, 1-8). Don’t do that, but also get some of the other details right. For instance, when competitors from the winner’s bracket in double elimination drop down to the loser’s bracket, they need to alternate halves so rematches are rare and late if they must occur.

When I first took over the MIT Integration Bee, the immediately previous format had been a form of an elimination tournament for the entire time. At first, I modified this to end in a standard bracket playoffs, but then realized that the first phase didn’t need elimination. I could just have the students compete in groups to solve a certain number of integrals, with the schedule written out ahead of time. I called this a “regular season” and it also gave me a chance to use a Balanced Incomplete Block Design, even visiting MIT’s library to find this listing in The Handbook of Combinatorial Designs:

Each column corresponds to an integral, offered to four of the 16 students, labeled with 0-9 and a-f. Naturally, I rearranged the order so that the player corresponding to the label “0” didn’t just get the first five integrals.

My point is that whatever competition you’re designing, someone has probably found the best way to do it, and you should at least know what it is.

Teaming Up

This is my fifth year teaching classes of 6th-10th graders as a part of the IdeaMath weekend program. The program is designed to help students learn problem solving through problems from math competitions like the AMC’s. Naturally, many come because they are competitive, and I’ve slowly learned that mixing in actual competition into my weekly classes goes a long way in encouraging them to do math on a Saturday.

But what type of competition? When I considered this question a couple years ago, lots of directions were pointing towards team competitions. The standard math competitions they take are almost entirely individual, leaving many strong math students lacking in people skills. (Even some “team” competitions just require students to pass an answer from one problem to the next.) In addition, I tend to have more material than almost any student can solve in the time we have, so having them work in teams gives me a chance to get through more problems and give everyone a taste of some of them before I go over solutions.

I’ve developed my system to the point where I think it’s worth sharing in detail, but if it doesn’t interest you, feel free to skip to the next section.

Still with me? Okay, I start by dividing the students into 4-6 equally strong teams of 3-4 students each, a rather involved process that I’ll describe later. In class, I have them sit together with their team and give everyone a copy of the handout. I’ll announce which problems we’re doing in chunks of around 4-5, and they’ll get started. If someone solves a problem, they have to submit an answer to me to get points. The first team to get a problem is awarded 10 points, the second team 9 points, and so on. If a team submits an incorrect answer, though, they lose 3 points on their final score on that problem, though never below 0. After a certain time, I’ll call the end of the round (giving them plenty of warning and a last chance to submit), and go over the problems. Then I repeat for the next handout and so on.

I think this “team race” system sets up the right incentive structure. Students are rewarded for solving problems correctly, but there is also a significant penalty for making mistakes, so they’re encouraged to check their work with each other. At the beginning of the round, they tend to claim problems (“I’ll do #3”) and they’re encouraged to communicate so they don’t duplicate their work. Stronger students generally get to more problems, as will be the case any time they’re working hard, but weaker students can still contribute to their team by solving a different problem.

How do I pick the teams? The first step was finding out who would be missing each week; one member gone can really mess up the equal teams. So I had to collect the e-mail addresses of the students (or their parents) and remind them every week to let me know if they’d be gone. I like to think that this also gives them practice checking e-mail like an adult.

A small sample of my students’ normalized and weighted scores. The sixth column corresponded to a test, on which the fourth student did quite well while the fifth did quite poorly. Most of the other scores were team competitions.

It also requires me to know the relative strengths of the students in the class, so I can balance the teams. To do this, I keep a running table of previous rounds’ results, and normalize all of their scores (from team races and tests) to have mean zero and standard deviation one. I then take a weighted average of each student’s normalized scores, along with a significant component on their normalized test scores (we have tests every three weeks), to determine numerical strengths. I then renormalize these to have mean 300 and standard deviation 100; this usually gives all of the students have positive “strengths” (which is only relevant if the teams have different sizes).

Finally, I put all of those strengths into a simulated annealing algorithm I wrote up in Mathematica, which tries to find the optimal teams to minimize the variance of the total team scores. This also allows me to include information about which pairs of students have been on teams in the past; I add this “diversity score” to the objective function to encourage the algorithm to pick teams different from the past (with the most recent weeks weighted more highly). Finally, I also include a term to encourage equal-sized teams; students notice if there’s a team of 3 and a team of 5, and I’m not that confident that the mean-300 estimate is the best way to value an extra player. I then initialize this simulated annealing a few times and see which final value has the smallest composite objective function.

The output of my algorithm last week. The first three printed lines are intermediate results of the computation (essentially a progress bar). As you can see, the final result on the first initialization was the best; while it had a slightly higher variance than the other two, its diversity score was much better.

This also naturally gives me lots of information about which students are doing well compared to others, which is useful for large classes like the ones I’ve been teaching recently. In several ways, this teaching system scales well and has allowed me to take on classes of 20 or more, which is typically fairly large for IdeaMath. And the students love it — one former student e-mailed me to ask if I was going to be teaching this class again; he had a great time and wanted to encourage his brother to take it the next year.

Minimizing Blowouts

Another consideration when designing a competition is to minimize the games played between mismatched competitors. This is important both because such matches are usually not very entertaining or fun, and because those matches often give the least useful information needed to rank competitors.

However, if you aren’t careful, you can end up disadvantaging some competitors. For instance, when we were designing the Blue Lobster Bowl’s pools, our first iteration had all of the A teams listed before their corresponding B teams. Naturally, since schools will make their A team stronger, this loaded up Division 1 with a bunch of A teams, while Division 4 had only B and C teams. This was a problem because our structure was fixed: only two teams would be advancing to the playoffs from Division 1, so we had just made it much more difficult for those A teams to advance.

When designing pools in such a system, therefore, one unfortunately can’t do much. To make each division nearly equally difficult to advance from, we ended up spreading the 12 A teams evenly across the divisions, then distributing the B teams evenly (also making sure no two teams from the same school were together), and finally placing the C teams in two different pools.

In other structures, though, one can try to minimize the mismatches. One classic way to do this is a Swiss-style tournament, where competitors only play others that have a similar number of wins so far. On the other hand, this system needs to be constantly updated, which isn’t very feasible for something like the BLB, and the choice of matchups can be rather opaque if the community isn’t used to it already. I’ve mainly seen it in chess tournaments, although I tried to introduce it to an online Dominion tournament among friends (that we unfortunately never actually completed).

Engaging Spectators

Spectators are often underappreciated when it comes to these math and science competitions. While some are excellent at showmanship — see the Mathcounts Countdown Round which is streamed live every year below for an excellent if still imperfect example — many still lag. One former Integration Bee organizer told me that his biggest piece of advice to me was to find a good MC; he had seen how they enhanced the experience.

I like to think of publishing the standings I referenced at the beginning as one small way to make the audience more engaged and interested in the outcome. Publishing the link on the BLB website and in the competition programs allowed anyone to check on the current standings, with all of the round results on the first tab, each division’s standings on the second tab, and the playoff bracket on the third. I worked up that spreadsheet to automatically display results from a simple hidden input sheet, with almost all of the tiebreak scenarios automatically worked in. We have some hope of applying a similar method at Nationals, so parents and other competitors back home can follow along without having to wait for texts or updates online at the end of the day.

Still, there is plenty of room for improvement in making some of these competitions more enjoyable for spectators. Our softball league could have a better website. Videotaped science and math competitions could use some commentary to help the audience understand how the students likely solved the problems, like the commentary in some football games on the route that the wide receiver ran, say. If I had a lot of extra time, I’d love to put together an online version of my contest math teaching style, so my students could submit answers that could be instantly checked and I could accommodate an even larger class.

All of that said, many of the competitions I’ve been involved with have gone pretty well, and I’m proud to have been part of them.

I enjoy your posts Sam. I work in the youth baseball, fastpitch, volleyball and basketball world. We have lots of discussions in the office about the best way to create brackets. The latest one that we’ve used in volleyball is to put all of the top teams into the same pool. Of course we must know ahead of time who they are but assuming we do, we put them all in the “A” pool. They play 3 or 4 games/matches against “like” competition. When we go from pool into bracket play, these teams (1 in A, 2 in A, 3 in A….) are put into strategic spots on the brackets such that they will probably end up as the final teams in the bracket but no guarantee. This also gives “lower teams” some good matches in their pools. Everyone then gets a chance to “win” the bracket. Too often the better teams are having easy games until they get to the semi-finals and therefore only have 2 good games. The best “defense” we have for this is when the “lower” teams complain that they aren’t given a fair shake in the bracket. We tell them that if they couldn’t beat the “like competition” in their pool to get a good seed then for sure they weren’t going to beat the top A teams anyway. Probably not the best explanation but it has been very popular in volleyball, especially with the college coaches who come to recruit. They want to see the girls play in a competitive match, not a blow-out.

Yeah, if it’s possible to be flexible, something like a Swiss system seems to do a good job of giving competitive matches for most of the day while also sorting pretty fairly. You can think of it as similar to what would happen if you added a 2-loss bracket, a 3-loss bracket, and so on, to a double-elimination tournament.

I enjoy your posts Sam. I work in the youth baseball, fastpitch, volleyball and basketball world. We have lots of discussions in the office about the best way to create brackets. The latest one that we’ve used in volleyball is to put all of the top teams into the same pool. Of course we must know ahead of time who they are but assuming we do, we put them all in the “A” pool. They play 3 or 4 games/matches against “like” competition. When we go from pool into bracket play, these teams (1 in A, 2 in A, 3 in A….) are put into strategic spots on the brackets such that they will probably end up as the final teams in the bracket but no guarantee. This also gives “lower teams” some good matches in their pools. Everyone then gets a chance to “win” the bracket. Too often the better teams are having easy games until they get to the semi-finals and therefore only have 2 good games. The best “defense” we have for this is when the “lower” teams complain that they aren’t given a fair shake in the bracket. We tell them that if they couldn’t beat the “like competition” in their pool to get a good seed then for sure they weren’t going to beat the top A teams anyway. Probably not the best explanation but it has been very popular in volleyball, especially with the college coaches who come to recruit. They want to see the girls play in a competitive match, not a blow-out.

LikeLike

Just noticed that you did touch on that concept with the “Swiss” version. Somehow missed that when I was skipping the team part ;-).

LikeLike

Yeah, if it’s possible to be flexible, something like a Swiss system seems to do a good job of giving competitive matches for most of the day while also sorting pretty fairly. You can think of it as similar to what would happen if you added a 2-loss bracket, a 3-loss bracket, and so on, to a double-elimination tournament.

LikeLike

I will have to look into that.

LikeLike

Pingback: The Hard Problem of Teaching | The Christian Rationalist