It was clear almost immediately after I published my first pass at a predictive model for forecastin
It was clear almost immediately after I published my first pass at a predictive model for forecasting the Olympic team competition that it was inadequate. European Championships was the following weekend and the British team beat Russia in qualification, making it the second time in a year they had won against Russia in a competition format with three counting scores. Although the Russians prevailed in the team final, if the Brits were coming out on top that often, they probably have more than a 5.5% chance of medaling.So although I wanted to update the rankings periodically in the two months since I posted it, instead it took me this long to reassess and tweak the model to a point where I think it is more useful. As with before, I hope for it to be a starting point for this type of work, and an imperfect starting point is better than nothing.The general principle remains the same: it still simulates the competition from qualification to the team final 20,000 times and calculates the probability of each team attaining each result. A refresher of the components:Tenths Above Average (TAA): A statistic to measure the scoring potential of their gymnasts in relation to the 2016 average for that event. To calculate TAA, I created a pool of scores where domestic scores were adjusted to be more in line to how the gymnasts from that country scored at Worlds in 2015, relative to domestically. Then I averaged together each gymnast’s high and average score per event and calculated the difference between that and the same statistic for the entire pool. So, it’s an approximate calculation of how much a gymnast is worth per event, with an attempt to account for their scoring potential and how well they generally perform. You can see individual gymnast TAA per event on 4for4.info.The model then runs through qualification and the team final 20,000times, using TAA to determine line-ups. It will not be right about line-ups 100% of the time, but the instances where it isn’t correct probably involve a choice between two gymnasts have pretty similar scores. It generates a score for each gymnast based on their 2015-2016 scores from the pool with adjusted domestic scores, with their 2016 scores given more weight. The model uses those numbers to calculate a team score and placements. The odds are how often during those 20,000 simulations each of these results occurred.In this round, I added the average simulated score – so the average score a gymnast received over the course of the 20,000 simulations. Here are the tweaks I made to bring it closer to representative:Introducing more randomness: Gymnastics is a war of inches. There is only so much that is mathematically possible, sure, but the ramifications of mistakes, sometimes even minor ones, can be significant in this sport. I tried to embrace this aspect of gymnastics a little more. One way was introducing the possibility of a gymnast receiving a 0 on vault, which obviously is a threat that, when realized, has been a major game changer in the past. Because there did not seem to be any good way to quantify how often it actually happens, I informally crowdsourced the gymternet’s knowledge on Twitter. Based on the results, I went conservative and have a random gymnast get a zero about two percent of the time.Emphasizing recent results: Generally in statistics, more data = better results. Unfortunately, gymnastics just doesn’t have very many observations to begin with, and things change quickly. So although the input into the model still is each gymnast’s 2015 and 2016 scores, the recent 2016 results are now given more weight. I’m a little dubious of how hot the model is on Germany relative to Italy and Japan, but it’s because of that bias toward recency – Germany has looked very strong in the last couple months, versus Italy and Japan’s better record in recent history. Striking the appropriate balance on these factors is still a work in progress, but this is the model’s current conclusion. Rio will be a chance to evaluate it. Using only the finalized rosters: Now that all the teams are confirmed, I decided to do away with a six-gymnast pool for each team and use only their named five-member rosters. Of course, injuries are still an unfortunate probability in the lead-up to Rio. But a) hopefully that won’t affect the majority of the roster, b) injuries would not change each team’s outlook equally so this way the model is more reflective of what will happen on the competition floor, and b) this will give us another metric to gauge how an individual gymnast’s loss impacts the competition.To provide a counterargument to this, check out this team E-score analysis/set of team rankings from GymPOW. That methodology emphasizes World/Test Event execution and current difficulty, and rates Japan and Italy more highly. The more tools that are out there for this kind of analysis, the better off we are.Obviously, I will be updating this pending the results of the IOC ruling about whether or not to ban Russia tomorrow. -- source link