Plotting a hockey rink in R

Plotting a hockey rink in R

Raise your hand if you thought you’d be reading about W. E. B. Du Bois and hockey on the same website. If your hand is up and your name isn’t Willie O’Ree, you’re lying. If you are actually Willie O’Ree… YOU’RE READING MY WEBSITE RIGHT NOW?! Also, congrats on your long overdue induction into the Hockey Hall of Fame last year.

Anyway, regardless of your name situation, if you read that post and thought “I want to know more about this hockey plot though,” today’s your day.

A long time ago in a galaxy far, far away, I analyzed NHL shot data for a spatial statistics project. I decided it’d be cool to visualize this with a plot conforming to the official dimensions of an NHL rink, and here was the result:

An NHL rink is \(200 \times 85\) feet, with each blue line 25 feet from center ice. The play-by-play data I scraped had the coordinates recorded as \(\{(x, y): x,y \in [-100, 100] \times [-42.5, 42.5]\}\), so to keep everything in one zone relative to the goalie’s perspective, I negated both \(x\) and \(y\) if \(x\) was negative.

One thing that annoyed me at the time is that the corners aren’t quite rounded like they should be. Another thing that really annoys me right now is looking at my code and seeing the following:


This is an actual line of code I once wrote. Three strikes1 of bad coding:

  1. I don’t know what 76.441140475368970 is. (Well, it’s obviously a coordinate of the center of this circle, but you get my point.)
  2. I don’t know how I computed 76.441140475368970.
  3. I don’t know why the level of precision to… 15 decimal places is necessary.

Actually, there’s more than three strikes here, but I’m stopping there at the risk of further embarrassment. So anyway, I tried to fix those things:

(Aside from the corners, note the differences between the two plots. My first plot was made circa 2012–2013, and the new one was made circa… this past weekend. Somewhere in that time between 2013 and this past weekend, the NHL tweaked the rink configuration a bit; namely, the hash marks outside the face-off circles are farther apart,2 and the restricted area behind the net is wider.)

One of the goals of my project was to account for spatial trends in determining… something about scoring chances? I don’t really remember, honestly. According to my project proposal, the original plan was to examine the differences in the rate of goals and scoring chances pre- and post-2005 lockout, but I recall not having enough pre-lockout data to do so.

Anyway, I divided the offensive zone into separate regions based on distance and shot angle from the net, and defined a neighboring structure as zones sharing either a similar shot angle or distance:

The lone exception is the black semi-circle surrounding the crease, which I assigned as a zone with no neighbors. The assumption was that shots in this area might represent unique scoring chances not present in the other regions (e.g., tip-ins, rebounds, etc.)—although in retrospect, I made no similar adjustment to account for the non-trivial amount of shots from the point and other areas higher in the zone solely intended to set up those chances (and I’m not even sure if the data allows anything to be done about that).

Here’s the shot chart for the entire 2010 Stanley Cup playoffs3 superimposed over my plot (points jittered for clarity):

In my project proposal, I noted that treating these regions as discrete might not be the ideal way to account for the spatial correlation, but we hadn’t yet learned about spatial Poisson point processes, which—based on my cursory introduction at the time—seemed to be more appropriate for what I wanted to do. (My exact words: “Since the shots come from random locations on the ice, they should be treated as point-pattern data.”) Maybe I’ll re-visit my analysis with this in the future, unless an expert in spatial statistics tells me I’m an idiot. They may not be wrong.

I’d be remiss to talk about this project without shouting out my friend Ryan Parker for his assistance; he shared his code for creating a basketball court, which I used as a guide for my own. He also helped me write a script to scrape play-by-play data from the NHL’s website. I’d talk more about the latter, but it seems to have been corrupted during transfer to/from various laptops; I’m staring at four files—NHLpbp.R, hockeygrid.R, st733projectgrid.R, st733projectplotgrid.R—which are all just completely blank. But because this was at a time where I didn’t believe in organzing code in any way, I at least copied the plotting part into one giant, 1,079 line script for this project, allowing me to find it years later and write this post. Morals of this story:

  • Use Git as version control and also as a backup of your files.

  • Never write a 1,079 line program unless you have a really good reason for doing so. I can’t believe I got an A in this class.

  • Seriously, learn how to name your files. I can only assume that NHLpbp.R is the one that actually gave me the play-by-play data, and I have no idea what the difference is among the three aforementioned grid files.

I extended this plot a bit to include the choice of the offensive zone (right or left) and which rink type (NHL/North American or IIHF/international) to use. I also went ahead and made a ggplot version, which additionally plots an entire half of the rink instead of just the offensive zone. Code available on GitLab and GitHub.

After finding this, I wondered if someone else had done this, and lo and behold the answer is yes with code that looks substantially simpler than mine does. This will also plot the entire rink rather than just one offensive zone. I also found an interactive plot with Python’s Plotly, and the result appears similar to my original one.

Also, my high school physics teacher liked hockey,4 and I think he was a big fan of the St. Louis Blues. So, Mr. Stüller, if you’re reading this and are indeed a Blues fan, congrats.

  1. Three strikes do not mean I’m out, though. Subject of a future post.↩︎

  2. I’ve been watching hockey for 20 years and just realized that I actually have no idea what purpose these hash marks serve. Maybe something to do with positioning on face-offs?↩︎

  3. Excluding the following: shots from below the goal line, shots from outside the offensive zone, empty net goals, penalty shots, shootout attempts (not applicable for the playoffs)↩︎

  4. All of his bonus questions on assignments were about hockey. My classmates didn’t think this was fair since I was the only one who watched hockey. I didn’t think it was fair that time I actually lost points for getting the “bonus” question wrong.↩︎

Matthew A.

Statistician. Ohio State alumnus. I like jerk chicken and Prince. I don’t like anything else. Just because I’m kidding doesn’t mean I’m not dead serious.