Blog Post

Fantasy Hockey Sim: The Big Damn Database Refactor

Fantasy Hockey Sim: The Big Damn Database Refactor

I recently completed a year’s worth of upgrades to Fantasy Hockey Sim.  They were all important changes and the site is better with them completed, but it was never supposed to be that way.  I want to try to make some sense of it here.

It started with a suggestion.  “Can we track TOI and ATOI for players in their season stats?”  That would be time on ice and average time on ice.  And, given that we have the number of minutes each player skates in every game, yeah, that should have been easy enough to add.

So I started to add the two new fields to the stat_player table, where each players season stats are stored.  Looking at that, though, I was struck by how wrong that table was.  And that’s where the snowball started.

A brief history lesson…

Long before Fantasy Hockey Sim was a thought, I was building tools for managing the drwcentral.net Fantasy Hockey League (now the DetroitHockey.Net Fantasy Hockey League).  In 2006, I greatly expanded on those tools and developed a system for not just managing the league in its current state but also displaying historical data, such as career stats and past games played.

To do that, I wrote code and designed a database centered around importing archived historical data.  In that context, I always knew that stats belonged to Team X and/or Player Y, as played for a given game type of Season Z.

In 2013, I updated this system to handle multiple leagues, so then the stats belonged to Team A and/or Player B in League C for a given game type of Season D.

The problem was that was exactly how the stat_player table was keyed.  From the original 2006 design, it had never been updated to have an auto-incrementing key.  Instead it had a multi-dimensional key consisting of player_id, team_id, league_id, game_type, and season_id.  Additionally, it wasn’t even accurate, as farm league stats were simply another set of fields on the end of the player_stats table.

This might have made sense in the context of the format from which the stats were imported in 2006 but it was wrong for how things were used in 2019.

So I started to fix it.

Leagues have seasons.  Seasons have schedules.  Schedules are made up of games and have a game type.  Leagues have franchises.  Each season, a franchise fields a team.  A farm team has a parent team.  Now stats_player has a player_id, a team_id, and a schedule_id.  From that, you can find what league the stat record is for, what franchise it’s for, and whether or not it’s from the farm league.

But then I didn’t stop there.

Games had a home team and an away team, with scores for each.  They also had a set of power play stats for each team.  But that could be combined into a game_stat_team table, with one team having a flag denoting them as the home team.  Goalie stats and skater stats for each game had their own tables but they were merged to become game_stat_player, with position_id denoting whether they’re a goalie or not.

This continued until I had a database structure that I was happy with.

Now I just had to update all of my code to match the new database layout.

That would be no small thing regardless of the state of the codebase coming into this project.  I’d just redefined or added the concepts of seasons, schedules, franchises, and teams.  I had to build new tooling around that.

On top of that, like the database structure, the site code was based around importing historical data.  Much of the code was historical itself, having been originally written in 2006, with new features bolted on as necessary since then.  As such, the database redesign led to a major code refactor project.

All started by a request for two new stats fields, for data the system already had available.

So what have I learned from this?

I think it shows that I’d gotten complacent.  Both the database and the code connecting to it should have been refactored ages ago but, because they worked just fine, I didn’t touch them.  FHS is a personal project, so I can justify this by saying that I didn’t have the time available to go back and do it right, but I think that would just be an excuse.

That said, this project shouldn’t have snowballed in the way that it did.  I should have broken it down into smaller chunks.  I never would have allowed this in a project at the office, but because it was for one of my own sites, I let it get out of control.

Finally, in an effort to force myself to get the project done, I rolled in several unrelated features and launched them, so that I wouldn’t have the option of continuing on with normal operations of FHS using the old codebase and schema.

By not breaking it down into smaller pieces, I ended up burning out a bit on the project.  It took so long to get done and involved so much downtime that I was desperate to work on anything else.

Now that it’s done, I’m relieved to have the site back up and running and that off my plate.  I’m also glad to have the opportunity to look back, see what I did wrong, and work to do better next time around.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.