This post is going to discuss some of the analysis I had to do in a project recently. The project is the creation of a basketball manager game. During the design phase of the project it became clear that one of the most important aspects of the project was going to be defining and implementing accurate player behaviour. So I had to come up with some method of defining different players and how they played. I came across a talk From 5 to 13: Redefining the Positions in Basketball from the Sloan Sports Conference 2012. This started me down the track of finding different player types and the players that fit into each group. Then from reading Basketball Analytics: Spatial Tracking by Stephen Shea I had a method of calculating these groups.
The first step is to collect the data. This came from two places. During the design phase this came from basketball-reference.com which contains a large database of basketball statistics. All the player data was collected from the 2013-2014 season and saved in a spreadsheet. During the evaluation phase this data came from the output of simulations. Once the data is stored in a spreadsheet the data must be standardized. This is done by subtracting average and dividing by the standard deviation of each statistic. However the standardize function supplied with google spreadsheets was used to speed up this process.
The next step is to apply a filtration to the data. By doing this each player will get an x and y coordinate on a chart that is represented by their playstyle. The x and y coordinate is calculated by looking at certain standardized statistics that describe a player. For this project I used a simple filtration that looks at assists, rebounds, three point attempts and two point attempts, pictured above. A much more complex filtration could be used, however I felt it best to get a simple base to work from. The book describes a filtration containing 8 elements(directions) and using a wider array of statistics. The assists and rebounds are on the x axis and the three point and two point attempts are on the y axis. The calculation to get a player’s coordinates is:
PlayerX = rebounds – assists
PlayerY = two point attempts – three point attempts
This is applied to each player in the spreadsheet. The data can then be put into a chart using the x and y coordinates. Similar players will be grouped together and certain play styles can be found. The chart below shows the bubble chart of the NBA players in 2013-2014. It only includes players who played more than 30 minutes a game to make the chart more visible. Players are grouped by their team and use their points per game to measure the size of each bubble.
The final step is to group the players together. The image above shows a rough outline of the 9 player types that I found during analysis. Each of these types had to be implemented into the game so that the AI players could behave in that manner. The image below shows the 5 different positions in basketball playing as each of the different strategies as an “average” player. Its worth noting the slope between the 5 positions as you would expect. The point guard has more assists and takes more shots outside while the centre is getting a lot more rebounds. Also the increase/decrease in the y-axis matches how the strategies should be playing.
Finally we have the recreation of the Memphis Grizzlies and the Golden State Warriors in the game and their stats compared to real life. While the players aren’t all exactly matching up, most are close to where they should and playing in a similar manner. The project has been quite successful in recreating the game of basketball and used some good analytic techniques in proving this. I will continue to work on it in the future however, trying to get it even close to the NBA and real life basketball.