이번포스팅은 Python(파이썬)을 이용한 데이터분석 프로젝트이다. 

데이터분석, 데이터시각화를 위한 패키지인 pandas, matplotlib, numpy 위주로 사용해보았다.


 


컴공에 열중이지만 체대 출신인만큼 난 운동을 사랑한다.

그중에 농구를 가장 즐겨했는데 중학생때 NBA의 레이알렌(Ray Allen)이라는 선수에 매료되어

농구를 시작하게 되었다. NBA 역사상 3점슛을 가장 많이 넣고

3점슛 폼이 깔끔하고 아름다운것으로 유명한 전설적인 슈팅가드(SG)이다. 




그래서 Python을 통해서 이 선수의

매시즌 항목별 경기실적 vs 동시즌 NBA 전체 평균경기실적 && 세계최고의 슈팅가드로 꼽히는 선수들 경기실적

데이터들을 얻어와 직접 그래프로 시각화(visualization)

해봄으로써 이선수가 왜 대단한선수인지

(참 추상적인 주제이다...ㅋㅋㅋㅋ) 증명해 보기로 하였다.


프로젝트의 모든것은 역시. *English*


아래는 내 개인 Github에 repository에 보관해놓은 프로젝트 설명과 내용을 그대로 가져와봤다.


=======================================================================


- 대략적인 한국어설명 -


제목 - 왜 NBA 선수 레이알렌은 최고의 슈터인가


데이터출처 - basketballreferences.com


데이터분석 그래프 / 사용된 패키지 및 메소드

    Line plot / pandas , matplotlib.pyplot

    Scatter plot / pandas , matplotlib.pyplot 

    Histogram plot / pandas , matplotlib.pyplot

    Heatmap / seaborn , numpy , matplotlib.pyplot

    Plot Animation / matplotlib.patches , matplotlib.path , matplotlib.animation , matplotlib.pyplot


참고문헌


사용한 소프트웨어 - 쥬피터노트북



NBA-Ray-Allen-analysis

Description

Title

Why NBA player Ray Allen is the best shooter


Title Selection Reason


1) Love of NBA basketball
2) Favorite player in NBA is Ray Allen
3) Many people doesn't know about this player even though he his the one of the best shooter in NBA history
4) Wanted to show why he is the best shooter in NBA through NBA stats

Hypothesis


Ray Allen is the best shooter in NBA history

Attaining Internet Data Source


Through www.basketball-reference.com web, searched the wanted player(Ray Allen), NBA overall average per season stat and brought the data into xlsx file and read it using pandas package.
Through http://www.espn.com/nba/story/ searched top ten shooting guards ever in NBA history and made the excel data by myself and sorted out the 

Data Analysis / Visualization Method


  • Line plot / pandas , matplotlib.pyplot

  • Scatter plot / pandas , matplotlib.pyplot 

  • Histogram plot / pandas , matplotlib.pyplot

  • Heatmap / seaborn , numpy , matplotlib.pyplot

  • Plot Animation / matplotlib.patches , matplotlib.path , matplotlib.animation , matplotlib.pyplot

Reference material


Selected Software


Jupyter Notebook - clearly show every steps of code

Effectiveness of NBA data Plot Analysis

NBA player plays can be explicitly shown by the stat data viusalization. Here a few an examples of how data visualization effectively
Let's take a look at the average weight changes of NBA players throughout the years

Import required packages using import ~ as ~ shorting the package name for easier usage and read the data using pandaspackage
Sort out the excel data using pd.read_excel(file_location) that are read with .sort_values method

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns


print("\n\n                                            The Chart Below is NBA average stats\n\n")
NBA_stat = pd.read_excel('../data/NBA_stats.xlsx')
NBA_stat_sorted = NBA_stat.sort_values(by='Season', ascending = True)
print(NBA_stat)
print("\n\n                                            The Chart Below is Ray Allen's NBA career stats\n\n")
Allen_stat = pd.read_excel('../data/Ray_Allen_average.xlsx')
print(Allen_stat)
print('\nFirstly I will extract important columns from NBA stats\n')
NBA_want = pd.read_excel('../data/NBA_stats.xlsx')
NBA_want_sorted = NBA_want.sort_values(by='Season', ascending = True)
print(NBA_want)
                                            The Chart Below is NBA average stats


    Rk   Season   Lg   Age   Wt     G     MP    FG   FGA    3P  ...      PTS  \
0    1  2017-18  NBA  26.4  218  1230  241.4  39.6  86.1  10.5  ...    106.3   
1    2  2016-17  NBA  26.6  220  1230  241.6  39.0  85.4   9.7  ...    105.6   
2    3  2015-16  NBA  26.7  221  1230  241.8  38.2  84.6   8.5  ...    102.7   
3    4  2014-15  NBA  26.7  222  1230  242.0  37.5  83.6   7.8  ...    100.0   
4    5  2013-14  NBA  26.5  223  1230  242.0  37.7  83.0   7.7  ...    101.0   
5    6  2012-13  NBA  26.7  223  1229  241.9  37.1  82.0   7.2  ...     98.1   
6    7  2011-12  NBA  26.6  223   990  241.9  36.5  81.4   6.4  ...     96.3   
7    8  2010-11  NBA  26.6  223  1230  241.9  37.2  81.2   6.5  ...     99.6   
8    9  2009-10  NBA  26.6  222  1230  241.7  37.7  81.7   6.4  ...    100.4   
9   10  2008-09  NBA  26.6  221  1230  241.7  37.1  80.9   6.6  ...    100.0   
10  11  2007-08  NBA  26.8  220  1230  241.5  37.3  81.5   6.6  ...     99.9   
11  12  2006-07  NBA  26.6  219  1230  242.2  36.5  79.7   6.1  ...     98.7   
12  13  2005-06  NBA  26.5  220  1230  242.1  35.8  79.0   5.7  ...     97.0   
13  14  2004-05  NBA  26.9  220  1230  241.9  35.9  80.3   5.6  ...     97.2   
14  15  2003-04  NBA  27.0  220  1189  241.7  35.0  79.8   5.2  ...     93.4   
15  16  2002-03  NBA  27.2  219  1189  242.0  35.7  80.8   5.1  ...     95.1   
16  17  2001-02  NBA  27.4  218  1189  241.7  36.2  81.3   5.2  ...     95.5   
17  18  2000-01  NBA  27.7  216  1189  242.0  35.7  80.6   4.8  ...     94.8   
18  19  1999-00  NBA  27.8  216  1189  241.5  36.8  82.1   4.8  ...     97.5   
19  20  1998-99  NBA  27.9  215   725  241.8  34.2  78.2   4.5  ...     91.6   
20  21  1997-98  NBA  27.7  214  1189  241.9  35.9  79.7   4.4  ...     95.6   
21  22  1996-97  NBA  27.7  213  1189  241.9  36.1  79.3   6.0  ...     96.9   
22  23  1995-96  NBA  27.5  213  1189  241.6  37.0  80.2   5.9  ...     99.5   
23  24  1994-95  NBA  27.2  212  1107  241.9  38.0  81.5   5.5  ...    101.4   
24  25  1993-94  NBA  27.2  210  1107  241.1  39.3  84.4   3.3  ...    101.5   
25  26  1992-93  NBA  27.1  210  1107  241.7  40.7  86.0   3.0  ...    105.3   
26  27  1991-92  NBA  27.2  208  1107  241.8  41.3  87.3   2.5  ...    105.3   
27  28  1990-91  NBA  27.2  209  1107  241.8  41.4  87.2   2.3  ...    106.3   
28  29  1989-90  NBA  27.1  208  1107  241.5  41.5  87.2   2.2  ...    107.0   
29  30  1988-89  NBA  26.9  208  1025  241.5  42.5  89.0   2.1  ...    109.2   
30  31  1987-88  NBA  26.9  209   943  241.3  42.1  87.7   1.6  ...    108.2   
31  32  1986-87  NBA  26.6  209   943  241.6  42.6  88.8   1.4  ...    109.9   
32  33  1985-86  NBA  26.8  208   943  241.6  43.2  88.6   0.9  ...    110.2   
33  34  1984-85  NBA  26.4  207   943  241.4  43.8  89.1   0.9  ...    110.8   
34  35  1983-84  NBA  26.4  206   943  242.0  43.5  88.4   0.6  ...    110.1   
35  36  1982-83  NBA  26.1  205   943  241.3  43.5  89.7   0.5  ...    108.5   
36  37  1981-82  NBA  26.2  205   943  241.6  43.3  88.2   0.6  ...    108.6   
37  38  1980-81  NBA  26.2  205   943  241.4  43.0  88.4   0.5  ...    108.1   
38  39  1979-80  NBA  26.4  204   902  241.8  43.6  90.6   0.8  ...    109.3   
39  40  1978-79  NBA  26.3  203   902  241.1  44.5  91.7   NaN  ...    110.3   

      FG%    3P%    FT%   Pace   eFG%  TOV%  ORB%  FT/FGA   ORtg  
0   0.460  0.362  0.767   97.3  0.521  13.0  22.3   0.193  108.6  
1   0.457  0.358  0.772   96.4  0.514  12.7  23.3   0.209  108.8  
2   0.452  0.354  0.757   95.8  0.502  13.2  23.8   0.209  106.4  
3   0.449  0.350  0.750   93.9  0.496  13.3  25.1   0.205  105.6  
4   0.454  0.360  0.756   93.9  0.501  13.6  25.5   0.215  106.6  
5   0.453  0.359  0.753   92.0  0.496  13.7  26.5   0.204  105.8  
6   0.448  0.349  0.752   91.3  0.487  13.8  27.0   0.208  104.6  
7   0.459  0.358  0.763   92.1  0.498  13.4  26.4   0.229  107.3  
8   0.461  0.355  0.759   92.7  0.501  13.3  26.3   0.228  107.6  
9   0.459  0.367  0.771   91.7  0.500  13.3  26.7   0.236  108.3  
10  0.457  0.362  0.755   92.4  0.497  13.2  26.7   0.231  107.5  
11  0.458  0.358  0.752   91.9  0.496  14.2  27.1   0.246  106.5  
12  0.454  0.358  0.745   90.5  0.490  13.7  27.3   0.248  106.2  
13  0.447  0.356  0.756   90.9  0.482  13.6  28.7   0.245  106.1  
14  0.439  0.347  0.752   90.1  0.471  14.2  28.6   0.228  102.9  
15  0.442  0.349  0.758   91.0  0.474  14.0  28.5   0.229  103.6  
16  0.445  0.354  0.752   90.7  0.477  13.6  28.9   0.221  104.5  
17  0.443  0.354  0.748   91.3  0.473  14.1  28.2   0.231  103.0  
18  0.449  0.353  0.750   93.1  0.478  14.2  28.9   0.231  104.1  
19  0.437  0.339  0.728   88.9  0.466  14.6  30.2   0.240  102.2  
20  0.450  0.346  0.737   90.3  0.478  14.5  31.4   0.243  105.0  
21  0.455  0.360  0.738   90.1  0.493  14.8  30.8   0.236  106.7  
22  0.462  0.367  0.740   91.8  0.499  14.7  30.6   0.243  107.6  
23  0.466  0.359  0.737   92.9  0.500  14.6  31.4   0.245  108.3  
24  0.466  0.333  0.734   95.1  0.485  14.3  32.2   0.232  106.3  
25  0.473  0.336  0.754   96.8  0.491  14.0  32.0   0.243  108.0  
26  0.472  0.331  0.759   96.6  0.487  13.6  32.9   0.232  108.2  
27  0.474  0.320  0.765   97.8  0.487  13.9  32.3   0.245  107.9  
28  0.476  0.331  0.764   98.3  0.489  13.9  32.1   0.250  108.1  
29  0.477  0.323  0.768  100.6  0.489  14.5  33.0   0.249  107.8  
30  0.480  0.316  0.766   99.6  0.489  14.3  32.8   0.254  108.0  
31  0.480  0.301  0.763  100.8  0.488  14.3  33.4   0.262  108.3  
32  0.487  0.282  0.756  102.1  0.493  14.9  32.4   0.258  107.2  
33  0.491  0.282  0.764  102.1  0.496  14.9  32.9   0.252  107.9  
34  0.492  0.250  0.760  101.4  0.495  15.0  33.0   0.255  107.6  
35  0.485  0.238  0.740  103.1  0.488  15.8  33.4   0.233  104.7  
36  0.491  0.262  0.746  100.9  0.495  15.0  33.0   0.241  106.9  
37  0.486  0.245  0.751  101.8  0.489  15.6  33.5   0.245  105.5  
38  0.481  0.280  0.764  103.1  0.486  15.5  33.5   0.235  105.3  
39  0.485    NaN  0.752  105.8  0.485  16.0  32.8   0.232  103.8  

[40 rows x 31 columns]


                                            The Chart Below is Ray Allen's NBA career stats


     Season  Age   Tm   Lg Pos   G  GS    MP   FG   FGA  ...     FT%  ORB  \
0   1996-97   21  MIL  NBA  SG  82  81  30.9  4.8  11.1  ...   0.823  1.2   
1   1997-98   22  MIL  NBA  SG  82  82  40.1  6.9  16.0  ...   0.875  1.5   
2   1998-99   23  MIL  NBA  SG  50  50  34.4  6.1  13.5  ...   0.903  1.1   
3   1999-00   24  MIL  NBA  SG  82  82  37.4  7.8  17.2  ...   0.887  1.0   
4   2000-01   25  MIL  NBA  SG  82  82  38.2  7.7  16.0  ...   0.888  1.2   
5   2001-02   26  MIL  NBA  SG  69  67  36.6  7.7  16.6  ...   0.873  1.2   
6   2002-03   27  MIL  NBA  SG  47  46  35.8  7.5  17.1  ...   0.913  1.0   
7   2003-04   28  SEA  NBA  SG  56  56  38.4  8.0  18.2  ...   0.904  1.2   
8   2004-05   29  SEA  NBA  SG  78  78  39.3  8.2  19.2  ...   0.883  1.0   
9   2005-06   30  SEA  NBA  SG  78  78  38.7  8.7  19.2  ...   0.903  0.9   
10  2006-07   31  SEA  NBA  SG  55  55  40.3  9.2  21.0  ...   0.903  1.0   
11  2007-08   32  BOS  NBA  SG  73  73  35.9  6.0  13.5  ...   0.907  1.0   
12  2008-09   33  BOS  NBA  SG  79  79  36.4  6.3  13.2  ...   0.952  0.8   
13  2009-10   34  BOS  NBA  SG  80  80  35.2  5.8  12.2  ...   0.913  0.6   
14  2010-11   35  BOS  NBA  SG  80  80  36.1  6.0  12.2  ...   0.881  0.6   
15  2011-12   36  BOS  NBA  SG  46  42  34.0  4.9  10.7  ...   0.915  0.3   
16  2012-13   37  MIA  NBA  SG  79   0  25.8  3.7   8.2  ...   0.886  0.5   
17  2013-14   38  MIA  NBA  SG  73   9  26.5  3.3   7.4  ...   0.905  0.3   

    DRB  TRB  AST  STL  BLK  TOV   PF   PTS  
0   2.8  4.0  2.6  0.9  0.1  1.8  2.7  13.4  
1   3.4  4.9  4.3  1.4  0.1  3.2  3.0  19.5  
2   3.1  4.2  3.6  1.1  0.1  2.4  2.3  17.1  
3   3.4  4.4  3.8  1.3  0.2  2.2  2.3  22.1  
4   4.0  5.2  4.6  1.5  0.2  2.5  2.3  22.0  
5   3.3  4.5  3.9  1.3  0.3  2.3  2.3  21.8  
6   3.7  4.6  3.5  1.2  0.2  2.5  3.2  21.3  
7   3.9  5.1  4.8  1.3  0.2  2.8  2.4  23.0  
8   3.4  4.4  3.7  1.1  0.1  2.2  2.1  23.9  
9   3.3  4.3  3.7  1.3  0.2  2.4  1.9  25.1  
10  3.5  4.5  4.1  1.5  0.2  2.8  2.1  26.4  
11  2.6  3.7  3.1  0.9  0.2  1.7  2.0  17.4  
12  2.7  3.5  2.8  0.9  0.2  1.7  2.0  18.2  
13  2.6  3.2  2.6  0.8  0.3  1.6  2.3  16.3  
14  2.8  3.4  2.7  1.0  0.2  1.5  1.8  16.5  
15  2.8  3.1  2.4  1.1  0.2  1.5  1.8  14.2  
16  2.2  2.7  1.7  0.8  0.2  1.3  1.6  10.9  
17  2.5  2.8  2.0  0.7  0.1  1.2  1.6   NaN  

[18 rows x 30 columns]

Firstly I will extract important columns from NBA stats

    Rk   Season   Lg   Age   Wt     G     MP    FG   FGA    3P  ...      PTS  \
0    1  2017-18  NBA  26.4  218  1230  241.4  39.6  86.1  10.5  ...    106.3   
1    2  2016-17  NBA  26.6  220  1230  241.6  39.0  85.4   9.7  ...    105.6   
2    3  2015-16  NBA  26.7  221  1230  241.8  38.2  84.6   8.5  ...    102.7   
3    4  2014-15  NBA  26.7  222  1230  242.0  37.5  83.6   7.8  ...    100.0   
4    5  2013-14  NBA  26.5  223  1230  242.0  37.7  83.0   7.7  ...    101.0   
5    6  2012-13  NBA  26.7  223  1229  241.9  37.1  82.0   7.2  ...     98.1   
6    7  2011-12  NBA  26.6  223   990  241.9  36.5  81.4   6.4  ...     96.3   
7    8  2010-11  NBA  26.6  223  1230  241.9  37.2  81.2   6.5  ...     99.6   
8    9  2009-10  NBA  26.6  222  1230  241.7  37.7  81.7   6.4  ...    100.4   
9   10  2008-09  NBA  26.6  221  1230  241.7  37.1  80.9   6.6  ...    100.0   
10  11  2007-08  NBA  26.8  220  1230  241.5  37.3  81.5   6.6  ...     99.9   
11  12  2006-07  NBA  26.6  219  1230  242.2  36.5  79.7   6.1  ...     98.7   
12  13  2005-06  NBA  26.5  220  1230  242.1  35.8  79.0   5.7  ...     97.0   
13  14  2004-05  NBA  26.9  220  1230  241.9  35.9  80.3   5.6  ...     97.2   
14  15  2003-04  NBA  27.0  220  1189  241.7  35.0  79.8   5.2  ...     93.4   
15  16  2002-03  NBA  27.2  219  1189  242.0  35.7  80.8   5.1  ...     95.1   
16  17  2001-02  NBA  27.4  218  1189  241.7  36.2  81.3   5.2  ...     95.5   
17  18  2000-01  NBA  27.7  216  1189  242.0  35.7  80.6   4.8  ...     94.8   
18  19  1999-00  NBA  27.8  216  1189  241.5  36.8  82.1   4.8  ...     97.5   
19  20  1998-99  NBA  27.9  215   725  241.8  34.2  78.2   4.5  ...     91.6   
20  21  1997-98  NBA  27.7  214  1189  241.9  35.9  79.7   4.4  ...     95.6   
21  22  1996-97  NBA  27.7  213  1189  241.9  36.1  79.3   6.0  ...     96.9   
22  23  1995-96  NBA  27.5  213  1189  241.6  37.0  80.2   5.9  ...     99.5   
23  24  1994-95  NBA  27.2  212  1107  241.9  38.0  81.5   5.5  ...    101.4   
24  25  1993-94  NBA  27.2  210  1107  241.1  39.3  84.4   3.3  ...    101.5   
25  26  1992-93  NBA  27.1  210  1107  241.7  40.7  86.0   3.0  ...    105.3   
26  27  1991-92  NBA  27.2  208  1107  241.8  41.3  87.3   2.5  ...    105.3   
27  28  1990-91  NBA  27.2  209  1107  241.8  41.4  87.2   2.3  ...    106.3   
28  29  1989-90  NBA  27.1  208  1107  241.5  41.5  87.2   2.2  ...    107.0   
29  30  1988-89  NBA  26.9  208  1025  241.5  42.5  89.0   2.1  ...    109.2   
30  31  1987-88  NBA  26.9  209   943  241.3  42.1  87.7   1.6  ...    108.2   
31  32  1986-87  NBA  26.6  209   943  241.6  42.6  88.8   1.4  ...    109.9   
32  33  1985-86  NBA  26.8  208   943  241.6  43.2  88.6   0.9  ...    110.2   
33  34  1984-85  NBA  26.4  207   943  241.4  43.8  89.1   0.9  ...    110.8   
34  35  1983-84  NBA  26.4  206   943  242.0  43.5  88.4   0.6  ...    110.1   
35  36  1982-83  NBA  26.1  205   943  241.3  43.5  89.7   0.5  ...    108.5   
36  37  1981-82  NBA  26.2  205   943  241.6  43.3  88.2   0.6  ...    108.6   
37  38  1980-81  NBA  26.2  205   943  241.4  43.0  88.4   0.5  ...    108.1   
38  39  1979-80  NBA  26.4  204   902  241.8  43.6  90.6   0.8  ...    109.3   
39  40  1978-79  NBA  26.3  203   902  241.1  44.5  91.7   NaN  ...    110.3   

      FG%    3P%    FT%   Pace   eFG%  TOV%  ORB%  FT/FGA   ORtg  
0   0.460  0.362  0.767   97.3  0.521  13.0  22.3   0.193  108.6  
1   0.457  0.358  0.772   96.4  0.514  12.7  23.3   0.209  108.8  
2   0.452  0.354  0.757   95.8  0.502  13.2  23.8   0.209  106.4  
3   0.449  0.350  0.750   93.9  0.496  13.3  25.1   0.205  105.6  
4   0.454  0.360  0.756   93.9  0.501  13.6  25.5   0.215  106.6  
5   0.453  0.359  0.753   92.0  0.496  13.7  26.5   0.204  105.8  
6   0.448  0.349  0.752   91.3  0.487  13.8  27.0   0.208  104.6  
7   0.459  0.358  0.763   92.1  0.498  13.4  26.4   0.229  107.3  
8   0.461  0.355  0.759   92.7  0.501  13.3  26.3   0.228  107.6  
9   0.459  0.367  0.771   91.7  0.500  13.3  26.7   0.236  108.3  
10  0.457  0.362  0.755   92.4  0.497  13.2  26.7   0.231  107.5  
11  0.458  0.358  0.752   91.9  0.496  14.2  27.1   0.246  106.5  
12  0.454  0.358  0.745   90.5  0.490  13.7  27.3   0.248  106.2  
13  0.447  0.356  0.756   90.9  0.482  13.6  28.7   0.245  106.1  
14  0.439  0.347  0.752   90.1  0.471  14.2  28.6   0.228  102.9  
15  0.442  0.349  0.758   91.0  0.474  14.0  28.5   0.229  103.6  
16  0.445  0.354  0.752   90.7  0.477  13.6  28.9   0.221  104.5  
17  0.443  0.354  0.748   91.3  0.473  14.1  28.2   0.231  103.0  
18  0.449  0.353  0.750   93.1  0.478  14.2  28.9   0.231  104.1  
19  0.437  0.339  0.728   88.9  0.466  14.6  30.2   0.240  102.2  
20  0.450  0.346  0.737   90.3  0.478  14.5  31.4   0.243  105.0  
21  0.455  0.360  0.738   90.1  0.493  14.8  30.8   0.236  106.7  
22  0.462  0.367  0.740   91.8  0.499  14.7  30.6   0.243  107.6  
23  0.466  0.359  0.737   92.9  0.500  14.6  31.4   0.245  108.3  
24  0.466  0.333  0.734   95.1  0.485  14.3  32.2   0.232  106.3  
25  0.473  0.336  0.754   96.8  0.491  14.0  32.0   0.243  108.0  
26  0.472  0.331  0.759   96.6  0.487  13.6  32.9   0.232  108.2  
27  0.474  0.320  0.765   97.8  0.487  13.9  32.3   0.245  107.9  
28  0.476  0.331  0.764   98.3  0.489  13.9  32.1   0.250  108.1  
29  0.477  0.323  0.768  100.6  0.489  14.5  33.0   0.249  107.8  
30  0.480  0.316  0.766   99.6  0.489  14.3  32.8   0.254  108.0  
31  0.480  0.301  0.763  100.8  0.488  14.3  33.4   0.262  108.3  
32  0.487  0.282  0.756  102.1  0.493  14.9  32.4   0.258  107.2  
33  0.491  0.282  0.764  102.1  0.496  14.9  32.9   0.252  107.9  
34  0.492  0.250  0.760  101.4  0.495  15.0  33.0   0.255  107.6  
35  0.485  0.238  0.740  103.1  0.488  15.8  33.4   0.233  104.7  
36  0.491  0.262  0.746  100.9  0.495  15.0  33.0   0.241  106.9  
37  0.486  0.245  0.751  101.8  0.489  15.6  33.5   0.245  105.5  
38  0.481  0.280  0.764  103.1  0.486  15.5  33.5   0.235  105.3  
39  0.485    NaN  0.752  105.8  0.485  16.0  32.8   0.232  103.8  

[40 rows x 31 columns]


Bring out the wanted values by variable[index]

In [2]:
NBA_season = NBA_want_sorted['Season']
NBA_weight = NBA_want_sorted['Wt']


Draw the line plot
plt.figure(size) method draws the outline of a plot, plt.plot(details)method helps you to put the wanted data in x,y axis and select the options inside the plot. plt.xlabel(label name)method sets the lable name of each axis, plt.title(title name), puts the title of the plot
plt.grid()method shows the grid, which is the lines inside the plot

Last but not least plt.show() method is the final code to visualize the plot

In [3]:
plt.figure(figsize=(40,20))
plt.plot(NBA_season, NBA_weight, color = 'red', linestyle='dashed', marker = 'o', markerfacecolor ='blue', markersize=10, lw=4)
plt.xlabel('Season', size =20)
plt.ylabel('Player Weight', size = 20)
plt.title('NBA Player Weight Change', size=30)
plt.xticks(rotation = 90)
plt.grid()
plt.show()

Plot shows almost a steady increase of weight throughout the seasons and a slight decrease in recent years.
Players needed more muscle than lung capacity.



Let's look at other important stat change by creating line plots. Shooting


Bring out the wanted values by variable[index] method and draw two seperate line plots

In [4]:
NBA_FG_percent = NBA_want_sorted['FG%']
NBA_three_percent = NBA_want_sorted['3P%']

plt.figure(figsize = (80,25))
plt.plot(NBA_season, NBA_FG_percent, lw=10)
plt.xlabel('Season', size =50)
plt.ylabel('FieldGoal Percentage', size = 50)
plt.title('NBA Player FieldGoal Percentage', size=65)
plt.grid()
plt.show()

plt.figure(figsize=(80,25))
plt.plot(NBA_season, NBA_three_percent, color = 'purple', linestyle='-.', marker = 'p', markerfacecolor ='white', markersize=20, lw=13)
plt.xlabel('Season', size =50)
plt.ylabel('3point Percentage', size = 50)
plt.title('NBA Player 3point Percentage', size=65)
plt.grid()
plt.show()

Due to NBA Rule alteration and stragic changes, fieldgoal percentage showed a clear drop till late 1990s
Now reached an increase for the last 3 seasons.
3point percentage showed steady increase throughout the seasons.

Ray Allen Comparison Analysis

Now let's compare these Overall NBA average stats with Ray Allen's stat.
Ray Allen is a SG(Shooting Guard) famous for being specialized for 3 point shooting.

Start the NBA average stat data starting from 1996-97 till 2013-14 season to match it Ray Allen's career

In [5]:
NBA_short_stat = NBA_want_sorted[18:36]
NBA_short_stat_season = NBA_short_stat['Season']
NBA_short_stat_three_percent = NBA_short_stat['3P%']
NBA_short_stat_FG_percent = NBA_short_stat['FG%']
NBA_short_stat_freethrow_percent = NBA_short_stat['FT%']
Allen_FG_percent = Allen_stat['FG%']
Allen_season = Allen_stat['Season']
Allen_freethrow_percent = Allen_stat['FT%']


Bring out the wanted values again. This time use plt.scatter to draw a scatter plot
plt.legend is the indicator of two different shape, located on right top

In [6]:
plt.figure(figsize = (40,18))
plt.scatter(NBA_short_stat_season, NBA_short_stat_FG_percent, marker = 'd', s = 200, label = 'NBA')
plt.scatter(Allen_season, Allen_FG_percent, marker = 'p', s = 200, label = 'Allen')
plt.xlabel('Season', size = 33)
plt.ylabel('FieldGoal Percentage', size = 30)
plt.title('NBA and Ray Allen FieldGoal Percentage Comparison per Season', size = 40)
plt.grid()
plt.legend(loc = 0, prop ={'size' : 32})
plt.show()

Allen_three_percent = Allen_stat['3P%']
plt.figure(figsize=(40,18))
plt.scatter(NBA_short_stat_season, NBA_short_stat_three_percent, color='b', label = 'NBA', marker = '>', s=200)
plt.scatter(Allen_season, Allen_three_percent,color='g', label = 'Allen', s=200)
plt.title('NBA and Ray Allen 3point Percentage Comparison per Season', size = 40)
plt.xlabel('Season', size = 33)
plt.ylabel('3Point Percentage', size = 30)
plt.grid()
plt.legend(loc=0, prop = {'size':32})
plt.show()

plt.figure(figsize=(40,18))
plt.scatter(NBA_short_stat_season, NBA_short_stat_freethrow_percent, color = 'purple', label = 'NBA', marker = '.', s = 450)
plt.scatter(Allen_season, Allen_freethrow_percent, color = 'r', label = 'Allen', marker = '^', s = 250)
plt.title('NBA and Ray Allen Freethrow Percentage Comparison per Season', size = 40)
plt.xlabel('Season', size = 30)
plt.ylabel('Freethrow Percentage', size = 30)
plt.grid()
plt.legend(loc = 0, prop = {'size':32})
plt.show()

Result shows mixed results on fieldgoal percentage
However, 3point and Freethrow percentage rate clearly shows(especially freethrow) that Ray Allen has higher success percentage than NBA average.

As a matter of fact Ray Allen is the all time leader of 3 point made in the entire NBA history
Let me show the top 20 ranking of NBA 3point all time leaders


Read the data of 10 NBA 3point leaders

In [7]:
NBA_three_leader = pd.read_excel('../data/NBA_3pt_leader.xlsx')


Sort the data into 3point index section and Player index to create the axis of plot

In [8]:
NBA_three_leader_point = NBA_three_leader['3P']
NBA_three_leader_name = NBA_three_leader['Player']


Bring out Ray Allen's stat independently to differentiate the color of the bar using .loc[] method

In [9]:
NBA_three_leader_Allen = NBA_three_leader.loc[0]
NBA_three_leader_Allen_three = NBA_three_leader_Allen['3P']
NBA_three_leader_Allen_name = NBA_three_leader_Allen['Player']


Sort other 19 players

In [10]:
NBA_three_leader_else = NBA_three_leader[1:20]
NBA_three_leader_else_sort = NBA_three_leader_else.sort_values(by = '3P', ascending = True)
NBA_three_leader_else_three = NBA_three_leader_else['3P']
NBA_three_leader_else_sort_player = NBA_three_leader_else_sort['Player']


Categorize the data into ascending form to show the Rank from the highest score

In [11]:
NBA_three_leader_else_sort_player = NBA_three_leader_else_sort['Player']
NBA_three_leader_else_sort_three = NBA_three_leader_else_sort['3P']


Draw the histogram
Here this graph consists of two different histogram since I wanted to show different height and color of the bar for Ray Allen stat

In [12]:
plt.figure(figsize=(20,12))
plt.barh(NBA_three_leader_else_sort_player, NBA_three_leader_else_sort_three, height = 0.6)
plt.barh(NBA_three_leader_Allen_name, NBA_three_leader_Allen_three, height = 0.8, color = 'r', align= 'center')
plt.title('3point All Time NBA Leaders', size = 25)
plt.xlabel('3point Made', size = 20)
plt.ylabel('Player', size = 20, rotation = 60)
plt.show()

This bar plot of showing 3point all time NBA Leaders shows Ray Allen has the highest 3point made scores
which is an explicit indicator showing why Ray Allen can be called as the best shooter of NBA history


Use variable.mean() to get the average of data

In [13]:
print('10 Leader Three Point Average')
print(NBA_three_leader_point.mean())
print('\nAllen Three Point Average')
print(NBA_three_leader_Allen_three.mean())
print('\nDiffence')
NBA_three_leader_Allen_three.mean() - NBA_three_leader_point.mean()
10 Leader Three Point Average
2011.55

Allen Three Point Average
2973.0

Diffence
Out[13]:
961.45

Showing 961 average difference

Top ten SG(Shooting Guard) in all NBA history mean they were one of the best shooters in NBA and Ray Allen is included in 8th

One of the main indicator for best SG will be the actual percentage of 3 point field goal and freethrow
Here I brought excel data of each top 10 SG's 3pt and freethrow percentage rate from 1~12 years

  • Reference - www.basketball-reference.com
    (Geroge Gervin , Jerry West wasn't considered since they were players who didn't have 3 point stats)

With these two dataset(3point percentage, freethrow percentage), I will draw a heatmap and a animation using seaborn ,numpy and matplotlib.pyplotpackage

Heatmap

Read the two datasets using pandas

In [20]:
print('\n\n                       Top SG 3point percentage(12 seasons)\n')
Top_SG_3p = pd.read_excel('../data/Top_SG_3percent.xlsx')
print(Top_SG_3p)

print('\n\n                       Top SG freethrow percentage(12 seasons)\n')

Top_SG_free = pd.read_excel('../data/Top_SG_Freepercent.xlsx')
print(Top_SG_free)
                       Top SG 3point percentage(12 seasons)

   Season  Mcgrady  Allen  Drexler  Iverson  Jordan   Wade  Miller  Carter  \
0     1st    0.341  0.393    0.250    0.341   0.173  0.302   0.355   0.288   
1     2nd    0.229  0.364    0.216    0.298   0.167  0.289   0.402   0.403   
2     3rd    0.277  0.356    0.200    0.291   0.182  0.171   0.414   0.408   
3     4th    0.355  0.423    0.234    0.341   0.132  0.266   0.348   0.387   
4     5th    0.364  0.433    0.212    0.320   0.276  0.286   0.378   0.344   
5     6th    0.386  0.434    0.260    0.291   0.376  0.317   0.399   0.383   
6     7th    0.339  0.377    0.283    0.277   0.312  0.300   0.421   0.406   
7     8th    0.326  0.395    0.319    0.286   0.270  0.306   0.415   0.322   
8     9th    0.312  0.351    0.337    0.308   0.352  0.268   0.410   0.425   
9    10th    0.331  0.392    0.233    0.323   0.500  0.258   0.427   0.341   
10   11th    0.292  0.376    0.324    0.315   0.427  0.281   0.429   0.357   
11   12th    0.376  0.412    0.360    0.226   0.374  0.284   0.385   0.359   

    Bryant  Ginobili  
0    0.375     0.345  
1    0.341     0.359  
2    0.267     0.376  
3    0.319     0.382  
4    0.305     0.396  
5    0.250     0.401  
6    0.383     0.330  
7    0.327     0.377  
8    0.339     0.349  
9    0.347     0.413  
10   0.344     0.353  
11   0.361     0.349  


                       Top SG freethrow percentage(12 seasons)

   Season  Mcgrady  Allen  Drexler  Iverson  Jordan   Wade  Miller  Carter  \
0     1st    0.712  0.823    0.728    0.702   0.845  0.747   0.801   0.761   
1     2nd    0.726  0.875    0.759    0.729   0.840  0.762   0.844   0.791   
2     3rd    0.707  0.903    0.769    0.751   0.857  0.783   0.868   0.765   
3     4th    0.733  0.887    0.760    0.713   0.841  0.807   0.918   0.798   
4     5th    0.748  0.888    0.811    0.814   0.850  0.758   0.858   0.806   
5     6th    0.793  0.873    0.799    0.812   0.848  0.765   0.880   0.806   
6     7th    0.796  0.916    0.774    0.774   0.851  0.761   0.908   0.798   
7     8th    0.774  0.913    0.794    0.745   0.832  0.758   0.897   0.694   
8     9th    0.747  0.920    0.794    0.835   0.837  0.791   0.863   0.817   
9    10th    0.707  0.904    0.839    0.814   0.801  0.725   0.880   0.799   
10   11th    0.684  0.883    0.777    0.795   0.834  0.733   0.868   0.802   
11   12th    0.801  0.903    0.824    0.885   0.833  0.768   0.915   0.816   

    Bryant  Ginobili  
0    0.819     0.737  
1    0.794     0.802  
2    0.839     0.803  
3    0.821     0.778  
4    0.853     0.860  
5    0.829     0.860  
6    0.843     0.884  
7    0.852     0.870  
8    0.816     0.871  
9    0.850     0.871  
10   0.868     0.796  
11   0.840     0.851  


Create a list of 2D array to set axis and values in heatmap using np.array([]) method

In [21]:
player = ['Jordan','Bryant','Wade', 'Iverson', 'Drexler', 'Mcgrady', 'Allen', 'Miller', 'Carter', 'Ginobili']
season = ['1st','2nd','3rd','4th','5th','6th','7th','8th', '9th', '10th', '11th','12th']
three_heatmap_data = np.array([
    Top_SG_3p['Jordan'],
    Top_SG_3p['Bryant'],
    Top_SG_3p['Wade'],
    Top_SG_3p['Iverson'],
    Top_SG_3p['Drexler'],
    Top_SG_3p['Mcgrady'],
    Top_SG_3p['Allen'],
    Top_SG_3p['Miller'],
    Top_SG_3p['Carter'],
    Top_SG_3p['Ginobili']
                           ])


Draw heatmap using seaborn
Used ax.invert_yaxis() to change location of axis

In [22]:
fig, ax = plt.subplots(figsize = (13,10))
sns.heatmap(three_heatmap_data, annot =True, fmt = 'g', linewidths =1, 
                 cmap = 'Oranges', vmin = 0.15, vmax = 0.5 )
ax.set_xticklabels(season)
ax.set_yticklabels(player, rotation = 360)
ax.invert_yaxis()
plt.ylabel('Player', size = 15)
plt.xlabel('Season', size = 15)
plt.title('3 Point make Percentage of NBA top 10 best Shooting Guards Heatmap', size = 20)
Out[22]:
Text(0.5, 1.0, '3 Point make Percentage of NBA top 10 best Shooting Guards Heatmap')


As you can see, the darker the higher 3pt success rate a player has
Jordan seemed to have a hot hand on his 10th season showing 50% 3pt success rate

But players with constant high percentage rates are Miller and Allen

This heatmap indicates that Ray Allen did not just made a lot of 3point but also successed with high success rate!

Use data.describe() method to check details which is a quite useful method since it automatically shows 8 main statistically useful data results

In [23]:
Top_SG_3p.describe()
Out[23]:
McgradyAllenDrexlerIversonJordanWadeMillerCarterBryantGinobili
count12.00000012.00000012.00000012.00000012.00000012.00000012.00000012.00000012.00000012.000000
mean0.3273330.3921670.2690000.3014170.2950830.2773330.3985830.3685830.3298330.369167
std0.0446650.0286950.0542790.0315120.1157510.0377580.0268850.0405900.0400060.025626
min0.2290000.3510000.2000000.2260000.1320000.1710000.3480000.2880000.2500000.330000
25%0.3070000.3730000.2287500.2897500.1797500.2675000.3832500.3432500.3155000.349000
50%0.3350000.3925000.2550000.3030000.2940000.2850000.4060000.3710000.3400000.367500
75%0.3572500.4147500.3202500.3207500.3745000.3005000.4165000.4037500.3505000.385500
max0.3860000.4340000.3600000.3410000.5000000.3170000.4290000.4250000.3830000.413000

As you can see, Allen shows second highest percentage rate in Mean(average) and Max(maximum) among 10 greatest shooters
In min(minimum) section, Allen has the highest percentage.

Animation


Import the patchespathanimation tools from matplotlib package and set the data
using variable.iloc[] method to sort out wanted values

In [24]:
import matplotlib.patches as patches
import matplotlib.path as path
import matplotlib.animation as animation

data = Top_SG_free.drop(['Season'],1)

first_season = data.iloc[0]
second_season = data.iloc[1]
third_season = data.iloc[2]
fourth_season = data.iloc[3]
fifth_season = data.iloc[4]
sixth_season = data.iloc[5]
seventh_season = data.iloc[6]
eighth_season = data.iloc[7]
ninth_season = data.iloc[8]
tenth_season = data.iloc[9]
eleventh_season = data.iloc[10]
twelveth_season = data.iloc[11]



Set the data using numpy.array

In [25]:
ani_data = np.array([
first_season,
second_season,
third_season,
fourth_season,
fifth_season,
sixth_season,
seventh_season,
eighth_season,
ninth_season,
tenth_season,
eleventh_season,
twelveth_season
])
print(ani_data)
[[0.712 0.823 0.728 0.702 0.845 0.747 0.801 0.761 0.819 0.737]
 [0.726 0.875 0.759 0.729 0.84  0.762 0.844 0.791 0.794 0.802]
 [0.707 0.903 0.769 0.751 0.857 0.783 0.868 0.765 0.839 0.803]
 [0.733 0.887 0.76  0.713 0.841 0.807 0.918 0.798 0.821 0.778]
 [0.748 0.888 0.811 0.814 0.85  0.758 0.858 0.806 0.853 0.86 ]
 [0.793 0.873 0.799 0.812 0.848 0.765 0.88  0.806 0.829 0.86 ]
 [0.796 0.916 0.774 0.774 0.851 0.761 0.908 0.798 0.843 0.884]
 [0.774 0.913 0.794 0.745 0.832 0.758 0.897 0.694 0.852 0.87 ]
 [0.747 0.92  0.794 0.835 0.837 0.791 0.863 0.817 0.816 0.871]
 [0.707 0.904 0.839 0.814 0.801 0.725 0.88  0.799 0.85  0.871]
 [0.684 0.883 0.777 0.795 0.834 0.733 0.868 0.802 0.868 0.796]
 [0.801 0.903 0.824 0.885 0.833 0.768 0.915 0.816 0.84  0.851]]


Make animation executable inside Jupyter Notebook using %matplotlib notebook code

In [26]:
%matplotlib notebook


Set the outlines of the animation frame

In [27]:
n, bins = np.histogram(ani_data, 12)
left = np.array(bins[:-1])
right = np.array(bins[1:])
bottom = np.zeros(len(left))
top = bottom + n
nrects = len(left)


set up the vertex and path codes arrays using plt.Path.MOVETOplt.Path.LINETO and plt.Path.CLOSEPOLY for each rect.

need 1 MOVETO per rectangle, which sets the initial point. need 3 LINETO's, which tell Matplotlib to draw lines from vertex 1 to vertex 2, v2 to v3, and v3 to v4. We then need one CLOSEPOLY which tells Matplotlib to draw a line from the v4 to our initial vertex (the MOVETOvertex), in order to close the polygon.

In [28]:
nverts = nrects *(1+3+1)
verts = np.zeros((nverts, 2))
codes = np.ones(nverts, int) * path.Path.LINETO
codes[0::5] = path.Path.MOVETO
codes[4::5] = path.Path.CLOSEPOLY
verts[0::5, 0] = left
verts[0::5, 1] = bottom
verts[1::5, 0] = left
verts[1::5, 1] = top
verts[2::5, 0] = right
verts[2::5, 1] = top
verts[3::5, 0] = right
verts[3::5, 1] = bottom


Make an animate function, which generates and updates the locations of the vertices for the histogram. patch will eventually be a Patch object.

In [29]:
patch = None
def animate(i):
    n, bins = np.histogram(ani_data[:,i], 12)
    top = bottom + n
    verts[1::5,1] = top
    verts[2::5, 1] = top
    return [patch, ]


Make the animation plot
Execute the animation, interval per frame 1000 = 1 second

In [30]:
fig, ax = plt.subplots()
barpath = path.Path(verts, codes)
patch = patches.PathPatch(
    barpath, facecolor = 'red', edgecolor = 'black', alpha = 0.5)
ax.add_patch(patch)
ax.set_xlim(left[0], right[-1])
ax.set_ylim(bottom.min(), 7)
ax.invert_xaxis
ani = animation.FuncAnimation(fig, animate, repeat = False, blit = False, interval = 1000)
plt.title('Top10 Shooting Guard Freethrow Proportion per Season', size = 15)
plt.ylabel('Rate Proportion', size = 15)
plt.xlabel('Each Season(per slide)', size = 15)
plt.show()
<IPython.core.display.Javascript object>


Now, check out Ray Allen's average freethrow percentage rate of each season

In [31]:
Allen_freethrow = Allen_stat['FT%']
print(Allen_freethrow)
0     0.823
1     0.875
2     0.903
3     0.887
4     0.888
5     0.873
6     0.913
7     0.904
8     0.883
9     0.903
10    0.903
11    0.907
12    0.952
13    0.913
14    0.881
15    0.915
16    0.886
17    0.905
Name: FT%, dtype: float64


We can obviously see his FT% comes under 0.85 ~ 0.90 rate proportion from the upper animation chart or even over 0.90

Let's look at the actual overall average of each players and Ray Allen. Lastly checking the gap of each player's percentage with Allen

In [32]:
print('10 Top SG Freethrow Average\n')
print(Top_SG_free.mean())


print('\n\nRay Allen Freethrow Average\n')
print(Allen_freethrow.mean())


print('\n\nNBA Freethrow Average\n')
print(NBA_stat['FT%'].mean())


print('\n\nAllen / NBA overall freethrow Difference\n')
gap = Allen_freethrow.mean() - NBA_stat['FT%'].mean()
print(gap)


print('\n\nAllen / Top 10 SG freethrow Difference\n')
Allen_freethrow.mean() - Top_SG_free.mean()
10 Top SG Freethrow Average

Mcgrady     0.744000
Allen       0.890667
Drexler     0.785667
Iverson     0.780750
Jordan      0.839083
Wade        0.763167
Miller      0.875000
Carter      0.787750
Bryant      0.835333
Ginobili    0.831917
dtype: float64


Ray Allen Freethrow Average

0.8952222222222223


NBA Freethrow Average

0.75385


Allen / NBA overall freethrow Difference

0.14137222222222223


Allen / Top 10 SG freethrow Difference

Out[32]:
Mcgrady     0.151222
Allen       0.004556
Drexler     0.109556
Iverson     0.114472
Jordan      0.056139
Wade        0.132056
Miller      0.020222
Carter      0.107472
Bryant      0.059889
Ginobili    0.063306
dtype: float64

Allen takes the 1st rank freethrow success rate(0.895) among top 10 best shooting guards. Also shows more than 14% higher percentage in comparison with NBA overall average.

Conclusion

Ray Allen shows explicitly high rate from standards of judging a great shooting guard.


  • Shows maximum 11% higher success rate of 3 point fieldgoal

  • Taking 1st rank of 3 point fieldgoal made in all time NBA history reaching almost 3000 goals

  • Having a high and steady 3point make percentage rate reaching average 0.401 even compared to top 10 other greatest shooting guards in NBA

  • Shows extremely high percentage rate(0.894) in freethrow compared to NBA and also clear high rate with top 10 SG comparison

  • More than 14% higher freethrow success rate compared with NBA overall average


The hypothesis

'Ray Allen is the best shooter in NBA history'


is proved, shown through various data analysis and visualization.



마지막 간단한 애니메이션은 첨부파일을 넣었다.
시작할때 검은화면이 나오면 한번 되감기 하면 애니매이션이 나온다


ani.mp4


***안된다면

https://github.com/kyjyeon/NBA-Ray-Allen-analysis 에 있는  ani.mp4
파일을 받아 실행해보시면된다.!


끄읕



'개인_프로젝트 > Python' 카테고리의 다른 글

[책 후기] :: 파이썬으로 데이터 주무르기  (0) 2019.01.27

+ Recent posts