%matplotlib inline
import pandas as pd
import seaborn as sns
!wget -P data/ https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/NHS-111-from-Aug10-to-Nov16-web-file.csv
--2017-01-14 12:36:01-- https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/NHS-111-from-Aug10-to-Nov16-web-file.csv Resolving www.england.nhs.uk... 54.240.166.101, 54.240.166.61, 54.240.166.184, ... Connecting to www.england.nhs.uk|54.240.166.101|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 509128 (497K) [text/csv] Saving to: 'data/NHS-111-from-Aug10-to-Nov16-web-file.csv' NHS-111-from-Aug10- 100%[=====================>] 497.20K 1.49MB/s in 0.3s 2017-01-14 12:36:01 (1.49 MB/s) - 'data/NHS-111-from-Aug10-to-Nov16-web-file.csv' saved [509128/509128]
#Detailed Excel data files (require parsers...)
#MDS-Web-File-National-November-2016.xlsx
#MDS-Web-File-North-November-2016.xlsx
#MDS-Web-File-London-November-2016.xlsx
#MDS-Web-File-Midlands-East-England-November-2016.xlsx
#!wget -P data/ https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/MDS-Web-File-South-November-2016.xlsx
df=pd.read_csv('data/NHS-111-from-Aug10-to-Nov16-web-file.csv',skiprows=4)
df.head()
Yearnumber | Periodname | Provider Org Code | Provider Name | Commissioner Org Code | Commissioner Org Name | Nhs1 Population SUM | Nhs1 Number Calls Offered SUM | Nhs1 Calls Through 111 SUM | Nhs1 Calls Through Other SUM | ... | Nhs1 Recommend To Other SUM | Nhs1 Not Recommend SUM | Nhs1 Given Health Information SUM | Nhs1 Recommended Home Care SUM | Nhs1 Recommended Non Clinical SUM | Nhs1 Time Call Handlers SUM | Nhs1 Time Clinical Staff SUM | Nhs1 Cost Call Handlers SUM | Nhs1 Cost Clinical Staff SUM | Nhs1 Total Other Costs SUM | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2010-11 | Dec-10 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 299753 | 5782 | 1317.0 | 4465.0 | ... | 161 | 1034 | 164.0 | 509 | 361.0 | 33143.06667 | 17844.93333 | NCA | NCA | NCA |
1 | 2010-11 | Jan-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 5865 | 1280.0 | 4585.0 | ... | 177 | 1236 | 216.0 | 389 | 631.0 | 28714.61667 | 17309.8 | NCA | NCA | NCA |
2 | 2010-11 | Feb-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 3567 | 932.0 | 2635.0 | ... | 121 | 801 | 120.0 | 257 | 424.0 | 17761.16667 | 13719.93333 | NCA | NCA | NCA |
3 | 2010-11 | Mar-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 4942 | 1687.0 | 3255.0 | ... | 160 | 1160 | 201.0 | 386 | 573.0 | 23035.85 | 17147.86667 | NCA | NCA | NCA |
4 | 2011-12 | Apr-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 5797 | 1596.0 | 4201.0 | ... | 168 | 1377 | 204.0 | 410 | 763.0 | 33396 | 19839 | NCA | NCA | NCA |
5 rows × 42 columns
df.columns
Index(['Yearnumber', 'Periodname', 'Provider Org Code', 'Provider Name', 'Commissioner Org Code', 'Commissioner Org Name', 'Nhs1 Population SUM', 'Nhs1 Number Calls Offered SUM', 'Nhs1 Calls Through 111 SUM', 'Nhs1 Calls Through Other SUM', 'Nhs1 Abandoned Calls SUM', 'Nhs1 Answered Calls SUM', 'Nhs1 Answered Through 111 SUM', 'Nhs1 Answered Through Other SUM', 'Nhs1 Answered 60sec SUM', 'Nhs1 Calls Person Triaged SUM', 'Nhs1 Cnt Caller Terminated SUM', 'Nhs1 Cnt Caller Referred SUM', 'Nhs1 Cnt Caller Given Info SUM', 'Nhs1 Cnt Other Reason SUM', 'Nhs1 Transferred To Ca SUM', 'Nhs1 Wtransferred To Ca SUM', 'Nhs1 Ave Wtransfer Time SUM', 'Nhs1 Person Offered Call Back SUM', 'Nhs1 Calls Back Within 10mins SUM', 'Nhs1 Ave Episode Length SUM', 'Nhs1 Amb Dispatches SUM', 'Nhs1 Recommend To Ae SUM', 'Nhs1 Recommend To Primcare SUM', 'Nhs1 Rec Contact Prim Com Care SUM', 'Nhs1 Rec Speak Prim Com Care SUM', 'Nhs1 Rec Dental Pharmacy SUM', 'Nhs1 Recommend To Other SUM', 'Nhs1 Not Recommend SUM', 'Nhs1 Given Health Information SUM', 'Nhs1 Recommended Home Care SUM', 'Nhs1 Recommended Non Clinical SUM', 'Nhs1 Time Call Handlers SUM', 'Nhs1 Time Clinical Staff SUM', 'Nhs1 Cost Call Handlers SUM', 'Nhs1 Cost Clinical Staff SUM', 'Nhs1 Total Other Costs SUM'], dtype='object')
df['_period']=pd.to_datetime(('01-'+df['Periodname']),format='%d-%b-%y').dt.to_period('M')
df.head()
Yearnumber | Periodname | Provider Org Code | Provider Name | Commissioner Org Code | Commissioner Org Name | Nhs1 Population SUM | Nhs1 Number Calls Offered SUM | Nhs1 Calls Through 111 SUM | Nhs1 Calls Through Other SUM | ... | Nhs1 Not Recommend SUM | Nhs1 Given Health Information SUM | Nhs1 Recommended Home Care SUM | Nhs1 Recommended Non Clinical SUM | Nhs1 Time Call Handlers SUM | Nhs1 Time Clinical Staff SUM | Nhs1 Cost Call Handlers SUM | Nhs1 Cost Clinical Staff SUM | Nhs1 Total Other Costs SUM | _period | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2010-11 | Dec-10 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 299753 | 5782 | 1317.0 | 4465.0 | ... | 1034 | 164.0 | 509 | 361.0 | 33143.06667 | 17844.93333 | NCA | NCA | NCA | 2010-12 |
1 | 2010-11 | Jan-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 5865 | 1280.0 | 4585.0 | ... | 1236 | 216.0 | 389 | 631.0 | 28714.61667 | 17309.8 | NCA | NCA | NCA | 2011-01 |
2 | 2010-11 | Feb-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 3567 | 932.0 | 2635.0 | ... | 801 | 120.0 | 257 | 424.0 | 17761.16667 | 13719.93333 | NCA | NCA | NCA | 2011-02 |
3 | 2010-11 | Mar-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 4942 | 1687.0 | 3255.0 | ... | 1160 | 201.0 | 386 | 573.0 | 23035.85 | 17147.86667 | NCA | NCA | NCA | 2011-03 |
4 | 2011-12 | Apr-11 | 111AA4 | NOTTINGHAM CITY NHS 111 | 5EM | Nottingham City PCT | 303899 | 5797 | 1596.0 | 4201.0 | ... | 1377 | 204.0 | 410 | 763.0 | 33396 | 19839 | NCA | NCA | NCA | 2011-04 |
5 rows × 43 columns
df.groupby(['_period','Provider Name'])['Nhs1 Recommend To Ae SUM'].sum().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x108438400>
df.groupby(['_period','Provider Name'])['Nhs1 Recommend To Ae SUM'].sum().unstack(1).fillna(0).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10b332588>
df.groupby(['_period'])['Nhs1 Recommend To Ae SUM'].sum().fillna(0).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10b64ef60>
cols=[c for c in df.columns if c.startswith('Nhs1 Rec')]
sns.set_palette("Set2", len(cols))
pd.melt(df,id_vars='_period',value_vars=cols
).groupby(['_period','variable']).sum().unstack(1).fillna(0).plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10ca35eb8>
#A&E referrals: https://www.england.nhs.uk/statistics/statistical-work-areas/ae-waiting-times-and-activity/statistical-work-areasae-waiting-times-and-activityae-attendances-and-emergency-admissions-2016-17/
!wget -P data/ https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/Quarterly-timeseries-Nov-2016.xls
--2017-01-14 13:56:15-- https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2016/06/Quarterly-timeseries-Nov-2016.xls Resolving www.england.nhs.uk... 54.240.166.58, 54.240.166.39, 54.240.166.87, ... Connecting to www.england.nhs.uk|54.240.166.58|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 91648 (90K) [application/vnd.ms-excel] Saving to: 'data/Quarterly-timeseries-Nov-2016.xls' Quarterly-timeserie 100%[=====================>] 89.50K --.-KB/s in 0.03s 2017-01-14 13:56:16 (2.62 MB/s) - 'data/Quarterly-timeseries-Nov-2016.xls' saved [91648/91648]
df2=pd.read_excel?
df2=pd.read_excel('data/Quarterly-timeseries-Nov-2016.xls',skiprows=16,na_values='-').dropna(how='all')
df2['Year'].fillna(method='ffill',inplace=True)
#First group of cols are A&E attendances. I really need to sort out how to read multirow headers!
df2.head()
Unnamed: 0 | Area Team | Year | Quarter | Type 1 Departments - Major A&E | Type 2 Departments - Single Specialty | Type 3 Departments - Other A&E/Minor Injury Unit | Total attendances | Type 1 Departments - Major A&E.1 | Type 2 Departments - Single Specialty.1 | ... | Percentage in 4 hours or less (type 1) | Percentage in 4 hours or less (all) | Emergency Admissions via Type 1 A&E | Emergency Admissions via Type 2 A&E | Emergency Admissions via Type 3 and 4 A&E | Total Emergency admissions via A&E | Other Emergency admissions (i.e not via A&E) | Total emergency admissions | Number of patients spending >4 hours from decision to admit to admission | Number of patients spending >12 hours from decision to admit to admission | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | NaN | NaN | 2004-05 | Q1: April - June | 3377850.0 | 159966.0 | 964762.0 | 4502578.0 | 232110.0 | 2612.0 | ... | 0.931285 | 0.947135 | 651785.0 | 6778.0 | 7830.0 | 666393.0 | NaN | NaN | 27665.0 | NaN |
2 | NaN | NaN | 2004-05 | Q2: July - Sept | 3381219.0 | 156697.0 | 1018779.0 | 4556695.0 | 182554.0 | 1830.0 | ... | 0.946009 | 0.958960 | 673452.0 | 6670.0 | 8256.0 | 688378.0 | NaN | NaN | 20989.0 | NaN |
3 | NaN | NaN | 2004-05 | Q3: Oct - Dec | 3257398.0 | 150643.0 | 966886.0 | 4374927.0 | 136953.0 | 1415.0 | ... | 0.957956 | 0.968061 | 705901.0 | 6802.0 | 8691.0 | 721394.0 | NaN | NaN | 19831.0 | NaN |
4 | NaN | NaN | 2004-05 | Q4: Jan - Mar | 3249353.0 | 151926.0 | 1001701.0 | 4402980.0 | 115978.0 | 982.0 | ... | 0.964307 | 0.973280 | 724814.0 | 6975.0 | 5414.0 | 737203.0 | NaN | NaN | 24178.0 | NaN |
5 | NaN | NaN | 2005-06 | Q1: April - June | 3520931.0 | 165801.0 | 1172846.0 | 4859578.0 | 83845.0 | 742.0 | ... | 0.976187 | 0.982480 | 719644.0 | 7717.0 | 6048.0 | 733409.0 | NaN | NaN | 12991.0 | NaN |
5 rows × 22 columns
df2[['Type 1 Departments - Major A&E',
'Type 2 Departments - Single Specialty',
'Type 3 Departments - Other A&E/Minor Injury Unit']].plot();
df2['period']=df2['Year']+df2['Quarter']
ax=df2[['Type 1 Departments - Major A&E',
'Type 2 Departments - Single Specialty',
'Type 3 Departments - Other A&E/Minor Injury Unit']][:-4].plot(logy=True, xticks=df2.index, rot=90)
#ax.set_xticklabels(df2['period'])
ticks = ax.xaxis.get_ticklocs()
ax.xaxis.set_ticks(ticks[::8])
ax.xaxis.set_ticklabels(df2['period'][::8]);
df2["Year"].unique()
array(['2004-05', '-', '2005-06', '2006-07', '2007-08', '2008-09', '2009-10', '2010-11', '2011-12', '2012-13', '2013-14', '2014-15', '2015-16', '2016-17', nan], dtype=object)
df2[:-4].tail()
Unnamed: 0 | Area Team | Year | Quarter | Type 1 Departments - Major A&E | Type 2 Departments - Single Specialty | Type 3 Departments - Other A&E/Minor Injury Unit | Total attendances | Type 1 Departments - Major A&E.1 | Type 2 Departments - Single Specialty.1 | ... | Percentage in 4 hours or less (all) | Emergency Admissions via Type 1 A&E | Emergency Admissions via Type 2 A&E | Emergency Admissions via Type 3 and 4 A&E | Total Emergency admissions via A&E | Other Emergency admissions (i.e not via A&E) | Total emergency admissions | Number of patients spending >4 hours from decision to admit to admission | Number of patients spending >12 hours from decision to admit to admission | period | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
46 | NaN | NaN | 2015-16 | Q2: July - Sep | 3708943.0 | 153743.000000 | 1815327.0 | 5.678013e+06 | 319789.0 | 1130.0 | ... | 0.942399 | 1003270.0 | 3868.0 | 12113.0 | 1019251.0 | 374290.0 | 1393541.0 | 63188.0 | 73.0 | 2015-16Q2: July - Sep |
47 | NaN | NaN | 2015-16 | Q3: Oct - Dec | 3730654.0 | 144729.000000 | 1789612.0 | 5.664995e+06 | 470383.0 | 1345.0 | ... | 0.915334 | 1039186.0 | 3176.0 | 11859.0 | 1054221.0 | 389128.0 | 1443349.0 | 98691.0 | 116.0 | 2015-16Q3: Oct - Dec |
48 | NaN | NaN | 2015-16 | Q4: Jan - Mar | 3818750.0 | 145501.428571 | 1900998.0 | 5.865249e+06 | 694924.0 | 1641.0 | ... | 0.878743 | 1045021.0 | 3286.0 | 12326.0 | 1060633.0 | 381140.0 | 1441773.0 | 155277.0 | 695.0 | 2015-16Q4: Jan - Mar |
49 | NaN | NaN | 2016-17 | Q1: April - June | 3849762.0 | 154719.000000 | 1889452.0 | 5.893933e+06 | 561036.0 | 1682.0 | ... | 0.902890 | 1038795.0 | 3274.0 | 11484.0 | 1053553.0 | 380057.0 | 1433610.0 | 112135.0 | 387.0 | 2016-17Q1: April - June |
50 | NaN | NaN | 2016-17 | Q2: July - Sep | 3885494.0 | 157886.000000 | 1918063.0 | 5.961443e+06 | 547973.0 | 1511.0 | ... | 0.906234 | 1044748.0 | 3512.0 | 11395.0 | 1059655.0 | 377188.0 | 1436843.0 | 107350.0 | 255.0 | 2016-17Q2: July - Sep |
5 rows × 23 columns