MathJax.Hub.Config({ tex2jax: { inlineMath: [ ['$','$'], ["\\(","\\)"] ], displayMath: [ ['$$','$$'], ["\\[","\\]"] ], processEscapes: true, processEnvironments: true }, // Center justify equations in code and markdown cells. Elsewhere // we use CSS to left justify single line equations in code cells. displayAlign: 'center', "HTML-CSS": { styles: {'.MathJax_Display': {"margin": 0}}, linebreaks: { automatic: true } } });
In [1]:
#Import libraries
import pandas as pd
import numpy as np
from numpy import nan
pd.set_option('display.max_columns', None)
In [2]:
data=pd.read_excel('WK11Q2.xlsx')
data.head(20)
Out[2]:
ID DRG HCC Dr LOS
0 1 1 3 Peer 3
1 2 5 3 Peer 3
2 3 2 3 Clinician 3
3 4 1 1 Clinician 1
4 5 4 2 Peer 3
5 6 2 3 Clinician 3
6 7 8 3 Clinician 4
7 8 1 1 Peer 1
8 9 7 3 Clinician 4
9 10 6 3 Peer 4
10 11 10 1 Peer 4
11 12 2 3 Clinician 3
12 13 3 3 Peer 3
13 14 4 2 Peer 3
14 15 2 2 Clinician 2
15 16 5 1 Peer 2
16 17 4 3 Clinician 3
17 18 3 1 Clinician 2
18 19 5 2 Clinician 3
19 20 8 2 Clinician 4

The data used in this analysis consists of the following fields:

  • ID: Patients' ID. There are 151 patients.

  • HCC: Hierarchical Condition Category. HCC field contains 3 different values for low, medium, and high severity as 1,2,3.

  • Dr: indicates patients was cared for by clinician or peer group.

  • DRG: Diagnostic Related Groups. The DRG field contains 10 different values.

  • LOS: Length of Stay. Outcomes of care are in column LOS.

In [3]:
data.dtypes
Out[3]:
ID      int64
DRG     int64
HCC     int64
Dr     object
LOS     int64
dtype: object
In [4]:
###Create a binary column to convert clinician or peer group to 1/0
data['Clinician'] = np.where(data['Dr'] == 'Clinician',1,0)
data["Clinician"] = data["Clinician"].astype(float)
data['Peer'] = np.where(data['Dr']== 'Peer',1,0)
data["Peer"] = data["Peer"].astype(float)
data.dtypes
Out[4]:
ID             int64
DRG            int64
HCC            int64
Dr            object
LOS            int64
Clinician    float64
Peer         float64
dtype: object
In [5]:
data1 = data[data['Clinician']== 1].reset_index()
data1
Out[5]:
index ID DRG HCC Dr LOS Clinician Peer
0 2 3 2 3 Clinician 3 1.0 0.0
1 3 4 1 1 Clinician 1 1.0 0.0
2 5 6 2 3 Clinician 3 1.0 0.0
3 6 7 8 3 Clinician 4 1.0 0.0
4 8 9 7 3 Clinician 4 1.0 0.0
... ... ... ... ... ... ... ... ...
76 140 141 3 3 Clinician 3 1.0 0.0
77 141 142 10 1 Clinician 4 1.0 0.0
78 145 146 5 2 Clinician 3 1.0 0.0
79 146 147 7 3 Clinician 4 1.0 0.0
80 147 148 10 2 Clinician 4 1.0 0.0

81 rows × 8 columns

In [6]:
data2 = data[data['Peer']== 1].reset_index()
data2
Out[6]:
index ID DRG HCC Dr LOS Clinician Peer
0 0 1 1 3 Peer 3 0.0 1.0
1 1 2 5 3 Peer 3 0.0 1.0
2 4 5 4 2 Peer 3 0.0 1.0
3 7 8 1 1 Peer 1 0.0 1.0
4 9 10 6 3 Peer 4 0.0 1.0
... ... ... ... ... ... ... ... ...
65 143 144 10 3 Peer 5 0.0 1.0
66 144 145 9 3 Peer 4 0.0 1.0
67 148 149 9 1 Peer 4 0.0 1.0
68 149 150 2 1 Peer 1 0.0 1.0
69 150 151 3 1 Peer 3 0.0 1.0

70 rows × 8 columns

In [7]:
#Calculate pattern of care for Clinician group which satisfy the 
#conditions of HCC & DRG at the same time
NumberC= data1.groupby(['HCC','DRG'],as_index = True).size().reset_index(name='Numb_C')
groupC1 = pd.pivot_table(NumberC,
                       index=['HCC','DRG'],
                       values='Numb_C',                            
                       fill_value = 0,
                       dropna=False,
                       aggfunc=np.sum).reset_index()
groupC1
Out[7]:
HCC DRG Numb_C
0 1 1 2
1 1 2 2
2 1 3 3
3 1 4 1
4 1 5 2
5 1 6 5
6 1 7 2
7 1 8 2
8 1 9 1
9 1 10 1
10 2 1 2
11 2 2 5
12 2 3 7
13 2 4 4
14 2 5 2
15 2 6 3
16 2 7 3
17 2 8 2
18 2 9 3
19 2 10 4
20 3 1 3
21 3 2 5
22 3 3 3
23 3 4 2
24 3 5 0
25 3 6 3
26 3 7 6
27 3 8 1
28 3 9 0
29 3 10 2
In [8]:
LOSC=data1.groupby(['HCC','DRG'])['LOS'].max().reset_index(name='LOS_C')
groupC2 = pd.pivot_table(LOSC,
                       index=['HCC','DRG'],
                       values='LOS_C',                            
                       fill_value = 0,
                       dropna=False,
                       aggfunc=np.sum).reset_index()
groupC2
Out[8]:
HCC DRG LOS_C
0 1 1 1
1 1 2 1
2 1 3 2
3 1 4 2
4 1 5 2
5 1 6 3
6 1 7 3
7 1 8 3
8 1 9 4
9 1 10 4
10 2 1 2
11 2 2 2
12 2 3 2
13 2 4 3
14 2 5 3
15 2 6 3
16 2 7 3
17 2 8 4
18 2 9 4
19 2 10 4
20 3 1 3
21 3 2 3
22 3 3 3
23 3 4 3
24 3 5 0
25 3 6 4
26 3 7 4
27 3 8 4
28 3 9 0
29 3 10 5
In [9]:
#Calculate pattern of care for Peer group which satisfy the conditions 
#of HCC & DRG at the same time
NumberP= data2.groupby(['HCC','DRG'],as_index = True).size().reset_index(name='Numb_P')
groupP1 = pd.pivot_table(NumberP,
                       index=['HCC','DRG'],
                       values='Numb_P',                            
                       fill_value = 0,
                       dropna=False,
                       aggfunc=np.sum).reset_index()
groupP1
Out[9]:
HCC DRG Numb_P
0 1 1 2
1 1 2 2
2 1 3 1
3 1 4 1
4 1 5 2
5 1 6 2
6 1 7 2
7 1 8 2
8 1 9 7
9 1 10 8
10 2 1 1
11 2 2 0
12 2 3 2
13 2 4 8
14 2 5 1
15 2 6 0
16 2 7 1
17 2 8 2
18 2 9 0
19 2 10 0
20 3 1 3
21 3 2 1
22 3 3 1
23 3 4 2
24 3 5 3
25 3 6 4
26 3 7 2
27 3 8 1
28 3 9 5
29 3 10 4
In [10]:
LOSP=data2.groupby(['HCC','DRG'])['LOS'].max().reset_index(name='LOS_P')
groupP2 = pd.pivot_table(LOSP,
                       index=['HCC','DRG'],
                       values='LOS_P',                            
                       fill_value = 0,
                       dropna=False,
                       aggfunc=np.sum).reset_index()
groupP2
Out[10]:
HCC DRG LOS_P
0 1 1 1
1 1 2 1
2 1 3 3
3 1 4 2
4 1 5 2
5 1 6 3
6 1 7 3
7 1 8 3
8 1 9 4
9 1 10 4
10 2 1 2
11 2 2 0
12 2 3 2
13 2 4 3
14 2 5 3
15 2 6 0
16 2 7 3
17 2 8 4
18 2 9 0
19 2 10 0
20 3 1 3
21 3 2 3
22 3 3 3
23 3 4 3
24 3 5 3
25 3 6 4
26 3 7 4
27 3 8 4
28 3 9 4
29 3 10 5
In [11]:
groupC_final = pd.merge(groupC1, groupC2, how='left',on=['HCC','DRG']).reset_index()
groupC_final = groupC_final.drop(columns = 'index', axis = 1)
groupC_final
Out[11]:
HCC DRG Numb_C LOS_C
0 1 1 2 1
1 1 2 2 1
2 1 3 3 2
3 1 4 1 2
4 1 5 2 2
5 1 6 5 3
6 1 7 2 3
7 1 8 2 3
8 1 9 1 4
9 1 10 1 4
10 2 1 2 2
11 2 2 5 2
12 2 3 7 2
13 2 4 4 3
14 2 5 2 3
15 2 6 3 3
16 2 7 3 3
17 2 8 2 4
18 2 9 3 4
19 2 10 4 4
20 3 1 3 3
21 3 2 5 3
22 3 3 3 3
23 3 4 2 3
24 3 5 0 0
25 3 6 3 4
26 3 7 6 4
27 3 8 1 4
28 3 9 0 0
29 3 10 2 5
In [12]:
groupP_final = pd.merge(groupP1, groupP2, how='left',on=['HCC','DRG']).reset_index()
groupP_final = groupP_final.drop(columns = 'index', axis = 1)
groupP_final
Out[12]:
HCC DRG Numb_P LOS_P
0 1 1 2 1
1 1 2 2 1
2 1 3 1 3
3 1 4 1 2
4 1 5 2 2
5 1 6 2 3
6 1 7 2 3
7 1 8 2 3
8 1 9 7 4
9 1 10 8 4
10 2 1 1 2
11 2 2 0 0
12 2 3 2 2
13 2 4 8 3
14 2 5 1 3
15 2 6 0 0
16 2 7 1 3
17 2 8 2 4
18 2 9 0 0
19 2 10 0 0
20 3 1 3 3
21 3 2 1 3
22 3 3 1 3
23 3 4 2 3
24 3 5 3 3
25 3 6 4 4
26 3 7 2 4
27 3 8 1 4
28 3 9 5 4
29 3 10 4 5
In [13]:
###Match clinicians and peer group on common strata
df = pd.merge(groupC_final, groupP_final, how='left',on=['HCC','DRG']).reset_index()
df = df.drop(columns = 'index', axis = 1)
df
Out[13]:
HCC DRG Numb_C LOS_C Numb_P LOS_P
0 1 1 2 1 2 1
1 1 2 2 1 2 1
2 1 3 3 2 1 3
3 1 4 1 2 1 2
4 1 5 2 2 2 2
5 1 6 5 3 2 3
6 1 7 2 3 2 3
7 1 8 2 3 2 3
8 1 9 1 4 7 4
9 1 10 1 4 8 4
10 2 1 2 2 1 2
11 2 2 5 2 0 0
12 2 3 7 2 2 2
13 2 4 4 3 8 3
14 2 5 2 3 1 3
15 2 6 3 3 0 0
16 2 7 3 3 1 3
17 2 8 2 4 2 4
18 2 9 3 4 0 0
19 2 10 4 4 0 0
20 3 1 3 3 3 3
21 3 2 5 3 1 3
22 3 3 3 3 1 3
23 3 4 2 3 2 3
24 3 5 0 0 3 3
25 3 6 3 4 4 4
26 3 7 6 4 2 4
27 3 8 1 4 1 4
28 3 9 0 0 5 4
29 3 10 2 5 4 5
In [14]:
#Calculate Total Number of patients for Clinician and Peer group 
df['Total_C'] = data1['Clinician'].count()
df['Total_P'] = data2['Peer'].count()
#Calculate Probability for Clinician and Peer group 
df['Prob_C']=df['Numb_C']/df['Total_C']
df['Prob_P']=df['Numb_P']/df['Total_P']
#Solve for 'Prob_C *CASE WHEN LOS_C is null 
df['ClinicianLOS'] = (df['Prob_C']) * df['LOS_C']
#Solve for 'Prob_C*LOS_P'
df['PeerLOS'] = (df['Prob_P']) * df['LOS_P']
df
Out[14]:
HCC DRG Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS
0 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571
1 1 2 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571
2 1 3 3 2 1 3 81 70 0.037037 0.014286 0.074074 0.042857
3 1 4 1 2 1 2 81 70 0.012346 0.014286 0.024691 0.028571
4 1 5 2 2 2 2 81 70 0.024691 0.028571 0.049383 0.057143
5 1 6 5 3 2 3 81 70 0.061728 0.028571 0.185185 0.085714
6 1 7 2 3 2 3 81 70 0.024691 0.028571 0.074074 0.085714
7 1 8 2 3 2 3 81 70 0.024691 0.028571 0.074074 0.085714
8 1 9 1 4 7 4 81 70 0.012346 0.100000 0.049383 0.400000
9 1 10 1 4 8 4 81 70 0.012346 0.114286 0.049383 0.457143
10 2 1 2 2 1 2 81 70 0.024691 0.014286 0.049383 0.028571
11 2 2 5 2 0 0 81 70 0.061728 0.000000 0.123457 0.000000
12 2 3 7 2 2 2 81 70 0.086420 0.028571 0.172840 0.057143
13 2 4 4 3 8 3 81 70 0.049383 0.114286 0.148148 0.342857
14 2 5 2 3 1 3 81 70 0.024691 0.014286 0.074074 0.042857
15 2 6 3 3 0 0 81 70 0.037037 0.000000 0.111111 0.000000
16 2 7 3 3 1 3 81 70 0.037037 0.014286 0.111111 0.042857
17 2 8 2 4 2 4 81 70 0.024691 0.028571 0.098765 0.114286
18 2 9 3 4 0 0 81 70 0.037037 0.000000 0.148148 0.000000
19 2 10 4 4 0 0 81 70 0.049383 0.000000 0.197531 0.000000
20 3 1 3 3 3 3 81 70 0.037037 0.042857 0.111111 0.128571
21 3 2 5 3 1 3 81 70 0.061728 0.014286 0.185185 0.042857
22 3 3 3 3 1 3 81 70 0.037037 0.014286 0.111111 0.042857
23 3 4 2 3 2 3 81 70 0.024691 0.028571 0.074074 0.085714
24 3 5 0 0 3 3 81 70 0.000000 0.042857 0.000000 0.128571
25 3 6 3 4 4 4 81 70 0.037037 0.057143 0.148148 0.228571
26 3 7 6 4 2 4 81 70 0.074074 0.028571 0.296296 0.114286
27 3 8 1 4 1 4 81 70 0.012346 0.014286 0.049383 0.057143
28 3 9 0 0 5 4 81 70 0.000000 0.071429 0.000000 0.285714
29 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714
In [15]:
###Does not matter if outcomes for clinician is null,
###but peer group's null values require synthetic control calculations.
Null_P=df[df['LOS_P'].isin(['0'])]
Null_P
Out[15]:
HCC DRG Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS
11 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0
15 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0
18 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0
19 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0
In [16]:
###df['Total_C'] = data1['Clinician'].count() = 81
###Overlap between peer and clinician cases
Sum_Null=Null_P['Numb_C'].sum()
SumOverlap = (81 - Sum_Null)/df['Numb_C'].sum() *100
SumOverlap
Out[16]:
81.48148148148148
In [17]:
df1 = pd.merge(df, data, on =['HCC'])
MergeonHCC = df1[df1['Peer']== 1]
MergeonHCC
Out[17]:
HCC DRG_x Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS ID DRG_y Dr LOS Clinician Peer
1 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 8 1 Peer 1 0.0 1.0
2 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 11 10 Peer 4 0.0 1.0
3 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 16 5 Peer 2 0.0 1.0
5 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 25 10 Peer 4 0.0 1.0
6 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 26 9 Peer 4 0.0 1.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1502 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 130 9 Peer 4 0.0 1.0
1503 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 132 10 Peer 5 0.0 1.0
1504 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 135 6 Peer 4 0.0 1.0
1507 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 144 10 Peer 5 0.0 1.0
1508 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 145 9 Peer 4 0.0 1.0

700 rows × 18 columns

In [18]:
df2 = pd.merge(df, data, on =['DRG'])
MergeonDRG = df2[df2['Peer']== 1]
MergeonDRG
Out[18]:
HCC_x DRG Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS ID HCC_y Dr LOS Clinician Peer
0 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 1 3 Peer 3 0.0 1.0
2 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 8 1 Peer 1 0.0 1.0
4 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 31 3 Peer 3 0.0 1.0
5 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 33 2 Peer 2 0.0 1.0
6 1 1 2 1 2 1 81 70 0.024691 0.028571 0.024691 0.028571 34 3 Peer 3 0.0 1.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
443 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 85 3 Peer 5 0.0 1.0
445 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 100 1 Peer 4 0.0 1.0
447 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 124 1 Peer 4 0.0 1.0
449 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 132 3 Peer 5 0.0 1.0
451 3 10 2 5 4 5 81 70 0.024691 0.057143 0.123457 0.285714 144 3 Peer 5 0.0 1.0

210 rows × 18 columns

In [19]:
df3=MergeonHCC[MergeonHCC['LOS_P'].isin(['0'])]
df3
Out[19]:
HCC DRG_x Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS ID DRG_y Dr LOS Clinician Peer
550 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 5 4 Peer 3 0.0 1.0
551 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 14 4 Peer 3 0.0 1.0
556 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 24 4 Peer 3 0.0 1.0
559 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 33 1 Peer 2 0.0 1.0
560 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 37 8 Peer 4 0.0 1.0
565 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 45 4 Peer 3 0.0 1.0
566 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 46 4 Peer 3 0.0 1.0
567 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 50 4 Peer 3 0.0 1.0
571 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 61 5 Peer 3 0.0 1.0
573 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 69 3 Peer 2 0.0 1.0
583 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 106 3 Peer 2 0.0 1.0
584 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 111 7 Peer 3 0.0 1.0
585 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 117 8 Peer 4 0.0 1.0
596 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 138 4 Peer 3 0.0 1.0
597 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 143 4 Peer 3 0.0 1.0
750 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 5 4 Peer 3 0.0 1.0
751 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 14 4 Peer 3 0.0 1.0
756 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 24 4 Peer 3 0.0 1.0
759 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 33 1 Peer 2 0.0 1.0
760 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 37 8 Peer 4 0.0 1.0
765 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 45 4 Peer 3 0.0 1.0
766 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 46 4 Peer 3 0.0 1.0
767 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 50 4 Peer 3 0.0 1.0
771 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 61 5 Peer 3 0.0 1.0
773 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 69 3 Peer 2 0.0 1.0
783 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 106 3 Peer 2 0.0 1.0
784 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 111 7 Peer 3 0.0 1.0
785 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 117 8 Peer 4 0.0 1.0
796 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 138 4 Peer 3 0.0 1.0
797 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 143 4 Peer 3 0.0 1.0
900 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 5 4 Peer 3 0.0 1.0
901 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 14 4 Peer 3 0.0 1.0
906 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 24 4 Peer 3 0.0 1.0
909 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 33 1 Peer 2 0.0 1.0
910 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 37 8 Peer 4 0.0 1.0
915 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 45 4 Peer 3 0.0 1.0
916 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 46 4 Peer 3 0.0 1.0
917 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 50 4 Peer 3 0.0 1.0
921 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 61 5 Peer 3 0.0 1.0
923 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 69 3 Peer 2 0.0 1.0
933 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 106 3 Peer 2 0.0 1.0
934 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 111 7 Peer 3 0.0 1.0
935 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 117 8 Peer 4 0.0 1.0
946 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 138 4 Peer 3 0.0 1.0
947 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 143 4 Peer 3 0.0 1.0
950 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 5 4 Peer 3 0.0 1.0
951 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 14 4 Peer 3 0.0 1.0
956 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 24 4 Peer 3 0.0 1.0
959 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 33 1 Peer 2 0.0 1.0
960 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 37 8 Peer 4 0.0 1.0
965 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 45 4 Peer 3 0.0 1.0
966 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 46 4 Peer 3 0.0 1.0
967 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 50 4 Peer 3 0.0 1.0
971 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 61 5 Peer 3 0.0 1.0
973 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 69 3 Peer 2 0.0 1.0
983 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 106 3 Peer 2 0.0 1.0
984 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 111 7 Peer 3 0.0 1.0
985 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 117 8 Peer 4 0.0 1.0
996 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 138 4 Peer 3 0.0 1.0
997 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 143 4 Peer 3 0.0 1.0
In [20]:
df4=MergeonDRG[MergeonDRG['LOS_P'].isin(['0'])]
df4
Out[20]:
HCC_x DRG Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS ID HCC_y Dr LOS Clinician Peer
60 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 66 3 Peer 3 0.0 1.0
64 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 103 1 Peer 1 0.0 1.0
68 2 2 5 2 0 0 81 70 0.061728 0.0 0.123457 0.0 150 1 Peer 1 0.0 1.0
236 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 10 3 Peer 4 0.0 1.0
240 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 48 1 Peer 3 0.0 1.0
243 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 59 3 Peer 4 0.0 1.0
244 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 63 3 Peer 4 0.0 1.0
249 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 109 1 Peer 3 0.0 1.0
252 2 6 3 3 0 0 81 70 0.037037 0.0 0.111111 0.0 135 3 Peer 4 0.0 1.0
364 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 23 3 Peer 4 0.0 1.0
365 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 26 1 Peer 4 0.0 1.0
366 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 38 1 Peer 4 0.0 1.0
367 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 47 3 Peer 4 0.0 1.0
368 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 56 1 Peer 4 0.0 1.0
369 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 58 1 Peer 4 0.0 1.0
370 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 81 3 Peer 4 0.0 1.0
372 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 97 1 Peer 4 0.0 1.0
373 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 115 1 Peer 4 0.0 1.0
377 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 130 3 Peer 4 0.0 1.0
378 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 145 3 Peer 4 0.0 1.0
379 2 9 3 4 0 0 81 70 0.037037 0.0 0.148148 0.0 149 1 Peer 4 0.0 1.0
415 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 11 1 Peer 4 0.0 1.0
417 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 25 1 Peer 4 0.0 1.0
418 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 32 1 Peer 4 0.0 1.0
419 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 49 1 Peer 4 0.0 1.0
420 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 55 1 Peer 4 0.0 1.0
421 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 60 3 Peer 5 0.0 1.0
422 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 62 1 Peer 4 0.0 1.0
424 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 85 3 Peer 5 0.0 1.0
426 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 100 1 Peer 4 0.0 1.0
428 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 124 1 Peer 4 0.0 1.0
430 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 132 3 Peer 5 0.0 1.0
432 2 10 4 4 0 0 81 70 0.049383 0.0 0.197531 0.0 144 3 Peer 5 0.0 1.0
In [21]:
###Average for a marginal
LOS1=df3['LOS'].mean()
LOS1
Out[21]:
2.933333333333333
In [22]:
###Average for complement marginal
LOS2=df4['LOS'].mean()
LOS2
Out[22]:
3.8484848484848486
In [23]:
LOS3=data2['LOS'].mean()
LOS3
Out[23]:
3.3285714285714287
In [24]:
###Average for entire set
SUMLOSPAVG= LOS1 * LOS2 / LOS3
print(SUMLOSPAVG)
3.3915116833571766
In [25]:
### Select  is not null
df['LOS_P'] = df['LOS_P'].replace(0, nan)
df['LOS_P']=df['LOS_P'].fillna(value=SUMLOSPAVG)
df['LOS_P']
Out[25]:
0     1.000000
1     1.000000
2     3.000000
3     2.000000
4     2.000000
5     3.000000
6     3.000000
7     3.000000
8     4.000000
9     4.000000
10    2.000000
11    3.391512
12    2.000000
13    3.000000
14    3.000000
15    3.391512
16    3.000000
17    4.000000
18    3.391512
19    3.391512
20    3.000000
21    3.000000
22    3.000000
23    3.000000
24    3.000000
25    4.000000
26    4.000000
27    4.000000
28    4.000000
29    5.000000
Name: LOS_P, dtype: float64
In [26]:
###Switch probabilities of peer group to clinician
df['Prob_P'] = df['Prob_C'].values
df['Prob_P']
Out[26]:
0     0.024691
1     0.024691
2     0.037037
3     0.012346
4     0.024691
5     0.061728
6     0.024691
7     0.024691
8     0.012346
9     0.012346
10    0.024691
11    0.061728
12    0.086420
13    0.049383
14    0.024691
15    0.037037
16    0.037037
17    0.024691
18    0.037037
19    0.049383
20    0.037037
21    0.061728
22    0.037037
23    0.024691
24    0.000000
25    0.037037
26    0.074074
27    0.012346
28    0.000000
29    0.024691
Name: Prob_P, dtype: float64
In [27]:
###Second Overlap between peer and clinician cases
Null_P2=df[df['LOS_P'].isin(['0'])]
Sum_Null2=Null_P2['Numb_C'].sum()
SumOverlap2 = (81 - Sum_Null2)/df['Numb_C'].sum() *100
SumOverlap2
Out[27]:
100.0
In [28]:
#Solve for 'Prob_C*LOS_C'
df['PeerLOS'] = (df['Prob_P']) * df['LOS_P']
df
Out[28]:
HCC DRG Numb_C LOS_C Numb_P LOS_P Total_C Total_P Prob_C Prob_P ClinicianLOS PeerLOS
0 1 1 2 1 2 1.000000 81 70 0.024691 0.024691 0.024691 0.024691
1 1 2 2 1 2 1.000000 81 70 0.024691 0.024691 0.024691 0.024691
2 1 3 3 2 1 3.000000 81 70 0.037037 0.037037 0.074074 0.111111
3 1 4 1 2 1 2.000000 81 70 0.012346 0.012346 0.024691 0.024691
4 1 5 2 2 2 2.000000 81 70 0.024691 0.024691 0.049383 0.049383
5 1 6 5 3 2 3.000000 81 70 0.061728 0.061728 0.185185 0.185185
6 1 7 2 3 2 3.000000 81 70 0.024691 0.024691 0.074074 0.074074
7 1 8 2 3 2 3.000000 81 70 0.024691 0.024691 0.074074 0.074074
8 1 9 1 4 7 4.000000 81 70 0.012346 0.012346 0.049383 0.049383
9 1 10 1 4 8 4.000000 81 70 0.012346 0.012346 0.049383 0.049383
10 2 1 2 2 1 2.000000 81 70 0.024691 0.024691 0.049383 0.049383
11 2 2 5 2 0 3.391512 81 70 0.061728 0.061728 0.123457 0.209353
12 2 3 7 2 2 2.000000 81 70 0.086420 0.086420 0.172840 0.172840
13 2 4 4 3 8 3.000000 81 70 0.049383 0.049383 0.148148 0.148148
14 2 5 2 3 1 3.000000 81 70 0.024691 0.024691 0.074074 0.074074
15 2 6 3 3 0 3.391512 81 70 0.037037 0.037037 0.111111 0.125612
16 2 7 3 3 1 3.000000 81 70 0.037037 0.037037 0.111111 0.111111
17 2 8 2 4 2 4.000000 81 70 0.024691 0.024691 0.098765 0.098765
18 2 9 3 4 0 3.391512 81 70 0.037037 0.037037 0.148148 0.125612
19 2 10 4 4 0 3.391512 81 70 0.049383 0.049383 0.197531 0.167482
20 3 1 3 3 3 3.000000 81 70 0.037037 0.037037 0.111111 0.111111
21 3 2 5 3 1 3.000000 81 70 0.061728 0.061728 0.185185 0.185185
22 3 3 3 3 1 3.000000 81 70 0.037037 0.037037 0.111111 0.111111
23 3 4 2 3 2 3.000000 81 70 0.024691 0.024691 0.074074 0.074074
24 3 5 0 0 3 3.000000 81 70 0.000000 0.000000 0.000000 0.000000
25 3 6 3 4 4 4.000000 81 70 0.037037 0.037037 0.148148 0.148148
26 3 7 6 4 2 4.000000 81 70 0.074074 0.074074 0.296296 0.296296
27 3 8 1 4 1 4.000000 81 70 0.012346 0.012346 0.049383 0.049383
28 3 9 0 0 5 4.000000 81 70 0.000000 0.000000 0.000000 0.000000
29 3 10 2 5 4 5.000000 81 70 0.024691 0.024691 0.123457 0.123457
In [29]:
#continue solving the code
SumClinicianLOS = df['ClinicianLOS'].sum()
print(SumClinicianLOS)
SumPeerLOS = df['PeerLOS'].sum()
print(SumPeerLOS)
2.9629629629629624
3.0478108055599704
In [30]:
PercentMoreEfficient = (SumPeerLOS-SumClinicianLOS)*100/SumPeerLOS
print(PercentMoreEfficient)
2.7838946709626575

Conclusion

  • According to the results, LOS is more in the peer group (3.05) as compared to the clinician (2.96).
  • Clinician results for LOS are 2.78% more efficent than peer group.
In [ ]: