/***************************************************** SAS FILE THAT GENERATES FIGURE 1B-D AND FIGURE 2 Simonsohn, Nelson, Simmons (JEP:G 2013) "p-curve: a key to the file-drawer"; The code below starts defining a macro, "pc(). For a given sample size (n) and effect size (d) it reports power and the proportion of significant p-values in each of 5 bins of p-values (p<.01, .02<.p<.03, etc). The macro uses variable &k to identify that calculation so that it can be later aggregated. k is just a label After creating the macro it is used for Figures 1b-d and Figure d. Figure 1e-h is created using another file (as it requires simulating data rather than consulting t-distribution tables; This version: 2013 04 24 Uri Simonsohn (uws@wharton.upenn.edu) *****************************************************/; *Macro start here; %macro pc(n,d,k); *Start datafile with entered n & d, name the file pc&k; data pc&k; n=&n; d= &d; run; *compute noncentrality parameter (ncp) and degrees of freedom (df) for two-sample t-test; data pc&k; set pc&k; df=2*n-2; ncp=sqrt(n/2)*d; *Compute critical t-values for p=.01,.02....05; *note, because we deal with 2-sided test, for p<.05 we want 97.5% of t-values to be smaller than the critical one; q05=tinv(.975,df); q04=tinv(.98,df); q03=tinv(.985,df); q02=tinv(.99,df); q01=tinv(.995,df); *Compute proportion of tests smaller than each cutoff; power5=1-probt(q05,df,ncp); power4=1-probt(q04,df,ncp); power3=1-probt(q03,df,ncp); power2=1-probt(q02,df,ncp); power1=1-probt(q01,df,ncp); *Compute relative frequency in each bin; p1=power1/power5; *proportion of p-values p<.01 among those p<.05; p2=(power2-power1)/power5; *proportion of p-values .010 Compute expected p-curves for two-sample t-tests with n=10-100 per cell, with effect size d=.1 to 1; %pc(n=10,d=.1,k=11); %pc(n=20,d=.2,k=12); %pc(n=30,d=.3,k=13); %pc(n=40,d=.4,k=14); %pc(n=50,d=.5,k=15); %pc(n=60,d=.6,k=16); %pc(n=70,d=.7,k=17); %pc(n=80,d=.8,k=18); %pc(n=90,d=.9,k=19); %pc(n=100,d=1,k=20); * Figure 2b Compute expected p-curves for two-sample t-tests with n=10-100 per cell, with effect size d=1 to .1; %pc(n=10,d=1,k=21); %pc(n=20,d=.9,k=22); %pc(n=30,d=.8,k=23); %pc(n=40,d=.7,k=24); %pc(n=50,d=.6,k=25); %pc(n=60,d=.5,k=26); %pc(n=70,d=.4,k=27); %pc(n=80,d=.3,k=28); %pc(n=90,d=.2,k=29); %pc(n=100,d=.1,k=30); * Figure 2c Compute expected p-curves for two-sample t-tests with n=40 per cell, with effect size d=1 to .1; %pc(n=40,d=1,k=31); %pc(n=40,d=.9,k=32); %pc(n=40,d=.8,k=33); %pc(n=40,d=.7,k=34); %pc(n=40,d=.6,k=35); %pc(n=40,d=.5,k=36); %pc(n=40,d=.4,k=37); %pc(n=40,d=.3,k=38); %pc(n=40,d=.2,k=39); %pc(n=40,d=.1,k=40); *combine the 10 rows of each set into three sets; data f2_pos; set pc11-pc20; run; data f2_neg; set pc21-pc30; run; data f2_0; set pc31-pc40; run; *Show them to a human; title "Fig 2a - r(n,d)>0"; proc means data=f2_pos mean; var p1-p5; run; *this is reported in the Figure 2 caption; proc print data=f2_pos; run; title "Fig 2b - r(n,d)<0"; proc means data=f2_neg mean; var p1-p5; run; title "Fig 23 - r(n,d)=0"; proc means data=f2_0 mean; var p1-p5; run;