spBVS: fast Bayesian variable selection for spatial Gaussian/Poisson regression

  1. Woznicki, S. A., Nejadhashemi, A. P., Ross, D. M., Zhang, Z., Wang, L. and Esfahanian, A. (2015). Ecohydrological model parameter selection for stream health evaluation. Science of the Total Environment. 511, 341-353. URL
  2. Herman, M., Nejadhashemi, A. P., Daneshvar, F., Ross, D. M., Woznicki, S. A., Zhang, Z. and Esfahanian, A. (2015). Optimization of Conservation Practice Implementation Strategies in the Context of Stream Health. Ecological Engineering. 84, 1-12.
  3. Woznicki, S. A., Nejadhashemi, A. P., Abouali, M., Herman, H. R., Esfahanian, E., Hamaamin, A. Y. and Zhang, Z. (2016). Ecohydrological Modeling for Large-scale Environmental Impact Assessment. Science of the Total Environment. 543, 274-286.
  4. Herman, M., Nejadhashemi, A. P., Daneshvar, F., Abouali, M., Ross, D. M., Woznicki, S. A. and Zhang, Z. (2016). Optimization of bioenergy crop selection and placement based on a stream health indicator using an evolutionary algorithm. Journal of Environmental Management. 181, 413-424.
  5. Li, Y., Bence, J. R. and Zhang, Z. (2016). Variable selection for the determination of factors related to Lake Whitefish net movement distance in Lake Huron. To submit soon.

Contents

Demo on spBVS with reproducible results

function [] = demo()
% This is a reproducible demo for using spBVS. For all simulation studies, we simulate data from the first q=5 important variables out of p=30. 

% =========== Example 1: Simulation study for nonspatial data
disp('Example 1: Run ordinary Bayesian variable selection for nonspatial data')

%--------------- Prepare your data: response Y, design matrix X without intercept, and
%spatial adjacency matrix W (optional)
N = 300; p = 30;
x.invtau2 = 1/0.64; %residual variance = 0.64
betas0 = [1, 0.5, -1, 0.35, -0.35];  q = numel(betas0);
betas = zeros([p,1]); betas(1:q) = betas0;
rng('default'); rng(20)
X = zscore(randn([N, p])); % does not include the intercept!
% In this exmaple, we use no spatial information
x.gamma = 0;  W = 0;  %nonspatial data. W should be the N*N adjacency matrix for spatial data
Y = X*betas + randn([N,1])/sqrt(x.invtau2);
save('yourdata.mat','Y','X','W')

%---------------- run Bayesian variable selection
% number of MCMC runs, total iterations, burn-in period and thin (samples every thin iteration) for each MCMC run
nchain = 3; niter = 1e4; burn = 5e3; thin = 5;
spatial = 0; % 1 if the spatial adjacency matrix is provided
% nchain MCMC runs! This can be parallelized if you submit jobs in your HPC server
for i = 1:nchain
    mainProBVS(i, nchain, niter, burn, thin, spatial); %save out*.mat file for each chain
end

% summarize results. Full results are stored in paras*.mat
sumProBVS(spatial) % posterior summary
disp('True important variable index set and true parameters are all covered. ') 
Example 1: Run ordinary Bayesian variable selection for nonspatial data

MCMC Chain 1: 10000 iterations are done with elapsed time 0.36 minutes.
MCMC Chain 2: 10000 iterations are done with elapsed time 0.36 minutes.
MCMC Chain 3: 10000 iterations are done with elapsed time 0.36 minutes.

Posterior probability mass of the number of selected variables:
   Value    Count   Percent
      5     1426     47.53%   ------------- posterior mode
      6      966     32.20%
      7      418     13.93%
      8      149      4.97%
      9       32      1.07%
     10        8      0.27%
     11        1      0.03%
     
mean birth rate = 0.1191, death rate = 0.1404.
Posterior mean [lower, upper] 95% credible interval for noise level:
  0.6776  [0.5742, 0.8007]

Posterior mean [lower, upper] 95% credible interval for overall sigma-to-noise ratio:
  0.6762  [0.2633, 1.6503]

Top variable index (marginal inclusion probability), posterior mean [lower, upper] 95% CI of effect
 1 (1.0000):   1.0524  [0.9598, 1.1447]
 2 (1.0000):   0.5117  [0.4132, 0.6055]
 3 (1.0000):   -0.9885  [-1.0855, -0.8925]
 4 (1.0000):   0.3561  [0.2616, 0.4484]
 5 (1.0000):   -0.3038  [-0.3973, -0.2070]
26 (0.0773):   0.0756  [-0.0154, 0.1690]
 9 (0.0630):   0.0722  [-0.0139, 0.1642]
21 (0.0550):   0.0684  [-0.0201, 0.1618]
29 (0.0520):   0.0439  [-0.0643, 0.1384]
25 (0.0470):   0.0263  [-0.0529, 0.1326]
17 (0.0453):   -0.0529  [-0.1417, 0.0384]
14 (0.0433):   0.0508  [-0.0449, 0.1604]
16 (0.0360):   -0.0403  [-0.1509, 0.0608]
13 (0.0343):   0.0356  [-0.0440, 0.1313]
 6 (0.0337):   -0.0516  [-0.1298, 0.0504]
 
True important variable index set and true parameters are all covered. 
 
% =========== Example 2: Simulation study for spatial data
disp('Example 2: Run spatial BVS for spatial data')

% We simulate data for the N=49 U.S. contiguous states, and use the adjacency matrix in neighbor.txt
W = dlmread('neighbor.txt'); M = diag(sum(W,1)); N = size(W,1);
x.gamma = 0.6; % true spatial dependence
L = chol((M - x.gamma*W)\eye(N), 'lower');
rng('default'); rng(40)
x.invtau2 = 1/0.16; %residual variance = 0.16
betas0 = [1, 0.5, -1, 0.35, -0.35];  q = numel(betas0);
p = 30; betas = zeros([p,1]); betas(1:q) = betas0;
rng('default'); rng(20)
X = zscore(randn([N, p])); % does not include the intercept!
Y = X*betas + L*randn([N,1])/sqrt(x.invtau2);
save('yourdata.mat','Y','X','W')

%---------------- run Bayesian variable selection
% number of MCMC runs total iterations, burn-in period and thin (samples every thin iteration) for each MCMC run
nchain = 3; niter = 5e4; burn = 4e4; thin = 5;
spatial = 1; % 1 if the spatial adjacency matrix is provided
for i = 1:nchain
    mainProBVS(i, nchain, niter, burn, thin, spatial); %save out*.mat file for each chain
end

% summarize results. Full results are stored in paras*.mat
sumProBVS(spatial) % posterior summary
disp('True important variable index set and true parameters are all covered. ')

Example 2: Run spatial BVS for spatial data

MCMC Chain 1: 50000 iterations are done with elapsed time 0.59 minutes.
MCMC Chain 2: 50000 iterations are done with elapsed time 0.59 minutes.
MCMC Chain 3: 50000 iterations are done with elapsed time 0.59 minutes.

Posterior probability mass of the number of selected variables:
   Value    Count   Percent
      5     1991     33.18%   ------------- posterior mode
      6     1749     29.15%
      7     1133     18.88%
      8      583      9.72%
      9      339      5.65%
     10      120      2.00%
     11       47      0.78%
     12       22      0.37%
     13       13      0.22%
     14        2      0.03%
     15        1      0.02%

mean birth rate = 0.1898, death rate = 0.1968.
Posterior mean [lower, upper] 95% credible interval for noise level:
  0.3333  [0.1000, 1.6014]

Posterior mean [lower, upper] 95% credible interval for overall sigma-to-noise ratio:
  1.7133  [0.2556, 4.8959]

Posterior mean [lower, upper] 95% credible interval for spatial dependence:
  0.0203  [-0.8048, 0.7624]

Top variable index (marginal inclusion probability), posterior mean [lower, upper] 95% CI of effect
 1 (1.0000):   1.0309  [0.9471, 1.1116]
 2 (1.0000):   0.4789  [0.3875, 0.5675]
 3 (1.0000):   -0.9843  [-1.0644, -0.9018]
 4 (1.0000):   0.3464  [0.2584, 0.4310]
 5 (1.0000):   -0.3694  [-0.4601, -0.2839]
11 (0.2247):   -0.0708  [-0.1851, 0.0321]
17 (0.1057):   0.0547  [-0.0568, 0.1541]
22 (0.1042):   -0.0576  [-0.1532, 0.0391]
10 (0.0887):   0.0560  [-0.0551, 0.1765]
27 (0.0830):   0.0569  [-0.0474, 0.1601]
25 (0.0708):   -0.0466  [-0.1650, 0.0475]
 8 (0.0698):   0.0442  [-0.0474, 0.1491]
 7 (0.0633):   -0.0420  [-0.1475, 0.0606]
 9 (0.0553):   0.0400  [-0.0740, 0.1362]
29 (0.0518):   -0.0335  [-0.1250, 0.0719]

True important variable index set and true parameters are all covered.

% =========== Example 3: Continue example 2: if you have spatial data, but
% you discard the spatial information and go with standard Bayesian
% variable selection for nonspatial data, what will happen?
disp('Example 3: Continue example 2, run nonspatial BVS for spatial data by discarding spatial information.')

%---------------- run Bayesian variable selection
% number of MCMC runs total iterations, burn-in period and thin (samples every thin iteration) for each MCMC run
spatial = 0; % run nonspatial BVS by discarding the spatial information in W
for i = 1:nchain
    mainProBVS(i, nchain, niter, burn, thin, spatial); %save out*.mat file for each chain
end

% summarize results. Full results are stored in paras*.mat
sumProBVS(spatial) % posterior summary
disp('The posterior mode is no longer the true value (5). This is a misleading conclusion by discarding spatial dependence!!')

fprintf('\n\nDemo completed.\n')
end
 
Example 3: Continue example 2, run nonspatial BVS for spatial data by discarding spatial information.

MCMC Chain 1: 50000 iterations are done with elapsed time 0.49 minutes.
MCMC Chain 2: 50000 iterations are done with elapsed time 0.50 minutes.
MCMC Chain 3: 50000 iterations are done with elapsed time 0.50 minutes.

Posterior probability mass of the number of selected variables:
   Value    Count   Percent
      5     1192     19.87%   
      6     1439     23.98%   ------------- posterior mode
      7     1268     21.13%
      8      870     14.50%
      9      541      9.02%
     10      306      5.10%
     11      160      2.67%
     12      105      1.75%
     13       53      0.88%
     14       28      0.47%
     15       15      0.25%
     16       10      0.17%
     17        7      0.12%
     18        2      0.03%
     19        2      0.03%
     20        0      0.00%
     21        2      0.03%

mean birth rate = 0.2614, death rate = 0.2689.
Posterior mean [lower, upper] 95% credible interval for noise level:
  0.6585  [0.0293, 4.5925]

Posterior mean [lower, upper] 95% credible interval for overall sigma-to-noise ratio:
  3.4426  [0.1666, 13.6107]

Top variable index (marginal inclusion probability), posterior mean [lower, upper] 95% CI of effect
 1 (1.0000):   1.0066  [0.7116, 1.2760]
 2 (1.0000):   0.4767  [0.2093, 0.7568]
 3 (1.0000):   -0.9473  [-1.2109, -0.6742]
 5 (1.0000):   -0.3689  [-0.6295, -0.1059]
 4 (0.9980):   0.3317  [0.0568, 0.5965]
22 (0.2300):   -0.0649  [-0.3641, 0.2463]
11 (0.1992):   -0.0744  [-0.4005, 0.2340]
17 (0.1692):   0.0584  [-0.2912, 0.3461]
27 (0.1372):   0.0595  [-0.2450, 0.4112]
29 (0.1042):   -0.0613  [-0.4261, 0.2839]
 8 (0.0995):   0.0459  [-0.2928, 0.3746]
16 (0.0787):   -0.0250  [-0.3210, 0.2834]
 6 (0.0775):   -0.0256  [-0.3132, 0.2905]
13 (0.0775):   -0.0296  [-0.3671, 0.2742]
25 (0.0748):   -0.0302  [-0.2755, 0.3521]

The posterior mode is no longer the true value (5). This is a misleading conclusion by discarding spatial dependence!!
Demo completed.

Main function for Bayesian variable selection via MCMC runs

function [] = mainProBVS(id, nchain, niter, burn, thin, spatial)
% MCMC setup
verbose = 0;

%============= total iterations, burn-in period and thin (samples every thin iteration) for each MCMC run

nbin = 1e3; % ifverbose, every nbin iterations summarize the acceptance rates for that batch
repl = str2double(num2str(id)); ch0 = repl;
ncase = ceil(repl/nchain);
repl = repl - (ncase-1)*nchain;
% fprintf('case = %d, chain = %d:\n', [ncase, repl])  % partition jobs

% data input and initialize parameters
rng('default'); rng(ch0*8);
[x, E] = readDataBVS(ncase, repl, nchain, spatial);

% run MCMC, store the results
nsample = (niter-burn)/thin;
xall = cell(1, nsample); ps = zeros(1,nsample);
births = nan(1,niter); deaths = nan(1,niter);
eta_a = 0; batchLen = 50; batchNum = 0; batchTot = niter/batchLen;  eta_rates = zeros(1, batchTot);
sIter = 1;

tic
for iter = 1:niter
    if verbose==1; fprintf('%6d', x.p); if(~mod(iter,20)); fprintf('\n'); end; end

    % -------------- update marginal likelihood
    x = getLoglikeBVS(x, E, 0);

    % -------------- update random subset inds (model)
    U0 = rand(1); U1 = log(rand(1));
    if (U0 <= 0.5 || x.p == 0) && (x.p < E.pstar) % add another variable
        i0 = randsample(setdiff(1:E.p, x.inds), 1);
        x1 = x; x1.inds = [x.inds, i0]; x1.p = length(x1.inds);
        x1 = getLoglikeBVS(x1, E, 1);
        logratio = x1.logLik - x.logLik - E.alpha;
        if x.p==0; logratio = logratio + log(.5); end %boundary correction
        MHratio = min(logratio, 0); births(iter) = 0;
        if U1 <= MHratio; x = x1; births(iter) = 1; end
    else % delete an existing variable
        i0 = randsample(x.p,1);
        x1 = x; x1.inds(i0) = []; x1.p = length(x1.inds);
        x1 = getLoglikeBVS(x1, E, 1);
        logratio = x1.logLik - x.logLik + E.alpha;
        if x.p == E.pstar; logratio = logratio + log(2); end %boundary correction
        MHratio = min(logratio,0); deaths(iter) = 0;
        if U1 <= MHratio; x = x1; deaths(iter) = 1; end
    end

    % -------------- update variation \tau2 and fixed-effects \beta
    Xtild = [ones(E.N,1), E.X(:,x.inds)];
    Sigma = E.M-x.gamma.*E.W; Lo = chol(Sigma, 'lower');
    L = 0.5*sum((Lo'*(x.eta-Xtild*x.beta)).^2) + 0.5*x.invlambda*sum((x.beta).^2);
    x.invtau2 = gamrnd(E.a_tau2+0.5*(E.N+x.p+1), 1/(E.invb_tau2+L));

    Sigma = Xtild'*Sigma; Mu = Sigma*x.eta;
    Sigma = Sigma*Xtild + x.invlambda.*eye(x.p+1);
    Lo = chol(Sigma, 'lower');
    x.beta = Lo'\( 1/sqrt(x.invtau2)*randn(size(Mu)) + Lo\Mu );
    mu = Xtild*x.beta;  u = x.eta - mu;

    % -------------- update signal-to-noise \Lambda
    x.invlambda = gamrnd(E.a_lambda+0.5*(x.p+1), 1/(E.invb_lambda+0.5*x.invtau2*sum(x.beta.^2)));

    if E.spatial == 1
        % -------------- update spatial dependence \gamma
        P = E.loglike0 + E.gammas*(u'*E.W*u)*(0.5*x.invtau2);
        P = exp(P-max(P))/sum(exp(P-max(P)));
        cump = cumsum([0 P(1:(end-1))]);
        U0 = rand(1);
        i0 = sum(U0 > cump);
        x.gamma = E.gammas(1); if(i0>1); x.gamma = E.gammas(i0-1) + E.gap/P(i0)*(U0-cump(i0)); end
    end

    % -------------- update latent parameter \eta for Poisson model (E.fixeta==0) using tMALA
    if E.fixeta == 0
        Lo = sqrt(x.invtau2) * chol(E.M - x.gamma*E.W, 'lower');
        v = Lo'*u;
        grad = E.Y - E.offset.*exp(x.eta); if numel(grad>E.H)>0; grad(grad>E.H) = E.H; end
        grad = Lo\grad - v;
        v1 = v + 0.5*E.h*grad + E.h_sqrt*randn([E.N, 1]);
        eta1 = mu + Lo'\v1;
        grad1 = E.Y - E.offset.*exp(eta1); if numel(grad1>E.H)>0; grad1(grad1>E.H) = E.H; end
        grad1 = Lo\grad1 - v1;
        P = sum(E.Y.*eta1 - E.offset.*exp(eta1) - 0.5*v1.^2 + 0.5/E.h*(v1 - v - 0.5*E.h*grad).^2) ...
            - sum(E.Y.*x.eta - E.offset.*exp(x.eta) - 0.5*v.^2 + 0.5/E.h*(v - v1 - 0.5*E.h*grad1).^2);
        U1 = log(rand(1));
        if U1 <= P; x.eta = eta1; eta_a = eta_a + 1; end
    end

    if verbose == 1 && iter > nbin && ~mod(iter,nbin)
        rbirths = births((iter-nbin):iter); rdeaths = deaths((iter-nbin):iter);
        fprintf('iter=%d, birth=%.3f, death=%.3f, p=%d, ML = %4.2f, tau2 = %4.2f, lambda = %4.2f\n',...
            [iter, mean(rbirths(~isnan(rbirths))), mean(rdeaths(~isnan(rdeaths))), x.p, x.logLik, 1/x.invtau2, 1/x.invlambda]);
    end

    if ~mod(iter, batchLen); batchNum = batchNum+1; eta_a = eta_a/batchLen; eta_rates(batchNum) = eta_a; eta_a = 0; end %disp(eta_rates(batchNum));

    % record results
    if(iter <= nsample);  ps(iter) = x.p; end
    if(iter > burn) && ~mod(iter, thin); xall{sIter} = x; sIter = sIter+ 1; end
end

runtime = toc/60;
fprintf('MCMC Chain %d: %d iterations are done with elapsed time %.2f minutes.\n', [repl, niter, runtime])
nam = strcat('out',num2str(ncase), '_', num2str(repl),'.mat');
save(nam,'xall','runtime','births','deaths','ps','eta_rates')
end

Import data, set environmental variables, specify prior, and initialize parameter

function [x, E] = readDataBVS(ncase, repl, nchain, spatial)
% import data, initialize parameter, set environmental variables
load('yourdata.mat') %X full design matrix, Y response, W spatial adjacency
E.N = size(X, 1); %size(W,1);
E.X = zscore(X);  %standardized predictors without intercept
E.Y = Y;
E.p = size(E.X, 2);
E.pstar = min(E.p, E.N-1); %upper bound of p
E.alpha = 0; % Penalty term: 1: AIC, log(n)/2: BIC
E.spatial = spatial;  % spatial or nonspatial?
if E.spatial == 1
    E.W = W;   E.M = diag(sum(W,1)); invM = inv(E.M);
    eigs = eig(sqrt(invM)*W*sqrt(invM));
    lgamma = max(1/min(eigs),-1); ugamma = 1/max(eigs);
    gap = 1e-3; gammas = (lgamma+gap):gap:(ugamma-gap); len = length(gammas);
    if(~exist('loglike0.mat','file')) % precalculaiton: make CAR model sampling much faster!
        loglike0 = zeros(1,len); for i = 1:len; loglike0(i) = 0.5*sum(log(eig(E.M-gammas(i)*E.W))); end; save('loglike0.mat','loglike0');
    end
    load('loglike0.mat')
    E.gap = gap; E.gammas = gammas; E.loglike0 = loglike0;
else
    E.W = 0; E.M = eye(E.N);
end

% Poisson part
E.fixeta = 1; % 1 = Gaussian continuous response, 0 = Poisson count response
if ncase > 2; E.fixeta = 0; end

if E.fixeta == 0
    E.offset =  ones(E.N,1);     %no offset
    % % for truncated MALA
    E.H = 50; E.h = 0.0016; E.h_sqrt = sqrt(E.h);
end

% set prior
meantau2 = 0.01; vartau2 = 10^2;
E.a_tau2 = 2+meantau2^2/vartau2;
E.invb_tau2 = meantau2*(E.a_tau2-1);

meanlambda = 1; varlambda = 1e2; %100
E.a_lambda = 2+meanlambda^2/varlambda;
E.invb_lambda = meanlambda*(E.a_lambda-1);

% may consider certain transformation here
logTransformY = 0; if logTransformY == 1; x.eta = log(E.Y); else x.eta=E.Y; end
if E.fixeta == 0;  x.eta = log((E.Y+ 0.5*(E.Y==0))./E.offset); end % for Poisson

% set initial value
p0s = floor(linspace(1,E.p,nchain)); % initial number of variable spreads out its support
p0 = p0s(repl);
x.inds = randsample(1:E.p, p0); x.p = length(x.inds);
Xtild = [ones(E.N,1), E.X(:,x.inds)];
x.beta = (Xtild'*Xtild)\(Xtild'*x.eta) .* (1 + 1/50*randn([size(Xtild,2),1]));
x.invtau2 = 1/( sum((sqrt(diag(E.M)).*(x.eta-Xtild*x.beta)).^2)/(E.N-E.p) );

x.gamma = 0;  x.invlambda = 1/meanlambda;
x = getLoglikeBVS(x, E, 1);
end

Util function: compute marginal likelihood based on multivariate T density

function [x] = getLoglikeBVS(x, E, proposeParameter)
Xtild = [ones(E.N,1), E.X(:, x.inds)]; % include intercept
Lo = E.M-x.gamma.*E.W; L = x.eta'*Lo*x.eta;
Lo = Xtild'*Lo; Mu = Lo*x.eta;
Lo = Lo*Xtild + x.invlambda.*eye(x.p+1);
Lo = chol(Lo, 'lower');
Mu = Lo\Mu; L = (L - sum(Mu.^2))/2;
n1 = 0.5*E.N;
if proposeParameter == 1
    x.invtau2 = gamrnd(E.a_tau2+n1, 1/(E.invb_tau2+L));
    Mu = Mu + sqrt(x.invtau2)*randn(size(Mu));
    x.beta = Lo'\Mu;
end
x.logLik = 0.5*(x.p+1)*log(x.invlambda) - sum(log(diag(Lo))) - (E.a_tau2+n1)*log(1+L/E.invb_tau2);
end

Summarize Bayesian variable selection output

function [] = sumProBVS(spatial)
% summary function for BVS results
collectit = 1;
for ncase = 1:1
    % fprintf('case = %d\n', ncase)
    if collectit == 1
        load('yourdata.mat')
        E.X = zscore(X);  %standardized predictors without intercept
        [~, p] = size(E.X);
        E.spatial = 0;

        dirs = dir(strcat('out',num2str(ncase),'_*'));
        chs = 1:length(dirs); nch = length(chs);
        for ch = 1:nch
            load(dirs(chs(ch)).name)
            if ch == 1 % initialization
                niter = length(xall); burn = 0; thin = 1; %#ok<*USENS>
                nsample = (niter-burn)/thin;    tot = nch*nsample; allps0 = zeros(nsample,nch);
                allps = zeros(nsample,nch); allgammas = zeros(nsample,nch); alltau2s = zeros(nsample,nch); alllambdas = zeros(nsample,nch);
                allbirths = zeros(nsample,nch); alldeaths = zeros(nsample,nch);
                indmat = zeros(tot,p); betamat = zeros(tot,p); beta0mat = zeros(tot,1);
            end
            allps0(:,ch) = ps(burn+(1:nsample).*thin);
            allbirths(:,ch) = births(burn+(1:nsample).*thin);
            alldeaths(:,ch) = deaths(burn+(1:nsample).*thin);
            for i = 1:nsample
                indmat((ch-1)*nsample+i,xall{burn+i*thin}.inds) = 1;
                beta0mat((ch-1)*nsample+i) = xall{burn+i*thin}.beta(1);
                betamat((ch-1)*nsample+i,xall{burn+i*thin}.inds) = xall{burn+i*thin}.beta(2:end)';
                allgammas(i,ch) = xall{burn+i*thin}.gamma;
                alltau2s(i,ch) = 1/xall{burn+i*thin}.invtau2;
                alllambdas(i,ch) = 1/xall{burn+i*thin}.invlambda;
                allps(i,ch) = xall{burn+i*thin}.p;
            end
        end
        save(strcat('sumout',num2str(ncase),'.mat'),'allps','tot','allbirths','alldeaths','indmat','betamat','allgammas','alltau2s')
    end
    load(strcat('sumout',num2str(ncase),'.mat'))
    kplot = 1;
    subplot(2,2,kplot); plot(allps0); kplot = kplot+1;
    title('Initial runs of the number of selected variables','FontSize',8)
    ps = reshape(allps, [1,tot]);
    fprintf('\nPosterior probability mass of the number of selected variables:\n')
    tabulate(ps)
    mbirth = nanmean(reshape(allbirths, [1,tot]));
    mdeath = nanmean(reshape(alldeaths, [1,tot]));
    fprintf('\nmean birth rate = %5.4f, death rate = %5.4f.\n\n', [mbirth, mdeath])

    tau2s = reshape(alltau2s, [1,tot]);
    subplot(2,2,kplot); plot(alltau2s); kplot = kplot+1;
    title('Traceplot of the noise level','FontSize',8)
    disp('Posterior mean [lower, upper] 95% credible interval for noise level:')
    %[l,u] = FindHPDset(tau2s', 0.95, []);
    l = quantile(tau2s',0.025);
    u = quantile(tau2s',0.975);
    fprintf('  %5.4f  [%5.4f, %5.4f]\n\n', [mean(tau2s),l,u]);

    lambdas = reshape(alllambdas, [1,tot]);
    subplot(2,2,kplot); plot(alllambdas); kplot = kplot+1;
    title('Traceplot of the overall sigma-to-noise ratio','FontSize',8)
    disp('Posterior mean [lower, upper] 95% credible interval for overall sigma-to-noise ratio:')
    %[l,u] = FindHPDset(tau2s', 0.95, []);
    l = quantile(lambdas',0.025);
    u = quantile(lambdas',0.975);
    fprintf('  %5.4f  [%5.4f, %5.4f]\n\n', [mean(lambdas),l,u]);

    if spatial == 1
        gammas = reshape(allgammas, [1,tot]);
        subplot(2,2,kplot); plot(allgammas);
        title('traceplot of the spatial dependence','FontSize',8)
        disp('Posterior mean [lower, upper] 95% credible interval for spatial dependence:')
        %[l,u] = FindHPDset(gammas', 0.95, []);
        l = quantile(gammas',0.025);
        u = quantile(gammas',0.975);
        %disp([mean(gammas),l,u]);
        fprintf('  %5.4f  [%5.4f, %5.4f]\n\n', [mean(gammas),l,u]);
    end

    % -------------- variable wise summary
    P = size(betamat, 2);
    mat = zeros(P, 3);
    for i = 1:P
        betavec = betamat(:,i); betavec = betavec(betavec~=0);
        mat(i,:) = [mean(betavec), quantile(betavec,0.025), quantile(betavec, 0.975)];
    end
    mat_var = mean(indmat); np = min(15, length(mat_var));
    disp('Top variable index (marginal inclusion probability), posterior mean [lower, upper] 95% CI of effect')
    [mat_var,I] = sort(mat_var,'descend'); mat_var = [I',mat(I,:), mat_var']; %disp(num2str(mat_var(1:np,:))); fprintf('\n')
    for i = 1:np; fprintf('%2d (%5.4f):   %5.4f  [%5.4f, %5.4f]\n', mat_var(i,[1,5,2:4]));  end; fprintf('\n')

    % -------------- model wise summary
    [xu, ~, k] = unique(indmat, 'rows');
    count = histc(k, 1:size(xu, 1));
    [~,I] = sort(count,'descend');
    tab = [xu, count/sum(count)];
    tab = tab(I,:);
    bmat = nan(size(tab,1),size(tab,2)-1);
    lb = bmat; ub = bmat;
    bmat0 = nan(size(tab,1),1); lb0 = bmat0; ub0 = bmat0;
    for i = 1:length(I)
        bmat(i,:) = mean(betamat(k==I(i), :),1);
        lb(i,:) = quantile(betamat(k==I(i), :),0.025,1);
        ub(i,:) = quantile(betamat(k==I(i), :),0.975,1);
        bmat0(i,:) = mean(beta0mat(k==I(i)));
        lb0(i,:) = quantile(beta0mat(k==I(i)),0.025);
        ub0(i,:) = quantile(beta0mat(k==I(i)),0.975);
    end
    save(strcat('paras',num2str(ncase),'.mat'),'allps', 'ps','tab','bmat','lb','ub','mat_var','bmat0','lb0','ub0')
end
end