The General Social Survey (GSS) is an areaprobability sample that uses the NORC National Sampling Frame for an equalprobability multistage cluster sample of housing units for the entire United States. Since the sample for the GSS is a cluster sample, standard errors are larger for the GSS than simple random sample calculations (calculated without correction for the design). To calculate standard errors correctly, design variables must be used in statistical software (such as PROC SURVEYFREQ in SAS). Without these design variables, statistical software will assume a simple random sample and underestimate standard errors.
We provide two design variables for every GSS interview from 19752012, VSTRAT and VPSU. VSTRAT is the Variance STRATum while VPSU is the Variance Primary Sampling Unit. The stratum and PSU reflect the firststage and secondstage units selected as part of the NORC National Sampling Frame, and are unique to a particular round. There are two secondstage units (VPSU) for each firststage unit (VSTRAT).
Firststage units in the NORC National Sampling Frame are called National Frame Areas, (NFAs), each of which is composed of one or more counties (previous to the 2010 National Frame, NFAs were called PSUs). The largest urban areas are selected with certainty to guarantee their representation in NORC’s National Sampling Frame. Secondstage stage units in the NORC National Sampling Frame are called segments, each of which is either a block, a group of blocks, or an entire census tract. The firststage and secondstage units are selected with probabilities proportional to size (in housing units), and the sample housing units (thirdstage units) are then selected to be an equalprobability sample, which results in roughly the same number of housing units selected per secondstage sampling unit.
To create the variables VSTRAT and VPSU, we recode the NFAs and segments, depending on whether the NFA was selected with certainty. In certainty NFAs, segments are paired into strata with one segment assigned to VPSU = 1 while the other segment is assigned to VPSU = 2. Often, small segments are combined into one VPSU. Noncertainty NFAs are paired into strata with one NFA assigned to VPSU = 1 while the other NFA is assigned to VPSU = 2. It is rare, but possible, for NFAs to be combined in one VPSU. This strategy has been adapted from the National Longitudinal Survey of Youth, 1997 cohort strategy designed by Kirk Wolter.
Example Code for STATA
Here is sample Stata code to analyze the variable ANALYSISVAR where SEX = 1 (Male) within a GSSDATAFILE with the weight variable WTVAR (either WTSSALL or WTSSNR):

use GSSDATAFILE.dta, clear
svyset [weight=WTVAR], strata (vstrat) psu(VPSU) singleunit(scaled)
svy, subpop(if sex ==1): tabulate ANALYSISVAR
tab ANALYSISVAR [aweight= WTVAR],missing
Note that it is possible to combine multiple years of GSS data into one GSSDATAFILE. SPSS is menudriven, so no code is given here, but you can create designcorrected standard errors within SPSS using the Complex Samples addon.
STATA error handling: “missing standard error because of stratum with single sampling unit”
VSTRAT and VPSU were created so that there was a minimum of three GSS respondents within a VSTRAT/VPSU cell. If all three are missing on a variable, this error can occur in Stata. If a GSS round is subset (to males or females, for example), this error becomes more likely to happen. For this reason, it is recommended that users utilize the subpop option for any subdomain analyses as is included in the example STATA code
Example Code for SAS
Here is sample SAS code to analyze the variable ANALYSISVAR by SEX within a GSSDATAFILE with the weight variable WTVAR (either WTSSALL or WTSSNR):

proc sort data = GSSDATAFILE; by sex; run;
proc surveyfreq data= GSSDATAFILE/*missing*/ nosummary ;
table ANALYSISVAR*SEX;
strata vstrat;
cluster vpsu;
weight WTVAR;
run;
Example Code for R
Here is sample R code to analyze the variable ANALYSISVAR by SEX within a GSSDATAFILE with the weight variable WTVAR (either WTSSALL or WTSSNR):

gss.design<svydesign(ids=~ vpsu,weights=~ WTVAR,strata=~vstrat,data=gssdatafile)
tX=svyby( ~ ANALYSISVAR, by = ~SEX,design=gss.design,FUN=svymean)
TableX=round(ftable(tX)*100, 2)
Calculating DesignCorrected Standard Errors for the General Social Survey, 19752014
Steven Pedlow, NORC at the University of Chicago (pedlowsteven@norc.uchicago.edu)
Rene Bautista, NORC at the University of Chicago (bautistarene@norc.org)