Next: , Previous: FREQUENCIES, Up: Statistics


15.3 EXAMINE

     EXAMINE
             VARIABLES=var_list [BY factor_list ]
             /STATISTICS={DESCRIPTIVES, EXTREME[(n)], ALL, NONE}
             /PLOT={BOXPLOT, NPPLOT, HISTOGRAM, ALL, NONE}
             /CINTERVAL n
             /COMPARE={GROUPS,VARIABLES}
             /ID=var_name
             /{TOTAL,NOTOTAL}
             /PERCENTILE=[value_list]={HAVERAGE, WAVERAGE, ROUND, AEMPIRICAL, EMPIRICAL }
             /MISSING={LISTWISE, PAIRWISE} [{EXCLUDE, INCLUDE}]
     		[{NOREPORT,REPORT}]
     

The EXAMINE command is used to test how closely a distribution is to a normal distribution. It also shows you outliers and extreme values.

The VARIABLES subcommand specifies the dependent variables and the independent variable to use as factors for the analysis. Variables listed before the first BY keyword are the dependent variables. The dependent variables may optionally be followed by a list of factors which tell PSPP how to break down the analysis for each dependent variable. The format for each factor is

     var [BY var].

The STATISTICS subcommand specifies the analysis to be done. DESCRIPTIVES will produce a table showing some parametric and non-parametrics statistics. EXTREME produces a table showing extreme values of the dependent variable. A number in parentheses determines how many upper and lower extremes to show. The default number is 5.

The PLOT subcommand specifies which plots are to be produced if any. Available plots are HISTOGRAM, NPPLOT and BOXPLOT.

The COMPARE subcommand is only relevant if producing boxplots, and it is only useful there is more than one dependent variable and at least one factor. If /COMPARE=GROUPS is specified, then one plot per dependent variable is produced, containing boxplots for all the factors. If /COMPARE=VARIABLES is specified, then one plot per factor is produced, each each containing one boxplot per dependent variable. If the /COMPARE subcommand is omitted, then PSPP uses the default value of /COMPARE=GROUPS.

The ID subcommand also pertains to boxplots. If given, it must specify a variable name. Outliers and extreme cases plotted in boxplots will be labelled with the case from that variable. Numeric or string variables are permissible. If the ID subcommand is not given, then the casenumber will be used for labelling.

The CINTERVAL subcommand specifies the confidence interval to use in calculation of the descriptives command. The default it 95%.

The PERCENTILES subcommand specifies which percentiles are to be calculated, and which algorithm to use for calculating them. The default is to calculate the 5, 10, 25, 50, 75, 90, 95 percentiles using the HAVERAGE algorithm.

The TOTAL and NOTOTAL subcommands are mutually exclusive. If NOTOTAL is given and factors have been specified in the VARIABLES subcommand, then then statistics for the unfactored dependent variables are produced in addition to the factored variables. If there are no factors specified then TOTAL and NOTOTAL have no effect.

Warning! If many dependent variable are given, or factors are given for which there are many distinct values, then EXAMINE will produce a very large quantity of output.