CADStat: Statistical Tools for Causal Analysis
Linear regression is a method for quantifying the relationship between a dependent (response) variable and one or more independent (explanatory) variables. These quantitative models can then be used to predict the value of the response variable for new values of the explanatory variables or to estimate the value of an explanatory variable needed to account for a change in the response variable.
Select Analysis Tools -> Linear Regression from the menus. A dialog box will open. Select the data set of interest from the pull-down menu, or browse for a tab-delimited text file. The Data Subsetting tab can be used to select a subset of the data file by choosing a variable from the pull down menu and then selecting the levels of that variable to include. Hold down the <ctrl> key to add several levels.
Select a response variable (the variable whose value you wish to predict). An appropriate distribution for your response variable can be selected by clicking on a radio button. Normal distributions are most common. A Poisson distribution may be appropriate if your response variable reflects a count of some attribute (e.g., number of distinct taxa), and a Binomial distribution may be appropriate if you are modeling the probability of success and your response variable is a number of successes for a fixed number of trials (e.g., the presence of a species in a sample).
Select all explanatory variables you wish to include in the model (the variables used to predict the dependent variable). You can hold down the <CTRL> key to add multiple variables. Note: the dependent variable is in the list of possible independent variables, but it should not be selected as an independent variable.
By default, an intercept is included in the model., The intercept can be excluded by selecting Remove Intercept, in which case at least one independent variable must have been selected.
Confidence intervals for the regression coefficients are generated only if Compute Confidence Intervals is selected. The confidence level can be changed only after Compute Confidence Intervals is selected.
Several diagnostic plots are available for plotting. None are produced by default, but any subset of the five diagnostic plots can be selected for display.
Should you wish to save the linear regression as an R object, select Save R Results? and type a variable name in R Result Name.
The output is the regression coefficients, along with standard errors, t-statistics, and p-values. Confidence intervals and diagnostic plots are produced only if selected.
For this example, select the fish length and weight data (fish.lwr). (Visit the Loading and merging data help page for information on loading CADStat example data). Then, select Analysis Tools -> Linear Regression.
Once this is selected, a dialog window opens with the available options for the analysis.
For this example, specify length as the dependent (response) variable and weight as the independent (explanatory) variable. The resulting model is shown in the CADStat console or your default browser window (if Display Results in Browser is selected)
Diagnostic Plots check boxes can be selected to further evaluate whether the assumptions inherent in linear regression are supported by the data.