statistical test to compare two groups of categorical data

This page shows how to perform a number of statistical tests using SPSS. low communality can Simple and Multiple Regression, SPSS A Type II error is failing to reject the null hypothesis when the null hypothesis is false. When we compare the proportions of success for two groups like in the germination example there will always be 1 df. These results show that racial composition in our sample does not differ significantly The Fishers exact test is used when you want to conduct a chi-square test but one or variable. A one sample binomial test allows us to test whether the proportion of successes on a Thus, these represent independent samples. appropriate to use. The stem-leaf plot of the transformed data clearly indicates a very strong difference between the sample means. The most common indicator with biological data of the need for a transformation is unequal variances. When reporting paired two-sample t-test results, provide your reader with the mean of the difference values and its associated standard deviation, the t-statistic, degrees of freedom, p-value, and whether the alternative hypothesis was one or two-tailed. Alternative hypothesis: The mean strengths for the two populations are different. distributed interval independent The distribution is asymmetric and has a tail to the right. Even though a mean difference of 4 thistles per quadrat may be biologically compelling, our conclusions will be very different for Data Sets A and B. between the underlying distributions of the write scores of males and The data come from 22 subjects --- 11 in each of the two treatment groups. Specify the level: = .05 Perform the statistical test. distributed interval variable (you only assume that the variable is at least ordinal). Thus far, we have considered two sample inference with quantitative data. When we compare the proportions of "success" for two groups like in the germination example there will always be 1 df. Comparing Means: If your data is generally continuous (not binary), such as task time or rating scales, use the two sample t-test. as shown below. Those who identified the event in the picture were coded 1 and those who got theirs' wrong were coded 0. We can define Type I error along with Type II error as follows: A Type I error is rejecting the null hypothesis when the null hypothesis is true. variable with two or more levels and a dependent variable that is not interval We can do this as shown below. Thanks for contributing an answer to Cross Validated! regression assumes that the coefficients that describe the relationship The number 20 in parentheses after the t represents the degrees of freedom. sign test in lieu of sign rank test. If the responses to the questions are all revealing the same type of information, then you can think of the 20 questions as repeated observations. Then, once we are convinced that association exists between the two groups; we need to find out how their answers influence their backgrounds . The assumptions of the F-test include: 1. ), Biologically, this statistical conclusion makes sense. We expand on the ideas and notation we used in the section on one-sample testing in the previous chapter. The null hypothesis (Ho) is almost always that the two population means are equal. same. With a 20-item test you have 21 different possible scale values, and that's probably enough to use an, If you just want to compare the two groups on each item, you could do a. For our purposes, [latex]n_1[/latex] and [latex]n_2[/latex] are the sample sizes and [latex]p_1[/latex] and [latex]p_2[/latex] are the probabilities of success germination in this case for the two types of seeds. Md. To determine if the result was significant, researchers determine if this p-value is greater or smaller than the. logistic (and ordinal probit) regression is that the relationship between Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. The same design issues we discussed for quantitative data apply to categorical data. First, scroll in the SPSS Data Editor until you can see the first row of the variable that you just recoded. This was also the case for plots of the normal and t-distributions. However, Thus, let us look at the display corresponding to the logarithm (base 10) of the number of counts, shown in Figure 4.3.2. but could merely be classified as positive and negative, then you may want to consider a as the probability distribution and logit as the link function to be used in A picture was presented to each child and asked to identify the event in the picture. hiread. For the thistle example, prairie ecologists may or may not believe that a mean difference of 4 thistles/quadrat is meaningful. we can use female as the outcome variable to illustrate how the code for this In SPSS unless you have the SPSS Exact Test Module, you Example: McNemar's test [latex]s_p^2[/latex] is called the pooled variance. reading, math, science and social studies (socst) scores. You would perform McNemars test Suppose you wish to conduct a two-independent sample t-test to examine whether the mean number of the bacteria (expressed as colony forming units), Pseudomonas syringae, differ on the leaves of two different varieties of bean plant. The most commonly applied transformations are log and square root. scores. stained glass tattoo cross The interaction.plot function in the native stats package creates a simple interaction plot for two-way data. It is incorrect to analyze data obtained from a paired design using methods for the independent-sample t-test and vice versa. Choosing a Statistical Test - Two or More Dependent Variables This table is designed to help you choose an appropriate statistical test for data with two or more dependent variables. PSY2206 Methods and Statistics Tests Cheat Sheet (DRAFT) by Kxrx_ Statistical tests using SPSS This is a draft cheat sheet. In our example, female will be the outcome the model. 0.047, p The results indicate that reading score (read) is not a statistically 6 | | 3, Within the field of microbial biology, it is widel, We can see that [latex]X^2[/latex] can never be negative. proportions from our sample differ significantly from these hypothesized proportions. The results indicate that the overall model is statistically significant The binomial distribution is commonly used to find probabilities for obtaining k heads in n independent tosses of a coin where there is a probability, p, of obtaining heads on a single toss.). the relationship between all pairs of groups is the same, there is only one ), Then, if we let [latex]\mu_1[/latex] and [latex]\mu_2[/latex] be the population means of x1 and x2 respectively (the log-transformed scale), we can phrase our statistical hypotheses that we wish to test that the mean numbers of bacteria on the two bean varieties are the same as, Ho:[latex]\mu[/latex]1 = [latex]\mu[/latex]2 A 95% CI (thus, [latex]\alpha=0.05)[/latex] for [latex]\mu_D[/latex] is [latex]21.545\pm 2.228\times 5.6809/\sqrt{11}[/latex]. You could also do a nonlinear mixed model, with person being a random effect and group a fixed effect; this would let you add other variables to the model. more of your cells has an expected frequency of five or less. the predictor variables must be either dichotomous or continuous; they cannot be "Thistle density was significantly different between 11 burned quadrats (mean=21.0, sd=3.71) and 11 unburned quadrats (mean=17.0, sd=3.69); t(20)=2.53, p=0.0194, two-tailed. The examples linked provide general guidance which should be used alongside the conventions of your subject area. and based on the t-value (10.47) and p-value (0.000), we would conclude this It will show the difference between more than two ordinal data groups. For example, Lets round Thus, in some cases, keeping the probability of Type II error from becoming too high can lead us to choose a probability of Type I error larger than 0.05 such as 0.10 or even 0.20. Thus, from the analytical perspective, this is the same situation as the one-sample hypothesis test in the previous chapter. of ANOVA and a generalized form of the Mann-Whitney test method since it permits can do this as shown below. Recall that for the thistle density study, our scientific hypothesis was stated as follows: We predict that burning areas within the prairie will change thistle density as compared to unburned prairie areas. In this data set, y is the Is it correct to use "the" before "materials used in making buildings are"? For the example data shown in Fig. For example, one or more groups might be expected . This is because the descriptive means are based solely on the observed data, whereas the marginal means are estimated based on the statistical model. next lowest category and all higher categories, etc. In other instances, there may be arguments for selecting a higher threshold. and normally distributed (but at least ordinal). The results indicate that the overall model is statistically significant (F = 58.60, p Again, this just states that the germination rates are the same. distributed interval variables differ from one another. We'll use a two-sample t-test to determine whether the population means are different. Some practitioners believe that it is a good idea to impose a continuity correction on the [latex]\chi^2[/latex]-test with 1 degree of freedom. Examples: Regression with Graphics, Chapter 3, SPSS Textbook For Set B, recall that in the previous chapter we constructed confidence intervals for each treatment and found that they did not overlap. We can see that [latex]X^2[/latex] can never be negative. . Thus, there is a very statistically significant difference between the means of the logs of the bacterial counts which directly implies that the difference between the means of the untransformed counts is very significant. This makes very clear the importance of sample size in the sensitivity of hypothesis testing. ", "The null hypothesis of equal mean thistle densities on burned and unburned plots is rejected at 0.05 with a p-value of 0.0194. categorical independent variable and a normally distributed interval dependent variable Each test has a specific test statistic based on those ranks, depending on whether the test is comparing groups or measuring an association. A factorial logistic regression is used when you have two or more categorical . The formula for the t-statistic initially appears a bit complicated. The best known association measure is the Pearson correlation: a number that tells us to what extent 2 quantitative variables are linearly related. The predictors can be interval variables or dummy variables, variable and you wish to test for differences in the means of the dependent variable Why are trials on "Law & Order" in the New York Supreme Court? 1 | | 679 y1 is 21,000 and the smallest If, for example, seeds are planted very close together and the first seed to absorb moisture robs neighboring seeds of moisture, then the trials are not independent. 4.4.1): Figure 4.4.1: Differences in heart rate between stair-stepping and rest, for 11 subjects; (shown in stem-leaf plot that can be drawn by hand.). Share Cite Follow At the bottom of the output are the two canonical correlations. (We will discuss different [latex]\chi^2[/latex] examples in a later chapter.). relationship is statistically significant. 3 pulse measurements from each of 30 people assigned to 2 different diet regiments and 4 | | 1 In this example, because all of the variables loaded onto value. using the thistle example also from the previous chapter. We reject the null hypothesis very, very strongly! Note, that for one-sample confidence intervals, we focused on the sample standard deviations. The [latex]\chi^2[/latex]-distribution is continuous. (The exact p-value is 0.071. The threshold value we use for statistical significance is directly related to what we call Type I error. command to obtain the test statistic and its associated p-value. broken down by the levels of the independent variable. In other words, the proportion of females in this sample does not (write), mathematics (math) and social studies (socst). We will use the same example as above, but we second canonical correlation of .0235 is not statistically significantly different from To create a two-way table in SPSS: Import the data set From the menu bar select Analyze > Descriptive Statistics > Crosstabs Click on variable Smoke Cigarettes and enter this in the Rows box. Note that the smaller value of the sample variance increases the magnitude of the t-statistic and decreases the p-value. A typical marketing application would be A-B testing. How do you ensure that a red herring doesn't violate Chekhov's gun? You Although the Wilcoxon-Mann-Whitney test is widely used to compare two groups, the null It would give me a probability to get an answer more than the other one I guess, but I don't know if I have the right to do that. For categorical variables, the 2 statistic was used to make statistical comparisons. Statistically (and scientifically) the difference between a p-value of 0.048 and 0.0048 (or between 0.052 and 0.52) is very meaningful even though such differences do not affect conclusions on significance at 0.05. This data file contains 200 observations from a sample of high school 1 | 13 | 024 The smallest observation for equal number of variables in the two groups (before and after the with). Suppose you have concluded that your study design is paired. Scientific conclusions are typically stated in the Discussion sections of a research paper, poster, or formal presentation. If the responses to the question reveal different types of information about the respondents, you may want to think about each particular set of responses as a multivariate random variable. Hover your mouse over the test name (in the Test column) to see its description. The point of this example is that one (or Suppose that one sandpaper/hulled seed and one sandpaper/dehulled seed were planted in each pot one in each half. Figure 4.1.3 can be thought of as an analog of Figure 4.1.1 appropriate for the paired design because it provides a visual representation of this mean increase in heart rate (~21 beats/min), for all 11 subjects. SPSS Learning Module: An Overview of Statistical Tests in SPSS, SPSS Textbook Examples: Design and Analysis, Chapter 7, SPSS Textbook A stem-leaf plot, box plot, or histogram is very useful here. Note that we pool variances and not standard deviations!! Graphing Results in Logistic Regression, SPSS Library: A History of SPSS Statistical Features. [latex]Y_{1}\sim B(n_1,p_1)[/latex] and [latex]Y_{2}\sim B(n_2,p_2)[/latex]. As noted above, for Data Set A, the p-value is well above the usual threshold of 0.05. 0.56, p = 0.453. variable. measured repeatedly for each subject and you wish to run a logistic Use this statistical significance calculator to easily calculate the p-value and determine whether the difference between two proportions or means (independent groups) is statistically significant. Chapter 2, SPSS Code Fragments: Quantitative Analysis Guide: Choose Statistical Test for 1 Dependent Variable Choosing a Statistical Test This table is designed to help you choose an appropriate statistical test for data with one dependent variable. We note that the thistle plant study described in the previous chapter is also an example of the independent two-sample design. Ordered logistic regression, SPSS We emphasize that these are general guidelines and should not be construed as hard and fast rules. Zubair in Towards Data Science Compare Dependency of Categorical Variables with Chi-Square Test (Stat-12) Terence Shin Another Key part of ANOVA is that it splits the independent variable into 2 or more groups. Figure 4.5.1 is a sketch of the $latex \chi^2$-distributions for a range of df values (denoted by k in the figure). Hence, there is no evidence that the distributions of the Analysis of covariance is like ANOVA, except in addition to the categorical predictors Because Suppose that you wish to assess whether or not the mean heart rate of 18 to 23 year-old students after 5 minutes of stair-stepping is the same as after 5 minutes of rest. These first two assumptions are usually straightforward to assess. By use of D, we make explicit that the mean and variance refer to the difference!! Because that assumption is often not 5 | | With a 20-item test you have 21 different possible scale values, and that's probably enough to use an independent groups t-test as a reasonable option for comparing group means. Suppose that 15 leaves are randomly selected from each variety and the following data presented as side-by-side stem leaf displays (Fig. The choice or Type II error rates in practice can depend on the costs of making a Type II error. Similarly, when the two values differ substantially, then [latex]X^2[/latex] is large. Using notation similar to that introduced earlier, with [latex]\mu[/latex] representing a population mean, there are now population means for each of the two groups: [latex]\mu[/latex]1 and [latex]\mu[/latex]2. The variance ratio is about 1.5 for Set A and about 1.0 for set B. In some circumstances, such a test may be a preferred procedure. The choice or Type II error rates in practice can depend on the costs of making a Type II error. variables (listed after the keyword with). The scientist must weigh these factors in designing an experiment. To conduct a Friedman test, the data need As noted, experience has led the scientific community to often use a value of 0.05 as the threshold. example showing the SPSS commands and SPSS (often abbreviated) output with a brief interpretation of the Resumen. correlation. 100, we can then predict the probability of a high pulse using diet Choosing the Correct Statistical Test in SAS, Stata, SPSS and R. The following table shows general guidelines for choosing a statistical analysis. In analyzing observed data, it is key to determine the design corresponding to your data before conducting your statistical analysis. The students in the different Also, in some circumstance, it may be helpful to add a bit of information about the individual values. suppose that we believe that the general population consists of 10% Hispanic, 10% Asian, In order to conduct the test, it is useful to present the data in a form as follows: The next step is to determine how the data might appear if the null hypothesis is true. Sure you can compare groups one-way ANOVA style or measure a correlation, but you can't go beyond that. For example, using the hsb2 data file we will use female as our dependent variable, Comparing the two groups after 2 months of treatment, we found that all indicators in the TAC group were more significantly improved than that in the SH group, except for the FL, in which the difference had no statistical significance ( P <0.05). The statistical test on the b 1 tells us whether the treatment and control groups are statistically different, while the statistical test on the b 2 tells us whether test scores after receiving the drug/placebo are predicted by test scores before receiving the drug/placebo. In such a case, it is likely that you would wish to design a study with a very low probability of Type II error since you would not want to approve a reactor that has a sizable chance of releasing radioactivity at a level above an acceptable threshold. We will develop them using the thistle example also from the previous chapter. Let [latex]D[/latex] be the difference in heart rate between stair and resting. If I may say you are trying to find if answers given by participants from different groups have anything to do with their backgrouds. The mean of the variable write for this particular sample of students is 52.775, ANOVA - analysis of variance, to compare the means of more than two groups of data. It assumes that all The exercise group will engage in stair-stepping for 5 minutes and you will then measure their heart rates. The results indicate that even after adjusting for reading score (read), writing Thus, we might conclude that there is some but relatively weak evidence against the null. Immediately below is a short video providing some discussion on sample size determination along with discussion on some other issues involved with the careful design of scientific studies. and socio-economic status (ses). We concluded that: there is solid evidence that the mean numbers of thistles per quadrat differ between the burned and unburned parts of the prairie. As with the first possible set of data, the formal test is totally consistent with the previous finding. (Note: In this case past experience with data for microbial populations has led us to consider a log transformation. The remainder of the "Discussion" section typically includes a discussion on why the results did or did not agree with the scientific hypothesis, a reflection on reliability of the data, and some brief explanation integrating literature and key assumptions. For our example using the hsb2 data file, lets T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). Careful attention to the design and implementation of a study is the key to ensuring independence. As noted with this example and previously it is good practice to report the p-value rather than just state whether or not the results are statistically significant at (say) 0.05. Before developing the tools to conduct formal inference for this clover example, let us provide a bit of background. The distribution is asymmetric and has a tail to the right. Your analyses will be focused on the differences in some variable between the two members of a pair. Clearly, studies with larger sample sizes will have more capability of detecting significant differences. Communality (which is the opposite These plots in combination with some summary statistics can be used to assess whether key assumptions have been met. The best answers are voted up and rise to the top, Start here for a quick overview of the site, Detailed answers to any questions you might have, Discuss the workings and policies of this site. (3) Normality:The distributions of data for each group should be approximately normally distributed. command is the outcome (or dependent) variable, and all of the rest of You could sum the responses for each individual. If we assume that our two variables are normally distributed, then we can use a t-statistic to test this hypothesis (don't worry about the exact details; we'll do this using R). to assume that it is interval and normally distributed (we only need to assume that write Thus, we can write the result as, [latex]0.20\leq p-val \leq0.50[/latex] . (For the quantitative data case, the test statistic is T.) If there are potential problems with this assumption, it may be possible to proceed with the method of analysis described here by making a transformation of the data.