Bivariate Data

We wish to emphasize that the data sets presented so far are associated with a single random variable. This means that each sample refers to a single, specific population or population characteristic. The technical name for such data is univariate. Data which derives from two separate populations or population characteristics is called bivariate.

Categorical Data

Statistics is often introduced using quantitative data such as height, weight, waiting time, etc. But there is a wide, wide world of very important data of a different type; qualitative or categorical data. As the name suggests, categorical data is expressed, as a random variable, in terms of categories or cells; two or more information[…]

chi_stat_gof

Testing Categorical Data

Introduction Chi-Square Tests are generally applied to categorical data. Yet the Chi-square distribution is a continuous random variable. How can this be ? The answer may be found in the history of statistical methods. Just as de Moivre and Laplace sought for and found the normal approximation to the binomial, Karl Pearson sought for and[…]

Some Distribution Theory

Here we outline some of the theory behind the chi-square, t- and F-distributions. Recall from Inferential Statistics that the chi-squared test uses the chi-squared statistic to determine whether or not two population characteristics are related in some way. While a large value for the chi-squared statistic suggests that the two characteristics are not independent, a[…]

Single Sample Inferences

Math_ This page is under construction. In Inferential Statistics we saw that the methods of data interpretation depend, among other things, on the number of samples involved …. are different depending on the size, n, of the sample(s) under consideration. Here we examine situations where the sample size is large; that is, where n >[…]

Inferential Statistics

As we saw in Statistics Defined, inferential statistics may be defined as the science of data interpretation. Data interpretation involves drawing one or more conclusions about a population underlying a data set accompanied by statements about the reliability of such conclusions. The process begins with a claim about some numerical feature of a population (a[…]