Categorical Data and GLMs (CDA)

To enable students to use generalized linear models (GLMs) and other methods to analyse categorical data with proper attention to the underlying assumptions. There is an emphasis on the practical interpretation and communication of results to colleagues and clients who may not be statisticians.


Dr Michael Waller, School of Public Health, University of Queensland

Dr Michael Waller University of Queensland, School of Public Health Semester 2
General outline


Epidemiology, Mathematical Background for Biostatistics, Probability and Distribution Theory, Principles of Statistical Inference


Linear Models

Time commitment

8-12 hours total study time per week

Semester availability

Semester 2


3 assignments, the first for modules 1-3 (30%), the second for modules 4-5 (35%) and the last for module 6 (35%)

Prescribed Texts

References will be listed in the unit Study Guide

Special Computer Requirements

Stata or R statistical software


Introduction to and revision of conventional methods for contingency tables especially in epidemiology: odds ratios and relative risks, chi-squared tests for independence, Mantel-Haenszel methods for stratified tables, and methods for paired data. The exponential family of distributions; generalized linear models (GLMs), and parameter estimation for GLMs. Inference for GLMs – including the use of score, Wald and deviance statistics for confidence intervals and hypothesis tests, and residuals. Binary variables and logistic regression models – including methods for assessing model adequacy. Nominal and ordinal logistic regression for categorical response variables with more than two categories. Count data, Poisson regression and log-linear models.

Special Computer Requirements

Course notes, assignment material, tutorials and interaction facilities available online