Why is it important?

As data become more complex, correct handling and analyses is important to be able to get valid results.

How is it done?

So far, we have discussed comparing one outcome with one variable and with multiple variables that does not take time or missing data into account. The commonly used data in medical literature that includes missing data and time is survival analyses. This is when patients are followed up to a time point and are alive (censored) or died. The most common regression method in this circumstance is the use of the Cox proportional hazards regression where time and censoring is taken into account. The measure of association is a hazard ratio, that refers to the relative risk of death.

As data becomes more complicated, more sophisticated methods are applied, such as longitudinal data analyses when multiple time points of interest are analysed, the most common thoracic surgical example is the longitudinal lung function outcomes after lung volume reduction surgery. The analysis needs to take into account, time, irregular time intervals, correlation within each patient, correlation with time before comparison can be made with another group.

What is the relevance?

If proper statistical methods are not used to account for time and missing data, erroneous conclusions can occur. For example, if a new surgical intervention is introduced and the follow up time is only 1 month compared to the old technique used for over 30 years, with follow up time of 30 years, few deaths will occurs in the new technique group with a follow up time of 1 month compared to the old technique group of 30 years, and a researcher could make the false conclusion that there we statistically significantly less deaths on follow up with the new technique.

If you find this type of teaching useful and would like to learn more, I run an online statistics course for clinicians and researchers: