Shown in the left panel in Figure 1 is the histogram of x i, while the right panel is the histogram of y i (the log-transformed version of x i) based on a sample size of n=10,000. To show how this can happen, we first simulated data u i which is uniformly distributed between 0 and 1,and then constructed two variables as follows: x i=100(exp(μ i-1)+1, y i=log(x i). In fact, in some cases applying the transformation can make the distribution more skewed than the original data.
Unfortunately, data arising from many studies do not approximate the log-normal distribution so applying this transformation does not reduce the skewness of the distribution. In this case, the log-transformation does remove or reduce skewness. If the original data follows a log-normal distribution or approximately so, then the log-transformed data follows a normal or near normal distribution. The log transformation is, arguably, the most popular among the different types of transformations used to transform skewed data to approximately conform to normality. When the distribution of the continuous data is non-normal, transformations of data are applied to make the data as "normal" as possible and, thus, increase the validity of the associated statistical analyses. Many methods have been developed to test the normality assumption of observed data. Quite often data arising in real studies are so skewed that standard statistical analyses of these data yield invalid results.
Unfortunately, the symmetric bell-shaped distribution often does not adequately describe the observed data from research projects. The normal distribution is widely used in basic and clinical research studies to model continuous outcomes. Using the log transformation to make data conform to normality We conclude with recommendations of alternative analytic methods that eliminate the need of transforming non-normal data distributions prior to analysis.Ģ.1. We use examples and simulated data to show that this method often does not resolve the original problem for which it is being used (i.e., non-normal distribution of primary data) and to show that using this transformation can introduce new problems that are even more difficult to deal with then the problem of non-normal distribution of data. In this article we focus on the log-transformation and discuss major problems of using this method in practice. Another example is the Cox regression model used in survival analysis many studies apply this popular model without even being aware of the proportionality assumption (i.e., the relative hazard of groups of interest is constant over time) required for valid inference. For example, the two-sample t-test is widely used to compare the means of two independent samples with normally distributed (or approximately normal) data, but many researchers take this critical assumption for granted, using t-tests without bothering to check or even acknowledge this underlying assumption. Such misuse and misinterpretation is not unique to this particular transformation it is a common problem in many popular statistical methods. Unfortunately, its popularity has also made it vulnerable to misuse – even by statisticians – leading to incorrect interpretation of experimental results.
#Log transformation hypothesis test calculator software#
Due to its ease of use and popularity, the log transformation is included in most major statistical software packages including SAS, Splus and SPSS. The log transformation, a widely used method to address skewed data, is one of the most popular transformations used in biomedical and psychosocial research.