20 Data transformations

Before conducting any analysis on multivariate datasets, it is important to note that the different data types present (such as the presence/absence of taxa, gene counts, metabolic capacity indices, and metabolite concentrations) may need to undergo transformation. Data transformation is the process of changing the scale or distribution of data in order to meet the assumptions of a statistical model or to improve the interpretability of the data. Transformations can be applied to individual variables or to the entire dataset, and can involve a variety of mathematical operations, such as scaling, centring, and rescaling. Data transformations can be categorised by the objective they follow: