Sample collection is a nuanced process that necessitates careful consideration before embarking into the field. Determining what samples to gather, when to obtain them, and how to procure them are three critical decisions that should be made in advance to ensure an effective sampling endeavour.
Researchers must bear in mind that due to the intricate and dynamic nature of microbial communities, any samples acquired may already deviate from their original state . This discrepancy can be attributed to temporal shifts, spatial disparities, and distinctions between the sampled substrate and the subject of study. An illustrative instance is the collection of faecal samples, which is notorious for not perfectly representing the true community within the lower intestine .
Furthermore, if faecal samples are sourced from the environment rather than directly from the host animal, the microbial communities within them may undergo alterations due to shifts in physicochemical parameters (such as oxygen levels and temperature) caused by exposure to air and sunlight . Additionally, colonisation by new bacteria from the surrounding environment can influence the samples. Hence, it is advisable to obtain faecal samples directly from the animals or from sterile containers in which animals are temporarily housed, thereby minimising environmental influences.
It is important to acknowledge that samples typically exhibit spatial structure . Consequently, when subsampling is performed, the composition of the subset will depend on the precise location (down to the micron level) from which the sample was taken. In cases where acquiring faecal samples from the environment is unavoidable, biases introduced by exposure can be mitigated by targeting internal subsamples and avoiding external layers. However, this approach might introduce its own spatial biases .
Dealing with sizeable samples often imposes constraints related to DNA preservation and extraction methods, limiting the amount of material that can be processed. As a result, researchers may consider gathering multiple small subsamples using either random or structured subsampling strategies. This approach captures the spatial diversity of the sample, as opposed to relying on a single large sample. Subsequently, these subsamples could be aggregated to represent the entire sample or, ideally, processed individually as distinct biological replicates. This practice could potentially enhance metagenomic binning procedures.