ANALYSIS OF THE BACTERIAL EPIPHYTIC MICROBIOTA OF OAK LEAF LETTUCE WITH 16S RIBOSOMAL RNA GENE ANALYSIS

The leaf microbiota has major influences on the quality of ready-to-eat lettuce. While studies investigating the epi- and endophytic microbiota of lettuce have been published, no protocols focusing only on the epiphytic microbiota exist. As the epiphytic microbiota may be especially influenced by technological steps in the production of ready-to-eat lettuce, an in-depth knowledge of these microorganisms is essential with regard to consumer safety and spoilage. Currently it is not clear to what extent results gained from single samples are representative of the community composition. A technique for the separation of bacterial cells from the leaf surface was applied to green oak leaf lettuce. The bacterial diversity was analysed in triplicate with high throughput Roche 454 sequencing of prokaryotic 16S rRNA genes to analyse the intra-sample variation. Sequence analysis revealed members of the phyla Acidobacteria, Actinobacteria, Bacteroidetes, Firmicutes, Gemmatimonadetes, Proteobacteria and Verrucomicrobia, and of the candidate division WYO. The ten most abundant proteobacterial genera in all three samples were Alkanindiges (24.6%), Pseudomonas (11.3%), Sphingomonas (8.6%), Janthinobacterium (8.3%), Acinetobacter (4.3%), Polaromonas (1.3%), Erwinia (1.1%), and Methylobacterium (1.1%). The genera Pedobacter (2.5%) and Hymenobacter (1.4%) dominated the phylum Bacteroidetes. The intra-sample variation was less than 0.7% for seven of these most abundant genera with the exception of Pseudomonas, Janthinobacterium and Alkanindiges, where larger standard deviations were obtained. This low intra-sample variation demonstrates that the established technique based on oak leaf lettuce is suitable for the culture-independent analysis of the epiphytic bacterial microbiota of produce.

In the present study, we developed an extensive protocol from sample preparation to next generation sequencing to assess the bacterial diversity on the surface of oak leaf lettuce.This cultivar was chosen as an example of a cultivar with a delicate leaf structure.A procedure for the separation of the bacterial cells from the leaf surfaces while retaining leaf integrity was established.A biodiversity analysis using Roche 454 sequencing of 16S rRNA gene amplicons was performed.This procedure was evaluated with DNA retrieved from a single green oak leaf lettuce head to determine the reproducibility of the method and to determine the degree of intra-sample variation within a single lettuce sample.

Bacterial biomass harvest and DNA isolation
One head of conventionally grown green oak leaf lettuce (Lactuca sativa var.crispa) was obtained from a local retail store.The wrapper leaves were removed under sterile conditions by cutting with a scalpel.To gain a representative amount of biomass, ten middle-sized leaves from this lettuce head were used as described below.DNA was isolated from each of the leaves and the DNA preparations were subsequently combined to a single DNA solution.Bacterial biomass harvest was carried out as follows: the cut edges of ten middlesized leaves (approximately 10 g of lettuce each) were sealed with commercial nail polish (Manhattan Lotus Effect, Coty Germany GmbH, Germany) to reduce chloroplast effusion.The leaves were transferred into sterile blender bags with lateral filters (BBAG-03, Gosselin SAS, France) and the appropriate nine-fold volume of sterile buffer (1% (w/v) TWEEN ® 80 (Sigma-Aldrich Co., U.S.A.), 1% (w/v) buffered peptone water (Merck KGaA, Germany), 10 mM EDTA (Biomol GmbH, Germany); pH 6.7) was added.The leaves were completely covered by the buffer.The bags were then heat-sealed and placed into a Pulsifier ® PUL 100E device (Microgen Bioproducts Ltd., UK), and treated for 45 s to detach microorganisms from the leaf surface.With this procedure, removal of bacteria was highest and leaves remained largely intact (data not shown).Afterwards, the complete buffer of each leaf was divided into two equal volumes, which were transferred into two sterile 50 mL centrifuge tubes.The tubes were centrifuged at 8,000 × g for 20 min at 4°C.Supernatants were discarded, and the four cell pellets of two leaves were pooled into 2 mL reaction tubes, resulting in five tubes, and stored at -20°C.Genomic DNA was prepared from the five cell pellets modified from Wilson (1997).Each bacterial cell pellet was re-suspended in 567 µL lysozyme solution (3 mg/mL in TE-buffer (10 mM Tris, 1 mM EDTA; pH 8.0)) by vigorous blending.After the addition of cetyl trimethylammonium bromide (CTAB) and 0.7 M sodium chloride and incubation at 65°C for 10 min, the samples were cooled to room temperature, and 50 µL of ribonuclease A solution (3 mg/mL in TE-buffer) was added.The samples were mixed and incubated at 37°C for 2 h.The first extraction step was performed with 800 µL of a mixture of phenol:chloroform:isoamyl alcohol (25:24:1) (Carl Roth GmbH & Co. KG, Germany) by mixing and centrifugation in a bench top centrifuge (17,950 × g, 4°C, 5 min).The upper aqueous phase was transferred into a fresh tube and 800 µL of chloroform:isoamyl alcohol (24:1) (Carl Roth GmbH & Co. KG) were added.The sample was mixed and centrifuged under the same conditions as described above.The aqueous phase was again transferred to a new tube and this step was repeated once.Finally, 600 µL of ice-cold isopropanol (-20°C; Merck KGaA) was added and the DNA was precipitated for at least 0.5 h at -20°C.The precipitate was centrifuged (17,950 × g, 4°C, 15 min), the supernatant was discarded, and the pellet was washed with 1 mL of 70% (v/v) ethanol (-20°C).The dried pellet was re-suspended in 40 µL of DNaseand RNase-free sterile water (Gibco ® , Thermo Fisher Scientific Inc.) and stored at 4°C overnight.The DNA was solubilised by heating at 65°C for 5 min.The solutions of the five parallel samples were pooled, dried with a SpeedVac™ Vacuum System (Thermo Fisher Scientific Inc.) at ambient temperature and re-suspended in 40 µL of DNase-and RNase-free sterile water.The DNA concentration and quality were determined with a NanoDrop™ ND-2000 device (PEQLAB Biotechnologie GmbH, Germany) and by agarose gel electrophoresis.

Amplification of 16S rRNA genes, library preparation and Roche 454 sequencing
The primers for the paired-end 16S rRNA gene community sequencing were 515F (5'-GTGCCAGCMGCCGCGGTAA-3') and 806R (5'-GGACTACHVGGGTWTCTAAT-3') (Caporaso et al., 2012).Three samples of the DNA solution were used for Roche 454 sequencing.For each of the three samples, four different PCRs were performed to reduce the bias using dilutions of the purified DNA at concentrations of 10 -1 , 10 -2 , 10 -3 , and 10 -4 .Each 25 µL reaction contained 200 nM of each primer, 1 µL of template DNA as indicated above, 10 µL of 5 PRIME HotMasterMix (5 PRIME GmbH, U.S.A.), and 13 µL of PCR grade water.PCR cycle parameters consisted of an initial denaturation at 94°C for 3 min, then 35 cycles of denaturation at 94°C for 45 s, annealing at 50°C for 1 min and extension at 72°C for 1.5 min, followed by a final extension step at 72°C for 10 min, as described on the Earth Microbiome website (http://www.earthmicrobiome.org/emp-standard-protocols/16S).PCR products were obtained from template dilutions of 10 -2 , 10 -3 and 10 -4 .These products were pooled for each sample separately (55 µL each), mixed with 275 µL of PB buffer (Qiagen GmbH, U.S.A.), and purified using a MinElute ® PCR Purification Kit (Qiagen GmbH) according to the manufacturer's protocol.The amplicons were eluted with 20 µL of PCR grade water and quantified with a NanoDrop ® ND-1000 spectrophotometer (Thermo Fisher Scientific Inc.).The concentrations of the three samples were 50, 35 and 50 ng/µL, respectively.The following steps were performed for each sample independently: For the fragment end repair, 500 ng of purified amplicon was used and an adaptor ligation was carried out according to the Rapid Library Preparation Method Manual V. 3/2012 (F.Hoffmann-La Roche Ltd., Switzerland).The resulting sample (27 µL) was mixed with 3 µL of BlueJuice™ gel loading buffer (Thermo Fisher Scientific Inc.), run on a 1% agarose [TAE 1 ×] gel for 40 min at 4.5 V/cm.The ligation product was excised on a DarkReader ® transilluminator (Clare Chemical Research, U.S.A.) and recovered using the MinElute ® Gel Extraction Kit (Qiagen GmbH).Library quantitation was performed according to the Roche Manual with a QuantiFluor™ ST fluorimeter (Promega, U.S.A.).An aliquot of the DNA library was diluted to a working stock of 1 × 10 7 molecules/µL in TE buffer.An emulsion PCR was then performed with 11 µL of a working stock according to the emPCR Amplification Method Manual -Lib-L V. 3/2012 (F.Hoffmann-La Roche Ltd.).DNA library bead enrichment and sequencing primer annealing were performed according to the manufacturer's specifications using a REM-e module (F.Hoffmann-La Roche Ltd.) on a Biomek ® 3000 robot (Beckman Coulter GmbH, U.S.A.).Finally, the parallel samples were independently sequenced on a 454 GS Junior System (F.Hoffmann-La Roche Ltd.) according to the Sequencing Method Manual V. 1/2013 (F.Hoffmann-La Roche Ltd.).

16S rRNA amplicon sequencing data analysis
Forward and reverse reads were joined using fastq-join from the ea-utils software package (Aronesty, 2011).The sequences were then quality-filtered to exclude sequences with ambiguous bases (N) and sequences containing base calls with less than 99% confidence (Phred score of < 20).The QIIME bioinformatics software suite (version 1.5.0) was used to subsample the resultant sequences (82,311, 99,654 and 95,216 sequences, respectively), to cluster those sequences into 97% identity operational taxonomic units (OTU) with UCLUST (Aronesty, 2011), to taxonomically classify each OTU with RDP classifier (Edgar 2010) at 97% sequence homology placement into the GreenGenes 12_10 database (DeSantis et al., 2006).All QIIME scripts were run using default parameters unless otherwise stated.The sequences were deposited in the NCBI sequence read archive (SRA) under the numbers SRR1523744 (sample 1), SRR1265099 (sample 2) and SRR1265426 (sample 3).

16S rRNA gene amplicon sequencing results
In order to assess a method for the biodiversity analysis of leaf surface microbiota and to measure intra-sample variation, the total DNA of the leaf surface microbiota of a single lettuce head was isolated and quantified.A DNA concentration of 2,727.4ng/µL with absorption ratios of A260/A280 of 1.85 and A260/A230 of 1.77 were obtained for the pooled DNA solution.Three samples of this DNA solution, called sample 1, sample 2 and sample 3, were analysed by 16S rRNA gene sequencing using a Roche 454 next generation sequencer as described above.The universal primers used have been described to cover the variable region 4 of the 16S rRNA gene of almost all bacterial phyla (Walters et al. 2011).The results and quality parameters of these runs are depicted in Table 1.On average, 92,393 sequences were obtained per sample, with an average median length of 275 bp.Members of the domain Archaea were not detected, and the sequences assigned to eukaryotes originated from their mitochondria and chloroplasts (see Table 1).The covered richness expressed as the Chao1 index was rather high, reaching an average of 4164.1.For all samples the rarefaction curves showed asymptotic curve progression, indicating that a very high bacterial diversity was detected in this study (Figure 1).As expected, the total number of sequences included mitochondrial as well as chloroplast sequences (see Table 1), which were discarded from the subsequent analysis.Furthermore, sequences that could not be assigned to a phylum or a candidate division were excluded from the analysis.The mean number of bacterial 16S rRNA gene sequences was therefore 80,126.These data indicate a high level of taxonomic identification and classification with the method applied as well as that the applied database was very appropriate with respect to the analysed samples.The data obtained by next generation sequencing were evaluated for the intrasample variation.Approximately 450 OTUs per sample were detected.This demonstrates a diverse microbiota present on oak leaf lettuce, as the majority of sequences could be assigned to specific bacterial phyla.The number of sequences that remained unclassified or could only be assigned to the kingdom Bacteria strongly declined with increasing number of sequences per OTU in all samples.It has to be noted that diversity estimates are highly sensitive to the abundance of rare members of populations (White et al., 2010).Including OTUs with single reads into the evaluation, Kunin et al. (2010) found that sequencing errors inflated the actual diversity 100-fold, and therefore the present analysis was restricted to OTUs containing at least five sequences.Of the remaining OTUs on average six unclassified OTUs consisted of more than five sequences.For the kingdom Bacteria on average twelve OTUs with more than five sequences were found that could not be assigned to a further taxonomic level.These may represent hitherto unidentified bacteria, but might also be attributed to sequencing errors that were not removed from the dataset despite de-noising.

Phylogenetic analysis of abundant OTUs
The results gained from the three analysed samples were compared to determine the intra-sample variation.On average 452 OTUs contained at least five sequences, which corresponded to 32.0% of the bacterial 16S rRNA gene OTUs (see Table 1).Nevertheless, these OTUs contained on average 97.8% of the bacterial 16S rRNA gene sequences.Furthermore, on average 0.4% could be identified on the kingdom level as Bacteria only and could not be assigned to a specific phylum (Table 2).Among the abundant OTUs sequences members of the phyla Acidobacteria, Actinobacteria, Bacteroidetes, Firmicutes, Gemmatimonadetes, Proteobacteria, and Verrucomicrobia as well as the candidate division WYO were detected.As shown in Table 2, 89.9 ± 1.0% of the sequences belonged to the phylum Proteobacteria, followed by the phyla Bacteroidetes (7.1 ± 1.0%) and Actinobacteria (2.5 ± 0.4%).Only a minority of sequences was allocated to the remaining phyla.Within the Proteobacteria, on average 56.0% of the sequences were γ-Proteobacteria, 16.8% β-Proteobacteria, and 15.9% α-Proteobacteria, while only 0.1% of the sequences were δ-Proteobacteria.Only one OTU in sample 3 containing five sequences was allocated to ε-Proteobacteria, to the genus Arcobacter.The phyla detected in the current study match those reported for Romaine lettuce (Rastogi et al., 2012) and greenhouse-grown lettuce (Erlacher et al., 2014) for the total leaf microbiota of endophytic and epiphytic microorganisms, but, as expected, the composition differed.The majority of sequences were assigned to Gram-negative bacteria.This is in accordance with results of other authors (e.g., Rastogi et al., 2012;Vorholt, 2012), even though more members of the phylum Firmicutes were reported.The focus of this work was on the Gram-negative population, as several members are known to produce biofilms and to be responsible for spoilage.Nevertheless, the DNA isolation procedure applied in the present study was re-assessed with a panel of selected representative Grampositive and -negative pure cultures.Furthermore, harsher DNA isolation procedures including bead-beating steps were evaluated for epiphytic phyllosphere community biomass samples, which yielded inferior DNA quality (data not shown).Cruaud et al. (2014) found in their comparison of a commercial DNA isolation kit with a conventional DNA isolation procedure for application in 454 pyrosequencing only low levels of dissimilarity in taxonomic affiliations.Furthermore, the primers applied in the current study should also have amplified Gram-positive sequences.The detected low numbers of Grampositives could also be attributed to the leaf type, as the microbiota of lettuce was shown to depend on the cultivar (Hunter et al., 2010), the season (Williams et al., 2013) and the geographical location (Finkel et al., 2011;Redford et al., 2010).
In total, 58 genera were detected in both, sample 1 and 2, and 65 genera in sample 3 (Table 2).They contained on average 65.6% of the bacterial 16S rRNA gene sequences of abundant OTUs.Of these, only ten genera were assigned at least 1.0% of the sequences in one of the samples (see Table 3).Of these ten genera, eight were identical in all three samples, and only Methylobacterium and Polaromonas were below 1.0% in one of the samples.For these ten most abundant genera, the number of sequences of the most abundant OTUs between the samples was comparable.For example, for the genus Alkanindiges two OTUs per sample contained the majority of the sequences.These were 8,665 and 8,610 sequences for sample 1, 10,357 and 9,396 sequences for sample 2 and 10,918 and 8,999 sequences for sample 3. The only exception was Polaromonas, where only two abundant OTUs were detected in sample 1, and one abundant OTU in each of the other two samples.Of the Proteobacteria these main genera were the α-proteobacterial genera Sphingomonas and Methylobacterium, the β-proteobacterial genera Janthinobacterium and Polaromonas, as well as the γ-proteobacterial genera Alkanindiges, Pseudomonas, Acinetobacter, and Erwinia.To display the proteobacterial diversity, Figure 2 compares all proteobacterial genera that amounted at least 0.1% of the total number of sequences in the three samples.3).As shown in Table 3, the standard deviations of the abundances were below 0.7% for seven of the ten genera.Larger standard deviations were calculated for Pseudomonas, Janthinobacterium and Alkanindiges.These results correlate well with higher numbers of sequences identified only at the respective family level (data not shown).For example, 20.5%, 2.6% and 11.2% of the sequences in abundant OTUs were allocated to the genus Pseudomonas.On the other hand, 3.5%, 22.4% and 9.6% were additionally allocated to the family Pseudomonadaceae, but no genus affiliation was possible.The most abundant bacterial genus was Alkanindiges, a γ-Proteobacterium, which comprised on average 25.1 ± 1.5% of the bacterial 16S rRNA gene sequences in abundant OTUs.No species affiliation within this genus was possible in all samples.This genus was already reported to be an important member of the leaf microbiota of Romaine lettuce (Rastogi et al., 2012), and in the former study its presence correlated positively with that of Xanthomonas campestris pv.vitians.Contrarily, although Alkanindiges was the most abundant genus in the present study, no Xanthomonas sequences were detected.On the other hand, Alkanindiges and Acinetobacter were reported as indicators for healthy plants (Erlacher et al., 2014), and it should be studied further if the difference in Alkanindiges abundance could be attributed to the lettuce cultivar and growth conditions.The only genus of the family Enterobacteriaceae among the ten most abundant genera was Erwinia.Erwinia may cause plant disease, be saprophytic or epiphytic, depending on the species (Hauben and Swings, 2005).
It is commonly found on lettuce (Rastogi et al., 2012), and the presence of Erwinia on Romaine lettuce was hypothesized to correlate with a decreased survival of E. coli O157:H7 (Williams et al., 2013).Escherichia was only rarely detected on Romaine lettuce (Williams et al., 2013), which is also the case in the current study.Each sample contained only one OTU of Escherichia, with 25, 1 and 27 sequences, respectively.The 1,192, 1,254 and 1,142 sequences allocated to rare OTUs amounted to 1.5% of the total bacterial 16S rRNA gene sequences on average.Table 4 gives an overview of the assignment of rare OTUs to the diverse phyla and candidate divisions.On average, only 744 (0.9%) single reads were found.As expected by the length of the sequenced amplicons, a species affiliation was possible in 26.2% (36 OTUs), 7.7% (34 OTUs) and 16.1% (48 OTUs) of the sequences of the three samples.These were mainly associated with the most abundant genera, and therefore no separate evaluation was carried out.Furthermore, no correlation was found between the abundance of rRNA gene sequences detected in this study and the mean number of rRNA genes present in the respective genus (Lee et al., 2008).As expected, the standard deviations between the samples and thus the intra-sample variation increased with the taxonomic level from phylum to genus.This could be attributed to the length of the sequenced amplicons.In this study a method published within the Earth Microbiome Project was applied to evaluate it for studying the plant microbiota, too.For further decreasing the intra-sample variation it could be considered to study longer amplicons, as they might facilitate a higher taxonomic resolution.
Thus it was demonstrated that the established procedure consisting of bacterial biomass harvest, DNA isolation and next generation sequencing is indeed suitable for the biodiversity analysis of the epiphytic bacterial microbiota of lettuce.

CONCLUSION
In this study a complete procedure for the assessment of the bacterial leaf surface microbiota of lettuce was successfully established.Comparable results with low intra-sample variation are obtained with the established protocol by analysing three samples of the same lettuce head.In the current study, γ-Proteobacteria were most abundant, as already reported for other lettuce varieties and spinach.Contrary to these studies, the genus Alkanindiges comprised approximately a quarter of the abundant 16S rRNA gene sequences.While the implications of the variations in the bacterial biodiversity remain to be elucidated, this study has provided a sound protocol for the extraction, performance and analysis of 16S rRNA gene sequencing data of epiphytic bacterial DNA from delicate leaf structures, which greatly improves the understanding of the lettuce phyllosphere microbiota.

Figure 1
Figure 1 Rarefaction curves showing the covered richness expressed as the Chao1 index for the samples 1 (A), 2 (B) and 3 (C)

Figure 2
Figure 2 Overview of the genera of the phylum Proteobacteria with an abundance of at least 0.1% of the total sequences in one of the samples

Table 1
16S rRNA gene amplicon sequencing results

Table 2
Number of classified sequences allocated to abundant OTUs, respective number of OTUs and genera assigned to diverse phyla and their proportion within the of bacterial 16S rRNA gene sequences contained in abundant OTUs[%] a Sequ.: sequences.

Table 3
Number of classified sequences allocated to abundant OTUs, respective number of OTUs assigned to the ten most abundant genera and their proportion within the number of bacterial 16S rRNA gene sequences contained in abundant OTUs [%]