Matrix 1999 Download Itag

 Posted admin
Published online 2016 Jul 7. doi: 10.3389/fmicb.2016.01048

Download Mp3tag, a powerful and easy-to-use tool to edit metadata of audio files.

PMID: 27458442
  1. Matrix 1999 Download Itag. Belin- Blank Center. Request Services. If you wish to obtain information about our services or are interested in scheduling an assessment, counseling, or consultation, please complete the online form. Current Clients. You may download forms and privacy information here at any.
  2. The Matrix 1999 (tt0133093) Set in the 22nd century, The Matrix tells the story of a computer hacker who joins a group of underground insurgents fighting the vast and powerful computers who now rule the earth.
This article has been cited by other articles in PMC.

Abstract

The Mississippi River (MR) serves as the primary source of freshwater and nutrients to the northern Gulf of Mexico (nGOM). Whether this input of freshwater also enriches microbial diversity as the MR plume migrates and mixes with the nGOM serves as the central question addressed herein. Specifically, in this study physicochemical properties and planktonic microbial community composition and diversity was determined using iTag sequencing of 16S rRNA genes in 23 samples collected along a salinity (and nutrient) gradient from the mouth of the MR, in the MR plume, in the canyon, at the Deepwater Horizon wellhead and out to the loop current. Analysis of these datasets revealed that the MR influenced microbial diversity as far offshore as the Deepwater Horizon wellhead. The MR had the highest microbial diversity, which decreased with increasing salinity. MR bacterioplankton communities were distinct compared to the nGOM, particularly in the surface where Actinobacteria and Proteobacteria dominated, while the deeper MR was also enriched in Thaumarchaeota. Statistical analyses revealed that nutrients input by the MR, along with salinity and depth, were the primary drivers in structuring the microbial communities. These results suggested that the reduced salinity, nutrient enriched MR plume could act as a seed bank for microbial diversity as it mixes with the nGOM. Whether introduced microorganisms are active at higher salinities than freshwater would determine if this seed bank for microbial diversity is ecologically significant. Alternatively, microorganisms that are physiologically restricted to freshwater habitats that are entrained in the plume could be used as tracers for freshwater input to the marine environment.

Keywords: Mississippi River, Gulf of Mexico, ITag, microbial ecology, microbial diversity, bacterioplankton community composition, 16S rRNA gene sequencing

Introduction

Over a decade ago the ubiquity of specific microbial lineages inhabiting global freshwater environments was reported (Zwart et al., 2002). More recently provided an extensive synthesis regarding what is known about freshwater microorganisms, including detailed phylogenies of freshwater clades, such as actinobacterial subclades that have been reported as one of the dominant phyla in lakes (; ; Allgaier and Grossart, 2006; ). In aquatic environments distinct microbial communities have been identified in freshwater as compared to marine environments (; ). In comparing microbial diversity along a salinity trajectory from freshwater to the ocean, rivers have been implicated in enriching archaeal and bacterial diversity in coastal waters. Evidence of such an enrichment has been reported in the Columbia River estuary (; ). showed that salinity and depth were the primary drivers structuring microbial communities, with microbial diversity decreasing from a riverine freshwater end member out to the shelf bottom in the Columbia River coastal margin. similarly reported changes in microbial taxonomic composition along a salinity gradient profiled in three stations from the Columbia River to the Pacific Ocean. They also reported that microbial activity varied, but not systematically, with changes in salinity. In the Amazon River, metagenomic analysis of samples collected from the upper course of the river revealed that genes encoding heterotrophic processes were more abundant in comparison to those in the marine environment (). reported an increase in transcript copy compared to gene copy in five stations collected from the Amazon River to the ocean and suggested a greater per cell activity along this trajectory. These studies reveal that microbial community structure and function varies along a river to open ocean transect. They also show that microbial communities in river systems are distinct from high salinity ocean end members, with river systems often exhibiting higher diversity than open ocean sites. That rivers generally host a more diverse microbial community than marine environments suggests that river plumes, with reduced salinities and higher nutrient concentrations compared to the open ocean, may act as a seed bank for microbial diversity to marine waters as the plume migrates away from the river mouth and mixes with seawater. Whether freshwater microorganisms are active in river plumes as they migrate and mix with seawater would determine the ecological significance of this seed bank.

In the nGOM, the Mississippi (and Atchafalaya) Rivers are the primary sources of fresh water and key nutrients, delivering 80% of the freshwater inflow, 91% of the estimated annual nitrogen load, and 88% of the phosphorus load (Dunn, 1996). The freshwater, sediments, and dissolved and particulate materials are carried predominantly westward along the Louisiana/Texas inner to mid-continental shelf, especially during peak spring discharge (Rabalais et al., 2007). The influence of nutrient rich freshwater, in combination with stratification that arises from salinity differences and intensifies during summer with thermal warming of surface waters (Wiseman et al., 1997; Rabalais et al., 2010), is best exemplified by the formation of the annual, mid-summer hypoxic water mass that is distributed across the Louisiana shelf west of the MR and onto the upper Texas coast (Rabalais and Turner, 2001; Rabalais et al., 2007). However, MR plume transport is influenced by several factors such as wind, topography, interactions with boundary currents and eddies which can lead to offshore transport of the plume (Schiller et al., 2011 and references therein). This offshore movement of the low salinity, nutrient rich MR plume could influence the in situ microbial community structure and ecology beyond the historical area of low oxygen that occurs annually by acting as a dispersion mechanism for microbial diversity and conduit for introducing freshwater microorganisms, as it does with nutrients to the nGOM.

analyzed the microbial community in the MR to the open ocean in the nGOM. In their shallow, low salinity MR plume sample Proteobacteria were abundant, as were unclassified Bacteria, unclassified microbes and Verrucomicrobia. Although these authors reported that microbial diversity did not differ when comparing MR to marine samples, they did show that the MR sample had a distinct microbial community ( see Figure Figure11). The distinct nature of their MR sample as compared to marine sites is likely influenced by the resident microbial community in the MR. For example, the Upper MR was dominated by Proteobacteria, Actinobacteria, Bacteroidetes, Cyanobacteria, and Verrucomicrobia (), some of which overlapped with the major players in the MR plume as reported by .

Station map of six sites sampled, nutrient concentrations (NO3-, PO43-), salinity (practical salinity units, psu) and relative abundances (RA) of Actinobacteria; acI and Thaumarchaeota; Nitrosopumilus profiles over depth. In all figures distance is zero at the MR site and then proceeds: MRP2, MRP1, canyon, DWH to the loop current site at 800 km. These sites are plotted along a surface sample salinity trajectory from lowest (MR) to highest (LC). The sites are: Site 1 MR (S1), Site 2 MRP2 (S2), Site 3 MRP1 (S3), Site 4 canyon (S4), Site 5 DWH (S5), and Site 6 loop current (S6).

As shown by the MR did not harbor greater microbial diversity than nGOM seawater, as compared to the Columbia River which enriches microbial diversity in seawater as these salinity end members mix (; ). Here we sought to further resolve whether the similarity in microbial diversity in the MR and the nGOM as reported by is a transient or stable feature by sampling the mouth of the MR to the loop current in the nGOM during the summer right before the onset of hypoxia, compared to who sampled in the spring. Specifically, we examined how the MR plume influenced microbial diversity from the mouth of the MR, in the MR plume, in the canyon, to the Deepwater Horizon wellhead and out to the loop current. To do so, we used iTag sequencing of 16S rRNA genes to characterize the microbial communities in 23 samples. We then analyzed this data along with in situ geochemistry and physical properties of the water column to determine the organizing principles structuring microbial communities along a salinity (and depth) gradient from a river end-member to the mesopelagic ocean.

Materials and Methods

Sample Collection

A total of six different sites were sampled during the CARTHE III pelagic cruise in the nGOM from July 7-10th, 2014 aboard the R/V Pelican (Figure Figure11). These six sites were the MI River mouth Southwest Pass (MR-P1; abbreviated as MR), MI River plume one (MRP1-P1; abbreviated as MRP1), MI River plume two (MRP2-S7; abbreviated as MRP2), loop current (LC-P4; abbreviated as LC), MI canyon (C-P5; abbreviated as C), and Deepwater Horizon wellhead (DWH-P3; abbreviated as DWH) (Figure Figure11). In total 23 samples were obtained. Sample nomenclature indicates the sampling location as well as the collection depth, which follows an underscore (e.g., MR-P1_3). Station MR-P1 bottom depth was 19 m. Collection depths were 1 m to 300 m, depending on the site. Physical properties including temperature, depth, pressure, and salinity (conductivity) were determined in situ using a Conductivity-Temperature-Depth (CTD) instrument.

Oxygen and Nutrients

From one niskin bottle, duplicate water samples for dissolved inorganic nutrients (NO3- + NO2-, NO2-, PO43-, SiO2, and NH4+) were analyzed in triplicate using a Lachat Instruments QuikChem® FIA+ 8000 Series Automated Ion Analyzer with an ASX-400 Series XYZ autosampler after being filtered through acid-cleaned (10% HCl) 47 mm diameter, 0.2 μm pore size, membrane filters (Pall Supor®-200) under low vacuum pressure. Samples were analyzed simultaneously for dissolved NO3 + NO2- (by Cu-Cd reduction followed by azo dye colorimetry), PO43- (by the automated ascorbic acid reduction method), and SiO2-, but were analyzed separately for dissolved NH4+ (by phenate colorimetry) to prevent contamination of the samples by fumes from the NH4Cl buffer used in the analysis for NO3- + NO2- (APHA, 1992). Dissolved NO2- was determined separately by azo dye colorimetry (without Cu-Cd reduction) and NO3- concentration was determined by difference. Known volumes of water were filtered under low vacuum pressure onto precombusted (500°C for 4 h) 25 mm glass fiber filters (Whatman GF/F) and stored frozen prior to determination of particulate phosphorus concentrations using a modification of the method of (). Filters were placed in acid-cleaned (10% HCl) borosilicate vials and combusted at 550°C for 2 h. After cooling, 10 mL of 1N HCl was added to each vial and the vials were shaken for 16 h at 250-300 rpm. After settling, the samples were diluted 10-100 times and PO4-P was quantified using the automated ascorbic acid reduction method on the Lachat as described above.

Dissolved oxygen concentrations were determined at all locations and depths using the CTD oxygen sensor. At five stations and multiple depths (total n = 15) one or two (n = 4 occasions) 300-mL biological oxygen demand (BOD) bottles were filled from the same Niskin bottle using tygon tubing inserted into the bottom of a bottle and then allowing water to overflow the bottle by 2–3 volumes, taking care to expel all bubbles from the sample before capping. The bottles were immediately fixed (with 1 mL each of manganous sulfate and alkaline iodide solutions), stoppered, had deionized water added to fill the flared top and a secondary plastic cap to ensure a tight seal, and stored in the dark prior to titrations. Dissolved oxygen concentrations were determined using a modification of the Winkler titration method detailed in Strickland and Parsons (1972). Titrations were performed on three to five 50-mL aliquots per bottle using a Mettler Toledo DL28 auto-titrator. The mean and median coefficient of variation in DO concentrations among duplicate BOD bottles filled from the same Niskin bottle was 1.02 and 0.74%, respectively. Dissolved oxygen concentrations measured via winkler titration (95–237 μmol O2 L-1) spanned most of the range of DO concentrations observed in any profile during the sampling campaign. Ambient DO concentrations measured by winkler titration (DOwinkler) were not significantly different than CTD measurements of DO (DOCTD) (p = 0.60; paired t-test) with the mean DOwinkler – DOCTD of -0.69 mmol O2 m-3 (range: -5.7 to +6.8 mmol m-3) and a mean (DOwinkler – DOCTD)/DOwinkler value of -0.5%, providing a high degree of confidence in the DO patterns observed in CTD profiles.

Microbial Sample Collection and DNA Extractions

At six stations 6-12 L of seawater was collected and filtered with a peristaltic pump. A 2.7 μm Whatman GF/D pre-filter was used and samples were concentrated on 0.22 μm Sterivex filters (EMD Millipore, Billerica, MA, USA). Sterivex filters were sparged and filled with RNAlater. DNA was extracted directly off of the filter by placing half of the Sterivex filter in a Lysing matrix E (LME) glass/zirconia/silica beads Tube (MP Biomedicals, Santa Ana, CA, USA) using the protocol described in which combines phenol:chloroform:isoamyalcohol (25:24:1) and bead beating. Genomic DNA was stored at -80°C until purified. DNA was purified using a QIAGEN (Valencia, CA, USA) AllPrep DNA/RNA Kit. DNA quantity was determined using a Qubit2.0 Fluorometer (Life Technologies, Grand Island, NY, USA).

16S rRNA Gene Sequencing (iTag) and Analysis

16S rRNA genes were amplified from ∼10 ng of purified DNA in duplicate using primers 515F and 806R that amplify both bacteria and archaea, targeting the V4 region of E. coli in accordance with the protocol described by , ) and used by the Earth Microbiome Project1, with a slight modification, specifically, the annealing temperature was modified from 50°C to 60°C. PCR amplicons were purified using Agencourt AMPure XP PCR Purification beads (Beckman Coulter, Indianapolis, IN, USA). Sequencing was carried out using the MiSeq (Illumina, San Diego, CA, USA) platform. Sequences were analyzed using QIIME version 1.9.0 () pipeline. Paired end reads were joined using fastq-join (Aronesty, 2011). Sequences were then demultiplexed and quality filtered using QIIME version 1.9.0 default parameters. These sequences are available at http://mason.eoas.fsu.edu and from NCBI’s sequence read archive (accession SRP077603). Sequences were then clustered into operational taxonomic units (OTU)s which was defined as ≥97% 16S rRNA gene sequence similarity with the open reference clustering protocol 2 with Greengenes version 13.5 (). The resulting OTU table was filtered to keep only OTUs that had at least 10 observations (6490 OTUs in total). Data was normalized using cumulative-sum scaling (). For the actinobacterial OTU350458 all 16S rRNA gene sequences from all samples that were 97% or more similar and thus grouped within this OTU were extracted and analyzed further by blastn. Specifically, using blastn the 350458 OTU representative 16S rRNA gene sequence was compared to all of the 16S rRNA gene sequences (55,163 sequences) in this OTU to examine phylogenetic cohesion.

Statistics

Nutrient data were interpolated using Ocean Data View (Schlitzer, 2013). Alpha diversity (Shannon index) was determined according to Seaby and Henderson (2006) in QIIME (the QIIME script uses a default logarithm of 2 instead of e). The normalized OTU abundances in the 23 different samples were then analyzed using non-metric multidimensional (NMDS) scaling in R using metaMDS with default parameters in the Vegan package. To fit environmental vectors onto the ordination the Vegan function envfit was used. P-values were derived from 999 permutations of this data. A bipartite network of the 16S rRNA gene data was generated using QIIME. The network was visualized using Cytoscape’s edge-weight spring-embedded algorithm (edges were weighted by the abundance of an observation).

Results and Discussion

Chemical and Physical Properties of Samples

At 1.47 practical salinity units (psu), the MR sample collected at 3 m depth (MR-P1_3) serves as the low salinity, freshwater end-member (Figure Figure11; Table Table11). At this same station, the 15 m MR sample salinity was 34.91 psu (Figure Figure11; Table Table11) indicating penetration of nGOM water [using 36 psu as the reference salinity (Morey, 2003; Rong et al., 2014)] below the pycnocline in the river. The MR plume samples (MRP1and MRP2) had lower salinities in the surface (31.79 and 29.46 psu, respectively), increasing with depth (Figure Figure11; Table Table11). The canyon and DWH samples had a similar salinity profile with surface sample salinities of 34.07 and 35.89 psu, respectively (Figure Figure11; Table Table11). The loop current was the farthest from the MR and did not show a surface salinity dip compared to deeper depths, as observed at all other sampling locations.

Table 1

Metadata and physical data for MR, MRP, canyon, Deepwater Horizon wellhead, and loop current samples.

Sample nameStationDepth (m)Date sampledLatitudeLongitudeSalinity (PSU)Temperature (C)Density(σ𝜃 (kg/m3))Fluorescence (RFU)Pressure (dbar)
MR-P1_3River3.37/7/1428.987-89.3671.4728.49-2.800.713.28
MR-P1_15River14.57/7/1428.99-89.36434.9125.3723.160.2414.60
MRP1-P2_1Plume 11.17/7/1429.255-88.47831.7929.5619.471.201.09
MRP1-P2_25Plume 125.47/7/1429.255-88.47836.2923.3124.820.1225.54
MRP1-P2_55Plume 155.47/7/1429.255-88.47836.3920.6325.660.2055.83
MRP1-P2_66Plume 1667/7/1429.255-88.47836.4020.1125.800.2766.45
MRP2-S7_1Plume 217/10/1428.497-90.05429.4630.1117.540.241.00
MRP2-S7_37Plume 236.57/10/1428.495-90.05536.3022.1625.160.3836.70
C-P5_1Canyon1.37/10/1428.233-90.17434.0729.1421.320.041.29
C-P5_25Canyon25.27/10/1428.233-90.17435.6127.5523.000.0525.36
C-P5_50Canyon50.17/10/1428.233-90.17336.3121.5925.330.1350.49
C-P5_100Canyon99.77/10/1428.233-90.17336.3818.5926.180.09100.35
DWH-P3_1Wellhead1.27/8/1428.65-88.56135.8928.8922.770.041.25
DWH-P3_25Wellhead25.47/8/1428.65-88.56136.0727.6723.310.0425.58
DWH-P3_50Wellhead49.77/8/1428.651-88.56136.2823.1824.860.0850.09
DWH-P3_75Wellhead74.57/8/1428.651-88.56136.4420.9325.620.2774.99
DWH-P3_200Wellhead199.67/8/1428.65-88.55835.9715.0226.720.02201.09
DWH-P3_300Wellhead300.47/8/1428.65-88.55835.5612.5526.930.02302.62
LC-P4_1Loop Current17/9/1427.532-88.7536.3028.9523.060.041.03
LC-P4_25Loop Current257/9/1427.531-88.7536.3028.9523.060.0525.20
LC-P4_50Loop Current507/9/1427.531-88.7536.3028.2123.300.0750.34
LC-P4_90Loop Current89.97/9/1427.532-88.7536.3726.6423.870.2690.55
LC-P4_150Loop Current1507/9/1427.532-88.7536.8322.3325.520.04151.11

Spatial patterns in dissolved nutrients across the sampling region were consistent with a strong influence of high nutrient MR water on the receiving waters of its plume (Figure Figure11). Locations within the plume (lower salinity) were higher in dissolved inorganic nitrogen (DIN) and phosphate (Figure Figure11; Table Table22). Below the pycnocline, nutrient concentrations tended to increase with depth, as has been reported in previous studies (Shiller and Joung, 2012; Rakowski et al., 2015).

Table 2

Chemical data and Shannon diversity for MR, MRP, canyon, Deepwater Horizon wellhead, and loop current samples.

Sample nameStationNO2 + NO3 (μm)NO2 (μm)NO3 (μm)NH4 (μm)DIN (μm)PO4 (μm)SiO2 (μm)O2 (mmol/m3)Shannon diversity
MR-P1_3River130.460.16130.291.31131.763.9324.875.856.93
MR-P1_15River13.301.5611.752.5615.861.5215.723.257.26
MRP1-P2_1Plume 11.800.381.421.533.330.685.217.465.90
MRP1-P2_25Plume 11.670.291.393.985.650.745.077.287.03
MRP1-P2_55Plume 11.060.360.703.614.670.711.515.897.09
MRP1-P2_66Plume 13.010.502.513.716.720.711.875.967.24
MRP2-S7_1Plume 20.510.380.131.652.160.264.006.726.15
MRP2-S7_37Plume 21.140.500.640.061.200.355.145.126.88
C-P5_1Canyon10.300.349.963.0313.341.212.666.595.83
C-P5_25Canyon1.910.521.393.105.010.831.246.725.63
C-P5_50Canyon1.200.420.783.784.980.820.767.125.94
C-P5_100Canyon0.990.330.661.252.230.751.304.467.21
DWH-P3_1Wellhead1.080.320.763.514.600.711.216.655.47
DWH-P3_25Wellhead0.960.350.602.493.450.711.286.905.70
DWH-P3_50Wellhead0.980.350.632.953.930.741.197.365.01
DWH-P3_75Wellhead3.480.453.033.376.850.741.626.246.78
DWH-P3_200Wellhead19.140.4218.722.6821.821.316.034.357.10
DWH-P3_300Wellhead25.290.3424.952.4627.751.6110.863.927.07
LC-P4_1Loop Current0.930.370.564.285.210.740.696.485.19
LC-P4_25Loop Current0.840.300.543.334.160.721.256.495.21
LC-P4_50Loop Current1.050.350.713.204.260.722.686.624.91
LC-P4_90Loop Current1.290.330.964.886.170.741.246.176.94
LC-P4_150Loop Current5.790.365.433.389.170.841.684.937.93

Alpha Diversity

Two patterns emerged when comparing Shannon diversity across sample sites: (1) the MR hosted the highest diversity, which decreased with increasing distance away from the MR and (2) outside of the MR diversity increased with depth. In regards to the first trend surface and near-surface sites of comparable depths (two samples/site) revealed Shannon diversity (H) was highest in the MR (H avg = 7.10 SD = ± 0.19) and MR plume samples MRP1 and MRP2 (H avg = 6.50 ± 0.66, 6.52 ± 0.42, respectively; Table Table22). Moving away from the MR Shannon diversity decreased from the canyon samples (H avg = 5.73 ± 0.11), to the DWH site (H avg = 5.57 ± 0.13) and finally to the loop current samples, which had the lowest Shannon diversity (H avg = 5.20 ± 0.01) (Table Table22). The decrease in microbial diversity with distance from the MR contrasts with the findings of who reported no differences in diversity based on geographic location. We note that (1) these authors collected their samples at a different time of year (March) than we did (July), when environmental conditions may have been different and (2) our sample locations differed, both of which likely influenced variability in the microbial diversity observed in our studies. However, our data did agree with that of who reported the highest diversity in the Columbia River, which decreased as freshwater and seawater mixed resulting in increasing salinity. Our findings suggested that the MR hosts a more diverse microbial community than the surface waters of the nGOM. It has long been known that the MR influences in situ chemistry, but our data suggested that it may also act as a seed bank for microbial diversity as it mixes with the surface nGOM. This is consistent with previous reports that suggested the Columbia River acted as a source for enriching microbial diversity (both archaeal and bacterial) when mixed with seawater in the Columbia River estuary (; ). However, we acknowledge deconvoluting the cause for higher microbial diversity in the MR plume as compared to the non-plume loop current samples is challenging, e.g., that higher nutrient concentrations allowed marine microorganisms to proliferate, rather than the MR acting as a seed bank for diversity. To this end evaluating the surface samples (6 samples from 6 sites) revealed that the 3 m MR sample had 2084 unique OTUs (47% of all surface samples with 4401 total OTUs). The loop current surface sample, which had no salinity anomaly that would have suggested freshwater mixing with seawater, had 99 unique OTUs (2%). This suggested that the MR plume could have influenced microbial diversity in two ways: (1) higher nutrient concentrations promoted the growth of marine microorganisms and (2) the MR plume introduced non-marine microorganisms to the nGOM.

Deeper in the water column the trend of decreasing Shannon diversity with increasing distance from the MR continued with the MR plume sample MRP1 having higher diversity at 55 and 66 m (H = 7.09 and 7.24, respectively) than the canyon 50 m (H = 5.94), DWH 50 m (H = 5.01) and finally the loop current 50 m sample (H = 4.91) (Table Table22). A secondary noticeable trend, discussed above, is that the highest diversity at each deep water site is observed below 75 m (Table Table22), which was not in agreement with the findings of , but was reported by in the Columbia River coastal margin.

Beta Diversity

The primary drivers in structuring the microbial communities were nutrient concentrations, salinity and depth (Figure Figure2A2A) as determined by non-metric multidimensional scaling ordination of normalized 16S rRNA gene data. In particular DIN, nitrate+nitrite, silicate, and phosphate were enriched in the MR, decreasing with distance from the mouth of the MR as shown by the vectors in Figure Figure22 [these values were significantly correlated with an axis, with p-values ranging from 0.001 to 0.004 and correlation coefficients (r2) from 0.76 to 0.91]. Along this trajectory the microbial communities differed, particularly the shallow MR sample collected at 3 m depth (Figure Figure22). As expected salinity increased with distance from the MR (p-value = 0.006, r2= 0.77). Depth (p-value = 0.002, r2= 0.77) and the variables that change with depth (e.g., temperature and pressure) were also highly correlated with axes. Along this vertical trajectory the microbial communities formed shallow (0-50 m) and deeper (50-300 m) water column clusters (Figure Figure2B2B). The results of our beta diversity analysis are similar to those of ; see their Figure Figure22) where both salinity and depth were important drivers in structuring the microbial communities from the Columbia River out to the Pacific Ocean.

Non-metric multidimensional scaling ordination plot of normalized 16S rRNA gene sequence data (A,B). (A) Biplot showing correlations between environmental variables and ordination axes. (B) The same ordination, but shows depth by bubble size, which increases with increasing depth. (B) Shows sample clustering by depth (<50 m and >50 m).

Network Analysis

The MR samples, and in particular the shallow 3 m sample, hosted a unique microbial community with many OTUs found only in this sample as shown in the network analysis figure (Figure Figure33). The number of unique OTUs observed in this sample (2,019 or 31% of all OTUs) agreed with the high Shannon diversity, relative to the other samples, discussed above. While there were also unique OTUs (782 or 12%) in the deeper MR sample (15 m) most were shared with its shallow MR counterpart or with nGOM seawater samples (Figure Figure33). The pattern of shared OTUs from the freshwater MR end-member to nGOM marine samples in this sample was expected given the observed mixing of freshwater with nGOM seawater at this sampling location as indicated by the salinity. Similar to the ordination (Figure Figure22) the shallow MR plume samples clustered together and with nGOM samples collected in the near surface (≤50 m) of the water column (Figure Figure33). Finally, the remaining samples clustered by depth (Figures Figures22 and 33). As observed in the ordination (Figure Figure22), samples clustered by nutrient concentrations, salinity and depth (Figure Figure33). Further, the pattern of unique and shared OTUs in the shallow MR sample to unique and shared OTUs in the deeper MR sample further supported that the MR acts as a microbial seed bank, introducing microbial diversity to the nGOM as freshwater and seawater mix.

Network analysis of OTU data from 16S rRNA gene sequences. Sample nodes are black, OTU (7,111) nodes are blue and edges are gray. OTUs from MR-P1_3 m are connected via red edges. The different depths are indicated on the figure. The depths of the two MR samples are shown in the figure. The actinobacterial, acI OTU that was particularly prevalent in the shallow MR sample, node is shown in yellow.

Microbial Community Structure in the Mississippi River, Plume and Northern Gulf of Mexico

The data revealed that the 3 m MR sample (MR-P1_3 m) had a highly divergent microbial community compared to the other samples (Figures Figures224). Specifically, this high nutrient, low salinity end member sample collected in the mouth of the MR was dominated by the Actinobacteria phylum (42% of all phyla in MR-P1_3 m/avg. 5% ± 3% in non-MR surface samples collected from 1 m) and to a lesser degree Proteobacteria and Planctomycetes (both were 14%/avg. 39% ± 12% and 0.6% ± 0.6%), Verrucomicrobia (4%/avg. 0.9% ± 0.5%), Chloroflexi (4%/avg. 0.2% ± 4%), and unclassified microorganisms (17%/avg. 2% ± 0.5%) (Figure Figure44). This shallow MR sample had lower relative abundances of Cyanobacteria (1%/avg. 38% ± 8%), Bacteroidetes (3%/avg. 9% ± 4%), and Marine Group II Euryarchaeota (MGII) (0.01%/avg. 3% ± 3%) (Figure Figure44). The relative abundances of the other phyla presented in Figure Figure44 were similar in these surface samples, or were less than 1% in relative abundance and are not discussed in more detail here. In the 3 m MR sample relative betaproteobacterial abundance was high (41% of all proteobacterial subphyla), followed by Alphaproteobacteria (27%), Gammaproteobacteria (22%), and Deltaproteobacteria (10%) (Figure Figure4B4B). reported MR surface water had high abundances of unassigned bacteria (their Figure Figure4A4A) and Betaproteobacteria. In contrast, our non-MR surface samples had low betaproteobacterial abundances ranging from less than 1 to 4.5% of the Proteobacteria with Alphaproteobacteria being most abundant (47 to 61%) followed by Gammaproteobacteria (34 to 44%) (Figure Figure4B4B). The dominance of Proteobacteria, and in particular Alphaproteobacteria in our shallow nGOM samples is consistent with the findings of . Our surface plume, canyon, loop current and DWH samples also had high normalized abundances of Cyanobacteria (28-49% of all phyla, compared to 1%), which were excluded from the analysis presented by . In contrast to our shallow MR sample, Bacteroidetes were well represented in our surface nGOM samples (4-15% of all phyla compared to 3%), consistent with .

Bar graph of normalized 16S rRNA gene sequence data. (A) Shows the relative abundance of the most abundant phyla. Less abundant phyla are grouped under “Other.” The figure key shows the highest (Proteobacteria) to lowest abundances (Other). (B) Shows relative abundance of the different proteobacterial sub-phyla. Less abundant proteobacterial sub-phyla are grouped under “Other.” The figure key shows the highest (Alphaproteobacteria) to lowest abundances (Proteobacteria; Other).

In the deeper MR sample (MR-P1_15 m) the microbial community was less divergent compared to the other sites than the shallow 3 m MR sample (Figures Figures224). This greater congruency of the MR-P1_15 m sample with marine samples collected from 25 to 50 m is likely due to mixing of MR and seawater as indicated by a salinity of 34.91 psu. Specifically, in the 15 m MR sample, Proteobacteria were most abundant (31% of all phyla) with Gammaproteobacteria accounting for 56% of the Proteobacteria (avg. 43% ± 4% in non-MR samples collected from 25 to 50 m) while Alphaproteobacteria and Deltaproteobacteria were 20%/avg. 48% ± 6% and avg. 7% ± 3%. Betaproteobacteria were not abundant in MR_15 m (3% of Proteobacteria) compared to its shallow counterpart (41% of Proteobacteria), which is more congruent with non-MR samples in the <50 m depth interval, in which Betaproteobacteria abundances were low (avg. 0.9% ± 0.2% of all Proteobacteria). Cyanobacteria abundances were low (6% of all phyla) in the 15 m MR sample as compared to the non-MR sites (avg. 35% ± 16%). The deeper MR sample had high relative Thaumarchaeota abundances compared to other samples in this depth interval (21%/ avg. 0.7% ± 2%). The high relative abundance of Thaumarchaeota in the MR was not a feature of the shallow MR sample analyzed in , and, to our knowledge is the first observation that Thaumarchaeota are abundant in the MR. The relative abundance of MGII was low in the shallow MR sample (less than 0.01%), increased to 6% in the 15 m sample, but was slightly lower than the <50 m non-MR samples at avg. 7% ± 6%. The relative abundance of Bacteroidetes was 12% in the 15 m MR sample, compared to 9% ± 5% in non-MR samples. Finally, the MR_15 m differed from the other samples in that it had higher relative abundances of Planctomycetes (7% compared to avg. 0.3% ± 0.3%), which was not a feature observed in the shallow MR sample presented by .

The Mississippi River as a Conduit for Introducing Microorganisms to the Northern Gulf of Mexico

Bacteria

The high microbial diversity in the MR and the elevated diversity in surface seawater influenced by the MR plume in this study suggested that the MR could in fact act as a seed bank for microbial diversity as it mixes with the nGOM. The difference in microbial community composition between the low salinity end member and marine samples was primarily due to the high relative abundance of Actinobacteria, which decreased with increasing distance from the MR. Thus a more detailed analysis of this group follows. Blastn analysis revealed that the most abundant actinobacterial OTU (OTU350458) averaged 99.5% similarity, 0.8 mismatches, 463.6 bit score and 1.2 × 1020 Expect value to the 16S rRNA gene sequences in this OTU (55,163 sequences in total). This additional analysis suggested that the representative OTU350458 captured only highly similar to identical 16S rRNA gene sequences in this OTU. OTU350458 was classified as ACK-M1 (Zwart et al., 2002), now referred to as acI (). This single actinobacterial OTU (OTU350458) was observed in nearly all of the samples (Figure Figure33), but its relative abundance was highest in the 3 m MR sample (32% of the microbial community), decreased to 0.9% in the 15 m MR sampled and was low (< 0.01%) to undetectable outside of the MR. Further, in our dataset this acI OTU was inversely correlated with salinity (Spearman ρ = -0.47, p-value = 0.01). This actinobacterial OTU was most similar in 16S rRNA gene sequence to microorganisms from freshwater environments, but also 100% similar to several clones from the Baltic Sea in seawater collected below ice (e.g., Acc LM652066). Although OTU350458 was the most abundant acI up to 23 different acI OTUs were observed across the samples, with the highest diversity observed in the MR (3 m) sample (all 23 OTUs were present). All acI OTUs were significantly inversely correlated with depth (ρ = -0.67, p-value = < 0.0), salinity (ρ = -0.50, p-value = 0.01) and density (ρ = -0.60, p-value = < 0.0) and positively correlated with temperature (ρ = 0.51, p-value = 0.01).

Recently provided a comprehensive synthesis of freshwater microbial communities from lakes and reported that Actinobacteria was one of the five most numerically dominant phyla in lake epilimnia. This finding is consistent with an earlier synthesis of freshwater microbial communities by Zwart et al. (2002) who reported that specific clades of Actinobacteria were well represented in freshwater. Zwart et al. (2002) also revealed that Actinobacteria were found in estuaries and the coastal ocean, but not in the open ocean. delineated and described the acI as a freshwater actinobacterial cluster that is highly represented in lakes and rivers, and to a lesser degree in estuaries.

Although Actinobacteria in the acI clade are numerically dominant in freshwater ecosystems they have eluded cultivation efforts, thus their salinity optimum and activity along a salinity gradient is unknown. Further, the lack of cultivated representatives required methodologies that circumvent the need to culture to determine physiology. In this vain used single cell genomics and described the organism SCGC AAA027-L06. This single cell was 95% similar to our acI OTU350458 (it should be noted that neither our, nor their 16S rRNA genes are full-length) and based on 16S rRNA gene sequence is the most similar single cell genome, inclusive of those presented in . Using the acI representative sequences in places the OTU350458 in the acI-C2 sub-clade (data not shown) while their SCGC AAA027-L06 is in the acI-B1 sub-clade. They reported that SCGC AAA027-L06 has a small genome that encodes a facultative aerobic lifestyle, with numerous enzymes involved in pentose utilization. Additionally, microautoradiography and fluorescence in situ hybridization (MAR-FISH) showed acI actively assimilated low-molecular-weight organic compounds, the source of which was suggested to be phytoplankton exudates (). These previous studies reveal a ubiquitous freshwater clade, representatives of which could degrade phytoplankton exudates.

The metabolism of acI led us to turn our attention to our other dataset (), namely the annual nGOM dead zone. In that dataset the same OTU (99% similar) was present and most abundant at the mouth of the MR in 2013, but was nearly absent moving westward over the shelf. We also evaluated its abundance in the 2014 hypoxic zone, where the hypoxic area was concentrated at the mouth of the MR, and found that it was abundant in the MR (29% relative abundance in the mouth) and in surface samples moving westward from the MR (Gillies et al., unpublished data). On this trajectory salinities ranged from 2 in the MR to 35.5 psu (avg. 22 psu), which is close to the seawater salinity value defined above. We hypothesize that acI is abundant in the MR, and during spring runoff it, along with excessive nutrients and reduced salinity, are introduced to the nGOM. During the resulting algal bloom if acI is metabolically active in the freshwater MR plume at salinities that exceed freshwater, but are below marine salinities, it would be well poised to rapidly degrade low molecular weight compounds, such as those found in algal exudates, and could even do so in low oxygen environments. Thus, we hypothesize that acI may play a role in establishing the hypoxic conditions that prevail near the mouth of the MR during the summer in the reduced salinity layer that overlays the saline nGOM bottom water. However, this hypothesis has not been tested herein, nor do we have the data to determine if the acI are metabolically active in the lower salinity MR plume, but the presence of some acI clade members in estuaries and in the Baltic Sea suggest that it may be active along a salinity gradient. Alternatively, if the acI is physiologically restricted to freshwater, its decreasing relative abundance outside of the mouth of the MR with increasing salinities and depths, along with early reports describing this clade as freshwater adapted, suggested that acI Actinobacteria may be a plausible tracer for freshwater input to the marine environment. We envision a quantitative assay for acI, such as the quantitative polymerase chain reaction that could theoretically provide a tracer for river input to seawater.

Archaea

Thaumarchaeota in the MR has not previously been reported (), suggesting that at certain times during the year, deeper water in MR may host greater archaeal diversity than the shallow nGOM. In our samples, the majority of Thaumarchaeota were in the genus Nitrosopumilus, a marine microorganism, therefore we focus on the distribution of this genus from the MR to the loop current. The relative abundance of this genus was low in all surface samples, including the 3 m MR sample (although at <1% of all genera in this sample it was the highest of any surface sample) (Figures Figures11 and 44). The deeper MR sample (15 m) had high Nitrosopumilus abundances at 19% of all genera compared to 0.63% ± 2% in the other samples of comparable depths (25-50 m) (Figures Figures11 and 44). High relative abundances of Thaumarchaeota closely related to Nitrosopumilus maritimus () have been reported in the 2013 nGOM hypoxic zone (). In the 2014 hypoxic zone N. maritimus was again highly abundant, as were amoA genes, particularly in or near the mouth of the MR, where salinity was lower than typical seawater (Gillies et al., unpublished). This data suggested that the deeper MR may introduce Thaumarchaeota closely related to N. maritimus to the shallow nGOM water column as it mixes and moves westward. In we postulated that persistent archaeal hotspots that were predominantly Nitrosopumilus in the hypoxic area would serve as a site where energy flow is diverted from higher trophic levels to, in this case, ammonia-oxidizing Thaumarchaeota. This aerobic metabolism would result in sustained oxygen draw down in an oxygen-depleted environment. Thaumarchaeota, and N. maritimus in particular, are viable and abundant at seawater salinities. Thus their introduction to the nGOM from the deeper MR may be an example of the MR enriching seawater with ecologically significant microorganisms, however, different methodological approaches are required to test this hypothesis.

Conclusion

In this study we used iTag sequencing of 16S rRNA genes for 23 samples collected in the MR, in the MR plume, in the canyon, at the Deepwater Horizon wellhead and out to the loop current, and in situ geochemistry to determine whether the MR influences microbial diversity in the nGOM. This analysis revealed that the MR had a distinct microbial community compared to marine samples and also had the highest diversity of any sample site. We suggest that the MR could in fact act as a seed bank for microbial diversity as it mixes with seawater in the nGOM. Future work will be directed at creating a quantitative assay to determine whether the freshwater acI actinobacterial clade could be used as a tracer for freshwater input to the marine environment. Additionally, we endeavor to determine if the acI clade and Thaumarchaeota that are introduced by the surface and near surface waters of the MR to the nGOM are ecologically significant.

Author Contributions

1999

OM conceived of the experiments, did the bioinformatics and statistical analyses of the data and wrote the manuscript. EC carried out DNA extractions, library preparation and sequencing. LG and TP participated in the cruise to obtain samples. BR determined nutrient and oxygen concentrations.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We thank the CARTHE consortium for the opportunity to participate in a GoMRI funded cruise to obtain the samples that we describe herein. We thank Matthew Rich (LUMCON/CWC) for his contributions to the collection of samples and subsequent analyses of dissolved nutrients and dissolved oxygen concentrations. Finally we thank the Captain and crew of the R/V Pelican.

Funding. Funding for this research cruise was provided by a grant from the Gulf of Mexico Research Initiative to the CARTHE research consortium. Dissolved inorganic nutrient and dissolved oxygen concentration data are publicly available through the Gulf of Mexico Research Initiative and Data Cooperative (GRIIDC) at https://data.gulfresearchinitiative.org (doi: http://dx.doi.org/10.7266/N7BZ63Z7).

1http://www.earthmicrobiome.org/emp-standard-protocols/16s/

2http://qiime.org/scripts/pick_open_reference_otus.html

References

  • Allgaier M., Grossart H. P. (2006). Seasonal dynamics and phylogenetic diversity of free-living and particle-associated bacterial communities in four lakes in northeastern Germany.Aquat. Microb. Ecol.45115–128. 10.3354/ame045115 [CrossRef] [Google Scholar]
  • APHA (1992). Standard Methods for the Examination of Water and Wastewater18th EdnWashington, DC: American Public Health Association. [Google Scholar]
  • Aronesty E. (2011). Command-line Tools for Processing Biological Sequencing Data, ea-utils. Available at: http://code.google.com/p/ea-utils[Google Scholar]
  • Aspila K., Agemain H., Cahu A. (1976). A semi-automated method for the determination of inorganic, organic and total phosphate in sediments.Analyst101187–197. 10.1039/an9760100187 [PubMed] [CrossRef] [Google Scholar]
  • Buck U., Grossart H. P., Amann R., Pernthaler J. (2009). Substrate incorporation patterns of bacterioplankton populations in stratified and mixed waters of a humic lake.Environ. Microbiol.111854–1865. 10.1111/j.1462-2920.2009.01910.x [PubMed] [CrossRef] [Google Scholar]
  • Caporaso J. G., Kuczynski J., Stombaugh J., Bittinger K., Bushman F., Costello E. K., et al. (2010). QIIME allows analysis of high-throughput community sequencing data.Nat. Methods7335–336. 10.1038/nmeth.f.303 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Caporaso J. G., Lauber C. L., Walters W. A., Berg-Lyons D., Huntley J., Fierer N., et al. (2012). Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms.ISME J.61621–1624. 10.1038/ismej.2012.8 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Caporaso J. G., Lauber C. L., Walters W. A., Berg-Lyons D., Lozupone C. A., Turnbaugh P. J., et al. (2011). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample.Proc. Natl. Acad. Sci. U. S. A.108(Suppl.)4516–4522. 10.1073/pnas.1000080107 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Crump B., Baross J. (2000). Archaeaplankton in the Columbia river, its estuary and the adjacent coastal ocean, U.S.A.FEMS Microbiol. Ecol.31231–239. 10.1111/j.1574-6941.2000.tb00688.x [PubMed] [CrossRef] [Google Scholar]
  • Crump B. C., Armbrust E. V., Baross J. A. (1999). Phylogenetic analysis of particle-attached and free-living bacterial communities in the Columbia River, its estuary, and the adjacent coastal ocean.Appl. Environ. Microbiol.653192–3204. [PMC free article] [PubMed] [Google Scholar]
  • Dunn D. D. (1996). Trends in nutrient inflows to the Gulf of Mexico from streams draining the conterminous United States, 1972-93. USGS Water-Resour. Inves.Rep.96–4113, 60. [Google Scholar]
  • Fortunato C. S., Crump B. C. (2015). Microbial gene abundance and expression patterns across a river to ocean salinity gradient.PLoS ONE10:e014057810.1371/journal.pone.0140578 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Fortunato C. S., Herfort L., Zuber P., Baptista A. M., Crump B. C. (2012). Spatial variability overwhelms seasonal patterns in bacterioplankton communities across a river to ocean gradient.ISME J.6554–563. 10.1038/ismej.2011.135 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Garcia S. L., McMahon K. D., Martinez-Garcia M., Srivastava A., Sczyrba A., Stepanauskas R., et al. (2012). Metabolic potential of a single cell belonging to one of the most abundant lineages in freshwater bacterioplankton.ISME J.7137–147. 10.1038/ismej.2012.86 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Ghai R., Rodrłguez-Valera F., McMahon K. D. K., Rodriguez-Valera F., McMahon K. D. K., Toyama D., et al. (2011). Metagenomics of the water column in the pristine upper course of the amazon river.PLoS ONE6:e2378510.1371/journal.pone.0023785 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Ghylin T. W., Garcia S. L., Moya F., Oyserman B. O., Schwientek P., Forest K. T., et al. (2014). Comparative single-cell genomics reveals potential ecological niches for the freshwater acI Actinobacteria lineage.ISME J.82503–2516. 10.1038/ismej.2014.135 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Gillies L. E., Thrash J. C., de Rada S., Rabalais N. N., Mason O. U. (2015). Archaeal enrichment in the hypoxic zone in the northern Gulf of Mexico.Environ. Microbiol.173847–3856. 10.1111/1462-2920.12853 [PubMed] [CrossRef] [Google Scholar]
  • Glockner F. O., Zaichikov E., Belkova N., Denissova L., Pernthaler J., Pernthaler A., et al. (2000). Comparative 16S rRNA analysis of lake bacterioplankton reveals globally distributed phylogenetic clusters including an abundant group of actinobacteria.Appl. Environ. Microbiol.665053–5065. 10.1128/AEM.66.11.5053-5065.2000 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • King G. M., Smith C. B., Tolar B., Hollibaugh J. T. (2013). Analysis of composition and structure of coastal to mesopelagic bacterioplankton communities in the northern gulf of Mexico.Front. Microbiol.3:43810.3389/fmicb.2012.00438 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Könneke M., Bernhard A. E., de la Torre J. R., Walker C. B., Waterbury J. B., Stahl D. A. (2005). Isolation of an autotrophic ammonia-oxidizing marine archaeon.Nature437543–546. 10.1038/nature03911 [PubMed] [CrossRef] [Google Scholar]
  • McDonald D., Price M. N., Goodrich J., Nawrocki E. P., DeSantis T. Z., Probst A., et al. (2012). An improved greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea.ISME J.6610–618. 10.1038/ismej.2011.139 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Morey S. L. (2003). Export pathways for river discharged fresh water in the northern Gulf of Mexico.J. Geophys. Res.1081–15. 10.1029/2002JC001674 [CrossRef] [Google Scholar]
  • Newton R. J., Jones S. E., Eiler A., McMahon K. D., Bertilsson S. (2011). A guide to the natural history of freshwater lake bacteria.Microbiol. Mol. Biol. Rev.7514–49. 10.1128/MMBR.00028-10 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Paulson J. N., Stine O. C., Bravo H. C., Pop M. (2013). Differential abundance analysis for microbial marker-gene surveys.Nat. Methods101200–1202. 10.1038/nmeth.2658 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Rabalais N., Turner R. (2001). “Coastal Hypoxia: Consequences for Living Resources and Ecosystems,” inCoastal and Estuarine Studies 58edsBowman M. J., Barber R. T., Mooers C. N. K., Raven J. A., editors. (Washington, DC: American Geophysical Union; ) 454. [Google Scholar]
  • Rabalais N. N., Díaz R. J., Levin L. A., Turner R. E., Gilbert D., Zhang J. (2010). Dynamics and distribution of natural and human-caused hypoxia.Biogeosciences7585–619. 10.5194/bg-7-585-2010 [CrossRef] [Google Scholar]
  • Rabalais N. N., Turner R. E., Sen Gupta B. K., Boesch D. F., Chapman P., Murrell M. C. (2007). Characterization and long-term trends of hypoxia in the northern Gulf of Mexico: does the science support the Action Plan?Estuaries Coasts30753–772. 10.1007/BF02841332 [CrossRef] [Google Scholar]
  • Rakowski C. V., Magen C., Bosman S., Rogers K. L., Gillies L. E., Chanton J. P., et al. (2015). Methane and microbial dynamics in the Gulf of Mexico water column.Front. Mar. Sci.2:6910.1126/science.1196830 [CrossRef] [Google Scholar]
  • Rappé M. S., Vergin K., Giovannoni S. J. (2000). Phylogenetic comparisons of a coastal bacterioplankton community with its counterparts in open ocean and freshwater systems.FEMS Microbiol. Ecol.33219–232. 10.1111/j.1574-6941.2000.tb00744.x [PubMed] [CrossRef] [Google Scholar]
  • Rong Z., Hetland R. D., Zhang W., Zhang X. (2014). Current–wave interaction in the Mississippi–Atchafalaya river plume on the Texas–Louisiana shelf.Ocean Model.8467–83. 10.1016/j.ocemod.2014.09.008 [CrossRef] [Google Scholar]
  • Salcher M. M., Posch T., Pernthaler J. (2013). In situ substrate preferences of abundant bacterioplankton populations in a prealpine freshwater lake.ISME J.7896–907. 10.1038/ismej.2012.162 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Satinsky B. M., Fortunato C. S., Doherty M., Smith C. B., Sharma S., Ward N. D., et al. (2015). Metagenomic and metatranscriptomic inventories of the lower Amazon River, May 2011.Microbiome3:3910.1186/s40168-015-0099-0 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Schiller R. V., Kourafalou V. H., Hogan P., Walker N. D. (2011). The dynamics of the Mississippi River plume: impact of topography, wind and offshore forcing on the fate of plume waters.J. Geophys. Res.116C0602910.1029/2010JC006883 [CrossRef] [Google Scholar]
  • Shiller A. M., Joung D. (2012). Nutrient depletion as a proxy for microbial growth in Deepwater Horizon subsurface oil/gas plumes.Environ. Res. Lett.7:04530110.1088/1748-9326/7/4/045301 [CrossRef] [Google Scholar]
  • Schlitzer R. (2013). Ocean Data View. Available at: http://odv.awi.de[Google Scholar]
  • Seaby R. M., Henderson P. A. (2006). Species Diversity and Richness Version 4.Lymington: Pisces Conservation Ltd. [Google Scholar]
  • Staley C., Unno T., Gould T. J., Jarvis B., Phillips J., Cotner J. B., et al. (2013). Application of Illumina next-generation sequencing to characterize the bacterial community of the Upper Mississippi River.J. Appl. Microbiol.1151147–1158. 10.1111/jam.12323 [PubMed] [CrossRef] [Google Scholar]
  • Strickland J., Parsons T. (1972). A Practical Handbook of Sea-Water Analysis.Ottawa, ON: Fisheries Research Board of Canada; 310. [Google Scholar]
  • Warnecke F., Amann R., Pernthaler J. (2004). Actinobacterial 16S rRNA genes from freshwater habitats cluster in four distinct lineages.Environ. Microbiol.6242–253. 10.1111/j.1462-2920.2004.00561.x [PubMed] [CrossRef] [Google Scholar]
  • Warnecke F., Sommaruga R., Sekar R., Hofer J. S., Pernthaler J. (2005). Abundances, identity, and growth state of actinobacteria in mountain lakes of different UV transparency.Appl. Environ. Microbiol.715551–5559. 10.1128/AEM.71.9.5551-5559.2005 [PMC free article] [PubMed] [CrossRef] [Google Scholar]
  • Wiseman W. J. J., Rabalais N. N., Turner R. E., Dinnel S. P., MacNaughton A. (1997). Seasonal and interannual variability within the Louisiana coastal current: stratification and hypoxia.J. Mar. Syst.12237–248. 10.1016/S0924-7963(96)00100-5 [CrossRef] [Google Scholar]
  • Zwart G., Crump B. C., Kamst-van Agterveld M. P., Hagen F., Han S. K. (2002). Typical freshwater bacteria: an analysis of available 16S rRNA gene sequences from plankton of lakes and rivers.Aquat. Microb. Ecol.28141–155. 10.3354/ame028141 [CrossRef] [Google Scholar]
  • Zwart G., Hannen E. J., Agertveld M. P., Van der Gucht K., Lindstrom E. S., Van Wichelen J., et al. (2003). Rapid Screening for Freshwater Bacterial groups by using reverse line blot hybridization.Appl. Environ. Microbiol.695875–5883. [PMC free article] [PubMed] [Google Scholar]
Articles from Frontiers in Microbiology are provided here courtesy of Frontiers Media SA

Abstract

The Sol Genomics Network (SGN; http://solgenomics.net/ ) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato ( Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub ( http://github.com ). The database architecture combines SGN-specific schemas and the community-developed Chado schema ( http://gmod.org/wiki/Chado ) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/ .

INTRODUCTION

The Solanaceae, also known as the Nightshades, are a flowering plant family with important crop species such as potato, tomato and eggplant. The Solanaceae have a unique biology, with highly conserved genomes, yet extraordinarily diverse phenotypes and specialized adaptations. Thus, highly comparative approaches of genome study within the Solanaceae seem likely to yield important discoveries. As a step in this direction, several Solanaceae genomes are currently being sequenced, including a high-quality tomato reference sequence ( 1 ), several wild tomato species and a wild potato species ( Solanum phureja ). The Sol Genomics Network (SGN; http://solgenomics.net/ ) integrates this information in a clade-oriented database (COD), containing genomic, genetic, transcriptomic, phenotypic and taxonomic information with the data of major Euasterid families such as the Solanaceae (tomato, potato, eggplant, pepper and petunia), Plantaginaceae (snapdragon) and Rubiaceae (coffee).

Due to rapid progress in the development of new scientific methods, the database design needs to be constantly adapted and revised to accommodate the ever larger information. Over the last few years, genomics has undergone a significant transformation based on new sequencing technologies that can generate millions of sequences in a single run ( 2–6 ), enabling fast and low cost sequencing even of complex genomes and transcriptomes. The speed of sequencing and the resulting amount of sequence data poses novel challenges on how these data can be stored efficiently and presented to the research community.

Whereas model organism databases (MODs) such as the yeast database [saccharomyces genome database (SGD); http://yeastgenome.org/ ] ( 7 ) or Arabidopsis [the arabidopsis information resource (TAIR); http://www.arabidopsis.org/ ]( 8 ) can rely on a large staff of in-house curators who extract relevant information from the literature, providing their databases with deeper, richer information on genes and other data types, this approach is not scalable to CODs, which hold multiple species. Therefore, SGN has developed a powerful and easy to use community-based annotation system that uses a mixed approach of ‘trusted users’ in which SGN curators assign editor privileges to members of the research community who are gene experts, as evidenced by a publication or a meeting presentation ( 9 ). SGN is one of the largest community curated database and an open source project, such that both the database content and the code driving it can be advanced by the respective communities.

In this paper, we examine SGN from three perspectives: tools and data, technology and community participation.

TOOLS AND DATA

The SGN database hosts a wide range of biological data for various species and accessions in the Solanaceae ( Figure 1 ), from genomic sequences to phenotype images. This data originates from a variety of sources including user submissions and data from other curated public databases ( Figure 2 ).

The home page of the SGN. The home page is the main entry page, providing quick access to resources through graphical menus. Every SGN page consistently contains the same toolbar at the top with pull-down menus and links to login and help pages. On the lower part of the home page, the news and events sections keep the community informed and certain elements of the database are highlighted in different feature topics, such as a ‘locus of the week’. Links to other important resources are also provided.

The home page of the SGN. The home page is the main entry page, providing quick access to resources through graphical menus. Every SGN page consistently contains the same toolbar at the top with pull-down menus and links to login and help pages. On the lower part of the home page, the news and events sections keep the community informed and certain elements of the database are highlighted in different feature topics, such as a ‘locus of the week’. Links to other important resources are also provided.

SGN data type relationship diagram, in which the locus data type is a central node, from which most data on SGN data can be accessed with a few clicks. Other important data types include sequences and phenotypes.

SGN data type relationship diagram, in which the locus data type is a central node, from which most data on SGN data can be accessed with a few clicks. Other important data types include sequences and phenotypes.

Transcriptomes

For many years, SGN has collected expressed sequence tag (EST) sequences from many sources such as user submissions or public databases and has processed and assembled them into unigene builds, which are annotated using sequence homology and predicted protein domains and then grouped into gene families. Currently there are 14 unigene builds for 18 species with more than 270 000 member sequences used in the assemblies.

New sequencing technologies have made it necessary to change how individual reads are processed, stored and presented online. In the last year, seven Solanaceae RNA-seq datasets have been submitted to the sequence read archive (SRA) database ( 10 ) and many more will follow. To better process these larger data sets, a new SGN transcriptome system has been designed and implemented, which uses a hybrid approach for data storage: (i) unigenes, assembly protocol and filenames on the one hand are stored in a relational database, using the Chado schema ( 11 ) and sample tables from the SGN biosource component (unpublished); (ii) original reads and assembly data are stored in the filesystem using indexed standard formats such as FASTA and GFF3. This provides enhanced scalability while preserving seamless integration in the user interface.

SGN system architecture diagram. SGN is a three-tiered system, consisting of a front-end web interface, back-end code and a data store, which includes both files and a relational database. For example, the GEM component is composed of Javascript and Mason components to create the user-facing web interface, DBIx::Class-based Perl modules to manipulate and model the data and a relational database schema for storage.

SGN system architecture diagram. SGN is a three-tiered system, consisting of a front-end web interface, back-end code and a data store, which includes both files and a relational database. For example, the GEM component is composed of Javascript and Mason components to create the user-facing web interface, DBIx::Class-based Perl modules to manipulate and model the data and a relational database schema for storage.

Next generation sequencing (NGS) data can also be mined for expression and single-nucleotide polymorphisms (SNP) information. Although there are general expression databases like gene expression omnibus (GEO) ( 12 ) or ArrayExpress ( 13 ) that contain expression data from microarray and RNA-seq analysis, expression information is an important resource for researchers that should be tightly integrated with other data. To this end, SGN has developed a new expression component called the general expression module (GEM), which stores and displays expression data from different technologies such as microarrays and RNA-seq that are directly incorporated into the SGN relational database. Expression data is generated from the NGS data by first associating the expression values with the sequence assembly and loaded the results into the GEM database where they are visible and searchable on SGN.

We have focused on integrating high quality Affymetrix-based expression data ( 14 ) for species such as tobacco ( 15 ) and tomato ( 16 ), for which more than 43 conditions and 167 hybridizations have been loaded. Expression data associated with RNA-seq from 7 tomato trichomes experiments ( 17 ) have been loaded. Currently, the system provides a simple interface for searching, querying and visualizing expression data and more advanced functionality and graphical views will be provided in the near future. In addition, plant ontology (PO) annotations ( 18 ) can be associated with samples in the expression data to improve downstream sequence annotation.

Protein datasets

Most MODs contain little or no proteomics data, although a number of specialized protein databases such as Pride ( 19 ) or the ExPaSy proteomics server ( 20 ) exist. Species-specific proteomic databases have also been developed, e.g. the Nottingham arabidopsis stock centre (NASC) proteomics database ( 21 ). Usually, such databases collect proteins from different sources like Swiss-Prot ( 22 ), protein information resource (PIR) ( 23 ) or protein data bank (PDB) ( 24 ) and/or predicted proteins from gene models or messenger ribonucleic acid (mRNA) datasets.

In SGN, protein sequences are derived from protein predictions on both unigene transcript sequences and predicted genomic gene models, such as those from the international tomato annotation group (ITAG). For unigenes, several different methods for protein prediction are used, including the analysis of the longest open reading frame (ORF), detection of the coding region using hidden Markov models [using EstScan ( 25 )], or detection of the most probable translation initiation site using NetStart ( 26 ). Sequences are stored in a Chado relational database schema and in bulk FASTA files available for download via file transfer protocol (FTP). In addition to standard homology searches using basic local alignment search tool (BLAST) and similar tools, Mascot- ( 27 ) and Protein Pilot- (Applied Biosystems) compatible FASTA datasets are generated and published on SGNs FTP site. Currently SGN hosts 28 unigene-based protein datasets for various species and one gene-model-based protein dataset for tomato from the ITAG. These datasets also contain domain annotations based on InterProScan ( 28 ) and signal peptide analysis using SignalP( 29 ).

Gene family analyses are performed on these protein datasets using an SGN-developed pipeline ( http://solgenomics.net/about/family_analysis.pl ). This pipeline pre-clusters sequences via coarse homology searches with BLAST and then uses TRIBE-MCL ( 30 ) for final gene family clustering. Multiple alignments of the sequences in each cluster are performed with Muscle ( 31 ) and PAUP ( 32 ) is used for the calculation of the phylogenetic trees. The results are loaded into the SGN database and can be searched and visualized on the web. Family sequence alignments can be loaded into the Tree Browser tool to explore the relations between different family members. There are 11 851 gene families stored in the database for the last analysis using eight different species, with families ranging in sizes between 2 and 647 members.

Genomics and genetics

The tomato genome project has been actively sequencing tomato using a BAC-by-BAC approach since 2004, with sequenced BACs continually released on SGN ( 1 ). In 2009, a whole-genome-shotgun component was added to complement the BAC-by-BAC sequencing, with both being merged to form a finished assembly. This assembly is being continually refined and corrected with new assembly versions released frequently. The first pre-release of the S. lycopersicum whole-genome shotgun was published on 1 December 2009 as a common effort of the international tomato genome sequencing consortium. The previously-sequenced BAC sequences have been incorporated in the assembly covering the 12 tomato chromosomes with 91 scaffolds (release version 2.30) ( http://solgenomics.net/genomes/Solanum_lycopersicum/index.pl ).

SGN is also involved in the annotation of the S. lycopersicum genome as part of the ITAG. The ITAG group has created a distributed annotation pipeline, where each group runs a part of the analysis ( http://www.ab.wur.nl/TomatoWiki ) ( 1 ). Other genome sequences are hosted at the SGN database, such as S. phureja (wild potato) or S. pimpinellifollium (wild tomato species; http://solgenomics.net/genomes/Solanum_pimpinellifolium/ ), soon to be followed by new genome sequences ( http://solgenomics.net/static_content/solanaceae-project/docs/SOL_newsletter_Jun_10.pdf ), e.g. S. pennellii , a wild tomato species.

SGN stores genome sequences and annotations using two different approaches. It stores genomic elements both in a Chado relational schema ( 11 ) and as GFF3 ( http://sequenceontology.org/resources/gff3.html ) and FASTA bulk files are available for download. This combination allows tight integration of genomic features with other elements of the SGN database, such as markers and unigenes, but also allows straightforward integration of tools like BLAST ( 33 ) for sequence searches based on homology or GBrowse to visualize genome regions and their annotations ( 34 ).

Complementing the genome sequence, SGN hosts more than 20 genetic and physical maps for tomato ( 35 ), potato ( 36 ), pepper and tobacco ( 37 ) with thousands of markers. Genetic marker types in the database include AFLP, CAPS, PCR, RFLP, SNP, SSR and dCAPS. Genetic and physical maps are stored in a custom schema and can be accessed from the SGN toolbar or using different tools, including database searches, BLAST ( 33 ) or the SGN comparative viewer ( 38 ). SGN has also developed tools to facilitate the design of genetic markers, such as the CAPS designer ( http://solgenomics.net/tools/caps_designer/caps_input.pl ).

Metabolic pathways

Another important component of SGN is the annotation and cataloguing of genes involved in metabolic pathways. SolCyc is a Pathway Genome Database (PGDB) for Solanaceae species, such as tomato, potato, tobacco, pepper, eggplant, petunia, and close relatives, such as coffee ( http://solcyc.solgenomics.net/ ). Currently, SolCyc comprises 7 PGDBs with approximately 1250 pathways, 6200 enzymatic reactions, 8600 enzymes and 4900 compounds for seven different species. SolCyc is based on the pathway tools software suite ( 39 ).

Phenomics

One of the most important problems of the post-genomic era is linking sequences to phenotypes. To solve this problem, generation of vast amounts of sequence data must be accompanied by a corresponding amount of phenotypic data for hundreds or thousands of accessions and mutants. SGN has developed an infrastructure for storing, displaying and curating phenotypic data called Phenome, which heavily relies on elements of the Chado schema, and makes significant use of Javascript/JSON( 9 ) to provide a dynamic and responsive user interface.

Phenotypic data can be linked to loci, alleles, accessions, ontology annotations, publications and populations, with loci acting as the central data type linking phenomic and genomic data. A locus can have different alleles responsible for different phenotypes in a given accession or group of accessions. Accessions are also grouped into plant populations, for example, quantitative trait loci (QTL) or mapping populations. A trait is a phenotypic character analyzed in some population to study the distribution among the accessions. Currently SGN contains information on 7100 alleles, 5800 loci, 8200 accessions and 20 populations.

An ontology has been developed over several years to describe the phenotypic traits of the Solanaceae (Solanaceae phenotype ontology, SPO http://solgenomics.net/chado/cvterm.pl?action=view&cvterm_id=23057 ) with an emphasis on usability by both the scientific and breeder communities. The ontology currently contains about 200 terms and more terms are added as needed. Terms are mapped whenever applicable to the standard PO ( 18 ), as well as the phenotype and trait ontology (PATO, http://obofoundry.org/wiki/index.php/PATO:Main_Page ).

A web-based quantitative trait locus (QTL) analysis tool ( http://solgenomics.net/qtl ) based on R/QTL ( 40 ) has been developed for mapping QTLs, experimental crosses and cross-linking putative QTLs to relevant genomic, genetic and expression datasets in SGN (manuscript in preparation). There are currently three QTL populations with more than 40 different quantitative traits stored in the SGN database ( 41 ). Users can upload and analyze their own data on the fly and decide whether their data should be publicly visible or kept private.

TECHNOLOGY

From the user's perspective, SGN is a COD containing biological data for Solanaceae and related species, but from a technical perspective, it is a highly complex system for integrating diverse data, standard tools and custom code ( Figure 3 ), written primarily in the Perl programming language. All software source code and daily development logs are publicly viewable at http://github.com/solgenomics/ .

Like many websites, SGN is implemented as a three-tiered architecture consisting of user-facing view code, control and data modules and a relational database backend. The site runs on a complete modern Perl software stack: Mason, Catalyst and DBIx::Class, with adaptors providing support for legacy CGI and custom-SQL code. The relational database is PostgreSQL ( 42 ). Flat files are used for some storage purposes to complement the relational database, e.g. for storing large assemblies, images and sequence sets.

Many of the site's core functions are provided by generic model organism database (GMOD) tools ( http://gmod.org ), such as Chado, Bio::Chado::Schema and GBrowse, leaving SGN developers free to integrate more data and custom tools. Currently, over 600 SGN-developed Perl classes underlie the site, with a rich data model ( Figure 2 ), but the codebase is made more concise and powerful by contributing to and integrating code from many open source projects, including GBrowse, Chado, BioPerl, DBIx::Class, Catalyst and Moose.

To produce the site software, SGN uses an Agile software development process, incorporating test-driven development (TDD) ( 43 ) and continuous integration ( 44 ). Under TDD, detailed test programs are written for each aspect of the system's function and tied together to be run easily, usually many times per day. Consistent adherence to TDD is a powerful tool for accelerating development, since the tests immediately pinpoint most problems, thus greatly reducing debugging time. This increases efficiency by allowing it to produce new site features and open-source software quickly and with high quality.

COMMUNITY PARTICIPATION

Traditional data curation demands tremendous personnel resources for a database. For example, in just tomato research, more than 800 gene-related articles were published in 2009 according to PubMed ( http://www.ncbi.nlm.nih.gov/pubmed/ ). Most databases do not have enough curators to keep the site up-to-date in the face of such overwhelming numbers of publications. Therefore, SGN has developed a community curation system under which certain genes and phenotypes can be curated by domain experts called ‘locus editors’ who have privileges to add, edit and remove information on genes and phenotypes for a certain gene. Locus editors are chosen by SGN curators, who invite researchers to become locus editors based on journal articles or on meeting presentations ( 9 ). As of July 2010, there are over 100 locus editors curating 261 loci.

Community annotations provide important high quality information from experts in their field for both genes and phenotypes, but it also provides a dynamic social network where researchers of different disciplines can meet, share their work and continuously submit updates on their genes of interest.

Other resources provided for SGN users community are database help ( http://solgenomics.net/help/index.pl ) and SGN tools tutorials such as the community annotation tutorial ( http://www.slideshare.net/nm249/sgn-community-annotation-tutorial?type=presentation ). SGN also supplies other ‘social tools’ such as email lists for news announcements ( http://rubisco.sgn.cornell.edu/mailman/listinfo/sgn-announce/ ) and an SGN blog ( http://solgenomics.blogspot.com/ ) where data, community or code topics are discussed.

FUTURE DIRECTIONS

In the near future, over 100 Solanaceae genomes will be sequenced under the SOL-100 project, which will include many less-studied Solanaceae species, varieties and cultivars reflecting the natural biodiversity of this family. In addition, RNA-Seq will also be performed on a much larger scale than ever before in the Solanaceae. The challenge for SGN will be to integrate this information while retaining an easy to use and responsive user interface. In the short term, many of the existing databases and tools will be improved; e.g. a clustering feature will be added to the expression database and improved metabolic and plant ontology searches will be added. In the mid-term, system biology tools for the browsing, curation and visualization of gene network data will be implemented.

Behind the scenes, SGNs codebase is being developed in the direction of creating a generic, reusable, modular and flexible platform suitable for use by other organism databases. In the long run, it is our hope that this will provide opportunities for organizations to cooperate more closely on software development, thereby reducing the endless re-implementation of the same site features at many different databases that is still so common today. SGN actively participates in and contributes to the GMOD project, which has made great strides to combat ‘reinventing the wheel’, but a lot of work remains to be done.

While SGN has been created primarily with the molecular biologist and the geneticist in mind, a prime focus of current development is on improving the site's usefulness to breeders, who are the crucial link between the advances in the laboratory and improvements in the field, ultimately translating scientific progress into better varieties and contributing to healthier diets and more sustainable agriculture. The recently created breeders’ toolbox ( http://solgenomics.net/breeders/ ) will be expanded further, in collaboration with the breeders themselves, to create a comprehensive solution to give breeders easy, intuitive access to the wealth of data in SGN.

CONCLUSIONS

The SGN database is an important resource for Solanaceae scientific research. It currently has over 1000 registered users and more than 6000 unique visitors per month, generating more than 150 000 page views. With many new resources for the Solanaceae coming on-line, usage of SGN can be expected to grow considerably in the future.

Besides the new datasets that have been added, the way SGN interacts with the rest of the world has evolved. SGN actively contributes to open source projects. SGN's ‘radically open’ software development model offers possibilities for increasing software cooperation with other databases. Most importantly, new community curation tools establish a direct line of communication between the online database and the data producers, with many positive implications for the whole research community.

FUNDING

This project was funded by the National Science Foundation (NSF), the United States Department of Agriculture (USDA) and ATC Inc (Advanced Technologies, Cambridge).

Conflict of interest statement . None declared.

ACKNOWLEDGEMENTS

The authors would like to gratefully acknowledge Joyce van Eck for her ongoing contribution to the breeders toolbox. And also would like to gratefully acknowledge the National Science Foundation (NSF) and the United States Department of Agriculture (USDA) and ATC Inc. for funding of SGN.

REFERENCES

LA
,
RK
,
SD
,
JJ
, Matrix 1999 Download Itag
R
,
J
,
ZJ
,
Jv
,
R
,
AA
, et al. ,
Plant Genome
, , vol.
2
(pg. -
92
)
M
,
M
,
WE
,
S
,
JS
,
LA
,
J
,
MS
,
YJ
,
Z
, et al.
Genome sequencing in microfabricated high-density picolitre reactors
, ,
2005
, vol. (pg.
376
-)
G
,
A
,
M
,
AP
.
A new class of cleavable fluorescent nucleotides: Synthesis and optimization as reversible terminators for DNA sequencing by synthesis
, ,
2008
, vol. pg.
e25
J
,
GJ
,
NB
,
X
,
JP
,
AM
,
MD
,
K
,
RD
,
GM
.
Accurate multiplex polony sequencing of an evolved bacterial genome
, ,
2005
, vol. (pg.
1728
-)
TD
,
PR
,
H
,
E
,
J
,
I
,
M
,
J
,
J
,
JW
, et al. ,
Science
, , vol.
320
(pg. -
109
)
KJ
,
CS
,
DR
,
JS
,
SW
.
A flexible and efficient template format for circular consensus sequencing and SNP detection
, ,
2010
, vol. pg.
e159
SR
,
R
,
G
,
KR
,
MC
,
SS
,
DG
,
JE
,
BC
,
EL
, et al.
Saccharomyces genome database provides mutant phenotype data
, ,
2010
, vol. (pg.
D433
-)
D
,
C
,
P
,
TZ
,
M
,
H
,
D
,
T
,
R
,
L
, et al.
The arabidopsis information resource (TAIR): Gene structure and function annotation
, ,
2008
, vol. (pg.
D1009
-)
N
,
RM
,
I
,
LA
.
A community-based annotation framework for linking solanaceae genomes with phenomes
, ,
2008
, vol. (pg.
1788
-)
M
,
G
,
H
. ,
Nucleic Acids Res.
, , vol.
38

Matrix 1999 Download Itage

(pg. -
D871
)
CJ
,
DB
.
A chado case study: An ontology-based modular schema for representing genome-associated biological information
, ,
2007
, vol. (pg.
i337
-)
T
,
DB
,
SE
,
P
,
D
,
C
,
IF
,
A
,
M
,
KA
, et al.
NCBI GEO: Archive for high-throughput functional genomic data
, ,
2009
, vol. (pg.
D885
-)
H
,
M
,
N
,
G
,
M
,
N
,
H
,
M
,
I
,
A
, et al.
ArrayExpress update – from an archive of functional genomics experiments to the atlas of gene expression
, ,
2009
, vol. (pg.
D868
-)
AD
,
JE
,
PA
,
C
,
T
,
JA
,
GH
.
Photolithographic synthesis of high-density oligonucleotide probe arrays
, ,
2001
, vol. (pg.
525
-)
KD
,
A
,
GW
,
F
,
LA
,
SA
,
L
.
TobEA: An atlas of tobacco gene expression from seed to senescence
, ,
2010
, vol. pg.
142
S
,
Y
,
K
,
A
,
T
,
N
,
Y
,
T
,
T
,
C
, et al.
Coexpression analysis of tomato genes and experimental verification of coordinated expression of genes found in a functionally enriched coexpression module
, ,
2010
, vol. (pg.
105
-)
AL
,
DP
,
M
,
E
,
DR
,
C
,
RL
.
Studies of a biochemical factory: Tomato trichome deep expressed sequence tag sequencing and proteomics
, ,
2010
, vol. (pg.
1212
-)
S
,
CW
,
K
,
P
,
EA
,
S
,
A
,
L
,
SY
,
MM
, et al.
The plant ontology database: A community resource for plant structure and developmental stages controlled vocabulary and annotations
, ,
2008
, vol. (pg.
D449
-)
JA
,
R
,
F
,
H
,
JM
,
J
,
H
,
L
.
The proteomics identifications database: 2010 update
, ,
2010
, vol. (pg.
D736
-)
E
,
A
,
C
,
I
,
RD
,
A
.
ExPASy: The proteomics server for in-depth protein knowledge and analysis
, ,
2003
, vol. (pg.
3784
-)
N
,
N
,
D
,
B
,
S
. ,
Methods Mol. Biol.
, , vol.
406
(pg. -
227
)
U
,
Consortium
.
From protein sequences to 3D-structures and beyond: The example of the UniProt knowledgebase
, ,
2010
, vol. (pg.
1049
-)
CH
,
LS
,
H
,
L
,
J
,
Y
,
Z
,
P
,
RS
,
BE
, et al. ,
Nucleic Acids Res.
, , vol.
31
(pg. -
347
)
S
,
K
,
J
,
GJ
,
T
,
K
,
H
,
HM
.
Data deposition and annotation at the worldwide protein data bank
, ,
2009
, vol. (pg.
1
-)
C
,
CV
,
P
.
ESTScan: A program for detecting, evaluating, and reconstructing potential coding regions in EST sequences
, ,
1999
(pg. -
148
)
AG
,
H
.
Neural network prediction of translation initiation sites in eukaryotes: Perspectives for EST and genome analysis
, ,
1997
, vol. (pg.
226
-)
M
,
M
,
M
,
T
.
MASCOT: Multiple alignment system for protein sequences based on three-way dynamic programming
, ,
1993
, vol. (pg.
161
-)
EM
,
R
.
InterProScan–an integration platform for the signature-recognition methods in InterPro
, ,
2001
, vol. (pg.
847
-)
JD
,
H
,
G
,
S
.
Improved prediction of signal peptides: SignalP 3.0
, ,
2004
, vol. (pg.
783
-)
AJ
,
S
,
CA
.
An efficient algorithm for large-scale detection of protein families
, ,
2002
, vol. (pg.
1575
-)
RC
.
MUSCLE: Multiple sequence alignment with high accuracy and high throughput
, ,
2004
, vol. (pg.
1792
-)
JC
,
D
. ,
Curr. Protoc. Bioinformatics
,
Chapter 6, Unit 6.4
SF
,
W
,
W
,
EW
,
DJ
. ,
J. Mol. Biol.
, , vol.
215
(pg. -
410
)
LD
,
C
,
S
,
M
,
M
,
A
,
E
,
JE
,
TW
,
A
, et al.
The generic genome browser: A building block for a model organism system database
, ,
2002
, vol. (pg.
1599
-)
MJ
,
E
.
A comparative analysis into the genetic bases of morphology in tomato varieties exhibiting elongated fruit shape
, ,
2008
, vol. (pg.
647
-)
SD
,
MW
,
JP
,
MC
,
MW
,
P
,
TM
,
JJ
,
S
,
GB
.
High density molecular linkage maps of the tomato and potato genomes
, ,
1992
, vol. (pg.
1141
-)
G
,
R
,
I
,
J
,
M
,
L
,
F
,
P
.
A microsatellite marker based linkage map of tobacco
, ,
2007
, vol. (pg.
341
-)
LA
,
TH
,
N
,
B
,
R
,
J
,
C
,
MH
,
R
,
Y
, et al.
The SOL genomics network: A comparative resource for solanaceae biology and beyond
, ,
2005
, vol. (pg.
1310
-)
PD
,
S
,
P
. ,
Bioinformatics
, , vol.
18
(pg.
S225
-)
KW
,
H
,
S
,
GA
. ,
Bioinformatics
, , vol.
19
(pg. -
890
)
MT
,
JB
,
AJ
,
E
.
Morphological variation in tomato: A comprehensive study of quantitative trait loci controlling fruit shape and development
, ,
2007
, vol. (pg.
1339
-)
PostgreSQL documentation
,
( http://www.postgresql.org/docs/9.0/static/intro-whatis.html )
K
. , ,
2003
Introduction, xvii. Addison Wesley Longman Publishing Co., Reading, Massachusetts
C
. ,
Agile and Iterative Development: A Manager's Guide
,
Addison-Wesley Longman Publishing Co.
© The Author(s) 2010. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
0 Comments
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.