Αr35 RNA


αr35 is a family of bacterial small non-coding RNAs with representatives in a reduced group of Alphaproteobacteria from the order Hyphomicrobiales. The first member of this family was found in a Sinorhizobium meliloti 1021 locus located in the symbiotic plasmid B. Further homology and structure conservation analysis have identified full-length SmrB35 homologs in other legume symbionts, as well as in the human and plant pathogens Brucella anthropi and Agrobacterium tumefaciens, respectively. αr35 RNA species are 139-142 nt long and share a common secondary structure consisting of two stem loops and a well conserved rho independent terminator. Most of the αr35 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions of the Alphaproteobacterial genomes.

Discovery and Structure

Smr35B sRNA was firstly described by del Val et al., as a result of a computational comparative genomic approach in the intergenic regions of the reference S. meliloti 1021 strain. Northern hybridization experiments confirmed that the predicted smr35B locus did express a single transcript of the expected size, which accumulated differentially in free-living and endosymbiotic bacteria. TAP-based 5'-RACE experiments mapped the transcription start site of the full-length Smr35B transcript to the 577,730 nt position in the S. meliloti 1021 genome whereas the 3'-end was assumed to be located at the 577,868 nt position matching the last residue of the consecutive stretch of Us of a bona fide Rho-independent terminator. Recent deep sequencing-based characterization of the small RNA fraction of S. meliloti further confirmed the expression of Smr35B, and mapped the 5'- and 3'-ends of the molecule to the positions proposed earlier.
The nucleotide sequence of Smr35B was initially used as query to search against the Rfam database. This homology search rendered no matches to known bacterial sRNA in this database. Smr35B was next BLASTed with default parameters against all the currently available bacterial genomes. The regions exhibiting significant homology to the query sequence were extracted to create a Covariance Model from a seed alignment using Infernal .
]
This CM was used in a further search for new members of the αr35 family in the existing bacterial genomic databases.
CM modelNameGI accession numberbeginendstrand%GClengthOrganism
αr35Smr35Bgi16263748refNC_003078.1577730577868+52139Sinorhizobium meliloti 1021 plasmid pSymB
αr35Atr35Cgi159185562refNC_003063.2132595132733+48139Agrobacterium tumefaciens str. C58 chromosome linear
αr35Rlvr35Cgi116249766refNC_008380.122567162256853+55138Rhizobium leguminosarum bv. viciae 3841
αr35Rlt1325r35p04gi241258599refNC_012852.1114247114385-56139Rhizobium leguminosarum bv. trifolii WSM1325 plasmid pR132504
αr35Rlt1325r35p02gi241666492refNC_012858.1466255466394-?140Rhizobium leguminosarum bv. trifolii WSM1325 plasmid pR132502
αr35ReCFNr35fgi86360734refNC_007766.1136368136508+57141Rhizobium etli CFN 42 plasmid p42f
αr35Oar35CIIgi153010078refNC_009668.115871381587279-52142Brucella anthropi ATCC 49188 chromosome 2

The results were manually inspected to deduce a consensus secondary structure for the family. The consensus structure was also independently predicted with the program locARNATE with very similar predictions. The manual inspection of the 84 sequences found with the CM using Infernal allowed finding seven true homolog sequences: two copies in Rhizobium leguminosarum bv. viciae, two copies in Rhizobium leguminosarumbv. trifolii WSM1325, in Rhizobium etli CFN 42 plasmid p42f and in the chromosomes of Agrobacterium tumefaciens and Brucella anthropi. All these sequences showed significant Infernal E-values and bit-scores. In the case of S. meliloti a second copy was identified in the symbiotic plasmid pSymB with a significant E-value but no expression has been detected under any of the tested conditions. The rest of the sequences found with the model showed high E-values between but very low bit-scores, which usually is a sign of a remote homologue. However, a manual inspection of these cases showed that the rho independent terminator and the second stem were the only conserved regions, failing the first stem. This two stem arregment construction was largely extended in all the Alphaproteobacteria, being specially conserved in Brucella species.

Expression information

Smr35B expression was first assessed by del Val et al. in S. meliloti 1021 under different biological conditions; i.e. bacterial growth in TY, minimal medium and luteolin-MM broth and endosymbiotic bacteria. Expression of Smr35B in free-living bacteria was found to be growth-dependent, being the gene down-regulated when bacteria entered the stationary phase. Supplementation of MM with luteolin, the plant flavone that specifically induces transcription of the S. meliloti nodulation genes, stimulated the expression of Smr35B by ~4 fold. In contrast, the Smr35B transcript was not detected in mature nodule tissues. Schlüter et al. further described up-regulation of Smr35B upon an osmotic upshift.

Promoter Analysis

All αr35 loci have recognizable σ70-dependent promoters showing a -35/-10 consensus motif CTTAGAC-n17-CTATAT previously shown to be widely conserved among several other genera in the Alphaproteobacteria. To identify binding sites for other known transcription factors we used the fasta sequences provided by RegPredict, and used those position weight matrices provided by RegulonDB. We built PSWM for each transcription factor from the RegPredict sequences using the Consensus/Patser program, choosing the best final matrix for motif lengths between 14 and 30 if the corresponding length had not been previously specified for each matrix was established. Moreover, we searched for conserved unknown motifs using MEME and used relaxed regular expressions over all Smr35B homologs promoters. Only an inverted repeat structure built around the motif T-N11-A was found 55 nt upstream of the transcription start site of SmrB35 in S. meliloti which is a degenerated motif of the known conserved nod boxes. This characteristic sequence has been proposed as the specific binding site for the LysR-type proteins. All promoter regions of the seed SmrB35 homologs presented the motif as well.

Genomic Context

Most of the members of the αr35 family are trans-encoded sRNAs transcribed from independent promoters in the IGRs of the rhizobial megaplasmids. Exceptions are SmrB35 homologs of R. leguminosarum bv. viciae, and R. etli CFN 42 plasmid p42f, which are encoded in the opposite strand of annotated genes, partially overlapping ORFs. The predicted protein products of these overlapping ORFs could not be assigned to any functional category on the basis of the amino acid sequence homology. Thus, these αr35 members are putative cis-encoded antisense sRNAs.
The genomic regions of the trans-encoded αr35 sRNAs exhibit partial conservation mainly limited to the sRNA-coding sequence and one flanking gene. Most of the flanking genes of the αr35 loci encode transcription factors and proteins related to nitrogen and glutamine metabolism.
FamilyFeatureNameStrandBeginEndProtein nameAnnotationOrganism
αr35geneSM_b20551R576952577398NP_437070.1proteolysisSinorhizobium meliloti 1021 plasmid pSymB
αr35sRNASmr35BD577730577868

Sinorhizobium meliloti 1021 plasmid pSymB
αr35geneSM_b20552D578150578881NP_437071.1nitrogen compound metabolic processSinorhizobium meliloti 1021 plasmid pSymB
αr35geneOant_4157D15860071587065YP_001372686.1nitrogen compound metabolic processBrucella anthropi ATCC 49188 chromosome 2
αr35sRNAOar35CIIR15871381587279

Brucella anthropi ATCC 49188 chromosome 2
αr35geneOant_4158R15873381587724YP_001372687.1proteolysisBrucella anthropi ATCC 49188 chromosome 2
αr35geneRHE_PF00127R133963134406YP_472745.1hypothetical proteinRhizobium etli CFN 42 plasmid p42f
αr35geneRHE_PF00128D136269136700YP_472746.1hypothetical proteinRhizobium etli CFN 42 plasmid p42f
αr35sRNAReCFNr35fD136368136508

Rhizobium etli CFN 42 plasmid p42f
αr35geneRHE_PF00129D137962138264YP_472747.1membrane proteinRhizobium etli CFN 42 plasmid p42f
αr35geneAtu3124D132103132318NP_357476.1
Agrobacterium tumefaciens str. C58 chromosome linear
αr35sRNAAtr35CD132595132733

Agrobacterium tumefaciens str. C58 chromosome linear
αr35geneAtu3126D133057133344NP_357475.1nitrogen compound metabolic processAgrobacterium tumefaciens str. C58 chromosome linear
αr35geneRL2133D22562972256500YP_767731.1hypothetical proteinRhizobium leguminosarum bv. viciae 3841
αr35geneRL2134R22566172256982YP_767732.1hyphotetical proteinRhizobium leguminosarum bv. viciae 3841
αr35sRNARlvr35CD22567162256853

Rhizobium leguminosarum bv. viciae 3841
αr35geneRL2135D22569942257383YP_767733.1transposase-related proteinRhizobium leguminosarum bv. viciae 3841
αr35geneRleg_6079D113829114197YP_002978585.1membrane proteinRhizobium leguminosarum trifolii WSM1325 plasmid pR132502
αr35sRNARlt132504r35p04R114247114385

Rhizobium leguminosarum trifolii WSM1325 plasmid pR132502
αr35geneRleg_6080R114489115121YP_002978586.1endonucleaseRhizobium leguminosarum trifolii WSM1325 plasmid pR132502
αr35geneRleg_7049D465959466222YP_002985022.1
Rhizobium leguminosarum trifolii WSM1325 plasmid pR132504
αr35sRNARlt132502r35p02R466255466394

Rhizobium leguminosarum trifolii WSM1325 plasmid pR132504
αr35geneRleg_7050R466934467824YP_002985023.1transcription regulatorRhizobium leguminosarum trifolii WSM1325 plasmid pR132504
αr35genepRL110105D122566123456YP_771137.1transcription regulatorRhizobium leguminosarum bv. viciae 3841 plasmid pRL11
αr35sRNARlvr35p11D124030124162

Rhizobium leguminosarum bv. viciae 3841 plasmid pRL11
αr35genepRL110106R124229124447YP_771138.1hyphotetical proteinRhizobium leguminosarum bv. viciae 3841 plasmid pRL11