Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Front Genet ; 15: 1353553, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38505828

RESUMO

Post-genomic implementations have expanded the experimental strategies to identify elements involved in the regulation of transcription initiation. Here, we present for the first time a detailed analysis of the sources of knowledge supporting the collection of transcriptional regulatory interactions (RIs) of Escherichia coli K-12. An RI groups the transcription factor, its effect (positive or negative) and the regulated target, a promoter, a gene or transcription unit. We improved the evidence codes so that specific methods are incorporated and classified into independent groups. On this basis we updated the computation of confidence levels, weak, strong, or confirmed, for the collection of RIs. These updates enabled us to map the RI set to the current collection of HT TF-binding datasets from ChIP-seq, ChIP-exo, gSELEX and DAP-seq in RegulonDB, enriching in this way the evidence of close to one-quarter (1329) of RIs from the current total 5446 RIs. Based on the new computational capabilities of our improved annotation of evidence sources, we can now analyze the internal architecture of evidence, their categories (experimental, classical, HT, computational), and confidence levels. This is how we know that the joint contribution of HT and computational methods increase the overall fraction of reliable RIs (the sum of confirmed and strong evidence) from 49% to 71%. Thus, the current collection has 3912 reliable RIs, with 2718 or 70% of them with classical evidence which can be used to benchmark novel HT methods. Users can selectively exclude the method they want to benchmark, or keep for instance only the confirmed interactions. The recovery of regulatory sites in RegulonDB by the different HT methods ranges between 33% by ChIP-exo to 76% by ChIP-seq although as discussed, many potential confounding factors limit their interpretation. The collection of improvements reported here provides a solid foundation to incorporate new methods and data, and to further integrate the diverse sources of knowledge of the different components of the transcriptional regulatory network. There is no other genomic database that offers this comprehensive high-quality architecture of knowledge supporting a corpus of transcriptional regulatory interactions.

2.
Nucleic Acids Res ; 52(D1): D255-D264, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37971353

RESUMO

RegulonDB is a database that contains the most comprehensive corpus of knowledge of the regulation of transcription initiation of Escherichia coli K-12, including data from both classical molecular biology and high-throughput methodologies. Here, we describe biological advances since our last NAR paper of 2019. We explain the changes to satisfy FAIR requirements. We also present a full reconstruction of the RegulonDB computational infrastructure, which has significantly improved data storage, retrieval and accessibility and thus supports a more intuitive and user-friendly experience. The integration of graphical tools provides clear visual representations of genetic regulation data, facilitating data interpretation and knowledge integration. RegulonDB version 12.0 can be accessed at https://regulondb.ccg.unam.mx.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12 , Regulação Bacteriana da Expressão Gênica , Biologia Computacional/métodos , Escherichia coli K12/genética , Internet , Transcrição Gênica
3.
Microb Genom ; 8(5)2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35584008

RESUMO

Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.


Assuntos
Escherichia coli K12 , Escherichia coli , Escherichia coli/genética , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Regulação Bacteriana da Expressão Gênica , Óperon/genética , Reprodutibilidade dos Testes
4.
Biochim Biophys Acta Gene Regul Mech ; 1864(11-12): 194753, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34461312

RESUMO

The number of published papers in biomedical research makes it rather impossible for a researcher to keep up to date. This is where manually curated databases contribute facilitating the access to knowledge. However, the structure required by databases strongly limits the type of valuable information that can be incorporated. Here, we present Lisen&Curate, a curation system that facilitates linking sentences or part of sentences (both considered sources) in articles with their corresponding curated objects, so that rich additional information of these objects is easily available to users. These sources are going to be offered both within RegulonDB and a new database, L-Regulon. To show the relevance of our work, two senior curators performed a curation of 31 articles on the regulation of transcription initiation of E. coli using Lisen&Curate. As a result, 194 objects were curated and 781 sources were recorded. We also found that these sources are useful to develop automatic approaches to detect objects in articles by observing word frequency patterns and by carrying out an open information extraction task. Sources may help to elaborate a controlled vocabulary of experimental methods. Finally, we discuss our ecosystem of interconnected applications, RegulonDB, L-Regulon, and Lisen&Curate, to facilitate the access to knowledge on regulation of transcription initiation in bacteria. We see our proposal as the starting point to change the way experimentalists connect a piece of knowledge with its evidence using RegulonDB.


Assuntos
Curadoria de Dados/métodos , Bases de Dados Genéticas , Regulação Bacteriana da Expressão Gênica , Iniciação da Transcrição Genética , Escherichia coli/genética
5.
Nucleic Acids Res ; 44(D1): D133-43, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26527724

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Regulon , Análise por Conglomerados , Escherichia coli K12/metabolismo , Redes Reguladoras de Genes , Óperon , Matrizes de Pontuação de Posição Específica , Pequeno RNA não Traduzido/metabolismo , Fatores de Transcrição/classificação
6.
Database (Oxford) ; 2013: bas059, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23327937

RESUMO

RegulonDB provides curated information on the transcriptional regulatory network of Escherichia coli and contains both experimental data and computationally predicted objects. To account for the heterogeneity of these data, we introduced in version 6.0, a two-tier rating system for the strength of evidence, classifying evidence as either 'weak' or 'strong' (Gama-Castro,S., Jimenez-Jacinto,V., Peralta-Gil,M. et al. RegulonDB (Version 6.0): gene regulation model of Escherichia Coli K-12 beyond transcription, active (experimental) annotated promoters and textpresso navigation. Nucleic Acids Res., 2008;36:D120-D124.). We now add to our classification scheme the classification of high-throughput evidence, including chromatin immunoprecipitation (ChIP) and RNA-seq technologies. To integrate these data into RegulonDB, we present two strategies for the evaluation of confidence, statistical validation and independent cross-validation. Statistical validation involves verification of ChIP data for transcription factor-binding sites, using tools for motif discovery and quality assessment of the discovered matrices. Independent cross-validation combines independent evidence with the intention to mutually exclude false positives. Both statistical validation and cross-validation allow to upgrade subsets of data that are supported by weak evidence to a higher confidence level. Likewise, cross-validation of strong confidence data extends our two-tier rating system to a three-tier system by introducing a third confidence score 'confirmed'. Database URL: http://regulondb.ccg.unam.mx/


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Escherichia coli/genética , Regulon/genética , Estatística como Assunto , Vias Biossintéticas/genética , Imunoprecipitação da Cromatina , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Matrizes de Pontuação de Posição Específica , Reprodutibilidade dos Testes , Sítio de Iniciação de Transcrição
7.
Methods Mol Biol ; 804: 179-95, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22144154

RESUMO

RegulonDB contains the largest and currently best-known data set on transcriptional regulation in a single free-living organism, that of Escherichia coli K-12 (Gama-Castro et al. Nucleic Acids Res 36:D120-D124, 2008). This organized knowledge has been the gold standard for the implementation of bioinformatic predictive methods on gene regulation in bacteria (Collado-Vides et al. J Bacteriol 191:23-31, 2009). Given the complexity of different types of interactions, the difficulty of visualizing in a single figure of the whole network, and the different uses of this knowledge, we are making available different views of the genetic network. This chapter describes case studies about how to access these views, via precomputed files, web services and SQL, including sigma-gene relationships corresponding to transcription of alternative RNA polymerase holoenzyme promoters; as well as, transcription factor (TF)-genes, TF-operons, TF-TF, and TF-regulon interactions. 17.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados Genéticas , Escherichia coli K12/genética , Redes Reguladoras de Genes/genética , Regulon/genética , Internet , Óperon/genética , Fatores de Transcrição/genética
8.
Nucleic Acids Res ; 39(3): 808-24, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20923783

RESUMO

Position-specific scoring matrices (PSSMs) are routinely used to predict transcription factor (TF)-binding sites in genome sequences. However, their reliability to predict novel binding sites can be far from optimum, due to the use of a small number of training sites or the inappropriate choice of parameters when building the matrix or when scanning sequences with it. Measures of matrix quality such as E-value and information content rely on theoretical models, and may fail in the context of full genome sequences. We propose a method, implemented in the program 'matrix-quality', that combines theoretical and empirical score distributions to assess reliability of PSSMs for predicting TF-binding sites. We applied 'matrix-quality' to estimate the predictive capacity of matrices for bacterial, yeast and mouse TFs. The evaluation of matrices from RegulonDB revealed some poorly predictive motifs, and allowed us to quantify the improvements obtained by applying multi-genome motif discovery. Interestingly, the method reveals differences between global and specific regulators. It also highlights the enrichment of binding sites in sequence sets obtained from high-throughput ChIP-chip (bacterial and yeast TFs), and ChIP-seq and experiments (mouse TFs). The method presented here has many applications, including: selecting reliable motifs before scanning sequences; improving motif collections in TFs databases; evaluating motifs discovered using high-throughput data sets.


Assuntos
Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Animais , Proteínas de Bactérias/metabolismo , Sítios de Ligação , Imunoprecipitação da Cromatina , Genômica , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Curva ROC , Proteínas Repressoras/metabolismo , Serina Endopeptidases/metabolismo , Software
10.
Nucleic Acids Res ; 36(Database issue): D120-4, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18158297

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database offering curated knowledge of the transcriptional regulatory network of Escherichia coli K12, currently the best-known electronically encoded database of the genetic regulatory network of any free-living organism. This paper summarizes the improvements, new biology and new features available in version 6.0. Curation of original literature is, from now on, up to date for every new release. All the objects are supported by their corresponding evidences, now classified as strong or weak. Transcription factors are classified by origin of their effectors and by gene ontology class. We have now computational predictions for sigma(54) and five different promoter types of the sigma(70) family, as well as their corresponding -10 and -35 boxes. In addition to those curated from the literature, we added about 300 experimentally mapped promoters coming from our own high-throughput mapping efforts. RegulonDB v.6.0 now expands beyond transcription initiation, including RNA regulatory elements, specifically riboswitches, attenuators and small RNAs, with their known associated targets. The data can be accessed through overviews of correlations about gene regulation. RegulonDB associated original literature, together with more than 4000 curation notes, can now be searched with the Textpresso text mining engine.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Biologia Computacional , Internet , Modelos Genéticos , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Ribonucleico , Regulon , Fator sigma/metabolismo , Software , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA