April 09, 2025
YeastPathways, the database of metabolic pathways and enzymes in the budding yeast Saccharomyces cerevisiae, is manually curated and maintained by the biocuration team at SGD.
This resource is jam-packed with information, but somewhat hidden from view. We have been doing different things recently to make the pathways more readily accessible. Some time ago we added a new section with pathways links on the relevant gene pages (ex. DFR1).
We also made the pathways available in SGD Search.
Now we have transformed the metabolic pathways and associated genes/enzymes into Gene Ontology (GO) annotations (ex. DFR1).
Because many fundamental molecular processes and pathways are evolutionarily conserved between yeast and higher eukaryotes, including humans, the curated metabolic pathway information has great value for the transfer of knowledge to other organisms. It is for this reason that the YeastPathways data were exported in BioPAX (Demir et al. 2010) format for import into Noctua, a tool for collaborative curation of biological pathways and gene annotations that was developed by the GO Consortium (Thomas et al. 2019). BioPAX provides a standardized format for representing biological pathways, allowing researchers to integrate pathway information from different sources and databases. Noctua can import pathway data encoded in BioPAX format to populate the pathway editor with molecular interactions, biological processes, and regulatory relationships, and can utilize BioPAX files to combine pathway data from multiple datasets for pathway curation and analysis.
Pathways curated and edited in Noctua can be exported both as GO annotations for yeast and orthologous genes in other species, or as pathway annotations in BioPAX, which facilitates the sharing of curated pathways with other researchers, databases, and pathway analysis tools using a standard format, promoting data exchange, and collaboration within the scientific community.
Categories: Data updates
June 19, 2024
The S. cerevisiae strain S288C reference genome annotation was updated. The new genome annotation is release R64.5.1, dated 2024-05-29. Note that the underlying genome sequence itself was not altered. The chromosome sequences remain stable and unchanged.
This annotation update included (details in table below):
Chr | Feature | Description of change | Reference |
---|---|---|---|
II | ATG12/YBR217W | New uORF chrII:657824..657835, partially overlaps CDS | Yang Y, et al. (2023) PMID:35363116 |
IV | YDL204W-A | New ORF chrIV:94133..94285 | Wacholder A, et al. (2023) PMID:37164009 |
VI | YFR035W-A | New ORF chrVI:226260..226550 | Wacholder A and Carvunis AR (2023) PMID:38048358 |
VII | YGR016C-A | New ORF chrVII:523353..523246 | Wacholder A, et al. (2023) PMID:37164009, Chang S, et al. (2023) PMID:37927910 |
IX | EFM4/YIL064W | Move start 84 nucleotides downstream, new coordinates chrIX:242027..242716 | Hamey JJ, et al. (2024)PMID:38199565 |
IX | YIL059C | Change ORF qualifier from Dubious to Verified because stable translation product detected | Wacholder A and Carvunis AR (2023) PMID:38048358 |
XIII | YMR106W-A | New ORF chrXIII:480924..481187 | Wacholder A and Carvunis AR (2023) PMID:38048358 |
XIV | YNL040C-A | New ORF chrXIV:552558..552478 | Wacholder A, et al. (2023) PMID:37164009 |
XIV | YNL155C-A | New ORF chrXIV:342135..341911 | Wacholder A and Carvunis AR (2023) PMID:38048358 |
XV | ATG19/YOL082W | New uORF chrXV:168632..168679 | Yang Y, et al. (2023) PMID:35363116 |
XVI | ATG5/YPL149W | 4 new uORFs: chrXVI:271236..271277, chrXVI:271252..271302, chrXVI:271299..271307, chrXVI:271302..271307 | Yang Y, et al. (2023) PMID:35363116 |
XVI | ATG13/YPR185W | New uORF chrXVI:907211..907351, partially overlaps CDS | Yang Y, et al. (2023) PMID:35363116 |
Categories: Data updates
March 01, 2024
The saccharomyces_cerevisiae.gff contains sequence features of Saccharomyces cerevisiae and related information such as Locus descriptions and GO annotations. It is fully compatible with Generic Feature Format Version 3. It is updated weekly.
After November 2020, SGD updated the transcripts in the GFF file to reflect the experimentally determined transcripts (Pelechano et al. 2013, Ng et al. 2020), when possible. The longest transcripts were determined for two different growth media – galactose and dextrose. When available, experimentally determined transcripts for one or both conditions were added for a gene. When this data was absent, transcripts matching the start and stop coordinates of an open reading frame (ORF) were used.
Beginning in February 2024, SGD increased the start and stop coordinates of genes to encompass the start and stop coordinates of the longest experimentally determined transcripts, regardless of condition. This change was made in order to comply with JBrowse 2, a newer and more extensible genome browser, which requires that parent features in GFF files (genes) are larger than child features (mRNA, CDS, etc) (Diesh et al., 2023).
This is a standard format used by many groups. SGD uses the GFF file to load the reference tracks in SGD’s genome browser resource.
Categories: Announcements, Data updates
Tags: biology, blog, genetics, news, Saccharomyces cerevisiae
September 28, 2023
YeastMine is SGD’s data warehouse, powered by InterMine. We have so many templates (i.e., pre-defined queries) that provide access to so many different kinds of data!
A big area of focus for SGD and the yeast community is alleles. Alleles are different versions of genes that vary in DNA and sometimes protein sequence. Did you know that you can easily and quickly get all curated yeast allele data directly from YeastMine?
From the YeastMine home page, click ‘Templates‘ at top left. From there, filter for ‘allele’.
The Genes -> Alleles template returns data for one gene or a list of genes or the entire genome! Data include standard and systematic names for genes, gene name descriptions, allele names and descriptions, allele types, aliases, and references. SGDIDs for genes are included, and now SGDIDs for the alleles have been added. Previously, this query returned all of these data without the SGDIDs for the alleles. Based on user feedback, we have now made these allele SGDIDs available, so that they can be used to identify and distinguish different alleles. Enjoy!
There are thousands of alleles in SGD! Give the YeastMine Genes -> Alleles template a whirl! Get all the alleles for your favorite gene or list of genes.
For help using YeastMine, please see the SGD Help Pages and YouTube Channel.
Categories: Data updates, Website changes
September 20, 2023
Back in the day, SGD maintained an FTP site to distribute data in various files. More recently, you have found these files in the SGD Downloads site. We have now moved these files to YeastMine:
From the YeastMine homepage, click Templates at top left. In the Filter, select ‘Downloads’ to constrain the list of templates.
The following templates are listed under Downloads:
• Deleted Merged Features: Retrieve all deleted and merged features.
• Retrieve Functional Complementation for genes: For gene(s), retrieve information about cross-species functional complementation between yeast and another species.
• Retrieve GO Terms: Retrieve GO Terms, including name, ID, namespace, and definition.
• Retrieve SGD chromosomal Features: Retrieve genes and other chromosomal features, including IDs, coordinates, and descriptions.
• Retrieve all cross-references for all genes: Retrieve IDs for yeast gene and gene products in other databases.
• Retrieve all domains of all genes: Retrieve Proteins/Genes that have a given domain.
• Retrieve all interactions for all genes: Retrieve physical and genetic interactions for all genes.
• Retrieve all pathways for all genes: Retrieve all metabolic pathways for all genes.
• Retrieve protein properties of all proteins of ORFs: Retrieve protein properties, including pI, molecular weight, N-terminal and C-terminal sequences, codon bias, etc. of all proteins.
For help using YeastMine, please see the SGD Help Pages and YouTube Channel.
Categories: Data updates, Tutorial, Website changes
September 08, 2023
The S. cerevisiae strain S288C reference genome annotation was updated. The new genome annotation is release R64.4.1, dated 2023-08-23. Note that the underlying genome sequence itself was not altered in any way.
This annotation update included:
Chr | Feature | Description of change | Reference |
---|---|---|---|
III | SUT035/YNCC0015W | New ncRNA chrIII:205766..205942 (+ strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158 |
IV | YDR278C | Change ORF qualifier from Uncharacterized to Dubious | Requested by NCBI |
IV | SUT053/YNCD0033W | New ncRNA chrIV:506334..507774 (+ strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158 |
IV | SUT468/YNCD0034C | New ncRNA chrIV:506546..507450 (- strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158 |
VII | SUT532/YNCG0047C | New ncRNA chrVII:17213..17709 (- strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158 |
VII | SUT125/YNCG0048W | New ncRNA chrVII:650855..651159 (+ strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158, Feng MW, et al. (2022) PMID:36712349 |
VII | SUT126/YNCG0049W | New ncRNA chrVII:660087..661399 (+ strand) | Xu Z, et al. (2009) PMID:19169243,Balarezo-Cisneros LN, et al. (2021) PMID:33493158 |
XII | FPS1/YLL043W | New uORF uORF2 3 codons chrXII:49924..49932 (+ strand) ATGCATTAA | Cartwright SP, et al. (2017) PMID:28279185 |
XIV | ACC1/YNR016C | New uORF 4 codons chrXIV:661704..661715 (- strand) ATGTGTTTATAA | Blank HM, et al. (2017) PMID:28057705 |
XIV | HOL1/YNR055C | New uORF 7 codons chrXIV:730381..730401 (- strand) ATGCTATTACTACCAAGTTGA | Vindu A, et al. (2021) PMID:34375581 |
XV | YOL013W-A | Change ORF qualifier from Uncharacterized to Dubious | Requested by NCBI |
XVI | SUT390/YNCP0025W | New ncRNA chrXVI:52977..53465 (+ strand) | Xu Z, et al. (2009) PMID:19169243, Feng MW, et al. (2022) PMID:36712349 |
XVI | SUT418/YNCP0026W | New ncRNA chrXVI:588998..589830 (+ strand) | Xu Z, et al. (2009) PMID:19169243, Feng MW, et al. (2022) PMID:36712349 |
XVI | YPR108W-A | Change ORF qualifier from Uncharacterized to Dubious | Requested by NCBI |
Various sequence and annotation files are available on SGD’s Downloads site.
Categories: Data updates
Tags: genome annotation update, Saccharomyces cerevisiae
January 20, 2022
In an exciting new paper, Humphreys et al. describe the use of deep-learning-based algorithms to predict structures of not only single proteins, but assemblies of proteins. The team used rapid RoseTTAFold combined with the more accurate AlphaFold to build structural models for 106 previously unidentified protein assemblies and 806 complexes that had not been structurally characterized. The complexes have up to five subunits and are involved in numerous critical roles in cell biology.
Go look for your own proteins of interest at the ModelArchive and search in the Home page. Also find the link on the resources section of the SGD Interaction and Protein pages.
Categories: Announcements, Data updates, Paper of the Week
Tags: protein complex, Saccharomyces cerevisiae, yeast protein assembly
December 01, 2021
SGD has updated our protein complex pages to have the same format as gene pages, with tabs across the top for each category of information, including a Summary page, a new Gene Ontology page, and a new Literature page for each complex. Just as we do for all of your favorite genes, Gene Ontology and Literature curation for complexes will be ongoing.
If you have any questions or feedback about the updates to our complex pages, please do not hesitate to contact us at any time.
Categories: Announcements, Data updates, Website changes
Tags: protein complex, Saccharomyces cerevisiae
November 09, 2021
Would you like to see the shape of your protein?
SGD now contains links to AlphaFold in the Resources section of the Summary, Protein and Homology pages for every gene.
Categories: Data updates
November 05, 2021
SGD has long been the keeper of the official Saccharomyces cerevisiae gene nomenclature. Robert Mortimer handed over this responsibility to SGD in 1993 after maintaining the yeast genetic map and gene nomenclature for 30 years.
The accepted format for gene names in S. cerevisiae comprises three uppercase letters followed by a number. The letters typically signify a phrase (referred to as the “Name Description” in SGD) that provides information about a function, mutant phenotype, or process related to that gene, for example “ADE” for “ADEnine biosynthesis” or “CDC” for “Cell Division Cycle”. Gene names for many types of chromosomal features follow this basic format regardless of the type of feature named, whether an ORF, a tRNA, another type of non-coding RNA, an ARS, or a genetic locus. Some S. cerevisiae gene names that pre-date the current nomenclature standards do not conform to this format, such as MRLP38, RPL1A, and OM45.
A few historical gene names predate both the nomenclature standards and the database, and were less computer-friendly than more recent gene names, due to the presence of punctuation. SGD recently updated these gene names to be consistent with current standards and to be more software-friendly by removing punctuation. The old names for these four genes have been retained as aliases.
ORF | Old gene name | New gene name |
---|---|---|
YGL234W | ADE5,7 | ADE57 |
YER069W | ARG5,6 | ARG56 |
YBR208C | DUR1,2 | DUR12 |
YIL154C | IMP2′ | IMP21 |
Categories: Announcements, Data updates
Tags: gene nomenclature