October 20, 2022
SGD has updated our RNA pages to add secondary structures provided by RNAcentral and generated by R2DT. Thumbnails and linkouts to RNAcentral via RNAcentral IDs are shown on the Summary and Sequence pages.
Interactive secondary structure viewers are available on the Sequence pages.
Take the pages for a spin! For more information about the structures, please see the Help page at RNAcentral.
Categories: New Data, Website changes
Tags: RNA structure
March 25, 2021
SGD is excited to introduce our new Homology Pages! These pages can be accessed by clicking on the Homology tab in the header of SGD gene pages, as seen below.
The information displayed on the Homology Pages is divided into several sections:
If you have any questions or feedback regarding our new Homology Pages, please do not hesitate to contact us at any time.
Categories: Data updates, Homologs, New Data, Yeast and Human Disease
June 12, 2019
We are excited to announce that 50 new “Variants” data tracks are now available for use in our genome browsing tool JBrowse. Utilizing whole-genome sequencing data published by Song et al. (2015), these data tracks visualize how the sequences of 25 S. cerevisiae strains differ from that of the reference genome strain, S288C.
Two data tracks are available for each of the 25 strains: a track that indicates single nucleotide polymorphisms (SNPs) relative to strain S288C, and a track that shows insertions or deletions (“indels”) relative to S288C.
Accessing these new data tracks is easy—just enter JBrowse and click on the “Select tracks” tab on the upper-left hand part of the page. Then, select the “variants” category. You can also download the variants, annotation, and sequence files on these strains for use in your own analyses.
If you’re new to JBrowse, don’t miss out—getting started takes no time at all. For information on how to use this tool, be sure to check out the JBrowse playlist on the SGD YouTube Channel or visit the JBrowse help page. If you have any questions or feedback about the new “Variants” data tracks or about our genome browsing tool, please don’t hesitate to contact us.
Table of strains with “Variants” data tracks in JBrowse, along with links to download their respective dataset:
Categories: New Data
April 25, 2019
We have recently equipped our genome browsing tool JBrowse with 9 new Transcriptome data tracks, making JBrowse an even more powerful way to explore the vast heterogeneity of the S288C transcriptome. These information-rich data tracks visualize RNA transcripts from the TIF-seq dataset published by Pelechano et al. (2013), enabling quick and easy viewing of the position, length, and abundance of transcript isoforms sequenced in the study.
You can easily access these new tracks by entering JBrowse and clicking on the left-hand “Select tracks” tab. They are located in the Transcriptome category. In addition to viewing the data in JBrowse, you can also download the .gff3 and .bw files for these tracks for use in your own analyses.
Check out our video tutorial from the SGD YouTube channel at the top of this page for a quick overview of the new transcriptome data tracks and how to access them. More information about these tracks and how SGD created them can also be found on our Genome Browser help page.
If you have any questions or feedback about the new Transcriptome data tracks or about our genome browser, please don’t hesitate to contact us.
Data tracks that visualize transcript isoforms that fully overlap a gene coding region:
Data Track Title | Description |
longest_full-ORF_transcripts_ypd | This track contains the longest transcript overlapping each individual ORF completely for WT cells grown in glucose (ypd) media. |
longest_full-ORF_transcripts_gal | This track contains the longest transcript overlapping each individual ORF completely for WT cells grown in galactose (gal) media. |
most_abundant_full-ORF_transcripts_ypd | This track contains the most abundant transcript overlapping each individual ORF completely for WT cells grown in glucose (ypd) media. |
most_abundant_full-ORF_transcripts_gal | This track contains the most abundant transcript overlapping each individual ORF completely for WT cells grown in galactose (gal) media. |
unfiltered_full-ORF_transcripts | This track contains all transcripts that overlapped individual open reading frame (ORF) completely for WT cells grown in either glucose (ypd) or galactose (gal) media. |
Data tracks that quantify the number of transcripts that cover a given nucleotide in the S288c genome:
Data Track Title | Description |
plus_strand_coverage_ypd | For WT cells grown in glucose media (ypd), the amount of transcripts covering each position on the plus strand is represented in this track. |
plus_strand_coverage_gal | For WT cells grown in galactose media (gal), the amount of transcripts covering each position on the plus strand is represented in this track. |
minus_strand_coverage_ypd | For WT cells grown in glucose media (ypd), the amount of transcripts covering each position on the minus strand is represented in this track. |
minus_strand_coverage_gal | For WT cells grown in galactose media (gal), the amount of transcripts covering each position on the minus strand is represented in this track. |
Categories: New Data, Tutorial
March 11, 2019
SGD has now incorporated proteome-wide protein abundance data obtained from a comprehensive meta-analysis by Ho et al., 2018. The authors normalized and combined 21 different S. cerevisiae protein abundance datasets—including data from both untreated cells and cells treated with various environmental stressors—to create a unified protein abundance dataset where all values are in the intuitive units of molecules per cell. The original datasets were initially obtained using different methodologies (mass spectrometry, fluorescence microscopy, flow cytometry, and TAP-immunoblot), allowing Ho et al. to evaluate the strengths and weaknesses of these methods in addition to providing the community with a comprehensive reference map of the yeast proteome.
Normalized abundance measurements and associated metadata from untreated and treated cells are displayed in tabular form in the experimental data section of protein-tabbed pages (e.g. CDC28). Several different controlled vocabularies have been employed to standardize the metadata display. In addition, calculated median abundance and median absolute deviation (MAD) values are displayed in the protein section of Locus Summary pages (e.g. PHO85). Two new YeastMine templates have been created to provide access to these data: Gene -> Protein Abundance and Gene -> Median Protein Abundance
Special thanks to Brandon Ho and Grant Brown for generating this comprehensive reference map of protein abundance, and for their help in making this data available to the larger community.
Categories: New Data
January 15, 2019
SGD has updated our JBrowse genome browser with 157 new data tracks related to genome-wide experiments and omics data for you to explore. You can easily access these new tracks, which visualize data from the twenty publications listed below, by entering JBrowse and clicking on the left-hand “Select tracks” tab. Then, search for the PMID associated with the reference of interest.
Note that some references appear more than once, as they have multiple data tracks associated that belong to different categories in JBrowse.
For more information on using JBrowse, be sure to check out our playlist of JBrowse video tutorials on YouTube. If you have any questions or feedback about the new tracks or about our genome browser, please don’t hesitate to contact us.
Transcription & Transcriptional Regulation
Reference | PMID | Description in JBrowse |
Baptista et al. (2017) | 28918903 | ChEC-seq to map the genome-wide binding of the SAGA coactivator complex in budding yeast. |
Castelnuovo et al. (2014) | 24497191 | Genome-wide measurement of whole transcriptome versus histone modified mutants |
El Hage et al. (2014) | 25357144 | Genome-wide distribution of RNA-DNA hybrids identifies RNase H targets in tRNA genes retrotransposons and mitochondria. |
Freeberg et al. (2013) | 23409723 | Mapped regions of untranslated, polyadenylated transcriptome bound by RNA-binding proteins (RBPs) |
Kang et al. (2015) | 25213602 | Genome-wide transcript profiling by paired-end ditag sequencing |
Lee et al. (2018) | 29339748 | ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae. |
Park et al. (2014) | 24413663 | Simultaneous mapping of RNA ends by sequencing (SMORE-seq) to identify the strongest transcription start sites and polyadenylation sites genome-wide |
Rossbach et al. (2017) | 28924058 | Authors utilized the Calling Cards Ty5 retrotransposon insertion method to identify binding sites of cdc7kd, cdc7kdΔcterm and Gal4 transcription factor within the yeast genome. |
Schaughnency et al. (2014) | 25299594 | Genome-wide identification of transcription termination sites; pA pathway and non-polyadenylation pathway in strains missing Sen1p or Nrd1p |
Histone Modification
Reference | PMID | Description in JBrowse |
Castelnuovo et al. (2014) | 24497191 | Genome-wide measurement of whole transcriptome versus histone modified mutants |
Hu J. et al. (2015) | 26628362 | ChIP-seq and MNase-seq to determine how histone modifications and chromatin structure directly regulate meiotic recombination. Identified acetylation of histone H4 at Lys44 (H4K44ac) as a new histone modification |
Joo et al. (2017) | 29203645 | Next-Generation-Sequecing (NGS)-derived genome-wide occupancy of TAF (Taf1) compared with other basal initiation components (TBP and TFIIB), histones (H3, H4, Htz1 and H4 acetylation) and histone regulator complexes (Swr1, Bdf1) in S. cerevisiae |
Kniewel et al. (2017) | 28986445 | ChIP-seq to determine the whole-genome enrichment of Mek1 targeted histone H3 threonine 11 phosphorylation (H3 T11ph) during Saccharomyces cerevisiae meiosis. |
Lee et al. (2018) | 29339748 | ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae. |
Weiner et al. (2018) | 25801168 | Examining chromatin dynamics through genome-wide mapping of 26 histone modifications at 0 4 8 15 30 and 60 minutes after diamide addition using MNase-ChIP |
Chromatin Organization
Reference | PMID | Description in JBrowse |
Chereji et al. (2014) | 29426353 | Genome binding/occupancy profiling of single nucleosomes and linkers by high throughput sequencing |
Gutierrez et al. (2017) | 29212533 | Authors sought to correct sequence bias of MNase-Seq with a method based on the digestion of naked DNA and the use of the bioinformatic tool DANPOS |
Hu Z. et al. (2014) | 24532716 | Genome-wide measurement of nucleosome occupancy during cell aging |
Hu J. et al. (2015) | 26628362 | ChIP-seq and MNase-seq to determine how histone modifications and chromatin structure directly regulate meiotic recombination. Identified acetylation of histone H4 at Lys44 (H4K44ac) as a new histone modification |
Joo et al. (2017) | 29203645 | Next-Generation-Sequecing (NGS)-derived genome-wide occupancy of TAF (Taf1) compared with other basal initiation components (TBP and TFIIB), histones (H3, H4, Htz1 and H4 acetylation) and histone regulator complexes (Swr1, Bdf1) in S. cerevisiae |
Lee et al. (2018) | 29339748 | ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae. |
RNA Catabolism
Reference | PMID | Description in JBrowse |
Geisberg et al. (2014) | 24529382 | Half-lives of 21,248 mRNA 3_ isoforms in yeast were measured by rapidly depleting RNA polymerase II from the nucleus and performing direct RNA sequencing throughout the decay process. |
Smith et al. (2014) | 24931603 | Identification of genome-wide transcripts; looking at nonsense-mediated RNA decay pathway |
Transposons
Reference | PMID | Description in JBrowse |
Lee et al. (2018) | 29339748 | ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae. |
Michel et al. (2017) | 28481201 | Genome-wide examination of protein function by using transposons for targeted gene disruption |
Rossbach et al. (2017) | 28924058 | Authors utilized the Calling Cards Ty5 retrotransposon insertion method to identify binding sites of cdc7kd, cdc7kdΔcterm and Gal4 transcription factor within the yeast genome. |
DNA Replication, Recombination, and Repair
Reference | PMID | Description in JBrowse |
Mao et al. (2017) | 28912372 | Map of N-methylpurine (NMP) lesion alkalation damage across the yeast genome |
Categories: New Data
December 22, 2018
To promote the use of yeast as a catalyst for biomedical research, SGD utilizes the Disease Ontology (DO) to describe human diseases that are associated with yeast homologs. Disease Ontology annotations to yeast genes are now available through SGD’s new Disease pages. Each page corresponds to a Disease Ontology term, such as amyotrophic lateral sclerosis, and lists out all yeast genes annotated to the term by SGD.
Yeast genes with one or more human disease associations will also have a new Disease Summary tab (example: MIP1), accessible from the genes’ respective locus pages. The Disease summary tab shows all manually curated, high-throughput, and computational disease annotations for the yeast gene. Additionally, these pages feature a network diagram that depicts shared disease annotations for other yeast genes and their human homologs.
For more information, check out SGD’s Disease Ontology help page. Explore the new Disease pages and features, and be sure to let us know if you have any feedback or questions.
Categories: New Data
December 14, 2018
Macromolecular complexes, already retrievable from SGD’s YeastMine data warehouse, are now available on new pages on the SGD website. These new Complex pages (example: GAL3-GAL80 complex) provide manually curated information about the complex as well as helpful links and diagrams. Key features of Complex pages include:
Complex pages can be accessed by running a search for the complex, or by visiting the gene summary pages of its subunits. For example, to find the GAL3-GAL80 complex page, simply run a search for “GAL3-GAL80” and click on the Complexes category (symbolized by the gold dot). Or, go to the GAL3 or GAL80 gene page and locate the Complex section.
SGD curated these macromolecular complex data in collaboration with curators at EMBL-EBI’s Complex Portal. Be sure to check out the page for your favorite complex, and let us know if you have any feedback or questions.
Categories: New Data
April 17, 2018
1,011. That’s the number of different Saccharomyces cerevisiae yeast strains that were whole-genome sequenced and phenotyped by a team of researchers jointly led by Joseph Schacherer and Gianni Liti, published this week in Nature (Peter et al., 2018; data at: http://bit.ly/1011genomes-DataAtSGD).
Scrupulously gathering isolates of S. cerevisiae from as many diverse geographical locations and ecological niches as possible, the authors and their collaborators plucked yeast cells not only from the familiar wine, beer and bread sources, but also from rotting bananas, sea water, human blood, sewage, termite mounds, and more. The authors then surveyed the evolutionary relationships among the strains to describe the worldwide population distribution of this species and deduce its historical spread.
They found that the greatest amount of genome sequence diversity existed among the S. cerevisiae strains collected from Taiwan, mainland China, and other regions of East Asia. This means that in all likelihood the geographic origin of S. cerevisiae lies somewhere in East Asia. According to the authors, our budding yeast friend began spreading around the globe about 15,000 years ago, undergoing several independent domestication events during its worldwide journey. For example, it turns out that wine yeast and sake yeast were domesticated from different ancestors, thousands of years apart from each other. Whereas genomic markers of domestication appeared about 4,000 years ago in sake yeast, such markers appeared in wine yeast only 1,500 years ago.
Additionally — and similar to the situation where human interspecific hybridization with Neanderthals occurred only after humans migrated out of Africa — it appears that S. cerevisiae has inter-bred very frequently with other Saccharomyces species, especially S. paradoxus, but that most of these interspecific hybridization events occurred after the out-of-China dispersal.
There are many more gems to be found among the treasure trove of information in this paper. Some notable conclusions from the authors include: diploids are the most fit ploidy; copy number variation (CNV) is the most prevalent type of variation; most single nucleotide polymorphisms (SNPs) are very rare alleles in the population; extensive loss of heterozygosity is observed among many strains. There are also phenotype results (fitness values) for 971 strains across 36 different growth conditions.
As is often the case for yeast, the ability to sequence and analyze whole genomes at very deep coverage has yielded broad insights on eukaryotic genome evolution. The team’s work highlights this by presenting a comprehensive view of genome evolution on many different levels (e.g., differences in ploidy, aneuploidy, genetic variants, hybridization, and introgressions) that is difficult to obtain at the same scale and accuracy for other eukaryotic organisms.
SGD is happy to announce that in conjunction with the authors and publishers, we are hosting the datasets from the paper at this SGD download site. These datasets include: the actual genome sequences of the 1,011 isolates; the list of 4,940 common “core” ORFs plus 2,856 ORFs that are variable within the population (together these make up the “pangenome”); copy number variation (CNV) data; phenotyping data for 36 conditions; SNPs and indels relative to the S288C genome; and much more. We hope that the easy availability of these large datasets will be useful to many yeast (and non-yeast) researchers, and as the authors say, will help to “guide future population genomics and genotype–phenotype studies in this classic model system.”
Categories: Announcements, New Data
Tags: evolution, genome wide association study, Saccharomyces cerevisiae, strains
September 08, 2016
Ever wonder how quickly your favorite protein turns over within the cell? SGD has just incorporated half-life data for 3700 yeast proteins from a paper by Christiano et al., 2014. In this study, Christiano and colleagues pulse labeled exponentially growing wild type yeast cells in synthetic medium with a heavy lysine isotope (pulse SILAC), and followed the decay of native untagged proteins using high-resolution mass spectrometry based proteomics. The data generated in this study can be accessed by viewing the Experimental Data section of the Protein tab for your favorite gene, such as the short-lived Ctk1p or the long-lived Rsc1p.
In addition, you can retrieve this half-life data using YeastMine for one or more proteins with the Gene–>Protein Half-life template or obtain a list of proteins with half lives within a given range using the Retrieve–>Proteins with half-life in a given range template. Both of these templates can be found in the “Templates” section of YeastMine under the “Protein” category.
Thanks to Romaine Christiano and Tobias Walther for their help integrating this information into SGD.
Categories: New Data