Richard Yates Tao Lin Epub Converter
Free pdf to excel converter download the mind map book tony buzan pdf free download. Torrent Jeux Gamecube Iso. Tao lin richard yates epub download il gioco dell eroe ebook download.

Bottom-up proteomics relies on the use of proteases and is the method of choice for identifying thousands of protein groups in complex samples. Top-down proteomics has been shown to be robust for direct analysis of small proteins and offers a solution to the “peptide-to-protein” inference problem inherent with bottom-up approaches. Here, we describe the first large-scale integration of genomic, bottom-up and top-down proteomic data for the comparative analysis of patient-derived mouse xenograft models of basal and luminal B human breast cancer, WHIM2 and WHIM16, respectively. Using these well-characterized xenograft models established by the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium, we compared and contrasted the performance of bottom-up and top-down proteomics to detect cancer-specific aberrations at the peptide and proteoform levels and to measure differential expression of proteins and proteoforms. Bottom-up proteomic analysis of the tumor xenografts detected almost 10 times as many coding nucleotide polymorphisms and peptides resulting from novel splice junctions than top-down. For proteins in the range of 0–30 kDa, where quantitation was performed using both approaches, bottom-up proteomics quantified 3,519 protein groups from 49,185 peptides, while top-down proteomics quantified 982 proteoforms mapping to 358 proteins.
Examples of both concordant and discordant quantitation were found in a ∼60:40 ratio, providing a unique opportunity for top-down to fill in missing information. The two techniques showed complementary performance, with bottom-up yielding eight times more identifications of 0–30 kDa proteins in xenograft proteomes, but failing to detect differences in certain posttranslational modifications (PTMs), such as phosphorylation pattern changes of alpha-endosulfine.
This work illustrates the potency of a combined bottom-up and top-down proteomics approach to deepen our knowledge of cancer biology, especially when genomic data are available. Recent advances in high-throughput genomics have allowed deep characterization of cancer at the DNA and RNA level.
Large-scale initiatives, such as The Cancer Genome Atlas at the National Cancer Institute, have provided comprehensive genomic analyses of human tumors from many cancer types and, thus, the prospect for novel insights into the pathways leading to cancer and new possibilities for medical advances. It is well known that genomic aberrations and an inability to properly maintain and repair genetic material enable tumor initiation and progression (). The large-scale mapping of cancer genomes has provided a detailed catalogue of mutations and polymorphisms that may translate into proteome variation and has left researchers wondering which genomic abnormalities drive tumor biology and which are functionally irrelevant.
Although RNA sequencing can provide supporting evidence for the translation of DNA-level mutations into the proteome and alternative splicing, events, including signal peptide cleavage and a multitude of biologically active posttranslational modifications (PTMs) can significantly increase protein variation that RNA-seq data could not reliably predict. Recent studies have also shown that RNA transcript measurements poorly predict protein abundance differences between tumors (). Thus, detection of mutations and PTMs at the protein level provides a direct readout of the biological impact of cancer-related genomic abnormalities. Proteomic technologies, especially those based on mass spectrometry (MS), have the potential to detect genetic aberrations at the protein level. These technologies aim to identify the genes that give rise to proteins, characterize any modifications from the primary amino acid sequence, and quantify differences in relative expression levels between samples.
Ideally, these techniques would be operable for all the proteins expressed in a cell, tissue, or other complex protein mixture; however, this is not the case. Different technologies exist, each with its unique strengths and weaknesses. Two forms of proteomics analyses are shotgun bottom-up (BU) and top-down (TD) (). In BU proteomics, the proteins are digested with a protease, such as trypsin, prior to peptide detection and sequencing using tandem mass spectrometry. Protease digestion results in a complex mixture of peptides between 500–3,500 Da that are usually separated by reverse phase liquid chromatography or multidimensional chromatography in-line with a mass spectrometer (, ). Precursor mass measurements, along with MS/MS fragmentation information, allow inference of the protein composition of the sample via these peptides.
Extremely sensitive BU methods have been developed and are capable of identifying >5,000 protein groups within a single sample, with some peptide sequences present in multiple proteins or isoforms. Such shared peptides can lead to ambiguities in identifying the unique proteins present in the sample, the so called protein parsimony problem (). Also, enzymatic digestion can result in the loss of information about combinatorial PTMs and sequence variants.
Top-down (TD) proteomics, on the other hand, does not rely on the use of proteases and examines proteins as a whole. In doing so, top-down proteomics can fully characterize the composition of individual proteoforms (), including proteolysis products, signal peptide cleavage, sequence variants, and PTMs co-occurring on the same molecule. A typical TD workflow consists of single or multi-step protein separations, such as reverse-phase liquid chromatography () and GELFrEE (), and the resulting protein fractions are further separated by liquid chromatography in line with a mass spectrometer. Advances in MS instruments and protein separations have allowed TD proteomics to become a robust technique for the identification and characterization of ∼2,000–3,000 proteoforms (–). Unlike BU, TD proteomics routinely links proteins to their parental genes without the problem of protein inference. With the recent advent of methods for differential quantitation using TD on proteins below 30 kDa (), it is now possible to begin comparing BU and TD techniques for three primary proteomic tasks: gene identification, whole proteoform characterization, and detection of differential expression.
While some efforts have explored the complementarity of BU and TD technologies in the study of less complex proteomes (, ) and the structural analysis of antibodies (, ), herein we describe the first evaluation of the complementarity of BU and TD technologies for the qualitative and quantitative analysis of cancer proteomes. To accomplish this task, we employed two samples from patient-derived xenografts (PDXs) established from a basal-like (WHIM2-P32) and luminal B (WHIM16-P33) breast cancer (–). Patient-derived breast cancer xenografts have been established as reliable models of human tumors that provide a renewable resource for studying the human disease (,, ). These patient-derived xenograft tumor lines are genomically well-characterized (, ) and have been used to generate Comparison Reference (CompRef) samples within the Clinical Proteomic Tumor Analysis Consortium (CPTAC) () for performance validation of mass spectrometry protocols and workflows. Genome and RNA sequencing of the xenografts has provided us with lists of sequence variants, due to single nucleotide polymorphisms (SNPs), and novel splice junctions. Using these well-characterized xenograft models, we compared and contrasted the performance of BU and TD proteomic approaches to detect cancer-specific aberrations at the peptide and proteoform levels and to measure differential expression of proteins and proteoforms.
This work represents the first large-scale integration of genomic, BU, and TD proteomic data for comparative analysis of PDXs comprised of the studies described in. In brief, Study 1 was designed to provide information on the ability to detect tumor-specific features informed by prior RNA-seq data of these samples (, ). Study 2 tested the applicability of the recently established label-free top-down quantitative proteomics platform () for the analysis of tumors. Finally, Study 3 sought to detect differential expression of proteins and proteoforms between basal and luminal B breast cancer samples for the low molecular weight proteome (. Endoprotease Digestion of GELFrEE Fractions The proteins in GELFrEE fractions for Study 1 were precipitated using acetone. Protein pellets were dissolved in 20 μl of Tris buffer (100 m m, pH 8.5) containing 8 m urea.
GELFrEE fractions for Study 3 were received as protein pellets and dissolved in 20 μl of Tris buffer (100 m m, pH 8.5) containing 8 m urea. Horseradish peroxidase (1 μg) was added to each digest for Study 1 and Study 3 samples as a digest standard. The proteins were reduced using TCEP (5 m m) (Thermo) for 30 min, and alkylated with iodoacetamide (40 m m) (Sigma) at room temperature in the dark for 30 min. The reaction was quenched with DTT (20 m m) (Sigma) for 15 min. The methods described by Zybailov et al. () were followed with minor modifications. Specifically, the proteins were digested for 4 h with endoprotease LysC (5 μg) (Sigma) on a Thermomixer (750 rpm) at 37 °C.
The digests were then diluted fourfold with Tris buffer (100 m m, pH 8.5) and trypsin (5 μg) was added with continued incubation overnight. Due to the different protein concentrations of the individual GELFrEE fractions, the enzyme to protein ratio for the LysC and trypsin digests ranged from 1:25–1:50 and 1:5–1:10, respectively. The digests were acidified to 5% formic acid (Fluka) and filtered through a Microcon centrifugal filter (30K molecular weight cutoff) (Millipore). The peptides were desalted in parallel on Glygen Nutips containing C4 and graphitic carbon solid phase on a Biomek NXP (Beckman Coulter), as previously described (). The eluted peptides were dried in a SpeedVac and dissolved in water/acetonitrile/formic acid (98%/1%/1%) and transferred to autosampler vials for storage at −80 °C prior to LC-MS analysis.
High-Performance Liquid Chromatography with High-Resolution Tandem Mass Spectrometry A NanoLC 2D Plus System with a cHiPLC-Nanoflex and AS2 autosampler (ABSciex, Concord, ON) was configured with two columns in parallel. One cHiPLC column (ChromXP C 18 (200 μm × 15 cm; particle size 3 μm, 120 Å) was used to inject calibrant solution (β-galactosidase peptides (625 pmol/vial)), and the other cHiPLC column was used for sample analysis. The calibrant solution (500 fmol) was injected in solvent A (water/acetonitrile/formic acid, 98%/1%/1%). The samples were loaded in a volume of 10 μl at a flow rate of 0.8 μl/min followed by gradient elution of peptides at a flow rate of 800 nl/min.
The calibrant solution was eluted with the following gradient conditions with solvent B (water/formic acid/acetonitrile, 1%/1%/98%):0, 2%; 3 min, 2%; 73 min, 50%; 83 min, 80%; 86 min, 80%; 87 min 2%; and 102 min, 2%. The digests from the five fractions from 0–30 kDa (Study 3) were analyzed under the following gradient conditions (time, percentage solvent B): 0, 2%; 5 min, 2%; 365 min, 35%; 400 min, 80%; 405 min, 2%; and 425 min, 2%. The digests from the 12 GELFrEE fractions (Study 1) were analyzed under the following gradient conditions (time, percentage solvent B): 0, 2%; 5 min, 2%; 650 min, 35%; 695 min, 80%; 700 min, 2%; and 720 min, 2%. Data acquisition was performed with a TripleTOF 5600+ mass spectrometer (AB SCIEX, Concord, ON) fitted with a PicoView Nanospray source (PV400) (New Objectives, Woburn, MA) and a 10 μm Silica PicoTip emitter (New Objectives) for bottom-up proteomics. Data were acquired using an ion spray voltage of 2.9 kV, curtain gas of 20 psi, nebulizer gas of 25 psi, and an interface heater temperature of 175 °C. The MS was operated with a resolution of greater than or equal to 25,000 (fwhm) for TOF-MS scans.
For data-dependent acquisition, survey scans were acquired in 250 ms from which 100 product ion scans were selected for MS2 acquisition for a dwell time of 20 ms. Precursor charge state selection was set at +2 to +5. The survey scan threshold was set to 100 counts per second. The total cycle time was fixed at 2.25 s.
Four time bins were summed for each scan at a pulser frequency value of 15.4 kHz through monitoring of the 40 GHz multichannel time to digital converter detector with four-anode/channel detection. A rolling collision energy was applied to all precursor ions for collision-induced dissociation as described in the Analyst software. The raw LC-MS data (*.wiff) were converted to *.mzML format utilizing the AB SCIEX MS Data Converter v1.3 (AB SCIEX, Concord, ON) within PEAKS STUDIO 7.0 (Bioinformatics Solutions Inc., Waterloo, Canada) (). The resulting files were used for database searching by the PEAKS software using the following databases.
Tumor-specific protein sequence databases were created by starting with RefSeq release 50 and adding variants detected in whole genome sequencing of the xenografts and the corresponding germline. Alternative splice forms detected by RNA-seq of the tumors were also added to the protein sequence database. The variant calling for the whole genome sequencing data was done using GATK version 2.6, and the RNA-seq data were analyzed using TopHat version 2.0.3.
The searches were conducted with trypsin cleavage specificity, allowing three missed cleavages, oxidation of methionine and carbamidomethylation of cysteine as variable and fixed modifications, respectively. A parent ion tolerance of 25 ppm and a fragment ion tolerance of 100 millimass units were used. The MS2-based peptide identifications were validated within PEAKS software using a modified target decoy approach, decoy fusion, to estimate the false discovery rate (FDR).
A 1% FDR for peptide spectral matches was used as the quality filter to identify peptides and an FDR of. LC/MS For all studies, proteins were resuspended by pipetting vigorously with 40 μl solvent A (95% water, 5% acetonitrile, 0.2% formic acid) after SDS removal. Resuspended protein fractions (5 μl) were injected onto a trap column (150 μm inner diameter × 3 cm) using an autosampler (Thermo Dionex). For Study 1, a nanobore analytical column (75 μm inner diameter × 15 cm) was coupled to the trap in a vented tee setup.
Upstream of the column a 15 μm spray tip from New Objective was connected. The trap and analytical column were packed with polymeric reverse phase (PLRP-S, Phenomenex) media (5 μm d p, 1,000 Å pore size). The Dionex Ultimate 3000 system was operated at a flow rate of 2.5 μl/min for loading samples onto the trap.
Proteins were separated on the analytical column and eluted into the mass spectrometer using a flow rate of 300 nl/min and the following gradient: 5% B at 0 min.; 15% B at 5 min.; 55% B at 55 min.; 95% B from 58–61 min.; and 5% B from 64 to 80 min. Solvent A consisted of 95% water, 5% acetonitrile and 0.2% formic acid, and solvent B consisted of 5% water, 95% acetonitrile, and 0.2% formic acid.
In Studies 2 and 3, proteins were injected onto a PepSwift trap column (200 μm inner diameter × 5 mm, Thermo Fisher) at 10 μl/min, separated onto a monolithic ProSwift RP-4H analytical column (100 μm inner diameter × 50 cm) and eluted into the mass spectrometer using a flow rate of 1 μl/min and the following gradient: 1% B at 0 min.; 55% B at 55 min.; 95% B from 58–61 min.; and 5% B from 64 to 80 min. MS data were obtained on an Orbitrap Elite (Thermo) mass spectrometer fitted with a custom nanospray ionization source. Previous studies (, ) have demonstrated that high-energy collisional dissociation results in higher number of identifications than other fragmentation techniques, such as electron transfer dissociation and thus, high-energy collisional dissociation was the fragmentation of choice in the work described here. For proteins of molecular weight 30 kDa, the MS method included the following events: (1) precursor scan, ion trap, 25 microscans, m/z 500–2,000 and (2) data-dependent MS/MS on the top two peaks in each spectrum from scan event 1 using high-energy collisional dissociation with normalized collision energy of 25, isolation width 200 m/z, four microscans, and detection of ions with resolving power of 60,000 (at m/z 400). Data Analysis ProSightPC PUF files were created using a custom version of the cRAWler application. These neutral mass data were searched against an eight-step search tree ().
First, each target was searched with strict search criteria (mass tolerance of 2.2 Da for precursor mass and 10 ppm for fragment masses) against a mouse-specific database (UniProt Release 2014_05) to remove proteins that were a good match to the murine xenograft host. This implies that proteins with identical sequence in both human and mouse will be filtered from further analysis. The filtered proteins are listed in under the heading Search 0. Next, a WHIM-specific PTM-annotated database was created according to the workflow in. Any target failing to be identified with the mouse search was then searched against this database, first with a strict absolute mass search ( i.e.
Mass tolerances of 2.2 Da for precursor mass and 10 ppm for fragment masses), followed by a strict biomarker search ( i.e. Mass tolerance of 10 ppm for precursor and fragment masses). This step identified proteoforms from the xenograft that were a good match to non-sample-specific proteoforms. The remaining unidentified proteoforms were then searched against sample-specific databases to identify proteoforms that are uniquely associated with genetic events in each of the WHIM2 and WHIM16 tumor samples. Lastly, still unidentified targets were subjected to broad searches designed to identify previously unknown proteoforms. Lists the number of targets identified at each of these steps. Study 2: Label-Free Top-Down Quantitation (Single Fraction up to 30 kDa) An 8% GELFrEE cartridge was used to obtain a single fraction containing proteins of MW from 0 to 30 kDa.
After SDS removal, proteins were resuspended in solvent A and injected onto the RP-4H LC setup described above. The GELFrEE was performed three times for each CompRef sample and the resulting protein fractions were analyzed in six LC/MS replicates for a total of 18 RAW files per sample.
Neutral mass data were created and searched as described in Study 1 above, and quantitative results generated by the same analysis of variance (ANOVA) analysis described in Study 3 below. Study 3: Quantitative BU-TD Comparison of CompRef Samples (Multiple Fractions up to 30 kDa) A 10% GELFrEE cartridge was used to obtain five protein fractions ranging in molecular weight from 0 to 30 kDa for each sample.
After SDS removal, proteins were resuspended in solvent A and injected onto the RP-4H LC setup described above. The GELFrEE was performed three times for each CompRef sample and the resulting protein fractions were analyzed in five LC/MS replicates, for a total of 150 RAW files. The research design is illustrated in. Data Analysis The RAW files (150) generated for this study were analyzed in two steps: quantitation and proteoform identification. To identify proteoforms, for each MS1-based mass group, neutral masses were determined from all 150 RAW files, and ProSightPC PUF files were created using a custom version of the cRAWler application. These neutral mass data were searched as described above for Study 1.
In the quantitation step, neutral masses were inferred from all files, and then only those with identifications from tandem MS were grouped based on accurate MS1 mass and retention time. Next, the intensity from the mass groups, for each proteoform, were standardized within each fraction. Specifically, the average intensity for all measurements of a given proteoform was subtracted from each measurement, and the resulting difference was divided by the standard deviation of all measurements of that proteoform. Subtracting the mean centers the proteoform intensity data on zero, and the division rescales the data into units of standard deviations.
The standardized values were then subjected to a hierarchical linear model-based ANOVA, with Benjamini and Hochberg FDR correction at α = 0.05, to find proteoforms that were differentially expressed between WHIM2 and WHIM16. TD and BU Comparisons TD measures intact proteoforms, while BU measures tryptic peptides derived from sets of proteoforms sharing amino acid sequences. In order to consistently compare these two techniques, we chose to call sets of peptides from BU that map to a single RefSeq identification (ID) number as detecting a “protein.” Likewise, TD frequently identified more than one proteoform associated with a single RefSeq ID. Therefore, the number of proteoforms reported from each study is greater than the number of proteins identified by TD; if five proteoforms associated with a single RefSeq identifier were discovered with TD, this was reported as one protein and five proteoforms. In this study, unlike TD, BU analysis of CompRef sets allowed comprehensive identification of protein groups without any MW restriction.
All of the primary mass spectrometry data are deposited at the CPTAC Data Coordinating Center as raw files for public access (). RESULTS We set out to compare the ability of TD and BU to (1) identify the genes from which protein products were derived; (2) characterize proteoforms, including any PTMs, SNPs, and novel splice junctions; and (3) detect differential expression of proteins and proteoforms between a basal-like (WHIM2) and a luminal B (WHIM16) breast tumor xenograft sample. The three studies employed are described in. The workflow for all three studies included GELFrEE separation prior to LC/MS and data analysis as illustrated in. It was expected that TD would identify fewer molecular entities but that these would be characterized proteoforms, while BU would identify a greater number of proteins but do so with lower sequence coverage. Protein Identifications Studies 1 and 3 offered head-to-head comparisons of TD and BU.
Using GELFrEE to separate the proteome into molecular weight bins, we obtained information about proteins present with molecular weights ranging from 0–100 kDa and 0–30 kDa in Studies 1 and 3, respectively. The number of protein and proteoform identifications is enumerated in and a detailed list is included in, for TD and BU, respectively.
In all TD analyses, any proteoform that was consistent with mouse, even when having sequence homology with human, as in the case of histones, was removed from the counts. In both studies, BU resulted in a greater number of protein identifications than TD, as anticipated. Despite significant improvements for top-down proteomics in discovery mode (), BU outperforms TD in the >40 kDa range. Identification of Sequence Variants and Alternative Sequences Unique to the WHIM Tumor Samples Whole genome and RNA sequencing of the WHIM2 and WHIM16 PDX models, as well as the corresponding primary tumor sample (, ) provide an excellent foundation for evaluating the proteomic technology capacity for detecting sample or “WHIM-specific” coding SNPs and alternate splice variants that may give rise to unique proteoforms.
The BU datasets included the detection of 188 peptides containing sample-specific SNPs and 27 peptides crossing the junction of sample-specific novel splice junctions for each WHIM sample. In comparison, analysis of the TD datasets allowed the detection of 10 proteins containing WHIM-specific SNPs as shown in. A single proteoform resulting from a WHIM-specific novel splice junction was also detected in both WHIM2 and WHIM16 samples, and its sequence coverage by TD appears in. Since proteoforms containing sequences differing by one amino acid, for example, a reference protein sequence and a protein containing a cSNP, will likely coelute, TD was expected to be well suited for detecting allelic expression ratios. Indeed, as demonstrated in, gamma-synuclein (RefSeq:, UniProt:) and ribosomal protein L35 (RefSeq:, UniProt:) displayed that protein products from heterozygous alleles are being expressed at a roughly 1:1 ratio.
In both cases, TD detected both protein forms and gave relative quantitative information about the abundance of the protein products resulting from the expression of the two different allelles. BU could provide that information only for gamma-synuclein, while in the case of ribosomal protein L35, peptides containing the site of the coding SNP were not detected.
Label-Free Top-Down Quantitative Analysis Recently, a workflow for label-free top-down quantitation in discovery mode was described (). Study 2 was designed to demonstrate the efficacy of this new method on the CompRef samples. In this scenario, a 2×3×6 study design ( i.e. Two states, three GELFrEE replicates, and six LC/MS injection technical replicates) was established for the comparative proteome analysis of WHIM2 and WHIM16. Proteins ranging from 0–30 kDa were isolated using GELFrEE followed by LC-MS/MS as described above. Next, a hierarchical linear model was applied to quantify intact proteoforms within the samples.
A volcano plot ( A) was generated, in which each proteoform was represented as a function of estimated effect size (in log 2 fold-change) and the statistical confidence (FDR) that there was a difference in the normalized intensities between the two samples. Of the 5,975 quantitation mass targets detected in total, 1,031 of them were above the 5% FDR value comparing the WHIM2 and WHIM16 samples. Of all the quantitation mass targets, 538 were unambiguously identified using MS/MS information obtained during LC-MS. Among the differentially expressed proteoforms is the canonical isoform of gamma-synuclein ( A), a protein known to be expressed in late-stage breast tumors ().
A list of all differentially expressed proteoforms from this study appears in. Comparison of TD and BU Label-Free Quantitation In Study 3, the ability of TD and BU to quantify proteoform differential expression was evaluated. As described above, TD label-free quantitation is limited to the low MW proteome (. Overall quantitative results for Study 3 reveal a prevalence of concordant examples where proteoform-level changes differ substantially from that determined by BU Of those proteins detected by both methods and mapping to the same RefSeq identifiers, the TD and BU quantitation agreed that there was differential expression at 60% of the time at the protein level. BU can only quantify what is happening on average to all proteoforms of a given protein due to prior proteolysis and not the individual proteoforms themselves.
However, it is often the relative abundance of proteoforms harboring PTMs that changes and not the absolute abundance of the protein group. Developmental Testing Program Radiography on this page. With regard to DE concordance, we found that many of the measurements agreed between the two methods. Disagreements in DE often arise from changes in PTM stoichiometry, creating dynamic behavior in TD proteomics, which is often obscured through peptide–protein inference in BU proteomics.
Furthermore, the estimates of fold change between the two techniques agreed somewhat for those proteins and proteoforms where both methods agreed on differential expression (R-squared of 0.39, r = 0.62), as seen in D. DISCUSSION Across all three studies comparing the WHIM samples, it is clear that BU proteomics is able to identify more proteins than TD.
However, TD proteomics identifies and characterizes different entities than BU, namely intact proteoforms ( i.e. The different molecular forms of a protein arising from a single gene). The number of proteoforms per RefSeq identification as discovered by TD varies significantly and the level of variation can be seen in.
Of note, only 21% of proteoforms are the sole representative of a RefSeq ID, and 52% of RefSeq IDs are seen by only one proteoform. The ability to detect and quantify proteoforms makes TD more sensitive at determining changes in PTMs and variant expression within complex samples that may be crucial in biological processes responsible for signal transduction.
One clear example of these differences comes from alpha-endosulfine (RefSeq:, UniProt:) from Study 3, as shown in. It is known that phosphorylation of this protein affects its secondary structure and its corresponding protein–protein interactions (). Both techniques found a greater abundance of the unmodified protein in WHIM2 ( A and C). However, TD discovered a significant difference in phosphorylation stoichiometry that accompanied the abundance change and detected a diphosphorylated proteoform that was only present in WHIM16.
Both methods had strong supporting evidence ( B); TD confidently detected Isoform 1 of this protein with 121 observations of the unmodified form, and 48 observations on the diphosphorylated form, and BU detected the protein with five matched peptides covering 39.2% of the sequence and with 302 spectral counts. Although the BU analysis detected a peptide spanning one of the two phosphorylation sites, the analysis did not detect the phosphorylated peptide. As seen from D, the level of phosphorylation in WHIM16 is higher than in WHIM2, with both mono- and di- phosphorylated forms at higher relative abundance than in WHIM2, yet the unmodified form is higher in WHIM2.
Despite the excellent sequence coverage by BU, it was not possible to capture this level of dynamism in PTM levels. Phosphopeptide enrichment prior to BU may have resulted in the detection of these phosphorylations, as shown in the analysis of similar xenografts ().
However, BU could not possibly report on the exact proteoform present, as co-occurrence information of PTMs is lost during proteolysis. Differential expression of alpha-endosulfine. ( A) Bottom-up heatmap illustrating number of alpha-endosulfine peptides identified in each replicate.
Each row represents a separate peptide reporting uniquely on alpha-endosulfine, while columns in the map. Now consider an example from those proteins that TD detected as differentially expressed but were not classified as such by the BU analysis. A shows the results for d-dopachrome decarboxylase (RefSeq:, UniProt: ), a protein in this class. Based on 24 MS1 observations and a strong characterization, TD data ( E) showed this protein to be DE with an estimated fold change of over 36x more abundant in WHIM16 than WHIM2 (instantaneous q = 0.00001). The BU analysis had 91.5% sequence coverage from 18 different peptides and 786 spectral observations.
The t test used to detect differential expression of the protein had a p value of.0086, but the critical value to maintain the 1% FDR for this test was.0019, thus, the protein was not considered DE by BU. B shows the BU results for androgen-induced gene 1 (RefSeq:, UniProt: ), a protein not detected by TD. The t test for this protein had a p score five ranks better than d-dopachrome decarboxylase, but it was identified by only five peptides, four of which were only seen once, and the other peptide had at most only three spectra in a single file. The effect of these marginal identifications is that they lengthen the list of identifications and force stronger identifications such as d-dopachrome decarboxylase to meet more stringent criteria in the DE analysis.
This is an inherent tradeoff in quantitative omics studies. Less-stringent criteria can be used to accept more identifications, but in doing so, entities with less support are passed forward to the DE analysis. These less-well-supported entities, in this case BU protein IDs, increase the difference in treatment means needed for all entities detected to pass the multiple testing correction. Discordant examples of differential expression profiles as measured by BU and TD. Panels A–D show heatmaps generated from BU spectral count data, while panels E–G contain corresponding boxplots from TD MS1 intensity data. Each row of the.
Decisions made in the data analysis pipeline can have a drastic effect on the sensitivity of either technique. Differences in the statistical criteria used in the DE analysis can cause a protein to be identified by BU as DE but not so by TD. An example is illustrated in the case of cytochrome b-c1 complex subunit 8 (RefSeq:, UniProt: ) ( C). The classification of DE hinges on the assumption of uniform variances in the log-transformed spectral counts. For the BU analysis, we assumed the log-transformed spectral counts between the WHIM2 and WHIM16 GELFrEE replicates had equal variance, and thus the corresponding increase in p value of the t test was sufficient to move this result onto the DE list for the BU study; if that assumption is relaxed, then the results are not sufficiently great for this protein to be considered DE, and the protein would be in agreement with the hierarchical linear model used for the TD analysis as not DE. In some cases, the two methods can simply disagree as shown in the case of protein phosphatase 1 regulatory subunit 1B (RefSeq:, UniProt: ) (D). BU found 10 peptides each with one to three spectral counts in those LC runs that detected them in WHIM2, while no spectra were detected in WHIM16, leading to the conclusion that the protein is present in WHIM2 and absent in WHIM16.
Meanwhile, TD had 33 observations, 12 from WHIM16 where they showed no difference in mean intensity ( G). This case highlights the inferential problems that can arise from datasets containing many missing values. The BU dataset has 78.5% missing values. Therefore, when the WHIM2 data are near the detection threshold, i.e.
No single file spotted any supporting peptide more than three times, the total absence of spectral counts from the WHIM16 LC-MS/MS runs is not necessarily compelling evidence of the protein not being expressed. In TD, it is easy to determine if the intensity of a proteoform is near the detection limit by looking at the signal-to-noise ratio of the intact measurement.
The difference in the ability of the two techniques to characterize major changes to proteins is highlighted by the response to cytoskeletal keratins. Keratin, in general, is an intermediate filament protein and one of the most common contaminants of proteomic studies.
Like other intermediate filament proteins, keratins come in two complementary types, Type 1 and Type 2, and form polymeric complexes that shape both intracellular and extracellular structures (). Type 2 intermediate filament proteins are known to have unique head and tail regions that differentiate their cellular function, while retaining the highly conserved central region responsible for forming the filamentous dimers ().
Forming the primary component of human hair and skin and containing a large and highly conserved middle rod section, keratins are frequently dismissed in MS-based proteomics as contaminants from sample preparation. Type 2 cytoskeletal keratin 8 (K2C8, RefSeq:, UniProt: ), however, is a cytoplasmic keratin used as a variable diagnostic tool in differentiating lobular and ductal breast cancers (). Ductal carcinomas tend to stain diffusely positive for K2C8 markers (), while lobular do not.
In Study 3, the BU experiment found 98 peptides spanning nearly the entire length of K2C8 with 88.2% sequence coverage. The BU analysis found a strong increase of K2C8 in WHIM16 (1,727 versus 639 spectral counts; instantaneous q = 3.5 × 10 −5; 2.7x fold increase in WHIM16). Unfortunately, because of the conserved nature of keratins, these results are easily dismissed as contamination. Panel A of shows that much of the BU data come from peptides unique to K2C8, and it is easy to see both its increase in WHIM 16 and a fairly uniform distribution of peptides across all three regions of the protein (head, rod, and tail). By virtue of the “biomarker” search strategy (effectively a “no-enzyme” type search for TD data; see experimental section), the TD study found 17 proteoforms derived from K2C8, all of which were proteolytic fragments from the unique head or tail regions (see ). Eight out of 17 of the proteoforms were significantly increased in WHIM16 (ranging from 2–18-fold), the luminal-B cancer subtype.
Two of the DE proteoforms from the N-terminal variable region contain phosphorylation, and one also contains a 3-hydroxy- l-proline. These proteoforms cannot be explained as hair or skin contamination as they represent unique sequences found only in the cytoplasmic keratin. Furthermore, all of the proteoforms from the head region end either one or two amino acids from S74, which is known to play an important role in keratin filament reorganization ().
These observations are consistent with increased proteolytic release of the head and tail domains of K2C8 in luminal versus basal PDX models. While cytokeratins and intermediate filaments have been identified as possible probes of breast cancer subtypes previously (), these are the first such observations of head/tail proteolytic events only made possible via detection of intact proteoforms instead of tryptic peptides. Summary and Future Directions While BU and TD generally display complementary sensitivities, the trends found within the data here provide a first tranche of specific observations.
For example, BU identified 7.4 times as many proteins as TD, and 6.3 times as many proteins were found to be differentially expressed. TD proved sensitive for detecting proteoform-level differences below 30 kDa, such as the multiple phosphorylation forms of alpha-endosulfine, relative expression of heterozygous alleles like in gamma-synuclein or ribosomal protein L35, and domain-specific regions of keratin.
BU discovered 10 times as many cancer–events but was not able to accurately predict which of these events were DE. While precise mapping of BU and TD data is complicated because they measure fundamentally different things, an early estimate of the proteoform-level dynamics not captured by BU can be made: For small, abundant proteins, changes in primary structure not captured by BU occur in about 40% of cases. Future quantitative TD studies will benefit from the analysis of larger proteins (30–60 kDa) as many proteins fall within that MW range.
Given this study (and others), it is clear that there are significant benefits from the integration of BU and TD proteomics analyses, as a strong complementarity exists between peptide- and proteoform-level measurements. Contributed by Author contributions: I.N., R.D.L., R.T.F., S.R.D., P.M.T., P.D.C., E.S.B., H.R., R.R.T., and N.L.K. Designed the research; I.N., R.D.L., and P.E. Performed the research; I.N., R.D.L., R.T.F., J.R., B.P.E., K.V.R., and D.F. Analyzed data; I.N., R.D.L., R.R.T., and N.L.K. Wrote the paper; and S.L. Established and provided xenografts.
* This work was supported by Award No. GM067193 from the National Institute of General Medical Sciences (NLK), Federal Funds from the National Cancer Institute (Office of Cancer Clinical Proteomics Research), National Institutes of Health, under Contract No. HHSN01E, and the Office for Research at Northwestern University.
This research was also partially supported by the National Science Foundation under grant no. ABI-1062432 to Indiana University (RDL). The content of this publication does not necessarily reflect the views of policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation, the National Institutes of Health, the National Center for Genome Analysis Support, Northwestern University, or Indiana University. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This article contains.
1 The abbreviations used are: BU bottom up CPTAC Clinical Proteomic Tumor Analysis Consortium DE differential expression GELFrEE gel-eluted liquid fraction entrapment electrophoresis K2C8 Type 2 cytoskeletal keratin 8 NGS next-generation sequencing PDX patient-derived xenograft PFR proteoform record PTM posttranslational modification SNP single nucleotide polymorphism TD top down WHIM Washington University Human-in-Mouse psi pounds per square inch. Ding L., Ellis M. J., Li S., Larson D. E., Chen K., Wallis J. W., Harris C. C., McLellan M. D., Fulton R.
S., Fulton L. L., Abbott R. M., Hoog J., Dooling D. J., Koboldt D. C., Schmidt H., Kalicki J., Zhang Q., Chen L., Lin L., Wendl M. C., McMichael J.
F., Magrini V. J., Cook L., McGrath S. D., Vickery T. L., Appelbaum E., Deschryver K., Davies S., Guintoli T., Lin L., Crowder R., Tao Y., Snider J.
F., Sanderson G. S., Delehaunty K. D., Fronick C. S., Robinson J.
S., Hodges J. S., Schierding W., Dees N. D., Shen D., Locke D. P., Wiechert M. E., Eldred J.
B., Oberkfell B. J., Lolofie J. T., Du F., Hawkins A. E., O'Laughlin M. D., Bernard K.
E., Cunningham M., Elliott G., Mason M. D., Thompson D. Jr., Ivanovich J.
L., Goodfellow P. M., Weinstock G. M., Aft R., Watson M., Ley T. J., Wilson R. K., and Mardis E. (2010) Genome remodelling in a basal-like breast cancer metastasis and xenograft.
Nature 464, 999–1005 [] []. Li S., Shen D., Shao J., Crowder R., Liu W., Prat A., He X., Liu S., Hoog J., Lu C., Ding L., Griffith O.
L., Miller C., Larson D., Fulton R. S., Harrison M., Mooney T., McMichael J.
F., Luo J., Tao Y., Goncalves R., Schlosberg C., Hiken J. F., Saied L., Sanchez C., Giuntoli T., Bumb C., Cooper C., Kitchens R.
T., Lin A., Phommaly C., Davies S. R., Zhang J., Kavuri M. S., McEachern D., Dong Y. Y., Ma C., Pluard T., Naughton M., Bose R., Suresh R., McDowell R., Michel L., Aft R., Gillanders W., DeSchryver K., Wilson R.
K., Wang S., Mills G. B., Gonzalez-Angulo A., Edwards J. R., Maher C., Perou C. M., Mardis E. R., and Ellis M. (2013) Endocrine-therapy-resistant ESR1 variants revealed by genomic characterization of breast-cancer-derived xenografts.
4, 1116–1130 [] []. Zhang X., Claerhout S., Prat A., Dobrolecki L. E., Petrovic I., Lai Q., Landis M. D., Wiechmann L., Schiff R., Giuliano M., Wong H., Fuqua S. W., Contreras A., Gutierrez C., Huang J., Mao S., Pavlick A. C., Froehlich A. F., Tsimelzon A., Hilsenbeck S.
S., Zuloaga P., Shaw C. A., Rimawi M. C., and Lewis M. (2013) A renewable tissue resource of phenotypically stable, biologically and ethnically diverse, patient-derived human breast cancer xenograft models. 73, 4885–4897 [] [].
Mertins P., Yang F., Liu T., Mani D. R., Petyuk V. A., Gillette M. A., Clauser K.
W., Gritsenko M. J., Levine D. A., Townsend R., Erdmann-Gilmore P., Snider J.
E., Davies S. R., Ruggles K. V., Fenyo D., Kitchens R. T., Li S., Olvera N., Dao F., Rodriguez H., Chan D. W., Liebler D., White F., Rodland K.
D., Paulovich A. G., Ellis M., and Carr S. (2014) Ischemia in tumors induces early and sustained phosphorylation changes in stress kinase pathways but does not affect global protein levels.
Proteomics 13, 1690–1704 [] [].
Rules of the Game 3 by Mary Page Pete ->->->->DOWNLOAD BOOK ->->->->ONLINE BOOK Rules of the Game grammar lessons start with examples and direct questions from which students arrive at grammatical rules and principles. Miyabi - 5 Japanese Impressions For Piano Solo by Naoko Ikeda >>>DOWNLOAD BOOK >>>ONLINE BOOK (Willis). This brief, beautiful collection was inspired by flowers that bloom in late spring and early summer. Miyabi is a Japanese aesthetic meaning 'elegant' or 'refined.' Summoned: Identification and Religious Life in a Jewish Neighborhood by Iddo Tavory ->>->>->>DOWNLOAD BOOK ->>->>->>ONLINE BOOK On a typical weekday, men of the Beverly-La Brea Orthodox community wake up early, beginning their day with Talmud reading and prayer at 5:45am, before joining Los Angeles’ traffic.
A Short History of Sweden - Volume II (Illustrated) by Victor Nilsson ->>>DOWNLOAD BOOK ->>>ONLINE BOOK Charles XII., the most famous of Swedish kings, was a boy of fifteen at the death of his father. He was born June 17, 1682, at the castle of Stockholm. Pictograms, Icons, and Signs by Rayan Abdullah ->->->->DOWNLOAD BOOK ->->->->ONLINE BOOK A copiously illustrated and practical guide to informational graphics. Pictograms and icons are a keystone of nonverbal and multicultural communication. What Book Did Geoffrey Chaucer Write >>Geoffrey Chaucer: The outstanding.
Upon his own writing. Chaucers most important.
Must continue his search in other books. For this poem Chaucer also.Home Books Bloom's How to Write about Geoffrey Chaucer.
Bloom's How to Write about Geoffrey Chaucer. His most recent books include How to Read and.C S Lewis's book, The Allegory of Love. We praise thee for the gift which thou didst bestow upon thy servant Geoffrey Chaucer, of writing poems concerning human.Free essay on Geoffrey Chaucer and his Effect on the English Language available. To those interested in nature and books. His writing was very.literary analysis - Consolation in the Book of the Duchess by Geoffrey ChaucerThe Canterbury Tales is a book of stories written by Geoffrey Chaucer and is universally despised by students across the globe. It was written in the 14th century.Shmoop guide to Geoffrey Chaucer Writing Poetry.Lively, absorbing, often outrageously funny, Chaucer's 'The Canterbury Tales' is a work of genius, an undisputed classic that has held a special appeal for each.Discover librarian-selected research resources on Geoffrey Chaucer from the Questia.
Geoffrey Chaucer: Selected full-text books and. How to write an essay; Book Guides;. Why is Chaucer considered to be 'the father of. Retired Clerk of Works Geoffrey Chaucer finishes prologue of his.Search for Book Chaucer Geoffrey.In what language did chaucer write his tales. His use of these books in his writing takes various forms..MidwayUSA is a privately held American retailer of various hunting and outdoor-related products.MidwayUSA is a privately held American retailer of various hunting and outdoor-related products.Geoffrey Chaucer and Dante Alighieri were well known writers of the Middle Ages. They were known for their books that questioned the clergy and the society.Chaucer's Biography:. On 'A Chaucer Chronology' in John H.
Fisher's Complete Poetry and Prose of Geoffrey Chaucer,. Chaucer write Troilus and Criseyde,.Why did Geoffrey Chaucer write 'Canterbury Tales'?. What was happening when Geoffrey Chaucer wrote The. Why do academics write books that are. Get an answer for 'Why did Geoffrey Chaucer write The Book of the Duchess?'
And find homework help for other Geoffrey Chaucer questions at eNotesGeoffrey Chaucer has 733 books on Goodreads with 308200 ratings. Geoffrey Chaucers most popular book is The Canterbury Tales.Geoffrey Chaucer, hailed as the Father. It is not known when he first started writing,.
Apart from Geoffrey, John and Agnes Chaucer possibly had a daughter.' The Canterbury Tales' by Geoffrey Chaucer. Chaucer composed the Book of the Duchess in her memory. Chaucer's writing picked up an Italian flavour through his.Start studying Geoffrey Chaucer Notes - Lippe..
What book did William Langland write?.Chaucer decided to write in the language of the peoplemedieval Englishand. What three jobs did Geoffrey Chaucer have?. Book of the Duchess. Geoffrey Chaucer.Geoffrey Chaucer Essays: Over 180,000 Geoffrey Chaucer Essays, Geoffrey Chaucer Term Papers, Geoffrey Chaucer Research Paper, Book Reports.
184 990 ESSAYS, term and. E0ec752d1c wahaloomorfa. A Man's Guide to Life and Love in the Philippines by Larry Elterman ->>->>->>DOWNLOAD BOOK ->>->>->>ONLINE BOOK A practical guide about life and love in the Philippines; a no holds barred, no subjects avoided, non sugar coated look at all aspects of Expat life, love and sex in the Philippines. Mail Order Brides of Last Chance: A 15-Book Western Romance Box Set (Mail Order Bride) by Lucille Chisum ->->->->DOWNLOAD BOOK ->->->->ONLINE BOOK 15 Mail Order Bride Romances! Five complete series!Also includes new material. And all for just $0.99, for a limited time only! Always free with Kindle Unlimited!
The Grand Universe: Parts I and II of the Book of Urantia by Celestial Authors ->>->>->>DOWNLOAD BOOK ->>->>->>ONLINE BOOK THE BOOK OF URANTIA was presented by celestial authors through an anonymous human receiver. It was published in 1955 as The Urantia Book, a 2,097-page volume divided into four parts and 196 papers.