在蛋白质组学研究中,如果使用高通量方法会得到大量蛋白质数据, 这就需要采用生物信息学的方法进行处理. 这里介绍一篇文章,希望能起到抛砖引玉的作用, 让大家讨论一下还可以用那些方法进行生物信息学处理.
在这篇论文中, 应用了合并2种检索, 非标记定量, 相对量比较(normalized and non normalized),GO term 比较, 3种算法的蛋白定位预测比较, 通路分析,蛋白修饰(包括氨基酸修饰,和蛋白降解修饰)。另外在结果表格中还列出信号肽, 跨膜区,以及是否血清蛋白分析。文章连接:
Characterization of the Vitreous Proteome in Diabetes without Diabetic Retinopathy and Diabetes with Proliferative Diabetic Retinopathy
J. Proteome Res., ASAP Article, 10.1021/pr800112g
cuturl('http://pubs.acs.org/cgi-bin/abstract.cgi/jprobs/asap/abs/pr800112g.html')
因为版权所以不能贴在这里,无法下载的可以发消息给我。
Figure 1. Proteomic analysis process and the number of proteins identified. A. Schematic of gel-LC-MS/MS analysis and data processing. B. Venn diagram of proteins identified using X!Tandem and SEQUEST algorithms. The number of proteins identified from 17 independent vitreous samples and percent of total number of proteins identified by each algorithm are shown.
This table contains the total list of IPI, protein name, gene symbol, sequence coverage, unique peptides, spectral count of NDM, noDR, and PDR group (Mean ± SEM), p value, search engine, GO term (biological process, cellular component, molecular function), and protein subcellular localization prediction for the 252 proteins identified in this study.
相对含量分析:
Figure 2. Fractional distribution of the most abundant proteins in human vitreous. A. Chart showing a summary of the relative amounts of highly abundant proteins in PDR vitreous. B. Table showing the mean percent of number of total peptides for the 15 most abundant proteins identified in NDM, noDR, and PDR samples relative to the number of total peptides detected from respective samples.
Figure 3. Comparison of proteins abundance in noDR or PDR vitreous relative to NDM vitreous. Ratio of the mean total peptides detected in noDR or PDR groups relative to the NDM group. The absence of protein detection in a group is indicated by > 20 fold.
GO term, 蛋白质定位预测比较:
Figure 4. A. Frequency of Gene Ontology terms in human vitreous proteome annotation. Predicted protein subcellular localization by MultiLoc(, TargetP(C) and SubLoc(D).
Ingenuity分析。 论文中以表代替图。
Proteins related to complement and coagulation cascades differently changed in human vitreous proteome in patients with NDM or PDR。(red: increase, blue: decrease)
蛋白修饰分析:
Figure 5. Identification of protein fragments in the vitreous. A. Schematic of peptide coverage for proteins with a low than predicted molecular weight. The location of peptides identified is indicated. B & C. The spectral count for cadherin-2 and vitronectin on SDS-PAGE at different molecular weight from gel slices.
蛋白修饰分析, 各组比较(没在论文中)
A. Frequency of protein modification in human vitreous proteome.
B. Comparison of protein modification among NDM, noDR and PDR groups.
在两者的基础上,我们还可以做很多的扩展分析,笔者就见过将不同的分组的样本,加上差异的基因和GO的分析结果进行综合分析,将不同的样本分组情况跟GO的结果结合起来,得到了很不错的结果,具体可以参见文章“Integration of GO annotations in Correspondence Analysis:
facilitating the interpretation of microarray data”
(1) You have a system called MS Result manager in your lab, could you recommend similar systems that freely available?
(2) What do you think of the system described in cuturl('http://www.biomedcentral.com/1471-2105/9/302') ?
(3) I got from some one a lot of results from Mascot. Do you know any tools that extract the useful information? I don't want to write a parser for that.
PS: Did you do your PhD in H & W lab? If yes, we are old friends.作者: 草木叉 时间: 2013-10-11 10:57 标题: 回复 #39 89tongzijun 的帖子
There are several free software package available for proteomics data analysis. To my knowledge, there are no good free software packages can handle 1D Gel based proteomics data. I haven't published my software, so my software is free or open source now.
That software you mentioned is focused on 2D gel data. I don't know what kind of data you are dealing with.
There are mascot parser available, which you can get it after Google search.
I have written a parser for mascot data long time ago, my software package definitely can handle mascot data. As long as the output file is XML format, or even ASCII format, the results from different search engines could be easily input in my system.
If your data from 1D gel based proteomics or shotgun proteomics, we maybe cooperate to analysis these data. The software can also output label-free quantification result. See the link for detail: cuturl('http://www.dxy.cn/bbs/post/view?bid=65&id=12463538&sty=1')作者: jrwyyplt 时间: 2013-10-11 10:57
基于1D GEL + 1D-LC/MS/MS的蛋白组研究比较2D Gel based和2D-LC/MS/MS (Shotgun)有一些比较显著的优点。 1D GEL + 1D-LC/MS/MS 比2D Gel based 方法主要是减少很大工作量, 可以容易得到整个样品的蛋白质, 而不是数个差异点,而且不论使用标记和非标记方法得到的定量结果都理论上比2D准确。 因为在2D胶上每个点通常很多蛋白, 一个蛋白通常分布在几个点。
想请教楼主以下问题: 尽管在2D胶上每个点可能会有很多蛋白,但与1Dgel相比,蛋白的分离效果应该更好吧。此外,对于1DGEL中有多个蛋白的条带酶解后,混合蛋白是如何分别被鉴定出来的? 它的原理是什么?也是基于一种算法吗?对于未测序的物种是否可以用1D GEL + 1D-LC/MS/MS的蛋白组研究方法?此外,想问下定量的具体方法?因为对1D GEL + 1D-LC/MS/MS的方法不了解,希望多多指教!谢谢!作者: 草木叉 时间: 2013-10-11 10:59
相关疾病:
头痛
我是写了一段程序处理大量IPI转换成蛋白序列的,然后提交到cuturl('http://www.cbs.dtu.dk/')网站查询, 由于这个网站的程序有限制, 比如targetP 是:
At most 2,000 sequences and 200,000 amino acids per submission; each sequence not more than 4,000 amino acids.
程序里限定以下这些条件就可以. 会写程序的人解决这个问题不难.
GO annotation 细胞定位我也是自己写程序处理的, 你可以使用DAVID (cuturl('http://david.abcc.ncifcrf.gov/home.jsp'))这个网站去分类, 他们做的很好. 使用 GO annotation只能得到部分蛋白定位信息,其他的蛋白定位还是需要计算预测, Mouse IPI数据库有64%有annotation, 但是如果使用程序进行处理,对多个IPI对应一个序列进行挑选, 可以提高到94%的注释率. 但这一部对你可能困难些.