MS-Align+ is a software tool for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. This manual helps you install and run MS-Align+.
The current version of MS-Align+ works with Thermo raw data.
The MS-Align+ pipeline for analyzing raw data includes three components: format conversion, deconvolution, and database search.
See the paper Liu et al. 2012 .
MS-Align+ requires a computer with at least 12 GB memory, a 64-bit Linux or Microsoft Windows operating system, and Java Runtime Environment 7.0.
Download a zipped file MS-Align+ 0.7.1.7143 and extract the zipped file to install MS-Align+.
To extract the zipped file on Microsoft Windows, right-click the file, click "Extract all", and follow the instructions.
To extract the zipped file on a Linux system, use the following command:
unzip MS-Align-0.7.1.7143.zip
MS-Align+ needs three input files: a protein database file in FASTA format, a spectrum data file, and a configuration file. All the three files should be in directory INSTALL_DIR/msalign+/msinput/, where INSTALL_DIR is the directory that MS-Align+ has been installed.
The protein database file in FASTA format can be downloaded from SWISS-PROT or other protein databases.
The spectrum data file, in a format similar to MGF, can be obtained by using MS-Deconv. Please see the manual of MS-Deconv for details.
The name of the configuration file is input.properties. The user can edit this file to set parameters of MS-Align+.
Here is a sample of the configuration file: input.properties.
databaseFileName=prot.fasta
spectrumFileName=spectra.msalign
activation=FILE
searchType=TARGET
cysteineProtection=C0
shiftNumber=2
errorTolerance=15
cutoffType=EVALUE
cutoff=0.01
doOneDaltonCorrection=false
doChargeCorrection=false
tableOutputFileName=result_table.txt
detailOutputFileName=result_detail.txt
databaseFileName
Specify the name of the protein database file.
spectrumFileName
Specify the name of the spectrum file.
activation
Specify the fragmentation type of tandem mass
spectra. "activation" can be set as CID, HCD, ETD or FILE. When
activation=FILE, MS-Align+ uses the fragmentation information provided in
the spectrum file.
searchType
The parameter searchType can be either TARGET or TARGET+DECOY.
When searchType=TARGET+DECOY, a concatenated target+decoy database will be
generated and false discovery rates will be calculated.
cysteineProtection
The parameter cysteineProtection can be set as C0,
C57 or C58. C0: no modification, C57: Carbamidoemetylation or
C58:Carboxymethylation.
shiftNumber
The maximum number of unexpected post-translational
modifications in a protein-spectrum-match. The parameter can be set as 0, 1, or 2.
errorTolerance
The error tolerance for precursor and fragment
masses in PPM. The default value is 15.
cutoffType
The type of the cutoff value for reporting
identified protein-spectrum-matches. The parameter can be EVALUE or FDR.
When cutoffType=FDR, searchType must be set as TARGET+DECOY.
cutoff
The cutoff value for reporting identified
protein-spectrum-matches. If cutoffType=FDR, the value is a cutoff with
respect to FDRs. If cutoffType=EVALUE, the value is a cutoff with respect to
E-values.
doOneDaltonCorrection
The parameter can be set as true or false. If
doOneDaltonCorrection=true, MS-Align+ will try to automatically correct
+/-1 Da errors introduced in the deconvolution of precursor ions. This
function has not been thoroughly tested.
doChargeCorrection
The parameter can be set as true or false. If
doChargeCorrection=true, MS-Align+ will try to automatically correct
errors in the precursor masses introduced by deconvolution tools which
may report an incorrect charge state of a precursor ion.
This function has not been thoroughly tested.
tableOutputFileName
The name of the output file in a tab delimited
format.
detailOutputFileName
The name of the output file in an xml format.
Windows:
cd msalign+ java -Xmx12G -classpath jar\*; edu.ucsd.msalign.align.console.MsAlignPipeline .\
Linux:
cd msalign+ java -Xmx12G -classpath jar/*: edu.ucsd.msalign.align.console.MsAlignPipeline ./
The resulting files of MS-Align+ are in three formats: a tab delimited text file, an xml
file, and html files. The tax delimited text file and the xml file, whose names
are specified in the configuration file, are in the
directory msoutput. The files in directory html, including proteins.html,
provide web pages for displaying identified protein-spectrum-matches.
If you use MS-Align+ in your research, please include
Liu et al., 2012 in your reference
list.
Your comments, bug reports, and suggestions are very welcome. They will
help us to further improve MS-Align+.
If you have any troubles running MS-Align+, please email us
liuxiaowencs@gmail.com or post your questions at the
google group of MS-Align+.
4. Citation
5. Feedback and bug reports
6. New software tool
TopPIC (TOP-Down Mass Spectrometry Based Proteoform Identification and
Characterization) is a new software tool for identification and
characterization of proteoforms at the whole proteome level by top-down tandem
mass spectra using database search. It uses several techniques, such indexes,
spectral alignment, and a generation function method, to increase the speed,
sensitivity, and accuracy. It also provides a web browser based user interface.
You can download TopPIC here.