MS-Align+ is a software tool for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. This manual helps you install and run MS-Align+.
The current version of MS-Align+ works with Thermo raw data.
The MS-Align+ pipeline for analyzing raw data includes three components: format conversion, deconvolution, and database search.
See the paper Liu et al. 2012 .
MS-Align+ requires a computer with at least 12 GB memory, a 64-bit Linux or Microsoft Windows operating system, and Java Runtime Environment 7.0.
Download a zipped file MS-Align+ 0.7.1.7143 and extract the zipped file to install MS-Align+.
To extract the zipped file on Microsoft Windows, right-click the file, click "Extract all", and follow the instructions.
To extract the zipped file on a Linux system, use the following command:
unzip MS-Align-0.7.1.7143.zip 
MS-Align+ needs three input files: a protein database file in FASTA format, a spectrum data file, and a configuration file. All the three files should be in directory INSTALL_DIR/msalign+/msinput/, where INSTALL_DIR is the directory that MS-Align+ has been installed.
The protein database file in FASTA format can be downloaded from SWISS-PROT or other protein databases.
The spectrum data file, in a format similar to MGF, can be obtained by using MS-Deconv. Please see the manual of MS-Deconv for details.
The name of the configuration file is input.properties. The user can edit this file to set parameters of MS-Align+.
Here is a sample of the configuration file: input.properties.
databaseFileName=prot.fasta
spectrumFileName=spectra.msalign
activation=FILE
searchType=TARGET
cysteineProtection=C0
shiftNumber=2
errorTolerance=15
cutoffType=EVALUE
cutoff=0.01
doOneDaltonCorrection=false
doChargeCorrection=false
tableOutputFileName=result_table.txt
detailOutputFileName=result_detail.txt
     databaseFileName 
        Specify the name of the protein database file. 
     spectrumFileName 
        Specify the name of the spectrum file. 
     activation 
        Specify the fragmentation type of tandem mass
    spectra. "activation" can be set as CID, HCD, ETD or FILE. When
    activation=FILE, MS-Align+ uses the fragmentation information provided in 
    the spectrum file. 
    searchType
        The parameter searchType can be either TARGET or TARGET+DECOY.
    When searchType=TARGET+DECOY, a concatenated target+decoy database will be
    generated and false discovery rates will be calculated. 
    cysteineProtection  
        The parameter cysteineProtection can be set as C0,
    C57 or C58.  C0: no modification, C57: Carbamidoemetylation or
    C58:Carboxymethylation.
    shiftNumber  
        The maximum number of unexpected post-translational
    modifications in a protein-spectrum-match. The parameter can be set as 0, 1, or 2. 
    errorTolerance  
        The error tolerance for precursor and fragment
    masses in PPM. The default value is 15. 
    cutoffType  
        The type of the cutoff value for reporting
    identified protein-spectrum-matches. The parameter can be EVALUE or FDR.
    When cutoffType=FDR, searchType must be set as TARGET+DECOY. 
    cutoff  
        The cutoff value for reporting identified
    protein-spectrum-matches. If cutoffType=FDR, the value is a cutoff with 
    respect to FDRs. If cutoffType=EVALUE, the value is a cutoff with respect to
    E-values. 
    doOneDaltonCorrection  
        The parameter can be set as true or false. If 
    doOneDaltonCorrection=true, MS-Align+ will try to automatically correct 
    +/-1 Da errors introduced in the deconvolution of precursor ions. This 
    function has not been thoroughly tested. 
    doChargeCorrection  
        The parameter can be set as true or false. If 
    doChargeCorrection=true, MS-Align+ will try to automatically correct 
    errors in the precursor masses introduced by deconvolution tools which 
    may report an incorrect charge state of a precursor ion. 
    This function has not been thoroughly tested. 
    tableOutputFileName  
        The name of the output file in a tab delimited
    format. 
    detailOutputFileName  
        The name of the output file in an xml format. 
Windows:
cd msalign+ java -Xmx12G -classpath jar\*; edu.ucsd.msalign.align.console.MsAlignPipeline .\
Linux:
cd msalign+ java -Xmx12G -classpath jar/*: edu.ucsd.msalign.align.console.MsAlignPipeline ./
The resulting files of MS-Align+ are in three formats: a tab delimited text file, an xml
file, and html files. The tax delimited text file and the xml file, whose names
are specified in the configuration file, are in the
directory msoutput. The files in directory html, including proteins.html,
provide web pages for displaying identified protein-spectrum-matches. 
 
If you use MS-Align+ in your research, please include 
            Liu et al., 2012 in your reference
    list. 
 
    Your comments, bug reports, and suggestions are very welcome. They will
    help us to further improve MS-Align+.
 
    If you have any troubles running MS-Align+, please email us
    liuxiaowencs@gmail.com or post your questions at the 
            google group of MS-Align+. 
  
4. Citation
5. Feedback and bug reports
6. New software tool
TopPIC (TOP-Down Mass Spectrometry Based Proteoform Identification and
Characterization) is a new software tool for identification and
characterization of proteoforms at the whole proteome level by top-down tandem
mass spectra using database search.  It uses several techniques, such indexes,
spectral alignment, and a generation function method, to increase the speed,
sensitivity, and accuracy. It also provides a web browser based user interface.
You can download TopPIC here.