Molecular mass of proteins and peptides

Is there a package that can compute the molecular mass based on an amino acid sequence?
MolecularGraph has the machinery, but that appears to be focused on small molecules.
I can’t find a method in BioSequences

Probably not what you want, but from the PDB file you can do this:

julia> using PDBTools

julia> pdb = wget("1LBD")
   Array{Atoms,1} with 1870 atoms with fields:
   index name resname chain   resnum  residue        x        y        z  beta occup model segname index_pdb
       1    N     SER     A      225        1   45.228   84.358   70.638 67.05  1.00     1       -         1
       2   CA     SER     A      225        1   46.080   83.165   70.327 68.73  1.00     1       -         2
       3    C     SER     A      225        1   45.257   81.872   70.236 67.90  1.00     1       -         3
                                                       ⋮ 
    1868  OG1     THR     A      462      238  -27.462   74.325   48.885 79.98  1.00     1       -      1868
    1869  CG2     THR     A      462      238  -27.063   71.965   49.222 78.62  1.00     1       -      1869
    1870  OXT     THR     A      462      238  -25.379   71.816   51.613 84.35  1.00     1       -      1870


julia> mass(pdb)
24719.768399999957

julia> formula(pdb)
C₁₁₉₂N₃₂₀O₃₄₆S₁₂


Thanks for the link. I was not aware of PDBTools. I looked in BioStructures.jl, but did not find anything,
I love the formula :smile:
I don’t normally have a PDB-file, but now we have alphafold, so maybe :wink:

It’s not in BioStructures.jl - PRs welcome of course.

Well, that was really easy (not surprisingly) to implement in PDBTools. Now you can do (with version 0.12.12, which will be available at any moment):

julia> using PDBTools

julia> seq = "ATVR"
"ATVR"

julia> mass(Sequence(seq))
427.4986

julia> seq = ["ARG", "THR", "GLU"]
3-element Vector{String}:
 "ARG"
 "THR"
 "GLU"

julia> mass(Sequence(seq))
386.4036

It does not handle terminals in any sense, it is just the sum of the average mass (isopically speaking) of the amino acid residues in the sequence.

Thank you so much @lmiq . This is exactly what I need!

How I just love this community!