protr: R package for generating various numerical representation schemes of protein sequences
Introduction | An example predictive modeling workflow | Package overview | Commonly used descriptors | Amino acid composition descriptor | Dipeptide composition descriptor | Tripeptide composition descriptor | Autocorrelation descriptors | Normalized Moreau-Broto autocorrelation descriptors | Moran autocorrelation descriptors | Geary autocorrelation descriptors | Composition/Transition/Distribution | Composition | Transition | Distribution | Conjoint triad descriptors | Quasi-sequence-order descriptors | Sequence-order-coupling number | Pseudo-amino acid composition (PseAAC) | Amphiphilic pseudo-amino acid composition (APseAAC) | Profile-based descriptors | Proteochemometric modeling descriptors | Similarity calculation by sequence alignment | Scaling up for a large number of sequences | Similarity calculation by GO semantic similarity measures | Miscellaneous tools | Retrieve protein sequences from UniProt | Read FASTA files | Read PDB files | Sanity check for amino acid types | Remove gaps from sequences | Protein sequence partition | Auto cross covariance computation | Pre-computed 2D and 3D descriptor sets for the 20 amino acids | BLOSUM and PAM matrices for the 20 amino acids | Metadata of the 20 amino acids | ProtrWeb | Summary | References