citation("protr")
now gives better output with the BibTeX
citation key. This is improved by adding the key
argument to the
bibentry()
call in inst/CITATION
(#49).vignette("protr")
to fix typos and grammar issues.
Updated images to use knitr::include_graphics()
chunks,
resolving pkgdown 2.1.0 accessibility hints for missing alt text (#50).crossSetSim()
now gains two new arguments batches
and verbose
.
The batches
argument allows users to split the similarity computations
into multiple batches, which is useful when dealing with
a large number of sequences and limited RAM.
The verbose
argument enables progress updates during the computation.
This brings crossSetSim()
to feature parity with parSeqSim()
.
(thanks, @ofleitas, #41)
A new function crossSetSimDisk()
has been implemented as a disk-based
version of crossSetSim()
.
This function follows a similar approach to parSeqSimDisk()
,
where partial results from each batch are cached on the hard drive and
merged at the end. This allows for processing larger protein sequence
sets that may not fit into RAM (#41).
crossSetSim()
is added for calculating pairwise similarity between two sets
of protein sequence based on sequence alignment (thanks, @seb-mueller, #34).extractProtFP()
and extractProtFPGap()
when index = NULL
(thanks, @fcampelo, #30).system.file()
usage to avoid confusion
(thanks, @jonalv, #31).batches
to parSeqSim()
.
The new argument supports breaking down the pairwise similarity computation
into smaller batches. This is useful when you have a large number of
protein sequences, enough number of CPU cores, but not enough RAM to
compute and hold all the pairwise similarities in a single batch.
Also, use the other new argument verbose
to track the computation progress.parSeqSimDisk()
.
Compared to the in-memory version parSeqSim()
, this new function caches
the partial results in each batch to the hard drive and merges the results
together in the end. This could further reduce the memory usage for parallel
similarity computations involving a large number of protein sequences.parGOSim()
that will create minor numerical
inconsistencies in results due to argument matching.twoGOSim()
and parGOSim()
to use the latest GOSemSim
API
for computing GO based semantic similarity.
Issues in the code examples are also fixed.
We thank Denisa Duma for the feedback.getUniProt()
.gap.opening
and gap.extension
to parSeqSim()
,
allowing more flexible tuning of the sequence alignment for more types of
amino acid sequence data. We thank Dr. Maisa Pinheiro for the feedback.removeGaps()
for removing/replacing gaps (-
) or
any irregular characters from protein sequences, to make them suitable
for feature extraction or sequence alignment based similarity computation.
We thank Dr. Maisa Pinheiro for the feedback.ifelse
conditioning
(nanxstats/protr@3f6e106) for the
distribution descriptor in CTD. We thank Jielu Yan from
the University of Macau for kindly reporting this issue.curl -I -L
:
2015-12-28
extractCTDD()
.2014-12-18
2014-09-20
2014-06-20
LICENSE
file according to CRAN policies.readFASTA()
.getUniProt()
.protcheck()
.protseg()
.