basecontent {Biostrings} | R Documentation |
WARNING: Both basecontent
and countbases
have been
deprecated in favor of alphabetFrequency
.
These functions accept a character vector representing the nucleotide sequences and compute the frequencies of each base (A, C, G, T).
basecontent(seq) countbases(seq, dna = TRUE)
seq |
Character vector. |
dna |
Logical value indicating whether the sequence is DNA
(TRUE ) or RNA (FALSE ) |
The base frequencies are calculated separately for
each element of x
.
The elements of x
can be in upper case, lower case
or mixed.
A matrix with 4 columns and length(x)
rows. The columns are
named A
, C
, T
, G
, and the values in each
column are the counts of the corresponding bases in the elements of
x
. When dna=FALSE
, the T
column is replaced with
a U
column.
R. Gentleman, W. Huber, S. Falcon
alphabetFrequency
, reverseComplement
v<-c("AAACT", "GGGTT", "ggAtT") ## Do not use these functions anymore: if (interactive()) { basecontent(v) countbases(v) } ## But use more efficient alphabetFrequency() instead: v <- DNAStringSet(v) alphabetFrequency(v, baseOnly=TRUE) ## Comparing efficiencies: if (interactive()) { library(hgu95av2probe) system.time(y1 <- countbases(hgu95av2probe$sequence)) x <- DNAStringSet(hgu95av2probe) system.time(y2 <- alphabetFrequency(x, baseOnly=TRUE)) }