# HG changeset patch # User Oleksandr Gavenko # Date 1456661407 -7200 # Node ID eb1e2557ebf76a32a35b2fb51e696134a8598c56 # Parent 417ffd620c128a6fa36c45edaefc3442f196e8d5 Looping over data, Exploring data diff -r 417ffd620c12 -r eb1e2557ebf7 r.rst --- a/r.rst Sun Feb 28 01:00:06 2016 +0200 +++ b/r.rst Sun Feb 28 14:10:07 2016 +0200 @@ -19,17 +19,24 @@ Brief info about any object:: typeof(str) + class(str) + unclass(str) str(c(1, 2)) str(summary) -Brief info about vectors and matrixes:: +Column names of datasets:: + + names(...) + names(list(colA=1, colB=2)) + +Column/row names of matrixes:: - summary(1:8) - summary(matrix(1:20, 4, 5)) + colnames(matrix(...)) + rownames(matrix(...)) -Brief info on datasets and matrixes:: +List objects in global context: ``ls()``. - names(list(colA=1, colB=2)) +Objext size in memory: ``object.site(1:2)`` Debugging ========= @@ -82,6 +89,9 @@ rbinom(n, size, prob) rpois(n, lambda) runif(n, min = 0, max = 1) + rexp + rchisq + rgamma In order to generate predictable sequences use:: @@ -94,3 +104,47 @@ sample(1:10, 10) ## permutation!! sample(1:10, 100, replace=TRUE) + +Looping over data +================= + +``lapply`` iterate over data and return list of function application:: + + lapply(1:5, function(x) x^2) + lapply(matrix(rnorm(20*10),20,10), mean) + +Exploring data +============== + +Check `Inspecting objects`_ section. + +Investigating unique values:: + + sapply(data, unique) + sapply(data$col, unique) + sapply(data[,c("col1","col2")], unique) + sapply(data[,5:10], unique) + + table(data$col) + + tapply(data$what, data$by, unique) + tapply(data$what, data$by, summary) + tapply(data$what, data$by, range) + tapply(data$what, data$by, mean) + tapply(data$what, data$by, sd) + +Brief info about vectors and matrixes:: + + summary(1:8) + summary(matrix(1:20, 4, 5)) + +Simple plots:: + + i<-1:100 + x<-i/10 + y<-x^2 + plot(x,y) + + hist(rpois(100,10)) + hist(rpois(100,10),breaks=20) +