.. -*- coding: utf-8 -*-=== R===.. contents:: :local:Inspecting objects==================Info about object dimensions:: length(c(1,2,3)) dim(matrix(1:6, 2, 3)) ncol(matrix(1:6, 2, 3)) nrow(matrix(1:6, 2, 3))Brief info about any object:: typeof(str) class(str) unclass(str) str(c(1, 2)) str(summary)Column names of datasets:: names(...) names(list(colA=1, colB=2))Column/row names of matrixes:: colnames(matrix(...)) rownames(matrix(...))List objects in global context: ``ls()``.Objext size in memory: ``object.site(1:2)``Interactive session===================Controlling output precision:: options(digits=3)List of all options:: str(options())Debugging=========To mark function for debugging call:: debug(fun, text = "", condition = NULL) debugonce(fun, text = "", condition = NULL)To return function to normal execution:: undebug(fun) isdebugged(fun)You can under to debug mode in any piece of code by calling ``browser``.``traceback`` prints out the function call stack after an error occurs; doesnothing if there's no error.``trace`` allows you to insert debugging code into a function a specific places.``recover`` allows you to modify the error behavior so that you can browse thefunction call stack.Profiling=========How long execution of expression takes (in low sec/milisec resolution):: system.time(expr, gcFirst = TRUE) unix.time(expr, gcFirst = TRUE)``Rprof`` function enable global profiling. ``summaryRprof`` function decryptprofiling data:: Rprof() ## start profiling Rprof(NULL) ## suspend profiling Rprof(append = TRUE) ## resume profiling Rprof(NULL) ## end profiling summaryRprof() ## investigate profiling reportGenerating random numbers=========================For each distribution there are exists corresponding generation function, namedwith prefix ``r``:: rnorm(n, mean = 0, sd = 1) rt(n, df, ncp) rbinom(n, size, prob) rpois(n, lambda) runif(n, min = 0, max = 1) rexp rchisq rgammaIn order to generate predictable sequences use:: set.seed(seed, kind = NULL, normal.kind = NULL)Sampling from array:: sample(x, size, replace = FALSE, prob = NULL) sample.int(n, size = n, replace = FALSE, prob = NULL) sample(1:10, 10) ## permutation!! sample(1:10, 100, replace=TRUE)Looping over data=================``lapply`` iterate over data and return list with result of functionapplication:: lapply(1:5, function(x) x^2) lapply(matrix(rnorm(20*10),20,10), mean)Usually you don't need a list but a vector. ``sapply`` works like ``lapply`` butalso try to convert result to matrix or vector is dimantions and elvement typespermit this:: lapply(list(1:5), mean) [[1]] [1] 3 sapply(list(1:5), mean) [1] 3``apply`` works on specific dimension of data so useful to work with matrixesand data frames:: apply(matrix(1:6, 2, 3), 1, min) [1] 1 2 apply(matrix(1:6, 2, 3), 2, max) [1] 2 4 6 apply(array(rnorm(2*2*10), c(2, 2, 10)), c(1, 2), mean) [,1] [,2] [1,] -0.2733804 0.3154234 [2,] 0.1830982 -0.5889010``colSums``, ``rowSums``, ``colMeans``, ``rowMeans`` is defined as optimizedequivalent for:: rowSums = apply(x, 1, sum) colSums = apply(x, 2, sum) rowMeans = apply(x, 1, mean) colMeans = apply(x, 2, mean)``split`` partitioning data on factor (analog of SQL ``group by``):: data<-data.frame(rnorm(10),rbinom(10,1,prob=.7)) sdata<-split(data[,1],data[,2]) lapply(sdata,mean)Exploring data==============Check `Inspecting objects`_ section.Investigating unique values:: sapply(data, unique) sapply(data$col, unique) sapply(data[,c("col1","col2")], unique) sapply(data[,5:10], unique) table(data$col) tapply(data$what, data$by, unique) tapply(data$what, data$by, summary) tapply(data$what, data$by, range) tapply(data$what, data$by, mean) tapply(data$what, data$by, sd)Brief info about vectors and matrixes:: summary(1:8) summary(matrix(1:20, 4, 5))Simple plots:: i<-1:100 x<-i/10 y<-x^2 plot(x,y) hist(rpois(100,10)) hist(rpois(100,10),breaks=20)Renaming columns================:: names(d)[names(d)=="beta"] <- "two" names(d)[2] <- "two" library(plyr) newd <- rename(d, c("beta"="two", "gamma"="three"))Removing names for raws and columns===================================:: rownames(dt) <- NULL colnames(dt) <- NULLFiltering raws and columns==========================:: TODODroping raws and columns========================Drop column from data frame by number:: dfnew <- df[-1] # first dfnew <- df[-ncol(df)] # last dfnew <- df[-c(1, 3:4, 7)] # rangeDrop column from data frame by name:: newdf <- df[ , !(names(df) %in% c("lat", "long"))] df <- data.frame( a = 1:10, b = 2:11, c = 3:12 ) df <- subset(df, select = c(a,c)) df <- subset(df, select = -c(a,c))