r.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Sun, 28 Feb 2016 14:10:07 +0200
changeset 1928 eb1e2557ebf7
parent 1927 417ffd620c12
child 1936 998d6c646f20
permissions -rw-r--r--
Looping over data, Exploring data

.. -*- coding: utf-8 -*-

===
 R
===
.. contents::
   :local:

Inspecting objects
==================

Info about object dimensions::

  length(c(1,2,3))
  dim(matrix(1:6, 2, 3))
  ncol(matrix(1:6, 2, 3))
  nrow(matrix(1:6, 2, 3))

Brief info about any object::

  typeof(str)
  class(str)
  unclass(str)
  str(c(1, 2))
  str(summary)

Column names of datasets::

  names(...)
  names(list(colA=1, colB=2))

Column/row names of matrixes::

  colnames(matrix(...))
  rownames(matrix(...))

List objects in global context: ``ls()``.

Objext size in memory: ``object.site(1:2)``

Debugging
=========

To mark function for debugging call::

  debug(fun, text = "", condition = NULL)
  debugonce(fun, text = "", condition = NULL)

To return function to normal execution::

  undebug(fun)
  isdebugged(fun)

You can under to debug mode in any piece of code by calling ``browser``.

``traceback`` prints out the function call stack after an error occurs; does
nothing if there's no error.

``trace`` allows you to insert debugging code into a function a specific places.

``recover`` allows you to modify the error behavior so that you can browse the
function call stack.

Profiling
=========

How long execution of expression takes (in low sec/milisec resolution)::

  system.time(expr, gcFirst = TRUE)
  unix.time(expr, gcFirst = TRUE)

``Rprof`` function enable global profiling. ``summaryRprof`` function decrypt
profiling data::

  Rprof()       ## start profiling
  Rprof(NULL)   ## suspend profiling
  Rprof(append = TRUE)  ## resume profiling
  Rprof(NULL)   ## end profiling
  summaryRprof() ## investigate profiling report

Generating random numbers
=========================

For each distribution there are exists corresponding generation function, named
with prefix ``r``::

  rnorm(n, mean = 0, sd = 1)
  rt(n, df, ncp)
  rbinom(n, size, prob)
  rpois(n, lambda)
  runif(n, min = 0, max = 1)
  rexp
  rchisq
  rgamma

In order to generate predictable sequences use::

  set.seed(seed, kind = NULL, normal.kind = NULL)

Sampling from array::

  sample(x, size, replace = FALSE, prob = NULL)
  sample.int(n, size = n, replace = FALSE, prob = NULL)

  sample(1:10, 10)  ## permutation!!
  sample(1:10, 100, replace=TRUE)

Looping over data
=================

``lapply`` iterate over data and return list of function application::

  lapply(1:5, function(x) x^2)
  lapply(matrix(rnorm(20*10),20,10), mean)

Exploring data
==============

Check `Inspecting objects`_ section.

Investigating unique values::

  sapply(data, unique)
  sapply(data$col, unique)
  sapply(data[,c("col1","col2")], unique)
  sapply(data[,5:10], unique)

  table(data$col)

  tapply(data$what, data$by, unique)
  tapply(data$what, data$by, summary)
  tapply(data$what, data$by, range)
  tapply(data$what, data$by, mean)
  tapply(data$what, data$by, sd)

Brief info about vectors and matrixes::

  summary(1:8)
  summary(matrix(1:20, 4, 5))

Simple plots::

  i<-1:100
  x<-i/10
  y<-x^2
  plot(x,y)

  hist(rpois(100,10))
  hist(rpois(100,10),breaks=20)