Plotting Data in R

Base vs. Lattice graphics, examples from “Computing for Data Analysis” Week 3 lectures by Dr. Roger Peng (Johns Hopkins/Coursera) and R datasets:

https://www.coursera.org/course/compdata

Base Graphing

Base parameter documentation, use this:

>?par

Using base graphics plot(), add a title (main=””) and axis labels (xlab=””, ylab=””) like this:

>plot(x,y,main="Eruptions of Old faithful",xlab="eruption time(x)",ylab="waiting time (min)")

R_testgraph
[1] http://stat.ethz.ch/R-manual/R-devel/library/graphics/html/text.html
[2] http://www.dummies.com/how-to/content/how-to-add-titles-and-axis-labels-to-a-plot-in-r.html
[3] http://stackoverflow.com/questions/3453695/adding-text-to-a-plot
[4] http://stackoverflow.com/questions/12121060/adding-text-to-a-plot-axes-without-removing-existing-axes-labels-in-r

You can also use separate title(“”) and text(x,y,””) where the x,y coordinates will specify where on the graph the text will be displayed.

Screenshot from 2013-10-13 10:47:29

Rplot_scatterplotexample_ablinedefault

>abline(fit,lwd=3,col="blue")

Rplot_abline_lwd3_colblue

Fit more than one plot in a graphic with this:

> par(mar=c(2,2,1,1))
> par(mfrow=c(2,2))
> plot(x,y)
> plot(x,z)
> plot(z,x)
> plot(y,x)

Rplot_mfrow2_2

Subset data, then use points() function to sequentially add each subset to the plot and differentiate between the two with different colors, or symbols, etc.

> par(mfrow=c(1,2))
> x<-rnorm(100)
> y<-x+rnorm(100)
> g<-gl(2,50,labels=c("Male","Female"))
> str(g)
 Factor w/ 2 levels "Male","Female": 1 1 1 1 1 1 1 1 1 1 ...
> plot(x,y)
> plot(x,y,type="n")
> points(x[g=="Male"],y[g=="Male"],col="green")
> points(x[g=="Female"],y[g=="Female"],col="blue")

Rplot_points

Labels need to be added and the margins with par(mar=c(a,b,c,d)) need to be adjusted so the plotting graphics are sized more appropriately. But, you get the idea.

See an example of pch point options available in the base graphics plotting function plot():

>example(points)

pch_plotsymbols

Lattice Graphing

The main advantage is in specifying conditional variables after the | allowing you to visualize relationships with a lattice of variables. You will sometimes end up writing separate functions embedded within the function calls so that one long function call will produce one graphic. Whereas base graphing consists of multiple single line function calls to set up and edit the graphic. Lattice Graphics documentation:

> package ? lattice

Insert text here.

> library(lattice)
> library(nlme)
> xyplot(distance~age|Subject,data=Orthodont, type="b")

Rplot_latticeOrthodont_typeb

> data(environmental)
> ?environmental
> head(environmental)
  ozone radiation temperature wind
1    41       190          67  7.4
2    36       118          72  8.0
3    12       149          74 12.6
4    18       313          62 11.5
5    23       299          65  8.6
6    19        99          59 13.8
> xyplot(ozone~radiation,data=environmental)

In the last xyplot() function call, the data=environmental argument after the , tells the function which data structure to go to look up the variables ozone and radiation.
Rplot_lattice_environmentalscatter

> temp.cut<-equal.count(environmental$temperature,4)
> xyplot(ozone~radiation|temp.cut,data=environmental)

Rplot_lattice_environmental_temp.cut

> xyplot(ozone~radiation|temp.cut,data=environmental, layout=c(1,4),as.table=TRUE)

Rplot_lattice_environmental_tempcut_layout1_4

xyplot(ozone~radiation|temp.cut,data=environmental,as.table=TRUE, pch=20
       panel=function(x,y,...){
             panel.xyplot(x,y,...)
             fit<-lm(y~x)
             panel.abline(fit,lwd=2)     
       })

Rplot_lattice_environmental_tempcut_regressionlinefunction

xyplot(ozone~radiation|temp.cut,data=environmental,as.table=TRUE, pch=20,
       panel=function(x,y,...){
             panel.xyplot(x,y,...)
             panel.loess(x,y)
       }, xlab="Solar Radiation",ylab="Ozone (ppb)",
       main="Ozone vs. Solar Radiation")

Rplot_lattice_environmental_tempcut_loessregression_axislabels

> wind.cut<-equal.count(environmental$wind,4)
> xyplot(ozone~radiation|temp.cut*wind.cut,data=environmental,as.table=TRUE, pch=20,
       panel=function(x,y,...){
             panel.xyplot(x,y,...)
             panel.loess(x,y)
       }, xlab="Solar Radiation",ylab="Ozone (ppb)",
       main="Ozone vs. Solar Radiation")

Rplot_lattice_environmental_tempcut_windcut_loessregression_axislabels

> splom(~environmental)

Rplot_splom

More references on plotting in R:

Advertisements

About Lisa Cohen

PhD student at UC Davis.
This entry was posted in Coursera, R. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s