Wednesday, September 12, 2012

Graphics for Statistics - figures with ggplot - Chapter 3 - Bar Charts, Dot plot, add pic

chapter3


1 Chapter 3


Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)

  • here are the data
item<-c("Canada",
"Mexico",
"Saudi Arabia",
"Venezuela",
"Nigeria")

amount<-c(2460,1538,1394,1273,1120)
amount<-amount/1000

df <- data.frame(item=factor(1:5,labels=item),amount=amount)
barrel <- read.jpeg("barrel.jpg")

1.1 Figure 3.4 - simple bar chart


  • we use geom_bar to create the bar chart
  • customizing the y-axis by using scale_y_continuous: limits set the limits, expand defines the multiplicative and additive expansion constants
  • coord_flip rotates it (so we get a horizontal bar chart)
  • than we set the background to white
  • set the colour of the axis lines to black (we have to do this to axis.line not just axis.line.x because of inheritance)
  • get rid of the vertical axis
  • set colour of the ticks of the x-axis to black
  • get rid of the ticks of the y-axis
  • set the colour of the axis labels to black
  • change the adjustment of the labels of vertical axis
  • get rid of the grid lines (they are still visible if one looks carefully)
ggplot(df,aes(y=amount,x=reorder(item,-as.numeric(item)))) +
  geom_bar(stat="identity",fill="white",colour="black") +
  scale_y_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(panel.background=element_rect(fill="white"),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank(),
        axis.text=element_text(colour="black",size=11),
        axis.text.y=element_text(hjust=0),
        panel.grid=element_blank())
ggsave("fig3_4.png")
Saving 7 x 6.99 in image

1.2 Figure 3.5 - simple bar chart


  • we use geom_point to create the chart with dots (set the size of the dots to 3)
  • via geom_segment we add the dotted lines (linetype=3)
  • customizing the x-axis by using scale_x_continuous: limits set the limits, expand defines the multiplicative and additive expansion constants
  • set the colour of the axis lines to black (we have to do this to axis.line not just axis.line.x because of inheritance)
  • get rid of the vertical axis
  • set colour of the ticks of the x-axis to black
  • get rid of the ticks of the y-axis
  • set the colour of the axis labels to black
  • than we set the background to transparent and the colour of the frame to black (panel.background=element_rect)
  • get rid of the grid lines (they are still visible if one looks carefully)
  • get rid of of the title of the y-axis
  • set the colour and the size of the title of the x-axis to black and 11 respectively

ggplot(df,aes(x=amount,y=item)) +
  geom_point(size=3) +
  geom_segment(aes(yend=as.numeric(item)),xend=0,linetype=3) +
  scale_x_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +  
  theme(axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.text=element_text(colour="black",size=11),
        panel.background=element_rect(fill="transparent",colour="black"),
        panel.grid=element_blank(),
        axis.title.y=element_blank(),
        axis.title.x=element_text(colour="black",size=11)
        )
ggsave("fig3_5.png")


Saving 7 x 6.99 in image

1.3 Figure 3.7


  • the clipart can be downloaded here
  • we need the ReadImages package for reading this jpeg
  • we need the grid graphics package to divide the plot and insert to several parts
  • first we load two additional packages (ReadImages for reading the jpeg and grid for the grid graphics functions)
  • the next part (definition of the dot chart) is exactly the same as in figure 3.5
  • load the jpeg with the barrel (barrel <- read.jpeg("barrel.jpg"))
  • the next commands are part of the grid package, which is the underlying graphics system of ggplot2
    • grid.newpage moves to a new page
    • pushViewport adds a new viewport (plotting region) to the page (via x and y one can set the position), beginning in the top left corner, setting the width to 0.6 relative to the page and the height to 0.95; just sets the adjustment
    • print(p,newpage=F) prints the dot chart in this viewport
    • popViewport() closes the viewport
    • create another viewport next to the other one with width 0.4 and the same height
    • grid.raster inserts the picture of the barrel
    • grid.text inserts the text


library(grid)
library(ReadImages)

p <- ggplot(df,aes(x=amount,y=reorder(item,amount))) +
  geom_point(size=3) +
  geom_segment(aes(yend=reorder(item,amount)),xend=0,linetype=3) +
  scale_x_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +  
  theme(axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.text=element_text(colour="black",size=12),
        panel.background=element_rect(fill="white",colour="black"),
        panel.grid=element_blank(),
        axis.title.y=element_blank(),
        axis.title.x=element_text(colour="black",size=11)
        )

barrel <- read.jpeg("barrel.jpg") 

grid.newpage()
pushViewport(viewport(x=unit(0,"line"),y=unit(1,"npc")-unit(2,"mm"),width=0.6,height=0.95,name="vp1",just=c("left","top")))
print(p,newpage=F)
popViewport()
pushViewport(viewport(x=unit(0.7,"npc"),y=unit(0,"npc"),width=0.4,height=0.95,name="vp1",just=c("left","bottom")))
grid.raster(barrel,width=unit(1,"npc"),just=c("centre","bottom"),x=unit(0.2,"npc"),y=unit(3,"line"))
grid.text("Top Five Importing\nCountries of Crude Oil\nand Petrolium\nProducts in 2007\nfor the united States",x=unit(0.2,"npc"),y=unit(1,"npc")-unit(2,"line"),just=c("center","top"))
savePlot("fig3_7.png")


Date: 2012-09-12 21:56:59 CEST

Author: mandy

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0


Monday, September 10, 2012

Graphics for Statistics - figures with ggplot - Chapter 2 Part 3 - Pie Charts

Graphics for Statistics - Chapter 2 - Pie Charts: Figures 2.11-2.12

Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)


Pie charts of the United Nations budget for 2008-2009


  • in the first two lines we define a vector of grays - using the definition out of the book
  • using geom_bar() with width 1
  • mapping x to "", y to amount1 and fill to item1
  • to put the labels on the plot we use geom_text mapping y to the mid of each block
  • and we use scale_fill_manual to set the colours to our predefined grays

Maybe now it is time to look what we have done so far:


grays1<-gray(((2*length(df$amount1)-1):0)/(2*length(df$amount1)-1))
grays<-grays1[1:length(amount)]

ggplot(df,aes(x="",y=amount1,fill=item1)) +
  geom_bar(width=1,colour="black") +
  geom_text(aes(y=c(0,cumsum(df$amount1)[-nrow(df)]) + df$amount/2,label=df$item1),x=1.5,size=4) +
  scale_fill_manual(values=grays)
ggsave("fig2_11a.png")
Saving 7 x 6.99 in image


  • now we transform our coordinate system via coord_polar using the y-axis to define the angle within the pie chart
  • we get rid of the legend, background, axis ticks, text etc
grays1<-gray(((2*length(df$amount1)-1):0)/(2*length(df$amount1)-1))
grays<-grays1[1:length(amount)]

ggplot(df,aes(x="",y=amount1,fill=item1)) +
  geom_bar(width=1,colour="black") +
  geom_text(aes(y=c(0,cumsum(df$amount1)[-nrow(df)]) + df$amount/2,label=df$item1),x=1.5,size=4) +
  scale_fill_manual(values=grays) +
  coord_polar(theta="y") +
  theme(panel.background=element_rect(fill="white"),
        axis.text.x=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.title.x=element_blank(),
        axis.title.y=element_blank(),
        legend.position="none"
        )
ggsave("fig2_11.png")
Saving 7 x 6.99 in image


  • this is one of the cases one should consider using classical graphics
  • here is the code used by K. Keen:
pie(df$amount1,labels=df$item1, 
               radius = 0.85, 
               clockwise=TRUE,
               col=grays,
               angle=120)
savePlot("fig2_11b.png")

So far, I have no solution for the pattern in figure 2.12


Saturday, September 8, 2012

Graphics for Statistics - figures with ggplot - Chapter 2 Part 2 - Bar Chart Flavours

Graphics for Statistics - Chapter 2 - Bar Charts: Figures 2.3-2.10 + 2.13

Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)


Bar charts of the United Nations budget for 2008-2009


  • using geom_bar()
  • mapping x to item1 and y to amount1
  • set stat="identity" because of presummarised data
  • and there is the basic plot

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(stat="identity")
ggsave("fig2_3.png")



Saving 7 x 6.99 in image


But of course there is a lot to do left: you can not read the labels of the x-axis and the we have to change the axis titles


ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(stat="identity") +
  xlab("") +
  ylab("Millions of US Dollars") +
  opts(axis.text.x=theme_text(angle=90,size=12))
ggsave("fig2_3b.png")

Saving 7 x 6.99 in image



  • and there is the graph in default ggplot style
  • now we the plot the style of the plot in the book:
  • add the expand argument to the definition of the y-axis to let the drawn axis end exactly at the limits
  • the width of the bins is changed through the width argument in geom_bar(); in this case it is a bit tricky, because using the identity stat resets width so we have to put width in to the aes() argument (further information)
  • we add a hjust argument in the axis.text.x to change the alignment
  • we set fill and colour of the background to white
  • we use a simple extension by Rudolf Cardinal (source line), because we want to remove just one axis not the two of them (further information)
  • and at the end like above, we use again the hack to get rid of the ticks of the x-axis

source("http://egret.psychol.cam.ac.uk/statistics/R/extensions/rnc_ggplot2_border_themes.r")
png("fig2_3c.png",height=500, width=500)
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  opts(axis.text.x=theme_text(angle=90,size=12,hjust=1),
       axis.text.y=theme_text(size=12),
       panel.background=theme_rect(fill="white",colour="white"),
       panel.border=theme_left_border()
       )
g <- grid.gget(gPath("axis-b", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()
X11cairo 
       2


ggplot9.2 is out - and everything much easier:

  • you do not need to manipulate the grid elements directly, axis.ticks.x and axis.ticks.y are now available
  • there is also no need to use additional functions anymore: axis.line, axis.line.x and axis.line.y do a good job
  • maybe it this a bit confusing: first you have to set axis.line and then you you the axis blank you do not want to see, this is necessary because of the inheritance
  • there are also some functions renamed: use theme instead of opts and element instead of theme

## 9.2 version
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  theme(axis.text.x=element_text(angle=90,size=12,hjust=1,colour="black"),
        axis.text.y=element_text(size=12,colour="black"),
        axis.line=element_line(colour="black"),
        axis.line.x=element_blank(),
        axis.ticks.x=element_blank(),
        panel.background=element_rect(fill="white",colour="white")
       )
ggsave("fig2_3n.png")

Saving 7 x 6.99 in image




  • in figure 2.4 just the angle of the labels is changed, but therefore we have to adjust the alignment (add vjust argument)
  • also set the size of the labels to 11
  • savePlot() is a alternative to open and close a device explicitly

source("http://egret.psychol.cam.ac.uk/statistics/R/extensions/rnc_ggplot2_border_themes.r")
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800)) +
  xlab("") +
  opts(axis.text.x=theme_text(angle=45,size=12,hjust=1,vjust=1),
       axis.text.y=theme_text(size=12),
       panel.background=theme_rect(fill="white",colour="white"),
       panel.border=theme_left_border()       )
g <- grid.gget(gPath("axis-b", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
savePlot("fig2_4.png")


  • and here is also the code for ggplot v9.2

## 9.2 version
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",expand=c(0,0),limits=c(0,800)) +
  xlab("") +
  theme(axis.text.x=element_text(angle=45,size=11,hjust=1,vjust=1,colour="black"),
         axis.text.y=element_text(size=12,colour="black"),
         axis.line=element_line(colour="black"),
         axis.line.x=element_blank(),
         axis.ticks.x=element_blank(),
         panel.background=element_rect(fill="white",colour="white")
         )
ggsave("fig2_4n.png")



  • in figure 2.5 the axes are exchanged - so we can use the final code from figure 2.3
  • and do some minor changes (alignment, angle of labels)
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  opts(axis.text.x=theme_text(size=11,vjust=-1),
       axis.text.y=theme_text(hjust=1,size=12),
       panel.background=theme_rect(fill="white",colour="white"),
       panel.border=theme_bottom_border()
       )
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
savePlot("fig2_5.png")


  • and here is the 9.2 version

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(axis.text.x=element_text(size=11,vjust=-1,colour="black"),
        axis.text.y=element_text(hjust=1,size=12,colour="black"),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        panel.background=element_rect(fill="white",colour="white")
       )
ggsave("fig2_5n.png")




  • for figure 2.6 we just remove the colour argument from panel.background, the panel.border option and add panel.grid.major=theme_blank() to get rid of the tracks of the grid lines
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  opts(axis.text.x=theme_text(size=11,vjust=-1),
       axis.text.y=theme_text(hjust=1,size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank()
       )
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
savePlot("fig2_6.png")


  • and again the 9.2 version

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(axis.text.x=element_text(size=11,vjust=-1,colour="black"),
       axis.text.y=element_text(hjust=1,size=12,colour="black"),
       axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
       panel.background=element_rect(fill="white",colour="black"),
       panel.grid.major=element_blank()
       )
ggsave("fig2_6n.png")

Saving 7 x 6.99 in image



  • from now all code is for ggplot2 version 9.2
  • figure 2.8 keep the vertical grid lines, but removes the horizontal ones: this is controlled by panel.grid.major.x and panel.grid.major.y (line elements)

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(axis.text.x=element_text(size=11,vjust=-1,colour="black"),
        axis.text.y=element_text(hjust=1,size=12,colour="black"),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        panel.background=element_rect(fill="white",colour="black"),
        panel.grid.major.y=element_blank(),
        panel.grid.major.x=element_line(colour="black")
       )
ggsave("fig2_8.png")

Saving 7 x 6.99 in image



  • for figure 2.9 we change the colour of the borders of the bars to black (colour) and the colour of the filling to grey (fill)

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity",fill="grey",colour="black") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(axis.text.x=element_text(size=12,colour="black"),
        axis.text.y=element_text(size=12,colour="black"),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        panel.background=element_rect(fill="white")
        )
ggsave("fig2_9.png")

Saving 7 x 6.99 in image



  • for figure 2.10 just the filling of the bars have to be changed to white

ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity",fill="white",colour="black") +
  scale_y_continuous("Millions of US Dollars",limits=c(0,800),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(axis.text.x=element_text(size=12,colour="black"),
        axis.text.y=element_text(size=12,colour="black"),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        panel.background=element_rect(fill="white")
        )
ggsave("fig2_10.png")


Saving 7 x 6.99 in image




  • for figure 2.13 figure 2.4 is a good beginning
  • set the colour of the filling of the bars to grey
  • we set the breaks and labels of the y-axis manually
  • add horizontal white lines via geom_hline
## 2.13
dollars <- paste("US$",c(200,400,600),"k",sep="")
ggplot(df,aes(x=item1,y=amount1)) +
  geom_bar(aes(width=0.7),stat="identity",fill="grey") +
  scale_y_continuous(expand=c(0,0),breaks=c(0,200,400,600),labels=c("0",dollars)) +
  geom_hline(yintercept=c(200,400,600),colour="white") +
  theme(axis.text=element_text(size=11.5,colour="black"),
        axis.text.x=element_text(angle=45,hjust=1,vjust=1),
        axis.ticks=element_blank(),
        axis.title=element_blank(),
        axis.line=element_line(colour="grey"),
        axis.line.y=element_blank(),
        panel.background=element_rect(fill="white",colour="white")
         )
ggsave("fig2_13.png")
Saving 7 x 6.99 in image




Graphics for Statistics - figures with ggplot - Chapter 2 - Cleveland Dot plot

Chapter 2 - Dot Charts


Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)


Dot charts of the United Nations budget for 2008-2009


  • data:

item1<-factor(1:14,
             labels=c("Overall coordination",
               "Political affairs",
               "International law",
               "International cooperation",
               "Regional cooperation",
               "Human rights",
               "Public information",
               "Management",
               "Internal oversight",
               "Administrative",
               "Capital",
               "Safety & security",
               "Development",
               "Staff assessment"))
amount1<-c(718555600,626069600,87269400,398449400,
477145600,259227500,184000500,540204300,35997700,
108470900,58782600,197169300,18651300,461366000)
amount1<-amount1/1000000
df <- data.frame(item1=item1,amount1=amount1)
df

item1  amount1
1       Overall coordination 718.5556
2          Political affairs 626.0696
3          International law  87.2694
4  International cooperation 398.4494
5       Regional cooperation 477.1456
6               Human rights 259.2275
7         Public information 184.0005
8                 Management 540.2043
9         Internal oversight  35.9977
10            Administrative 108.4709
11                   Capital  58.7826
12         Safety & security 197.1693
13               Development  18.6513
14          Staff assessment 461.3660

  • now we can build the chart using geom_point() and geom_hline()
  • first we build a ggplot object and map x to amount1 and y to item1
  • than we add the point layer (geom_point()) setting the shape to 19 (filled circle)
  • now we need the horizontal lines, therefore we use geom_hline() and map as.numeric(item1) (which gives 1:14) to yintercept

ggplot(df,aes(x=amount1,y=item1)) +
  geom_point(shape=19) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3)
ggsave("fig2_1.png")


  • first we reverse the order of the category using reorder() by the negative of the number of the item
  • then we increase the size of the points a little (size argument in geom_point())
  • then we change the title of the x-axis and set the limits to c(0,800) (scale_x_continuous())
  • setting asis.title.y to theme_blank() gets us rid of the title of the y-axis
  • axis.title.x is managed by theme_text(): we set the text size to 12 and adjust the vertical position (vjust) downwards
  • last we set the panel background to white using theme_rect() (and because there are some leftovers of the grid lines visible in the frame we set the major grid lines to blank

ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
ggsave("fig2_1b.png")


  • remains the ticks of the y-axis, again we must use the hack (as in chapter 1 - have a look there for further information)

png("fig2_1c.png",height=500, width=500)
ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()

X11cairo 
       2


  • to change this figure to figure 2.2 we have just to replace geom_hline() by geom_segment() and change therefore some mappings

png("fig2_1d.png",height=500, width=500)
ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_segment(aes(yend=reorder(item1,-as.numeric(item1))),xend=0,linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()

X11cairo 
       2




R - Graphics for Statistics - figures with ggplot 2

chapter1


Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)


dot chart of prevalence of allergy in endoscopic sinus surgery (figure 1.1)


  • first create the data frame (which is mandatory)

names<-factor(1:6,labels=c("Epidermals","Dust Mites","Weeds","Grasses","Molds","Trees"))
prevs<-c(38.2,37.8,31.1,31.1,29.3,26.7)
df <- data.frame(names=names,prevs=prevs)
df

names prevs
1 Epidermals  38.2
2 Dust Mites  37.8
3      Weeds  31.1
4    Grasses  31.1
5      Molds  29.3
6      Trees  26.7

  • now we can create the dot chart using geomsegment() (lines) and geompoint()
  • we map x to prevs and y to names for all layers
  • in geom_segment() we map additionally yend to names and set xend to zero and linetype to 3 (dotted)
  • in geom_point() we set shape to 19 (small filled circle)
  • than we set the limits of the x axis to c(0,50) accordingly to the book chart, set the title to Percent and get rid of the title of the y axis

ggplot(df,aes(x=prevs,y=names)) + 
   geom_segment(aes(yend=names),xend=0,linetype=3) + 
   geom_point(shape=19) +
   scale_x_continuous("Percent",limits=c(0,50)) +
   opts(axis.title.y=theme_blank())
ggsave("fig1_1.png")


Saving 7 x 6.99 in image

bar chart of prevalence of allergy in endoscopic sinus surgery (figure 1.1)


  • now we map x to names and y to prevs
  • we use geombar(); we have to change the stat to "identity" because we use presummarised data (the default stat of the geom is "bin")
  • then we change the appearance of the axes as above

ggplot(df,aes(x=names,y=prevs)) + 
   geom_bar(stat="identity") +
   scale_y_continuous("Percent",limits=c(0,50)) +
   opts(axis.title.x=theme_blank())
ggsave("fig1_2.png")

Saving 7 x 6.99 in image




  • this looks fine for now; but in the book graph the labels are rotated and the bins are looking a bit narrower
  • the width of the bins is changed through the width argument in geom_bar(); in this case it is a bit tricky, because using the identity stat resets width so we have to put width in to the aes() argument (further information)
  • rotating the labels is done via opts() and text_theme() (angle)
  • I also resize the labels (size)
  • and get rid of the axis ticks (axis.ticks=theme_blank())

ggplot(df,aes(x=names,y=prevs)) + 
       geom_bar(aes(width=0.7),stat="identity") +
       scale_y_continuous("Percent",limits=c(0,50)) +
       opts(axis.title.x=theme_blank(),
            axis.text.x=theme_text(angle=90,size=12),
            axis.ticks=theme_blank())
ggsave("fig1_2b.png")

Saving 7 x 6.99 in image


  • unfortunately there are no ticks on the y axis as well, further more: in the current version of ggplot there is no equivalent to axis.ticks.x, so if you want to get rid of the ticks of just one axis you must use this hack (link)
  • another consequence is that ggsave does not work on the grid.remove edit - so we have to save the chart in the old fashioned way

png("fig1_2c.png",height=500, width=500)
ggplot(df,aes(x=names,y=prevs)) + 
       geom_bar(aes(width=0.5),stat="identity") +
       scale_y_continuous("Percent",limits=c(0,50)) +
       opts(axis.title.x=theme_blank(),
       axis.text.x=theme_text(angle=90,size=12))
g <- grid.gget(gPath("axis-b", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()

X11cairo 
       2



ggplot9.2 is out - and everything much easier:

  • you do not need to manipulate the grid elements directly, axis.ticks.x and axis.ticks.y are now available
  • axis.line does a good job to customize the axes
  • there are also some functions renamed: use theme instead of opts and element instead of theme


ggplot(df,aes(x=names,y=prevs)) + 
  geom_bar(aes(width=0.5),stat="identity") +
  scale_y_continuous("Percent",limits=c(0,50),expand=c(0,0)) +
  theme(axis.title.x=element_blank(),
        axis.text.x=element_text(angle=90,size=12,colour="black",hjust=1),
        axis.text.y=element_text(size=12,colour="black"),
        axis.line=element_line(colour="black"),
        axis.ticks.x=element_blank(),
        panel.background=element_rect(fill="white")
        )