Showing posts with label dot plot. Show all posts
Showing posts with label dot plot. Show all posts

Sunday, November 2, 2014

Graphics for Statistics - figures with ggplot2 - Chapter 3 part 2 Bar charts with errorbars, dot-whisker charts

1 Keen Chapter 3 part 2

Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)

1.1 Figure 3.8 Bar-whisker chart

  • we will work out the graphic on flipped axes and flip them later
  • set the aesthetics:
    • x to the names of the allergenes
    • y to the percentage (prevalence)
    • ymin to y minus the standard error and
    • ymax to y plus the standard error
  • using again geom_bar() with the stat="identity" option because we have already aggregated data (so the height of the bars is set by the y aesthetic)
  • set the filling and the colour of the edges (fill and colour), finally adjust the width (width) of the bars
  • there is no axis title on the axis with the names, so set xlab("") (remember we will flip the axis later)
  • set the title of the continuous axis to Percent and
  • set the limits of the axis to 0 and 50
  • set expansion of it to c(0,0) - because we did not want to expand the axis, it should actually end at 0 and 50
  • now we flip the axes
  • and set the appearance of the text, axis and background elements

require(ggplot2)

names<-c("Epidermals","Dust Mites","Weeds","Grasses","Molds","Trees")
prevs<-c(38.2,37.8,31.1,31.1,29.3,26.7)
se<-c(3.2,3.2,3.1,3.1,3.0,2.9)

df <- data.frame(item=factor(1:6,labels=names),prevs=prevs,se=se)

ggplot(df,aes(x=item,y=prevs,ymin=prevs-se,ymax=prevs+se)) +
    geom_bar(stat="identity",fill="transparent",colour="black",width=0.7) +
    geom_errorbar(width=0.3) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.text=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


1.2 Figure 3.9 Bar and single whisker chart

  • the same as above, only change the filling of the bars to black (you do not need the colour argument any more) and the width of the error bars to 0

ggplot(df,aes(x=item,y=prevs,ymin=prevs-se,ymax=prevs+se)) +
    geom_bar(stat="identity",fill="black",width=0.7) +
    geom_errorbar(width=0) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.text=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


1.3 Figure 3.10 Dot-whisker chart

  • for the dot-whisker chart we replace geom_bar() by geom_point()
  • in geom_point() we set the point size to four and the colour to black
  • in geom_errorbar() we set the the width to 0.3
  • add geom_vline() for the dotted lines and set the aesthetics xintercept to as.numeric(item) (because this aesthetic expects a numeric argument)
  • then change some elements in the theme section
    • set the colour in panel.border to black and do not forget to set fill to NA (you won't see anything if you don't)
    • remove the axis.line.y line


ggplot(df,aes(x=item,y=prevs,ymin=prevs-se,ymax=prevs+se)) +
    geom_point(colour="black",size=4) +
    geom_errorbar(width=0.25) +
    geom_vline(aes(xintercept=as.numeric(item)),linetype=3,size=0.4) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        panel.border=element_rect(colour="black",fill=NA),
        axis.line=element_line(colour="black"),
        axis.text=element_text(colour="black",size=14),
        axis.title=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


1.4 Figure 3.11 Dot-whisker chart

  • only minor changes to the previous plot
  • remove geom_vline()
  • adjust the widths of the error bars


ggplot(df,aes(x=item,y=prevs,ymin=prevs-se,ymax=prevs+se)) +
    geom_point(colour="black",size=4) +
    geom_errorbar(width=0.1) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        panel.border=element_rect(colour="black",fill=NA),
        axis.line=element_line(colour="black"),
        axis.text=element_text(colour="black",size=14),
        axis.title=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


1.5 Figure 3.12 Dot-whisker chart

  • only adjust the widths of the error bars


ggplot(df,aes(x=item,y=prevs,ymin=prevs-se,ymax=prevs+se)) +
    geom_point(colour="black",size=4) +
    geom_errorbar(width=0) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        panel.border=element_rect(colour="black",fill=NA),
        axis.line=element_line(colour="black"),
        axis.text=element_text(colour="black",size=14),
        axis.title=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


1.6 Figure 3.13 two-tiered dot-whisker chart

  • there are several possibilities
  • I decided to use two error bar layers so first
  • I have to move the aesthetics for ymin and ymax to geom_errorbar(), I set the width to 0.2
  • then I add a second geom_errorbar() set there also aesthetics but now ymin to prevs-1.96*se and ymax to prevs+1.96*se


ggplot(df,aes(x=item,y=prevs)) +
    geom_point(colour="black",size=4) +
    geom_errorbar(aes(ymin=prevs-se,ymax=prevs+se),width=0.2) +
    geom_errorbar(aes(ymin=prevs-1.96*se,ymax=prevs+1.96*se),width=0) +
    xlab("") +
    ylab("Percent") +
    scale_y_continuous(limits=c(0,50),expand=c(0,0)) +
    coord_flip() +
    theme(
        panel.background=element_blank(),
        panel.border=element_rect(colour="black",fill=NA),
        axis.line=element_line(colour="black"),
        axis.text=element_text(colour="black",size=14),
        axis.title=element_text(colour="black",size=14),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank()
        )


Wednesday, September 12, 2012

Graphics for Statistics - figures with ggplot - Chapter 3 - Bar Charts, Dot plot, add pic

chapter3


1 Chapter 3


Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)

  • here are the data
item<-c("Canada",
"Mexico",
"Saudi Arabia",
"Venezuela",
"Nigeria")

amount<-c(2460,1538,1394,1273,1120)
amount<-amount/1000

df <- data.frame(item=factor(1:5,labels=item),amount=amount)
barrel <- read.jpeg("barrel.jpg")

1.1 Figure 3.4 - simple bar chart


  • we use geom_bar to create the bar chart
  • customizing the y-axis by using scale_y_continuous: limits set the limits, expand defines the multiplicative and additive expansion constants
  • coord_flip rotates it (so we get a horizontal bar chart)
  • than we set the background to white
  • set the colour of the axis lines to black (we have to do this to axis.line not just axis.line.x because of inheritance)
  • get rid of the vertical axis
  • set colour of the ticks of the x-axis to black
  • get rid of the ticks of the y-axis
  • set the colour of the axis labels to black
  • change the adjustment of the labels of vertical axis
  • get rid of the grid lines (they are still visible if one looks carefully)
ggplot(df,aes(y=amount,x=reorder(item,-as.numeric(item)))) +
  geom_bar(stat="identity",fill="white",colour="black") +
  scale_y_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +
  xlab("") +
  coord_flip() +
  theme(panel.background=element_rect(fill="white"),
        axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.ticks.y=element_blank(),
        axis.text=element_text(colour="black",size=11),
        axis.text.y=element_text(hjust=0),
        panel.grid=element_blank())
ggsave("fig3_4.png")
Saving 7 x 6.99 in image

1.2 Figure 3.5 - simple bar chart


  • we use geom_point to create the chart with dots (set the size of the dots to 3)
  • via geom_segment we add the dotted lines (linetype=3)
  • customizing the x-axis by using scale_x_continuous: limits set the limits, expand defines the multiplicative and additive expansion constants
  • set the colour of the axis lines to black (we have to do this to axis.line not just axis.line.x because of inheritance)
  • get rid of the vertical axis
  • set colour of the ticks of the x-axis to black
  • get rid of the ticks of the y-axis
  • set the colour of the axis labels to black
  • than we set the background to transparent and the colour of the frame to black (panel.background=element_rect)
  • get rid of the grid lines (they are still visible if one looks carefully)
  • get rid of of the title of the y-axis
  • set the colour and the size of the title of the x-axis to black and 11 respectively

ggplot(df,aes(x=amount,y=item)) +
  geom_point(size=3) +
  geom_segment(aes(yend=as.numeric(item)),xend=0,linetype=3) +
  scale_x_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +  
  theme(axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.ticks.x=element_line(colour="black"),
        axis.text=element_text(colour="black",size=11),
        panel.background=element_rect(fill="transparent",colour="black"),
        panel.grid=element_blank(),
        axis.title.y=element_blank(),
        axis.title.x=element_text(colour="black",size=11)
        )
ggsave("fig3_5.png")


Saving 7 x 6.99 in image

1.3 Figure 3.7


  • the clipart can be downloaded here
  • we need the ReadImages package for reading this jpeg
  • we need the grid graphics package to divide the plot and insert to several parts
  • first we load two additional packages (ReadImages for reading the jpeg and grid for the grid graphics functions)
  • the next part (definition of the dot chart) is exactly the same as in figure 3.5
  • load the jpeg with the barrel (barrel <- read.jpeg("barrel.jpg"))
  • the next commands are part of the grid package, which is the underlying graphics system of ggplot2
    • grid.newpage moves to a new page
    • pushViewport adds a new viewport (plotting region) to the page (via x and y one can set the position), beginning in the top left corner, setting the width to 0.6 relative to the page and the height to 0.95; just sets the adjustment
    • print(p,newpage=F) prints the dot chart in this viewport
    • popViewport() closes the viewport
    • create another viewport next to the other one with width 0.4 and the same height
    • grid.raster inserts the picture of the barrel
    • grid.text inserts the text


library(grid)
library(ReadImages)

p <- ggplot(df,aes(x=amount,y=reorder(item,amount))) +
  geom_point(size=3) +
  geom_segment(aes(yend=reorder(item,amount)),xend=0,linetype=3) +
  scale_x_continuous("Millions of Barrels per Day",limits=c(0,2.5),expand=c(0,0)) +  
  theme(axis.line=element_line(colour="black"),
        axis.line.y=element_blank(),
        axis.ticks.y=element_blank(),
        axis.text=element_text(colour="black",size=12),
        panel.background=element_rect(fill="white",colour="black"),
        panel.grid=element_blank(),
        axis.title.y=element_blank(),
        axis.title.x=element_text(colour="black",size=11)
        )

barrel <- read.jpeg("barrel.jpg") 

grid.newpage()
pushViewport(viewport(x=unit(0,"line"),y=unit(1,"npc")-unit(2,"mm"),width=0.6,height=0.95,name="vp1",just=c("left","top")))
print(p,newpage=F)
popViewport()
pushViewport(viewport(x=unit(0.7,"npc"),y=unit(0,"npc"),width=0.4,height=0.95,name="vp1",just=c("left","bottom")))
grid.raster(barrel,width=unit(1,"npc"),just=c("centre","bottom"),x=unit(0.2,"npc"),y=unit(3,"line"))
grid.text("Top Five Importing\nCountries of Crude Oil\nand Petrolium\nProducts in 2007\nfor the united States",x=unit(0.2,"npc"),y=unit(1,"npc")-unit(2,"line"),just=c("center","top"))
savePlot("fig3_7.png")


Date: 2012-09-12 21:56:59 CEST

Author: mandy

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0


Saturday, September 8, 2012

Graphics for Statistics - figures with ggplot - Chapter 2 - Cleveland Dot plot

Chapter 2 - Dot Charts


Graphics out of the book Graphics for Statistics and Data Analysis with R by Kevin Keen (book home page)


Dot charts of the United Nations budget for 2008-2009


  • data:

item1<-factor(1:14,
             labels=c("Overall coordination",
               "Political affairs",
               "International law",
               "International cooperation",
               "Regional cooperation",
               "Human rights",
               "Public information",
               "Management",
               "Internal oversight",
               "Administrative",
               "Capital",
               "Safety & security",
               "Development",
               "Staff assessment"))
amount1<-c(718555600,626069600,87269400,398449400,
477145600,259227500,184000500,540204300,35997700,
108470900,58782600,197169300,18651300,461366000)
amount1<-amount1/1000000
df <- data.frame(item1=item1,amount1=amount1)
df

item1  amount1
1       Overall coordination 718.5556
2          Political affairs 626.0696
3          International law  87.2694
4  International cooperation 398.4494
5       Regional cooperation 477.1456
6               Human rights 259.2275
7         Public information 184.0005
8                 Management 540.2043
9         Internal oversight  35.9977
10            Administrative 108.4709
11                   Capital  58.7826
12         Safety & security 197.1693
13               Development  18.6513
14          Staff assessment 461.3660

  • now we can build the chart using geom_point() and geom_hline()
  • first we build a ggplot object and map x to amount1 and y to item1
  • than we add the point layer (geom_point()) setting the shape to 19 (filled circle)
  • now we need the horizontal lines, therefore we use geom_hline() and map as.numeric(item1) (which gives 1:14) to yintercept

ggplot(df,aes(x=amount1,y=item1)) +
  geom_point(shape=19) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3)
ggsave("fig2_1.png")


  • first we reverse the order of the category using reorder() by the negative of the number of the item
  • then we increase the size of the points a little (size argument in geom_point())
  • then we change the title of the x-axis and set the limits to c(0,800) (scale_x_continuous())
  • setting asis.title.y to theme_blank() gets us rid of the title of the y-axis
  • axis.title.x is managed by theme_text(): we set the text size to 12 and adjust the vertical position (vjust) downwards
  • last we set the panel background to white using theme_rect() (and because there are some leftovers of the grid lines visible in the frame we set the major grid lines to blank

ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
ggsave("fig2_1b.png")


  • remains the ticks of the y-axis, again we must use the hack (as in chapter 1 - have a look there for further information)

png("fig2_1c.png",height=500, width=500)
ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_hline(aes(yintercept=as.numeric(item1)),linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()

X11cairo 
       2


  • to change this figure to figure 2.2 we have just to replace geom_hline() by geom_segment() and change therefore some mappings

png("fig2_1d.png",height=500, width=500)
ggplot(df,aes(x=amount1,y=reorder(item1,-as.numeric(item1)))) +
  geom_point(shape=19,size=4) +
  geom_segment(aes(yend=reorder(item1,-as.numeric(item1))),xend=0,linetype=3) +
  scale_x_continuous("Millions of US Dollars",limits=c(0,800)) +
  opts(axis.title.y=theme_blank(),
       axis.text.y=theme_text(size=12),
       axis.title.x=theme_text(size=12,vjust=-0.7),
       axis.text.x=theme_text(size=12),
       panel.background=theme_rect(fill="white"),
       panel.grid.major=theme_blank())
g <- grid.gget(gPath("axis-l", "", "", "", "axis.ticks.segments"))
grid.remove(g$name)
dev.off()

X11cairo 
       2