Monday, July 22, 2013

R ggbio - visualize a single chromosome

  • plotIdeogram() is a wrapper function, synonyms to plotSingleChrom()
  • requires packages rtracklayer, GenomicRanges, BioGenerics, parallel
  • takes a name of a genome as argument and downloads the cytoband table from UCSC
require(ggbio)
p <- plotIdeogram()
1: hg19       2: hg18       3: hg17       4: hg16       5: vicPac1
  6: dasNov3    7: otoGar3    8: papHam1    9: papAnu2   10: felCat5
 11: felCat4   12: felCat3   13: panTro4   14: panTro3   15: panTro2
 16: panTro1   17: bosTau7   18: bosTau6   19: bosTau4   20: bosTau3
 21: bosTau2   22: canFam3   23: canFam2   24: canFam1   25: turTru2
 26: loxAfr3   27: musFur1   28: nomLeu3   29: nomLeu2   30: nomLeu1
 31: gorGor3   32: cavPor3   33: eriEur1   34: equCab2   35: equCab1
 36: dipOrd1   37: triMan1   38: calJac3   39: calJac1   40: pteVam1
 41: myoLuc2   42: mm10      43: mm9       44: mm8       45: mm7    
 46: micMur1   47: hetGla2   48: hetGla1   49: monDom5   50: monDom4
 51: monDom1   52: ponAbe2   53: ailMel1   54: susScr3   55: susScr2
 56: ochPri2   57: ornAna1   58: oryCun2   59: rn5       60: rn4    
 61: rn3       62: rheMac3   63: rheMac2   64: proCap1   65: oviAri1
 66: sorAra1   67: choHof1   68: speTri2   69: saiBol1   70: sorAra1
 71: sarHar1   72: echTel1   73: tupBel1   74: macEug2   75: cerSim1
 76: gadMor1   77: melUnd1   78: galGal4   79: galGal3   80: galGal2
 81: latCha1   82: fr3       83: fr2       84: fr1       85: petMar2
 86: petMar1   87: anoCar2   88: anoCar1   89: oryLat2   90: geoFor1
 91: oreNil2   92: chrPic1   93: gasAcu1   94: tetNig2   95: tetNig1
 96: melGal1   97: xenTro3   98: xenTro2   99: xenTro1  100: taeGut1
101: danRer7  102: danRer6  103: danRer5  104: danRer4  105: danRer3
106: ci2      107: ci1      108: braFlo1  109: strPur2  110: strPur1
111: apiMel2  112: apiMel1  113: anoGam1  114: droAna2  115: droAna1
116: droEre1  117: droGri1  118: dm3      119: dm2      120: dm1    
121: droMoj2  122: droMoj1  123: droPer1  124: dp3      125: dp2    
126: droSec1  127: droSim1  128: droVir2  129: droVir1  130: droYak2
131: droYak1  132: caePb2   133: caePb1   134: cb3      135: cb1    
136: ce10     137: ce6      138: ce4      139: ce2      140: caeJap1
141: caeRem3  142: caeRem2  143: priPac1  144: aplCal1  145: sacCer3
146: sacCer2  147: sacCer1  

Selection: 1
Loading…
Done
use chr1 automatically

  • plot chromosom 1 from the hg19 genome (chromosom 1 is the default)
p <- plotIdeogram(genome="hg19")
p

  • plot second chromosome
p <- plotIdeogram(genome="hg19",subchr="chr2" )
p

  • get rid of the cytoband
p <- plotIdeogram(genome="hg19",subchr="chr2",cytoband=F)
p

  • or change aspect ratio (you can change it also afterwards via + theme(aspect.ratio= )
p <- plotIdeogram(genome="hg19",subchr="chr2",aspect.ratio=1/10 )
p

  • have a closer look at the structure of p
attributes(p)
$.S3Class
[1] "gg"     "ggplot"

$class
[1] "ideogram"
attr(,"package")
[1] "ggbio"

$names
[1] "data"        "layers"      "scales"      "mapping"     "theme"      
[6] "coordinates" "facet"       "plot_env"    "labels"     

$xlab
[1] ""

$main
[1] ""

$ideogram

$xlabel
[1] FALSE

$ideogram.data
GRanges with 62 ranges and 2 metadata columns:
       seqnames                 ranges strand   |     name gieStain
          <Rle>              <IRanges>  <Rle>   | <factor> <factor>
   [1]     chr2   [       0,  4400000]      *   |    p25.3     gneg
   [2]     chr2   [ 4400000,  7100000]      *   |    p25.2   gpos50
   [3]     chr2   [ 7100000, 12200000]      *   |    p25.1     gneg
   [4]     chr2   [12200000, 16700000]      *   |    p24.3   gpos75
   [5]     chr2   [16700000, 19200000]      *   |    p24.2     gneg
   ...      ...                    ...    ... ...      ...      ...
  [58]     chr2 [225200000, 226100000]      *   |    q36.2     gneg
  [59]     chr2 [226100000, 231000000]      *   |    q36.3  gpos100
  [60]     chr2 [231000000, 235600000]      *   |    q37.1     gneg
  [61]     chr2 [235600000, 237300000]      *   |    q37.2   gpos50
  [62]     chr2 [237300000, 243199373]      *   |    q37.3     gneg
  ---
  seqlengths:
    chr1 chr10 chr11 chr12 chr13 chr14 ...  chr6  chr7  chr8  chr9  chrX  chrY
      NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA

$subchr
[1] "chr2"

$aspect.ratio
[1] 0.1

$color
[1] "red"

$fill
[1] "red"

$alpha
[1] 0.7

$size
[1] 1

$zoom.offset
[1] 0.1


  • so we see: data are stored within p, so it is not necessary to download them again
  • change aspect ratio (you can change it also afterwards via + theme(aspect.ratio= )
md <- attr(p,"ideogram.data")
plotIdeogram(md, aspect.ratio = 0.3)

  • so you can always download data and load it later using the getIdeogram() function, which has three arguments:
    • genome must be one of the result from ucscGenomes()$db
    • subchr defines a subset (character vectors)
    • cytobands must be TRUE or FALSE

  • you can also use the underlying functions available in the rtracklayer package
  • ucscGenomes() returns a data frame with information about the available UCSC genomes (identifier, specie, date, official name)
head(ucscGenomes())
db   species      date                               name
1    hg19     Human Feb. 2009 Genome Reference Consortium GRCh37
2    hg18     Human Mar. 2006                    NCBI Build 36.1
3    hg17     Human  May 2004                      NCBI Build 35
4    hg16     Human Jul. 2003                      NCBI Build 34
5 vicPac1    Alpaca Jul. 2008          Broad Institute VicPac1.0
6 dasNov3 Armadillo Dec. 2011            Broad Institute DasNov3
  • plot ideogram of hg19 (data are contained in the biovizBase package)
data(hg19IdeogramCyto, package="biovizBase" )
plotIdeogram(hg19IdeogramCyto)

  • add zoom.region
plotIdeogram(hg19IdeogramCyto,zoom.region=c(1e7,5e7) )

  • change color of the zoom region (border and filling)
plotIdeogram(hg19IdeogramCyto,zoom.region=c(1e7,5e7),fill="blue",color="blue" )

  • change zoom offset (zoom.offset) and size of the frame (size)
p <- plotIdeogram(hg19IdeogramCyto,zoom.region=c(1e7,5e7),fill="blue",color="blue",zoom.offset=4,size=2)
p

  • visualize chromosome 2 instead
p + xlim(GRanges("chr2", IRanges(1e7, 5e7)))

  • add labels on the x-axis
plotIdeogram(hg19IdeogramCyto,zoom.region=c(1e7,5e7),fill="blue",color="blue",zoom.offset=4,size=2,xlabel=T)

  • plot overview
autoplot(seqinfo(hg19Ideogram)[paste("chr",c(14:20,"X","Y"),sep="")])

No comments :

Post a Comment