Follow Bioinformatics to learn drawing | Seqlogo drawing

foreword

In April, I published a gene family analysis based on TBtools . 80% of the analysis in this tutorial is based on TBtools, which is very suitable for us to be a beginner in gene families. Well, some related optimizations have been done recently. I was doing Seqlogo diagram today, so I checked some tutorials and made a record.

- Of


Seqlogo diagram

The Seqlogo image is also given to you in the meme URL, which can be used directly, but you need to manually assemble the image.

download process

  1. Step one
  2. Step Two
  3. ENDING !

Drawing based on ggseqlogo package

The ggseqlogo package was published in the Bioinformatics journal. It is relatively simple and convenient to use, but it feels difficult to modify the color when batch drawing (PS: It should be that the parameters are not used correctly).

Article URL:

https://academic.oup.com/bioinformatics/article/33/22/3645/3980251?login=false

The ggseqlogo package uses the URL:

https://omarwagih.github.io/ggseqlogo/

There are more detailed documentation descriptions in the URL.


Install the R package

install.packages("ggseqlogo")

or 

devtools::install_github("omarwagih/ggseqlogo")

Load required packages

# Load the required packages
require(ggplot2)
require(ggseqlogo)

Let's directly use the sequence information in the ggseqlogo package here.

If you import the sequence yourself, you can use read.table()function import.

Batch Import:

·## 批量生产文件名
filelist = c(paste0('motif',1:10,'.txt'))
filelen <- length(filelist)

##批量读取
data.list <- list()
for (i in 1:10) {
  data.list[[paste0('motif',i)]]=scan(filelist[i],what = '')
}

# Some sample data
data(ggseqlogo_sample)

Basic graphics:

ggplot() + geom_logo( seqs_dna$MA0001.1 ) + theme_logo()


Add related parameters:
ggseqlogo is supported for amino acid, DNA and RNA sequence types. By default, ggseqlogo will try to guess your sequence type. You can explicitly set the sequence type via the seq_type option.

ggseqlogo( seqs_aa$AKT1, seq_type='aa' )

in digital form

# Replace DNA characters with numbers
seqs_numeric = chartr('ATGC','1234', seqs_dna$MA0001.1)
ggseqlogo(seqs_numeric, method='p', namespace=1:4) 

# Replace DNA characters with Greek ones
seqs_greek = chartr('ATGC', 'δεψλ', seqs_dna$MA0001.1)
ggseqlogo(seqs_greek, namespace='δεψλ', method='p')

color adjustment

Use col_schemeto adjust

ggseqlogo(seqs_dna$MA0001.1, col_scheme='base_pairing')


col_schemeThe parameters are as follows; auto, chemistry, chemistry2, hydrophobicity, nucleotide, nucleotide2, base_pairing, clustalx,taylor

Specify the color

## 设置颜色
cs1 = make_col_scheme(chars=c('A', 'T', 'C', 'G'), groups=c('gr1', 'gr1', 'gr2', 'gr2'), 
                      cols=c('purple', 'purple', 'blue', 'blue'))

## 
ggseqlogo(seqs_dna$MA0001.1, col_scheme=cs1)

color two

cs2 = make_col_scheme(chars=c('A', 'T', 'C', 'G'), values=1:4)

# Generate sequence logo
ggseqlogo(seqs_dna$MA0001.1, col_scheme=cs2)

batch drawing

ggseqlogo(seqs_dna, ncol=4)

## ncol:指定每行的展示个数

custom height

# Create a custom matrix 
set.seed(123)
custom_mat = matrix( rnorm(20), nrow=4, dimnames=list(c('A', 'T', 'G', 'C')))

# Generate sequence logo
ggseqlogo(custom_mat, method='custom', seq_type='dna') + ylab('my custom height')


For more detailed content, you can read the package help documentation! ! !


Previous articles:

1. The most complete WGCNA tutorial (replace the data to get all the results and graphics)


2. Beautiful graphics drawing tutorial

3. Transcriptome Analysis Tutorial

Xiao Du's life letter notes , mainly publish or include bioinformatics tutorials, and R-based analysis and visualization (including data analysis, graph drawing, etc.); share interested literature and learning materials!!

Guess you like

Origin blog.csdn.net/kanghua_du/article/details/130751862