Visualization of UCSC browser

  introduction

In the research process of Shengxin, the display of data is often a very important step, and the choice of which way to display is even more troublesome. Here, we introduce the visualization process in the UCSC browser, which can display data information on the reference genome, such as the methylation value, expression value or differential region near the gene of interest, etc. Of course, the more important thing is Facilitate your own understanding of the data.


1 create a project


Address: http://genome.ucsc.edu/

First click on My Sessions in My Data to enter the page.

image


Create a new project.

image

Click on the newly created project to enter the visualization page, and click add custom tracks below the picture to upload your own data.

image

You can also set the data type of the species, genome and reference genome information. If the reference genome information needs to be converted, it can be realized through LiftOver (the conversion between hg19 and hg38).


image


1 Upload of data


Browser supports bigBed, bigChain, bigGenePred, bigMaf, bigPsl, bigWig, barChart, bigBarChart, BAM, VCF, BED, BED detail, bedGraph, broadPeak, CRAM, GFF, GTF, MAF, narrowPeak, Personal Genome SNP, PSL, or WIG The upload of other file formats is mainly divided into direct submission through the web page and upload through the URL link of the server.

Due to the large number of data types, the BED file that can be uploaded directly is used as an example to introduce:

The BED file consists of 3 columns of required information and 9 additional optional information. The three columns of required information are: chromosome, starting position, and ending position. The other 9 additional information are as follows:

image

The first line of the BED file is the setting of this item, including name, description and color. The choice of Color can choose different colors according to the RGB color parameters. The data uploaded to the BED file cannot reflect the changes, heights and other phenomena of the data, and can only display the area information of the data. Therefore, more advanced settings are required, that is, the color of the area is set to distinguish the size of the data. This function is the itemRgb option in the 9th column of the BED file. It should be noted that if the information in the 9th column is to be used, the previous columns of information must also exist and cannot be vacant.

image

After saving, click go to visualize on the genome, you can find the data of the region of interest, and judge the size of the methylation value according to the shade of the color.

image

The simplest format only requires the first three columns to upload and view, and data in other formats can also be converted into BED format for upload.


When the file size exceeds 50M, it is not recommended to upload data through the web page. At this time, a URL link is required, usually bigBed, bigWig, bigGenePred, BAM and VCF format data. Let's take bigWig format as an example.

bigWig is a binary compressed file converted by a file in wig format, and a wig file is also a file containing regional information similar to a BED file. Generally, a wig format file can be generated after MACS peak detection. The wig format can be uploaded directly, but it is recommended to convert it into a bigWig file for upload if it exceeds 50m. The conversion command is as follows:

wigToBigWig input.wig chrom.sizes myBigWig.bw (If there is an error that exceeds the length of the chromosome, you need to edit the chromosome information to increase the length)

wigToBigWig program download address: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/wigToBigWig

chrom.sizes chromosome information download link: http://hgdownload.soe.ucsc.edu/admin/exe/

The upload example is as follows, the link of the data in the server needs to be added after bigDataUrl.

image

The advantage of wig and bigWig files is that they can reflect the changes and heights of the data, such as the peak value of histone modification.

image


1 image settings


The first is the top setting, which zooms in and out of the area in the figure (based on the number of bases). The input box below allows you to enter the location of the specific genome (tab separated) or the name of an item to jump to it, or Click the image and drag left and right to make fine adjustments.

image

Each track in the figure can be moved up and down by clicking the gray bar on the left and dragging it.

image

For each uploaded track, you can right-click the gray bar in the picture to set its display effect. If the uploaded file has a defined name for each section, there will be more options for pack and squish.

image

For example, take the CpG island that comes with the browser as an example. After selecting pack, its name will be displayed in front of each CpG island, which is a more suitable display effect. Users can choose different settings to adjust according to the data they upload.

image


For data with peak value display, you can set the displayed height by left-clicking on the gray bar.

image

image


In the line of settings below the image, tracksearch can search for items of interest, manage custom tracks is to upload files, configure is to set the width of the image, and resize is to adjust the width of the image to be consistent with the width of the browser.

image


About the bottom setting, it is the reference information in the genome of the browser. You can hide the unhelpful information. If the information of interest is not displayed, you can use track search to search and add it.

image


The image can be saved by clicking on the PDF/PS above.

image


The last is the saving of this project. There is no saving option in the web page, so you need to repeat the process of project creation. Click My Sessions in My Data at the top of the interface , and use the name of the current project to save and overwrite.


Summary: This article only provides the simplest and direct visualization implementation method. A more detailed introduction is available on the UCSC official website: http://genome.ucsc.edu/FAQ/FAQformat.html#format1. The visualization of Genome Browser is more about your own understanding of the data, and you can more intuitively check whether the identified different regions are significant, or the information of the regions of interest such as CpG islands, promoters, etc., which is helpful for further analysis .




Guess you like

Origin blog.51cto.com/15127592/2674467