Manual of CircPlant v1.0

CircPlant is an integrated tool for the identification of circRNA and its miRNA-related interaction in plants.

1. Installation

CircPlant is developed in perl, three dependencies are required to run it.

	BWA (Please make sure that BWA is added to $PATH.)
	Perl (version>=5.10)
	CIRI2
	Targetfinder
	FASTA36 (If you want to detect circRNA-miRNA interaction. Required to be added to $PATH.)
	***CIRI2 and Targetfinder are already packed with the CircPlant software. FASTA36 is also in the CircPlant package or you can download a preferred version from http://faculty.virginia.edu/wrpearson/fasta/CURRENT/.
	
2. Commands and arguments

How to run CircPlant: 

For single-end
perl CircPlant.pl -I read -O outdir -R ref.fa
For paired-end
perl CircPlant.pl -I read1,read2 -O outdir -R ref.fa

The arguments of CircPlant are as followings:

	-I, --in
		  input file(s) name, FASTA/FASTQ file from total/non-poly(A) RNA-Seq (required)
    -O, --out
          output directory (required)
    -R, --ref_dir
          FASTA file of reference genome (required)
    -A, --anno
          input GTF/GFF3 formatted annotation file name (optional)
    -MI, --miRNA
		  FASTA file of mature miRNAs (optional)
    -H, --help
          show this help information
    -S, --max_span
          max spanning distance of circRNAs (default: 200000)
    -L, --low_strigency
          output circRNAs supported by more than * junction reads
	-U, --mapq
          set threshold for mappqing quality of each segment of junction reads (default: 10; should be within [0,30])
    -M, --chrM
          tell CircPlant the ID of mitochondrion in reference file (default: chrM)
	-P, --chrP
          tell CircPlant the ID of chloroplast in reference file (default: chrP)	  
    -T, --thread_num
          number of threads for parallel running (default: 1)

3. Example
Input

Before start, please make sure you have installed parallel Perl 5.10 or higher and use Mac OS X or Linux operation system. 
Please download CircPlant from http://bis.zju.edu.cn/CircPlant. Test data sets (FASTQ file, annotation file, reference sequence and mature miRNA file) are packaged with the CircPlant software.

Go to the directory in which CircPlant is downloaded. Please type as following in your terminal:
tar -xzvf CircPlant.tar.gz
cd CircPlant
tar -xzvf fasta36-linux64.tar.gz

Please add the directory of FASTA36, usually "./fasta-36.3.8e/bin" in your system's $PATH.

cd CircPlant_test
perl ../scripts/CircPlant.pl -I test_1.fq,test_2.fq -O outdir -R ref.fa -A anno.gtf -MI miRNA
Or CircPlant can search circRNAs without the annotation gtf:
perl ../scripts/CircPlant.pl -I test_1.fq,test_2.fq -O outdir -R ref.fa -MI miRNA


Output
1)circRNA.circ
circRNA_ID	chr	circRNA_strand	circRNA_start	circRNA_end	circRNA_type	gene_id	junction_reads	junction_reads_ID
1:720024|720517	1	-	720024	720517	other	n/a	4	SRR1005257.21206194,SRR1005257.4272244,SRR1005257.19436260,SRR1005257.21206194,
1:2951210|2952311	1	+	2951210	2952311	other	n/a	4	SRR1005257.25532273,SRR1005257.24061122,SRR1005257.25532273,SRR1005257.3057260,
...

2)isoforms.circ
All circRNA sequences in FASTA format, including circRNA isoforms.

3)circRNA-miRNA.circ
circRNA_ID	chr	circRNA_strand	circRNA_start	circRNA_end	circRNA_type	gene_id	miRNA	junction_reads	junction_reads_ID
10:40601|153925	10	+	40601	153925	other	n/a	osa-miR818e,osa-miR5788,osa-miR2880,osa-miR531b,osa-miR1425-5p,osa-miR1441,osa-miR2925	7	SRR1005257.23904149,SRR1005257.9441661,SRR1005257.6350517,SRR1005257.23904149,SRR1005257.6359405,SRR1005257.2088732,SRR1005257.6350517,
10:40727|154181	10	-	40727	154181	other	n/a	osa-miR2905,osa-miR818e,osa-miR5788,osa-miR812f,osa-miR1439,osa-miR812j,osa-miR2925	15	SRR1005257.19485088,SRR1005257.12469781,SRR1005257.22487639,SRR1005257.22834101,SRR1005257.22884390,SRR1005257.23026386,SRR1005257.9945901,SRR1005257.10102062,SRR1005257.10361472,SRR1005257.9305723,SRR1005257.9511172,SRR1005257.9630851,SRR1005257.9764860,SRR1005257.19485088,SRR1005257.22140429,
...

4. Files
Reference sequences
ftp://ftp.ensemblgenomes.org/pub/current/plants/fasta/

Annotation file(optional)
ftp://ftp.ensemblgenomes.org/pub/current/plants/gtf/
ftp://ftp.ensemblgenomes.org/pub/current/plants/gff3/

Mature miRNA sequences (optional)
http://www.mirbase.org/

GFF format
http://genome.ucsc.edu/FAQ/FAQformat.html#format3
GTF format
http://genome.ucsc.edu/FAQ/FAQformat.html#format4

Note: Please make sure you are using exactly the same version of genomic sequences and their annotations when running CircPlant. 


If you have any questions, please contact Zhang Peijing @ Ming Chen's Bioinformatics Group, Zhejiang University.
Email: zhangpj@zju.edu.cn