Run Bwa Aln With Paired End Reads

Provided by: bwa_0.seven.5a-2_amd64 bug

Proper name

          bwa - Burrows-Wheeler Alignment Tool

SYNOPSIS

          bwa alphabetize ref.fa         bwa mem ref.fa reads.fq > aln-se.sam         bwa mem ref.fa read1.fq read2.fq > aln-pe.sam         bwa aln ref.fa short_read.fq > aln_sa.sai         bwa samse ref.fa aln_sa.sai short_read.fq > aln-se.sam         bwa sampe ref.fa aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln-pe.sam         bwa bwasw ref.fa long_read.fq > aln.sam

DESCRIPTION

          BWA  is  a  software parcel for mapping low-divergent sequences against a large reference        genome, such as the human being genome. It consists of 3 algorithms:  BWA-backtrack,  BWA-SW        and  BWA-MEM.  The  first  algorithm  is designed for Illumina sequence reads upwards to 100bp,        while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share        like  features such every bit long-read back up and separate alignment, but BWA-MEM, which is the        latest, is mostly recommended for  high-quality  queries  as  information technology  is  faster  and  more        authentic.   BWA-MEM  also  has ameliorate performance than BWA-backtrack for 70-100bp Illumina        reads.         For all the algorithms, BWA first needs to construct the FM-index for the reference genome        (the          alphabetize          command).  Alignment  algorithms  are  invoked  with  different sub-commands:          aln/samse/sampe          for BWA-backtrack,          bwasw          for BWA-SW and          mem          for the BWA-MEM algorithm.

COMMANDS AND OPTIONS

          index          bwa          index          [-p          prefix] [-a          algoType]          db.fa          Index database sequences in the FASTA format.          OPTIONS:          -p          STR          Prefix of the output database [same as db filename]          -a          STR          Algorithm for constructing BWT index. BWA implements two  algorithms  for                         BWT  construction:          is          and          bwtsw.  The outset algorithm is a niggling faster                         for minor database but requires large RAM and does not piece of work for databases                         with  total  length longer than 2GB. The second algorithm is adapted from                         the BWT-SW source lawmaking. It in theory works with database  with  trillions                         of  bases.  When  this option is non specified, the appropriate algorithm                         will exist chosen automatically.          mem          bwa          mem          [-aCHMpP] [-t          nThreads] [-k          minSeedLen] [-w          bandWidth]  [-d          zDropoff]  [-r          seedSplitRatio]  [-c          maxOcc]  [-A          matchScore]  [-B          mmPenalty] [-O          gapOpenPen] [-Due east          gapExtPen] [-L          clipPen] [-U          unpairPen]  [-R          RGline]  [-five          verboseLevel]          db.prefix          reads.fq          [mates.fq]                Align  70bp-1Mbp query sequences with the BWA-MEM algorithm. Briefly, the algorithm               works by seeding alignments with maximal exact matches (MEMs)  and  then  extending               seeds with the affine-gap Smith-Waterman algorithm (SW).                If          mates.fq          file  is  absent and choice          -p          is not set, this command regards input               reads are unmarried-end. If          mates.fq          is present, this command assumes the          i-thursday read in          reads.fq          and  the          i-th read in          mates.fq          constitute a read pair. If          -p          is used, the               command assumes the twoi-th and the (2i+one)-thursday read in          reads.fq          establish a read pair               (such  input file is said to be interleaved). In this example,          mates.fq          is ignored. In               the paired-cease style, the          mem          control will infer the read orientation and the insert               size distribution from a batch of reads.                The  BWA-MEM  algorithm  performs  local alignment. It may produce multiple primary               alignments for dissimilar role of a query sequence. This is a  crucial  feature  for               long  sequences.  However, some tools such as Picard's markDuplicates does non work               with split up alignments. One may consider to utilise option          -Thou          to flag shorter split hits               equally secondary.          OPTIONS:          -t          INT          Number of threads [one]          -k          INT          Minimum  seed  length.  Matches  shorter  than          INT          will  be missed. The                         alignment  speed  is  usually  insensitive  to  this  value   unless   information technology                         significantly deviates 20. [19]          -w          INT          Band  width.  Substantially,  gaps  longer than          INT          will not exist establish. Notation                         that the maximum gap length is too afflicted by the  scoring  matrix  and                         the hitting length, not solely determined past this option. [100]          -d          INT          Off-diagonal  X-dropoff  (Z-dropoff).  Stop extension when the departure                         between the best and the current extension score  is  above  |i-j|*A+INT,                         where          i          and          j          are  the  current positions of the query and reference,                         respectively, and          A          is  the  matching  score.  Z-dropoff  is  similar  to                         BLAST's  X-dropoff  except  that  information technology  doesn't penalize gaps in one of the                         sequences  in  the  alignment.  Z-dropoff  non  only  avoids  unnecessary                         extension, but also reduces poor alignments inside a long good alignment.                         [100]          -r          FLOAT          Trigger re-seeding for a MEM longer than          minSeedLen*FLOAT.  This is a key                         heuristic parameter for tuning the operation. Larger value yields fewer                         seeds, which leads to faster alignment speed but lower accuracy. [1.five]          -c          INT          Discard a MEM if it has more than          INT          occurence in the genome. This is an                         insensitive parameter. [10000]          -P          In the paired-end manner, perform SW to rescue missing hits only simply exercise not                         try to find hits that fit a proper pair.          -A          INT          Matching score. [1]          -B          INT          Mismatch penalty. The  sequence  error  rate  is  approximately:  {.75  *                         exp[-log(4) * B/A]}. [4]          -O          INT          Gap open penalty. [6]          -E          INT          Gap  extension  penalty. A gap of length k costs O + thou*East (i.e.          -O          is for                         opening a nil-length gap). [1]          -L          INT          Clipping penalty. When performing SW extension, BWA-MEM  keeps  rail  of                         the  best  score  reaching the end of query. If this score is larger than                         the best SW score minus  the  clipping  penalisation,  clipping  volition  not  be                         applied.  Note  that  in  this  case,  the SAM Every bit tag reports the best SW                         score; clipping penalty is not deducted. [5]          -U          INT          Punishment for an unpaired read pair. BWA-MEM scores an unpaired  read  pair                         equally     scoreRead1+scoreRead2-INT          and     scores     a    paired    equally                         scoreRead1+scoreRead2-insertPenalty. It  compares  these  two  scores  to                         make up one's mind  whether  we should force pairing. A larger value leads to more                         aggressive read pair. [17]          -p          Assume the get-go input query file is interleaved paired-end FASTA/Q.  See                         the control description for details.          -R          STR          Complete  read  group  header  line.  '\t' can be used in          STR          and will be                         converted to a TAB in the output SAM. The read group ID will be  attached                         to every read in the output. An example is '@RG\tID:foo\tSM:bar'.  [null]          -T          INT          Don't  output  alignment  with score lower than          INT.  This option affects                         output and occasionally SAM flag 2. [xxx]          -a          Output all found alignments for single-end or unpaired paired-finish  reads.                         These alignments will be flagged as secondary alignments.          -C          Append  append  FASTA/Q comment to SAM output. This choice tin can be used to                         transfer read meta information (e.chiliad. barcode) to  the  SAM  output.  Note                         that  the  FASTA/Q  comment (the cord afterwards a space in the header line)                         must suit the SAM spec (e.g. BC:Z:CGTAC). Malformated comments atomic number 82 to                         incorrect SAM output.          -H          Use  hard  clipping  'H'  in the SAM output. This option may dramatically                         reduce  the  redundancy  of  output  when  mapping  long  contig  or  BAC                         sequences.          -K          Mark shorter dissever hits as secondary (for Picard compatibility).          -v          INT          Control  the  verbose level of the output. This option has not been fully                         supported throughout BWA. Ideally, a value 0 for disabling all the output                         to stderr; 1 for outputting errors simply; 2 for warnings and errors; iii for                         all normal messages; iv or higher for debugging. When  this  option  takes                         value 4, the output is not SAM. [three]          aln          bwa  aln  [-n  maxDiff]  [-o maxGapO] [-eastward maxGapE] [-d nDelTail] [-i nIndelEnd] [-k               maxSeedDiff] [-l seedLen] [-t nThrds] [-cRN] [-Thou misMsc] [-O  gapOsc]  [-Due east  gapEsc]               [-q trimQual] <in.db.fasta> <in.query.fq> > <out.sai>                Find  the  SA  coordinates  of the input reads. Maximum          maxSeedDiff          differences are               allowed in the first          seedLen          subsequence  and  maximum          maxDiff          differences  are               allowed in the whole sequence.          OPTIONS:          -n          NUM          Maximum  edit  altitude  if  the value is INT, or the fraction of missing                         alignments given 2% uniform base error rate if FLOAT. In the latter case,                         the  maximum  edit  distance  is  automatically chosen for different read                         lengths. [0.04]          -o          INT          Maximum number of gap opens [ane]          -eastward          INT          Maximum number of gap extensions, -1 for one thousand-difference  fashion  (disallowing                         long gaps) [-1]          -d          INT          Disallow a long deletion within INT bp towards the iii'-finish [16]          -i          INT          Disallow an indel within INT bp towards the ends [5]          -l          INT          Accept  the  beginning INT subsequence as seed. If INT is larger than the query                         sequence, seeding will be  disabled.  For  long  reads,  this  option  is                         typically ranged from 25 to 35 for `-k 2'. [inf]          -k          INT          Maximum edit altitude in the seed [2]          -t          INT          Number of threads (multi-threading manner) [1]          -1000          INT          Mismatch  penalty.  BWA  volition not search for suboptimal hits with a score                         lower than (bestScore-misMsc). [3]          -O          INT          Gap open penalty [11]          -East          INT          Gap extension penalisation [iv]          -R          INT          Keep with suboptimal alignments if there are no more INT  every bit                         best  hits.  This option simply affects paired-end mapping. Increasing this                         threshold helps to amend the pairing accurateness at  the  toll  of  speed,                         especially for short reads (~32bp).          -c          Reverse  query  just non complement it, which is required for alignment in                         the color infinite. (Disabled since 0.six.x)          -Due north          Disable iterative search. All hits with no more than than          maxDiff          differences                         will be plant. This manner is much slower than the default.          -q          INT          Parameter    for    read   trimming.   BWA   trims   a   read   downward   to                         argmax_x{\sum_{i=x+1}^l(INT-q_i)} if q_l<INT where 50 is the original read                         length. [0]          -I          The input is in the Illumina 1.3+ read format (quality equals ASCII-64).          -B          INT          Length  of  barcode  starting  from the five'-terminate. When          INT          is positive, the                         barcode of each read will be trimmed before mapping and will  be  written                         at  the          BC          SAM tag. For paired-end reads, the barcode from both ends are                         concatenated. [0]          -b          Specify the input read sequence file is the BAM  format.  For  paired-end                         information,  2  ends  in a pair must be grouped together and options          -1          or          -two          are usually applied to  specify  which  end  should  be  mapped.  Typical                         command lines for mapping pair-end data in the BAM format are:                              bwa aln ref.fa -b1 reads.bam > 1.sai                             bwa aln ref.fa -b2 reads.bam > 2.sai                             bwa sampe ref.fa 1.sai ii.sai reads.bam reads.bam > aln.sam          -0          When          -b          is specified, just utilize unmarried-cease reads in mapping.          -1          When          -b          is specified, only use the offset read in a read pair in mapping                         (skip single-stop reads and the second reads).          -2          When          -b          is specified, only use the second read in a read pair in mapping.          samse          bwa samse [-north maxOcc] <in.db.fasta> <in.sai> <in.fq> > <out.sam>                Generate alignments in the SAM format given single-end reads. Repetitive hits  will               be randomly chosen.          OPTIONS:          -northward          INT          Maximum  number  of  alignments  to output in the XA tag for reads paired                         properly. If a read has more than INT  hits,  the  XA  tag  volition  non  be                         written. [3]          -r          STR          Specify the read grouping in a format similar `@RG\tID:foo\tSM:bar'. [null]          sampe          bwa  sampe  [-a  maxInsSize]  [-o  maxOcc]  [-n  maxHitPaired]  [-Northward maxHitDis] [-P]               <in.db.fasta> <in1.sai> <in2.sai> <in1.fq> <in2.fq> > <out.sam>                Generate alignments in the SAM format given paired-end reads. Repetitive read pairs               will exist placed randomly.          OPTIONS:          -a          INT          Maximum insert size for a read pair to exist considered being mapped properly.                       Since 0.4.5, this choice is only  used  when  there  are  not  enough  good                       alignment to infer the distribution of insert sizes. [500]          -o          INT          Maximum  occurrences  of  a  read for pairing. A read with more occurrneces                       will be treated as a unmarried-end read. Reducing this parameter helps  faster                       pairing. [100000]          -P          Load  the entire FM-alphabetize into memory to reduce disk operations (base-space                       reads only). With this option, at to the lowest degree 1.25N bytes of memory are required,                       where N is the length of the genome.          -northward          INT          Maximum  number  of  alignments  to  output  in the XA tag for reads paired                       properly. If a read has more than INT hits, the XA tag will not exist written.                       [3]          -North          INT          Maximum number of alignments to output in the XA tag for disconcordant read                       pairs (excluding singletons). If a read has more than than INT hits, the XA  tag                       will not be written. [10]          -r          STR          Specify the read group in a format like `@RG\tID:foo\tSM:bar'. [null]          bwasw          bwa  bwasw  [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads]               [-w bandWidth] [-T thres] [-s hspIntv]  [-z  zBest]  [-North  nHspRev]  [-c  thresCoef]               <in.db.fasta> <in.fq> [mate.fq]                Align  query  sequences in the          in.fq          file. When          mate.fq          is nowadays, perform paired-               terminate alignment. The paired-end mode  only  works  for  reads  Illumina  brusque-insert               libraries.  In  the  paired-finish  style, BWA-SW may yet output split alignments simply               they are all marked every bit not properly paired; the mate positions volition not exist  written               if the mate has multiple local hits.          OPTIONS:          -a          INT          Score of a match [i]          -b          INT          Mismatch penalization [iii]          -q          INT          Gap open up penalty [5]          -r          INT          Gap  extension  penalty.  The  penalty  for a contiguous gap of size g is                         q+1000*r. [2]          -t          INT          Number of threads in the multi-threading mode [1]          -w          INT          Band width in the banded alignment [33]          -T          INT          Minimum score threshold divided past a [37]          -c          Bladder          Coefficient for threshold aligning according to query length. Given  an                         l-long   query,   the   threshold   for   a   hit   to   be  retained  is                         a*max{T,c*log(l)}. [five.5]          -z          INT          Z-best heuristics. Higher -z increases accurateness at the cost of speed. [1]          -south          INT          Maximum SA interval size for  initiating  a  seed.  Higher  -s  increases                         accuracy at the toll of speed. [3]          -N          INT          Minimum  number  of  seeds  supporting  the  resultant  alignment to skip                         reverse alignment. [5]

SAM ALIGNMENT FORMAT

          The output of the          `aln'          control is binary and designed for BWA use only. BWA  outputs  the        concluding alignment in the SAM (Sequence Alignment/Map) format. Each line consists of:                  ┌────┬───────┬──────────────────────────────────────────────────────────┐                 │Col          │          Field          │          Description          │                 ├────┼───────┼──────────────────────────────────────────────────────────┤                 │ 1  │ QNAME │ Query (pair) Proper name                                        │                 │ two  │ FLAG  │ bitwise FLAG                                             │                 │ 3  │ RNAME │ Reference sequence NAME                                  │                 │ 4  │ POS   │ i-based leftmost POSition/coordinate of clipped sequence │                 │ 5  │ MAPQ  │ MAPping Quality (Phred-scaled)                           │                 │ 6  │ CIAGR │ extended CIGAR string                                    │                 │ 7  │ MRNM  │ Mate Reference sequence NaMe (`=' if aforementioned every bit RNAME)      │                 │ eight  │ MPOS  │ 1-based Mate POSistion                                   │                 │ 9  │ ISIZE │ Inferred insert SIZE                                     │                 │ten  │ SEQ   │ query SEQuence on the same strand as the reference       │                 │eleven  │ QUAL  │ query QUALity (ASCII-33 gives the Phred base quality)    │                 │12  │ OPT   │ variable OPTional fields in the format TAG:VTYPE:VALUE   │                 └────┴───────┴──────────────────────────────────────────────────────────┘         Each flake in the FLAG field is defined as:                           ┌────┬────────┬───────────────────────────────────────┐                          │Chr          │          Flag          │          Description          │                          ├────┼────────┼───────────────────────────────────────┤                          │ p  │ 0x0001 │ the read is paired in sequencing      │                          │ P  │ 0x0002 │ the read is mapped in a proper pair   │                          │ u  │ 0x0004 │ the query sequence itself is unmapped │                          │ U  │ 0x0008 │ the mate is unmapped                  │                          │ r  │ 0x0010 │ strand of the query (1 for opposite)   │                          │ R  │ 0x0020 │ strand of the mate                    │                          │ 1  │ 0x0040 │ the read is the first read in a pair  │                          │ 2  │ 0x0080 │ the read is the second read in a pair │                          │ s  │ 0x0100 │ the alignment is not chief          │                          │ f  │ 0x0200 │ QC failure                            │                          │ d  │ 0x0400 │ optical or PCR duplicate              │                          └────┴────────┴───────────────────────────────────────┘         The  Please  check  <http://samtools.sourceforge.net> for the format specification and the        tools for post-processing the alignment.         BWA generates the post-obit optional fields. Tags starting with `X' are specific to BWA.                       ┌────┬───────────────────────────────────────────────────────┐                      │Tag          │          Significant          │                      ├────┼───────────────────────────────────────────────────────┤                      │NM          │ Edit distance                                         │                      │Physician          │ Mismatching positions/bases                           │                      │As          │ Alignment score                                       │                      │BC          │ Barcode sequence                                      │                      ├────┼───────────────────────────────────────────────────────┤                      │X0          │ Number of best hits                                   │                      │X1          │ Number of suboptimal hits establish by BWA                │                      │XN          │ Number of ambiguous bases in the referenece           │                      │XM          │ Number of mismatches in the alignment                 │                      │XO          │ Number of gap opens                                   │                      │XG          │ Number of gap extentions                              │                      │XT          │ Type: Unique/Repeat/N/Mate-sw                         │                      │XA          │ Alternative hits; format: /(chr,pos,CIGAR,NM;)*/      │                      ├────┼───────────────────────────────────────────────────────┤                      │XS          │ Suboptimal alignment score                            │                      │XF          │ Support from forwards/reverse alignment                │                      │XE          │ Number of supporting seeds                            │                      ├────┼───────────────────────────────────────────────────────┤                      │XP          │ Alt main hits; format: /(chr,pos,CIGAR,mapQ,NM;)+/ │                      └────┴───────────────────────────────────────────────────────┘         Note that XO and XG are generated past BWT search while the CIGAR cord  by  Smith-Waterman        alignment. These two tags may exist inconsistent with the CIGAR string. This is non a problems.

NOTES ON SHORT-READ ALIGNMENT

Alignment Accuracy When seeding is disabled, BWA guarantees to find an alignment containing maximum maxDiff differences including maxGapO gap opens which exercise non occur within nIndelEnd bp towards either cease of the query. Longer gaps may be constitute if maxGapE is positive, but it is not guaranteed to observe all hits. When seeding is enabled, BWA further requires that the offset seedLen subsequence contains no more maxSeedDiff differences. When gapped alignment is disabled, BWA is expected to generate the same alignment as Eland version ane, the Illumina alignment programme. However, as BWA modify `Due north' in the database sequence to random nucleotides, hits to these random sequences will besides be counted. As a consequence, BWA may marking a unique hit as a echo, if the random sequences happen to be identical to the sequences which should be unqiue in the database. Past default, if the best hitting is non highly repetitive (controlled past -R), BWA also finds all hits contains one more mismatch; otherwise, BWA finds all equally best hits simply. Base quality is NOT considered in evaluating hits. In the paired-finish mode, BWA pairs all hits it found. It further performs Smith-Waterman alignment for unmapped reads to rescue reads with a high erro charge per unit, and for high-quality dissonant pairs to fix potential alignment errors. Estimating Insert Size Distribution BWA estimates the insert size distribution per 256*1024 read pairs. It outset collects pairs of reads with both ends mapped with a single-finish quality 20 or college and then calculates median (Q2), lower and higher quartile (Q1 and Q3). It estimates the mean and the variance of the insert size distribution from pairs whose insert sizes are within interval [Q1-two(Q3-Q1), Q3+2(Q3-Q1)]. The maximum distance x for a pair considered to be properly paired (SAM flag 0x2) is calculated past solving equation Phi((x-mu)/sigma)=x/L*p0, where mu is the mean, sigma is the standard error of the insert size distribution, Fifty is the length of the genome, p0 is prior of dissonant pair and Phi() is the standard cumulative distribution role. For mapping Illumina curt-insert reads to the human genome, x is nearly half-dozen-7 sigma abroad from the mean. Quartiles, mean, variance and x will be printed to the standard error output. Retentivity Requirement With bwtsw algorithm, 5GB memory is required for indexing the complete human genome sequences. For short reads, the aln control uses ~3.2GB retentivity and the sampe control uses ~5.4GB. Speed Indexing the human genome sequences takes 3 hours with bwtsw algorithm. Indexing smaller genomes with IS algorithms is faster, merely requires more than retentivity. The speed of alignment is largely determined by the error rate of the query sequences (r). Firstly, BWA runs much faster for near perfect hits than for hits with many differences, and it stops searching for a striking with l+2 differences if a l-deviation hitting is found. This ways BWA will exist very wearisome if r is high because in this case BWA has to visit hits with many differences and looking for these hits is expensive. Secondly, the alignment algorithm behind makes the speed sensitive to [k log(N)/m], where thousand is the maximum allowed differences, N the size of database and m the length of a query. In do, we choose 1000 westward.r.t. r and therefore r is the leading cistron. I would not recommend to use BWA on information with r>0.02. Pairing is slower for shorter reads. This is mainly because shorter reads have more than spurious hits and converting SA coordinates to chromosomal coordinates are very costly.

CHANGES IN BWA-0.6

          Since version 0.half-dozen, BWA has been able to work with a  reference  genome  longer  than  4GB.        This feature makes it possible to integrate the forward and reverse complemented genome in        one FM-index, which speeds up both BWA-short and BWA-SW. As  a  tradeoff,  BWA  uses  more        memory  because  information technology  has  to keep all positions and ranks in 64-bit integers, twice larger        than 32-fleck integers used in the previous versions.         The latest BWA-SW also works for paired-terminate reads longer than 100bp. In comparing to BWA-        brusk,  BWA-SW  tends  to  be  more  authentic  for  highly unique reads and more than robust to        relative long INDELs and structural variants.  Yet, BWA-curt commonly  has  college        power  to distinguish the optimal hit from many suboptimal hits. The choice of the mapping        algorithm may depend on the awarding.

SEE Besides

          BWA        website        <http://bio-bwa.sourceforge.net>,        Samtools        website        <http://samtools.sourceforge.net>

Author

          Heng  Li  at  the Sanger Establish wrote the fundamental source codes and integrated the following        codes for BWT construction:  bwtsw  <http://i.cs.hku.hk/~ckwong3/bwtsw/>,  implemented  by        Chi-Kwong      Wong      at     the     Academy     of     Hong     Kong     and     IS        <http://yuta.256.googlepages.com/sais>     originally     proposed     by     Nong      Ge        <http://www.cs.sysu.edu.cn/nong/>  at  the  Sun Yat-Sen University and implemented by Yuta        Mori.

LICENSE AND Citation

          The full BWA bundle is distributed nether GPLv3 as it uses source codes from BWT-SW  which        is covered by GPL. Sorting, hash tabular array, BWT and IS libraries are distributed nether the MIT        license.         If you lot use the BWA-backtrack algorithm, please cite the post-obit newspaper:         Li H. and Durbin R. (2009) Fast and authentic brusk  read  alignment  with  Burrows-Wheeler        transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]         If yous use the BWA-SW algorithm, please cite:         Li  H.  and  Durbin  R.  (2010) Fast and accurate long-read alignment with Burrows-Wheeler        transform. Bioinformatics, 26, 589-595. [PMID: 20080505]         If yous employ BWA-MEM or the fastmap component of BWA, delight cite:         Li H. (2013) Adjustment sequence reads, clone sequences and assembly contigs  with  BWA-MEM.        arXiv:1303.3997v1 [q-bio.GN].         It is likely that the BWA-MEM manuscript will not announced in a peer-reviewed journal.

HISTORY

BWA is largely influenced by BWT-SW. It uses source codes from BWT-SW and mimics its binary file formats; BWA-SW resembles BWT-SW in several means. The initial idea about BWT- based alignment also came from the grouping who developed BWT-SW. At the same fourth dimension, BWA is different plenty from BWT-SW. The short-read alignment algorithm bears no similarity to Smith-Waterman algorithm whatever more than. While BWA-SW learns from BWT-SW, it introduces heuristics that can hardly be practical to the original algorithm. In all, BWA does not guarantee to find all local hits as what BWT-SW is designed to do, merely information technology is much faster than BWT-SW on both short and long query sequences. I started to write the first piece of codes on 24 May 2008 and got the initial stable version on 02 June 2008. During this period, I was acquainted that Professor Tak-Wah Lam, the first writer of BWT-SW paper, was collaborating with Beijing Genomics Institute on SOAP2, the successor to Lather (Curt Oligonucleotide Assay Package). SOAP2 has come out in November 2008. According to the SourceForge download page, the third BWT-based brusk read aligner, bowtie, was first released in August 2008. At the time of writing this manual, at least iii more BWT-based short-read aligners are being implemented. The BWA-SW algorithm is a new component of BWA. It was conceived in November 2008 and implemented x months later. The BWA-MEM algorithm is based on an algorithm finding super-maximal exact matches (SMEMs), which was first published with the fermi assembler paper in 2012. I first implemented the bones SMEM algorithm in the fastmap command for an experiment and then extended the bones algorithm and added the extension function in Feburary 2013 to make BWA-MEM a fully featured mapper.

stovallthiseved.blogspot.com

Source: http://manpages.ubuntu.com/manpages/trusty/man1/bwa.1.html

Run Bwa Aln With Paired End Reads

Proper name

SYNOPSIS

DESCRIPTION

COMMANDS AND OPTIONS

SAM ALIGNMENT FORMAT

NOTES ON SHORT-READ ALIGNMENT

CHANGES IN BWA-0.6

SEE Besides

Author

LICENSE AND Citation

HISTORY

0 Response to "Run Bwa Aln With Paired End Reads"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel