Run Bwa Aln With Paired End Reads

Provided by: bwa_0.seven.5a-2_amd64 bug

        

Proper name

          bwa - Burrows-Wheeler Alignment Tool        

SYNOPSIS

          bwa alphabetize ref.fa         bwa mem ref.fa reads.fq > aln-se.sam         bwa mem ref.fa read1.fq read2.fq > aln-pe.sam         bwa aln ref.fa short_read.fq > aln_sa.sai         bwa samse ref.fa aln_sa.sai short_read.fq > aln-se.sam         bwa sampe ref.fa aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln-pe.sam         bwa bwasw ref.fa long_read.fq > aln.sam        

DESCRIPTION

          BWA  is  a  software parcel for mapping low-divergent sequences against a large reference        genome, such as the human being genome. It consists of 3 algorithms:  BWA-backtrack,  BWA-SW        and  BWA-MEM.  The  first  algorithm  is designed for Illumina sequence reads upwards to 100bp,        while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share        like  features such every bit long-read back up and separate alignment, but BWA-MEM, which is the        latest, is mostly recommended for  high-quality  queries  as  information technology  is  faster  and  more        authentic.   BWA-MEM  also  has ameliorate performance than BWA-backtrack for 70-100bp Illumina        reads.         For all the algorithms, BWA first needs to construct the FM-index for the reference genome        (the          alphabetize          command).  Alignment  algorithms  are  invoked  with  different sub-commands:          aln/samse/sampe          for BWA-backtrack,          bwasw          for BWA-SW and          mem          for the BWA-MEM algorithm.        

COMMANDS AND OPTIONS

          index          bwa          index          [-p          prefix] [-a          algoType]          db.fa          Index database sequences in the FASTA format.          OPTIONS:          -p          STR          Prefix of the output database [same as db filename]          -a          STR          Algorithm for constructing BWT index. BWA implements two  algorithms  for                         BWT  construction:          is          and          bwtsw.  The outset algorithm is a niggling faster                         for minor database but requires large RAM and does not piece of work for databases                         with  total  length longer than 2GB. The second algorithm is adapted from                         the BWT-SW source lawmaking. It in theory works with database  with  trillions                         of  bases.  When  this option is non specified, the appropriate algorithm                         will exist chosen automatically.          mem          bwa          mem          [-aCHMpP] [-t          nThreads] [-k          minSeedLen] [-w          bandWidth]  [-d          zDropoff]  [-r          seedSplitRatio]  [-c          maxOcc]  [-A          matchScore]  [-B          mmPenalty] [-O          gapOpenPen] [-Due east          gapExtPen] [-L          clipPen] [-U          unpairPen]  [-R          RGline]  [-five          verboseLevel]          db.prefix          reads.fq          [mates.fq]                Align  70bp-1Mbp query sequences with the BWA-MEM algorithm. Briefly, the algorithm               works by seeding alignments with maximal exact matches (MEMs)  and  then  extending               seeds with the affine-gap Smith-Waterman algorithm (SW).                If          mates.fq          file  is  absent and choice          -p          is not set, this command regards input               reads are unmarried-end. If          mates.fq          is present, this command assumes the          i-thursday read in          reads.fq          and  the          i-th read in          mates.fq          constitute a read pair. If          -p          is used, the               command assumes the twoi-th and the (2i+one)-thursday read in          reads.fq          establish a read pair               (such  input file is said to be interleaved). In this example,          mates.fq          is ignored. In               the paired-cease style, the          mem          control will infer the read orientation and the insert               size distribution from a batch of reads.                The  BWA-MEM  algorithm  performs  local alignment. It may produce multiple primary               alignments for dissimilar role of a query sequence. This is a  crucial  feature  for               long  sequences.  However, some tools such as Picard's markDuplicates does non work               with split up alignments. One may consider to utilise option          -Thou          to flag shorter split hits               equally secondary.          OPTIONS:          -t          INT          Number of threads [one]          -k          INT          Minimum  seed  length.  Matches  shorter  than          INT          will  be missed. The                         alignment  speed  is  usually  insensitive  to  this  value   unless   information technology                         significantly deviates 20. [19]          -w          INT          Band  width.  Substantially,  gaps  longer than          INT          will not exist establish. Notation                         that the maximum gap length is too afflicted by the  scoring  matrix  and                         the hitting length, not solely determined past this option. [100]          -d          INT          Off-diagonal  X-dropoff  (Z-dropoff).  Stop extension when the departure                         between the best and the current extension score  is  above  |i-j|*A+INT,                         where          i          and          j          are  the  current positions of the query and reference,                         respectively, and          A          is  the  matching  score.  Z-dropoff  is  similar  to                         BLAST's  X-dropoff  except  that  information technology  doesn't penalize gaps in one of the                         sequences  in  the  alignment.  Z-dropoff  non  only  avoids  unnecessary                         extension, but also reduces poor alignments inside a long good alignment.                         [100]          -r          FLOAT          Trigger re-seeding for a MEM longer than          minSeedLen*FLOAT.  This is a key                         heuristic parameter for tuning the operation. Larger value yields fewer                         seeds, which leads to faster alignment speed but lower accuracy. [1.five]          -c          INT          Discard a MEM if it has more than          INT          occurence in the genome. This is an                         insensitive parameter. [10000]          -P          In the paired-end manner, perform SW to rescue missing hits only simply exercise not                         try to find hits that fit a proper pair.          -A          INT          Matching score. [1]          -B          INT          Mismatch penalty. The  sequence  error  rate  is  approximately:  {.75  *                         exp[-log(4) * B/A]}. [4]          -O          INT          Gap open penalty. [6]          -E          INT          Gap  extension  penalty. A gap of length k costs O + thou*East (i.e.          -O          is for                         opening a nil-length gap). [1]          -L          INT          Clipping penalty. When performing SW extension, BWA-MEM  keeps  rail  of                         the  best  score  reaching the end of query. If this score is larger than                         the best SW score minus  the  clipping  penalisation,  clipping  volition  not  be                         applied.  Note  that  in  this  case,  the SAM Every bit tag reports the best SW                         score; clipping penalty is not deducted. [5]          -U          INT          Punishment for an unpaired read pair. BWA-MEM scores an unpaired  read  pair                         equally     scoreRead1+scoreRead2-INT          and     scores     a    paired    equally                         scoreRead1+scoreRead2-insertPenalty. It  compares  these  two  scores  to                         make up one's mind  whether  we should force pairing. A larger value leads to more                         aggressive read pair. [17]          -p          Assume the get-go input query file is interleaved paired-end FASTA/Q.  See                         the control description for details.          -R          STR          Complete  read  group  header  line.  '\t' can be used in          STR          and will be                         converted to a TAB in the output SAM. The read group ID will be  attached                         to every read in the output. An example is '@RG\tID:foo\tSM:bar'.  [null]          -T          INT          Don't  output  alignment  with score lower than          INT.  This option affects                         output and occasionally SAM flag 2. [xxx]          -a          Output all found alignments for single-end or unpaired paired-finish  reads.                         These alignments will be flagged as secondary alignments.          -C          Append  append  FASTA/Q comment to SAM output. This choice tin can be used to                         transfer read meta information (e.chiliad. barcode) to  the  SAM  output.  Note                         that  the  FASTA/Q  comment (the cord afterwards a space in the header line)                         must suit the SAM spec (e.g. BC:Z:CGTAC). Malformated comments atomic number 82 to                         incorrect SAM output.          -H          Use  hard  clipping  'H'  in the SAM output. This option may dramatically                         reduce  the  redundancy  of  output  when  mapping  long  contig  or  BAC                         sequences.          -K          Mark shorter dissever hits as secondary (for Picard compatibility).          -v          INT          Control  the  verbose level of the output. This option has not been fully                         supported throughout BWA. Ideally, a value 0 for disabling all the output                         to stderr; 1 for outputting errors simply; 2 for warnings and errors; iii for                         all normal messages; iv or higher for debugging. When  this  option  takes                         value 4, the output is not SAM. [three]          aln          bwa  aln  [-n  maxDiff]  [-o maxGapO] [-eastward maxGapE] [-d nDelTail] [-i nIndelEnd] [-k               maxSeedDiff] [-l seedLen] [-t nThrds] [-cRN] [-Thou misMsc] [-O  gapOsc]  [-Due east  gapEsc]               [-q trimQual] <in.db.fasta> <in.query.fq> > <out.sai>                Find  the  SA  coordinates  of the input reads. Maximum          maxSeedDiff          differences are               allowed in the first          seedLen          subsequence  and  maximum          maxDiff          differences  are               allowed in the whole sequence.          OPTIONS:          -n          NUM          Maximum  edit  altitude  if  the value is INT, or the fraction of missing                         alignments given 2% uniform base error rate if FLOAT. In the latter case,                         the  maximum  edit  distance  is  automatically chosen for different read                         lengths. [0.04]          -o          INT          Maximum number of gap opens [ane]          -eastward          INT          Maximum number of gap extensions, -1 for one thousand-difference  fashion  (disallowing                         long gaps) [-1]          -d          INT          Disallow a long deletion within INT bp towards the iii'-finish [16]          -i          INT          Disallow an indel within INT bp towards the ends [5]          -l          INT          Accept  the  beginning INT subsequence as seed. If INT is larger than the query                         sequence, seeding will be  disabled.  For  long  reads,  this  option  is                         typically ranged from 25 to 35 for `-k 2'. [inf]          -k          INT          Maximum edit altitude in the seed [2]          -t          INT          Number of threads (multi-threading manner) [1]          -1000          INT          Mismatch  penalty.  BWA  volition not search for suboptimal hits with a score                         lower than (bestScore-misMsc). [3]          -O          INT          Gap open penalty [11]          -East          INT          Gap extension penalisation [iv]          -R          INT          Keep with suboptimal alignments if there are no more INT  every bit                         best  hits.  This option simply affects paired-end mapping. Increasing this                         threshold helps to amend the pairing accurateness at  the  toll  of  speed,                         especially for short reads (~32bp).          -c          Reverse  query  just non complement it, which is required for alignment in                         the color infinite. (Disabled since 0.six.x)          -Due north          Disable iterative search. All hits with no more than than          maxDiff          differences                         will be plant. This manner is much slower than the default.          -q          INT          Parameter    for    read   trimming.   BWA   trims   a   read   downward   to                         argmax_x{\sum_{i=x+1}^l(INT-q_i)} if q_l<INT where 50 is the original read                         length. [0]          -I          The input is in the Illumina 1.3+ read format (quality equals ASCII-64).          -B          INT          Length  of  barcode  starting  from the five'-terminate. When          INT          is positive, the                         barcode of each read will be trimmed before mapping and will  be  written                         at  the          BC          SAM tag. For paired-end reads, the barcode from both ends are                         concatenated. [0]          -b          Specify the input read sequence file is the BAM  format.  For  paired-end                         information,  2  ends  in a pair must be grouped together and options          -1          or          -two          are usually applied to  specify  which  end  should  be  mapped.  Typical                         command lines for mapping pair-end data in the BAM format are:                              bwa aln ref.fa -b1 reads.bam > 1.sai                             bwa aln ref.fa -b2 reads.bam > 2.sai                             bwa sampe ref.fa 1.sai ii.sai reads.bam reads.bam > aln.sam          -0          When          -b          is specified, just utilize unmarried-cease reads in mapping.          -1          When          -b          is specified, only use the offset read in a read pair in mapping                         (skip single-stop reads and the second reads).          -2          When          -b          is specified, only use the second read in a read pair in mapping.          samse          bwa samse [-north maxOcc] <in.db.fasta> <in.sai> <in.fq> > <out.sam>                Generate alignments in the SAM format given single-end reads. Repetitive hits  will               be randomly chosen.          OPTIONS:          -northward          INT          Maximum  number  of  alignments  to output in the XA tag for reads paired                         properly. If a read has more than INT  hits,  the  XA  tag  volition  non  be                         written. [3]          -r          STR          Specify the read grouping in a format similar `@RG\tID:foo\tSM:bar'. [null]          sampe          bwa  sampe  [-a  maxInsSize]  [-o  maxOcc]  [-n  maxHitPaired]  [-Northward maxHitDis] [-P]               <in.db.fasta> <in1.sai> <in2.sai> <in1.fq> <in2.fq> > <out.sam>                Generate alignments in the SAM format given paired-end reads. Repetitive read pairs               will exist placed randomly.          OPTIONS:          -a          INT          Maximum insert size for a read pair to exist considered being mapped properly.                       Since 0.4.5, this choice is only  used  when  there  are  not  enough  good                       alignment to infer the distribution of insert sizes. [500]          -o          INT          Maximum  occurrences  of  a  read for pairing. A read with more occurrneces                       will be treated as a unmarried-end read. Reducing this parameter helps  faster                       pairing. [100000]          -P          Load  the entire FM-alphabetize into memory to reduce disk operations (base-space                       reads only). With this option, at to the lowest degree 1.25N bytes of memory are required,                       where N is the length of the genome.          -northward          INT          Maximum  number  of  alignments  to  output  in the XA tag for reads paired                       properly. If a read has more than INT hits, the XA tag will not exist written.                       [3]          -North          INT          Maximum number of alignments to output in the XA tag for disconcordant read                       pairs (excluding singletons). If a read has more than than INT hits, the XA  tag                       will not be written. [10]          -r          STR          Specify the read group in a format like `@RG\tID:foo\tSM:bar'. [null]          bwasw          bwa  bwasw  [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads]               [-w bandWidth] [-T thres] [-s hspIntv]  [-z  zBest]  [-North  nHspRev]  [-c  thresCoef]               <in.db.fasta> <in.fq> [mate.fq]                Align  query  sequences in the          in.fq          file. When          mate.fq          is nowadays, perform paired-               terminate alignment. The paired-end mode  only  works  for  reads  Illumina  brusque-insert               libraries.  In  the  paired-finish  style, BWA-SW may yet output split alignments simply               they are all marked every bit not properly paired; the mate positions volition not exist  written               if the mate has multiple local hits.          OPTIONS:          -a          INT          Score of a match [i]          -b          INT          Mismatch penalization [iii]          -q          INT          Gap open up penalty [5]          -r          INT          Gap  extension  penalty.  The  penalty  for a contiguous gap of size g is                         q+1000*r. [2]          -t          INT          Number of threads in the multi-threading mode [1]          -w          INT          Band width in the banded alignment [33]          -T          INT          Minimum score threshold divided past a [37]          -c          Bladder          Coefficient for threshold aligning according to query length. Given  an                         l-long   query,   the   threshold   for   a   hit   to   be  retained  is                         a*max{T,c*log(l)}. [five.5]          -z          INT          Z-best heuristics. Higher -z increases accurateness at the cost of speed. [1]          -south          INT          Maximum SA interval size for  initiating  a  seed.  Higher  -s  increases                         accuracy at the toll of speed. [3]          -N          INT          Minimum  number  of  seeds  supporting  the  resultant  alignment to skip                         reverse alignment. [5]        

SAM ALIGNMENT FORMAT

          The output of the          `aln'          control is binary and designed for BWA use only. BWA  outputs  the        concluding alignment in the SAM (Sequence Alignment/Map) format. Each line consists of:                  ┌────┬───────┬──────────────────────────────────────────────────────────┐                 │ColFieldDescription          │                 ├────┼───────┼──────────────────────────────────────────────────────────┤                 │ 1  │ QNAME │ Query (pair) Proper name                                        │                 │ two  │ FLAG  │ bitwise FLAG                                             │                 │ 3  │ RNAME │ Reference sequence NAME                                  │                 │ 4  │ POS   │ i-based leftmost POSition/coordinate of clipped sequence │                 │ 5  │ MAPQ  │ MAPping Quality (Phred-scaled)                           │                 │ 6  │ CIAGR │ extended CIGAR string                                    │                 │ 7  │ MRNM  │ Mate Reference sequence NaMe (`=' if aforementioned every bit RNAME)      │                 │ eight  │ MPOS  │ 1-based Mate POSistion                                   │                 │ 9  │ ISIZE │ Inferred insert SIZE                                     │                 │ten  │ SEQ   │ query SEQuence on the same strand as the reference       │                 │eleven  │ QUAL  │ query QUALity (ASCII-33 gives the Phred base quality)    │                 │12  │ OPT   │ variable OPTional fields in the format TAG:VTYPE:VALUE   │                 └────┴───────┴──────────────────────────────────────────────────────────┘         Each flake in the FLAG field is defined as:                           ┌────┬────────┬───────────────────────────────────────┐                          │ChrFlagDescription          │                          ├────┼────────┼───────────────────────────────────────┤                          │ p  │ 0x0001 │ the read is paired in sequencing      │                          │ P  │ 0x0002 │ the read is mapped in a proper pair   │                          │ u  │ 0x0004 │ the query sequence itself is unmapped │                          │ U  │ 0x0008 │ the mate is unmapped                  │                          │ r  │ 0x0010 │ strand of the query (1 for opposite)   │                          │ R  │ 0x0020 │ strand of the mate                    │                          │ 1  │ 0x0040 │ the read is the first read in a pair  │                          │ 2  │ 0x0080 │ the read is the second read in a pair │                          │ s  │ 0x0100 │ the alignment is not chief          │                          │ f  │ 0x0200 │ QC failure                            │                          │ d  │ 0x0400 │ optical or PCR duplicate              │                          └────┴────────┴───────────────────────────────────────┘         The  Please  check  <http://samtools.sourceforge.net> for the format specification and the        tools for post-processing the alignment.         BWA generates the post-obit optional fields. Tags starting with `X' are specific to BWA.                       ┌────┬───────────────────────────────────────────────────────┐                      │TagSignificant          │                      ├────┼───────────────────────────────────────────────────────┤                      │NM          │ Edit distance                                         │                      │Physician          │ Mismatching positions/bases                           │                      │As          │ Alignment score                                       │                      │BC          │ Barcode sequence                                      │                      ├────┼───────────────────────────────────────────────────────┤                      │X0          │ Number of best hits                                   │                      │X1          │ Number of suboptimal hits establish by BWA                │                      │XN          │ Number of ambiguous bases in the referenece           │                      │XM          │ Number of mismatches in the alignment                 │                      │XO          │ Number of gap opens                                   │                      │XG          │ Number of gap extentions                              │                      │XT          │ Type: Unique/Repeat/N/Mate-sw                         │                      │XA          │ Alternative hits; format: /(chr,pos,CIGAR,NM;)*/      │                      ├────┼───────────────────────────────────────────────────────┤                      │XS          │ Suboptimal alignment score                            │                      │XF          │ Support from forwards/reverse alignment                │                      │XE          │ Number of supporting seeds                            │                      ├────┼───────────────────────────────────────────────────────┤                      │XP          │ Alt main hits; format: /(chr,pos,CIGAR,mapQ,NM;)+/ │                      └────┴───────────────────────────────────────────────────────┘         Note that XO and XG are generated past BWT search while the CIGAR cord  by  Smith-Waterman        alignment. These two tags may exist inconsistent with the CIGAR string. This is non a problems.        

NOTES ON SHORT-READ ALIGNMENT

          Alignment          Accuracy          When  seeding  is disabled, BWA guarantees to find an alignment containing maximum          maxDiff          differences including          maxGapO          gap opens which exercise non occur  within          nIndelEnd          bp  towards        either  cease  of  the query. Longer gaps may be constitute if          maxGapE          is positive, but it is not        guaranteed to observe all hits. When seeding is enabled, BWA further requires that the  offset          seedLen          subsequence contains no more          maxSeedDiff          differences.         When gapped alignment is disabled, BWA is expected to generate the same alignment as Eland        version ane, the Illumina alignment programme. However, as BWA  modify  `Due north'  in  the  database        sequence  to random nucleotides, hits to these random sequences will besides be counted. As a        consequence, BWA may marking a unique hit as a echo, if the random sequences happen  to  be        identical to the sequences which should be unqiue in the database.         Past  default,  if  the best hitting is non highly repetitive (controlled past -R), BWA also finds        all hits contains one more mismatch; otherwise, BWA finds all equally best hits simply. Base        quality  is  NOT considered in evaluating hits. In the paired-finish mode, BWA pairs all hits        it found. It further performs Smith-Waterman alignment for unmapped reads to rescue  reads        with  a  high  erro  charge per unit, and for high-quality dissonant pairs to fix potential alignment        errors.          Estimating          Insert          Size          Distribution          BWA estimates the insert size distribution per 256*1024  read  pairs.  It  outset  collects        pairs  of  reads  with  both  ends  mapped with a single-finish quality 20 or college and then        calculates median (Q2), lower and higher quartile (Q1 and Q3). It estimates the  mean  and        the  variance  of  the  insert  size distribution from pairs whose insert sizes are within        interval [Q1-two(Q3-Q1), Q3+2(Q3-Q1)]. The maximum distance x for a pair  considered  to  be        properly paired (SAM flag 0x2) is calculated past solving equation Phi((x-mu)/sigma)=x/L*p0,        where mu is the mean, sigma is the standard error of the insert size  distribution,  Fifty  is        the  length  of  the  genome,  p0  is  prior  of  dissonant pair and Phi() is the standard        cumulative distribution role. For mapping Illumina curt-insert  reads  to  the  human        genome,  x  is nearly half-dozen-7 sigma abroad from the mean. Quartiles, mean, variance and x will be        printed to the standard error output.          Retentivity          Requirement          With bwtsw algorithm, 5GB memory is  required  for  indexing  the  complete  human  genome        sequences.  For short reads, the          aln          control uses ~3.2GB retentivity and the          sampe          control uses        ~5.4GB.          Speed          Indexing the human genome sequences takes 3 hours with bwtsw algorithm.  Indexing  smaller        genomes with IS algorithms is faster, merely requires more than retentivity.         The speed of alignment is largely determined by the error rate of the query sequences (r).        Firstly, BWA runs much faster for near perfect hits than for hits with  many  differences,        and it stops searching for a striking with l+2 differences if a l-deviation hitting is found. This        ways BWA will exist very wearisome if r is high because in this case BWA has to visit  hits  with        many  differences  and  looking  for  these  hits  is  expensive.  Secondly, the alignment        algorithm behind makes the speed sensitive to [k log(N)/m], where thousand is the maximum allowed        differences,  N the size of database and m the length of a query. In do, we choose 1000        westward.r.t. r and therefore r is the leading cistron. I would not recommend to use BWA  on  information        with r>0.02.         Pairing  is  slower  for  shorter  reads.  This  is mainly because shorter reads have more than        spurious hits and converting SA coordinates to chromosomal coordinates are very costly.        

CHANGES IN BWA-0.6

          Since version 0.half-dozen, BWA has been able to work with a  reference  genome  longer  than  4GB.        This feature makes it possible to integrate the forward and reverse complemented genome in        one FM-index, which speeds up both BWA-short and BWA-SW. As  a  tradeoff,  BWA  uses  more        memory  because  information technology  has  to keep all positions and ranks in 64-bit integers, twice larger        than 32-fleck integers used in the previous versions.         The latest BWA-SW also works for paired-terminate reads longer than 100bp. In comparing to BWA-        brusk,  BWA-SW  tends  to  be  more  authentic  for  highly unique reads and more than robust to        relative long INDELs and structural variants.  Yet, BWA-curt commonly  has  college        power  to distinguish the optimal hit from many suboptimal hits. The choice of the mapping        algorithm may depend on the awarding.        

SEE Besides

          BWA        website        <http://bio-bwa.sourceforge.net>,        Samtools        website        <http://samtools.sourceforge.net>        

Author

          Heng  Li  at  the Sanger Establish wrote the fundamental source codes and integrated the following        codes for BWT construction:  bwtsw  <http://i.cs.hku.hk/~ckwong3/bwtsw/>,  implemented  by        Chi-Kwong      Wong      at     the     Academy     of     Hong     Kong     and     IS        <http://yuta.256.googlepages.com/sais>     originally     proposed     by     Nong      Ge        <http://www.cs.sysu.edu.cn/nong/>  at  the  Sun Yat-Sen University and implemented by Yuta        Mori.        

LICENSE AND Citation

          The full BWA bundle is distributed nether GPLv3 as it uses source codes from BWT-SW  which        is covered by GPL. Sorting, hash tabular array, BWT and IS libraries are distributed nether the MIT        license.         If you lot use the BWA-backtrack algorithm, please cite the post-obit newspaper:         Li H. and Durbin R. (2009) Fast and authentic brusk  read  alignment  with  Burrows-Wheeler        transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]         If yous use the BWA-SW algorithm, please cite:         Li  H.  and  Durbin  R.  (2010) Fast and accurate long-read alignment with Burrows-Wheeler        transform. Bioinformatics, 26, 589-595. [PMID: 20080505]         If yous employ BWA-MEM or the fastmap component of BWA, delight cite:         Li H. (2013) Adjustment sequence reads, clone sequences and assembly contigs  with  BWA-MEM.        arXiv:1303.3997v1 [q-bio.GN].         It is likely that the BWA-MEM manuscript will not announced in a peer-reviewed journal.        

HISTORY

          BWA  is  largely  influenced  by  BWT-SW.  It uses source codes from BWT-SW and mimics its        binary file formats; BWA-SW resembles BWT-SW in several means. The initial idea about  BWT-        based  alignment  also  came from the grouping who developed BWT-SW. At the same fourth dimension, BWA is        different plenty from BWT-SW. The short-read alignment algorithm bears  no  similarity  to        Smith-Waterman  algorithm  whatever  more than.  While  BWA-SW  learns  from  BWT-SW,  it introduces        heuristics that can hardly be practical to the original algorithm.  In  all,  BWA  does  not        guarantee  to  find all local hits as what BWT-SW is designed to do, merely information technology is much faster        than BWT-SW on both short and long query sequences.         I started to write the first piece of codes on 24 May 2008  and  got  the  initial  stable        version  on 02 June 2008. During this period, I was acquainted that Professor Tak-Wah Lam,        the first writer of BWT-SW paper, was collaborating with  Beijing  Genomics  Institute  on        SOAP2,  the successor to Lather (Curt Oligonucleotide Assay Package). SOAP2 has come out        in November 2008. According to the SourceForge download page, the  third  BWT-based  brusk        read  aligner,  bowtie,  was  first  released  in August 2008. At the time of writing this        manual, at least iii more BWT-based short-read aligners are being implemented.         The BWA-SW algorithm is a new component of BWA. It was  conceived  in  November  2008  and        implemented x months later.         The  BWA-MEM  algorithm  is  based  on  an  algorithm  finding super-maximal exact matches        (SMEMs), which was first published with  the  fermi  assembler  paper  in  2012.  I  first        implemented  the  bones  SMEM  algorithm in the          fastmap          command for an experiment and then        extended the bones algorithm and added the extension function in Feburary 2013 to make BWA-MEM        a fully featured mapper.        

stovallthiseved.blogspot.com

Source: http://manpages.ubuntu.com/manpages/trusty/man1/bwa.1.html

Related Posts

0 Response to "Run Bwa Aln With Paired End Reads"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel