Aligns short reads using strobemers

Overview

StrobeAlign

Strobealign is a single or paired-end short-read aligner using syncmer-thinned strobemers. Strobealign is multithreaded and implements both alignment (SAM) and mapping (PAF). It is 12-15 times faster than BWA and Bowtie2 with similar accuracy for single-end reads, and about 10 times faster with a loss of 0.1-0.2% accuracy for paired-end reads. See experimentins in preprint.

The default parameter setting is tailored for Illumina single or paired-end reads of lengths about 150-500nt.

Strobealign is currently not recommended for reads shorter than 150nt as a lower value for parameter -k is needed (e.g. 15-17) and extensive testing in this setting remains to be done.

Strobealign is also currently not recommended for long reads (>500nt) as significant implementation changes is needed to keep its relative speed. For long reads we need a different extention algorithm (chaining of seeds instead of the current approach described in the preprint) and split-mapping funcitionality.

INSTALLATION

You can acquire precompiled binaries for Linux and Mac OSx from here. For example, for linux, simply do

wget https://github.com/ksahlin/StrobeAlign/tree/main/bin/Linux/StrobeAlign-v0.0.3.1
mv StrobeAlign-v0.0.3.1 strobealign  # rename to strobealign
chmod +x strobealign # make executable
./strobealign  # test program

If you want to compile from the source, you need to have a newer g++ and zlib installed. Then do the following:

git clone https://github.com/ksahlin/StrobeAlign
cd StrobeAlign
# Needs a newer g++ version. Tested with version 8 and upwards.
g++ -std=c++14 main.cpp source/index.cpp source/ksw2_extz2_sse.c -lz -fopenmp -o StrobeAlign -O3 -mavx2

Common installation from source errors

If you have zlib installed, and the zlib.h file is in folder /path/to/zlib/include and the libz.so file in /path/to/zlib/lib but you get

main.cpp:12:10: fatal error: zlib.h: No such file or directory
 #include <zlib.h>
          ^~~~~~~~
compilation terminated.

add -I/path/to/zlib/include -L/path/to/zlib/lib to the compilation, that is

g++ -std=c++14 -I/path/to/zlib/include -L/path/to/zlib/lib main.cpp source/index.cpp source/ksw2_extz2_sse.c -lz -fopenmp -o StrobeAlign -O3 -mavx2

USAGE

For alignment to SAM file:

StrobeAlign [-k 22 -s 18 -f 0.0002] -o <output.sam> ref.fa reads.fa 

For mapping to PAF file (option -x):

StrobeAlign [-k 22 -s 18 -f 0.0002] -x -o <output.sam> ref.fa reads.fa 

TODO

  1. Add option to separate build index and perform alignment in separate steps.

CREDITS

Kristoffer Sahlin. Faster short-read mapping with strobemer seeds in syncmer space. bioRxiv, 2021. doi:10.1101/2021.06.18.449070. Preprint available here.

VERSION INFO

Version 0.0.3.1

  1. Bugfix. Takes care of segmentation fault bug in paired-end mapping mode (-x) when none of the reads have NAMs.

Version 0.0.3

  1. Implements a paired-end alignment mode.
  2. Implements a rescue mode both in SE and PE alignment modes (described in preprint v2).
  3. Changed to symmetrical strobemer hashvalues due to inversions (described in preprint v2).

Version 0.0.2

  1. Implements multi-threading.
  2. Allow reads in fast[a/q] format and gzipped files through kseqpp library.

Version 0.0.1

The aligner used for the experiments presented in the preprint (v1) on bioRxiv. Only single threaded alignment and aligns reads as single reads (no PE mapping).

LICENCE

GPL v3.0, see LICENSE.txt.

Issues
  • pipe output to downstream programs

    pipe output to downstream programs

    hi @ksahlin , your approach looks really promising. I already started to work with it and do some benchmarks in respect with downstream analyses. Would it be possible to allow StrobeAlign to directly print the sam alignments to standard output? that would be very useful to pipe downstream tasks, such as marking duplicates and sorting with samtools. Thanks in advance!

    feature 
    opened by TDDB-limagrain 30
  • suggestion to align medium size indel

    suggestion to align medium size indel

    I have illumina paired reads 150bps that is targeted. The average coverage is pretty high. There is a known duplication of 77bps that is homozygous. It looks very good in IGV see snapshot.

    With the default parameters the alignment looks like this. Is there way I can adjust the settings to get the alignment to properly anchor both ends of the reads.

    139854754-c310eb28-384f-4115-8a15-1d541ddb36f3

    opened by husamia 14
  • ALIGNMENT TO REF LONGER THAN 2000bp

    ALIGNMENT TO REF LONGER THAN 2000bp

    hi @ksahlin , sorry for coming back with a new possible issue! I used the very latest patch you've made to align a full sample against the public genome we discussed later. The output was correctly piped to samtools for further duplicate marking and sorting, so thank you very much for allowing the redirection !

    Total mapping sites tried: 260811459
    Total calls to ssw: 83238353
    Calls to ksw (rescue mode): 24136154
    Did not fit strobe start site: 452500
    Tried rescue: 8059431
    Total time mapping: 4317.84 s.
    Total time reading read-file(s): 178.746 s.
    Total time creating strobemers: 188.722 s.
    Total time finding NAMs (non-rescue mode): 618.305 s.
    Total time finding NAMs (rescue mode): 341.146 s.
    Total time sorting NAMs (candidate sites): 75.2316 s.
    Total time reverse compl seq: 0 s.
    Total time base level alignment (ssw): 2239.28 s.
    Total time writing alignment to files: 525.32 s.
    

    Everything goes well, but looking at the standard output, I obtained the following message several times: ALIGNMENT TO REF LONGER THAN 2000bp - REPORT TO DEVELOPER.

    For example: ALIGNMENT TO REF LONGER THAN 2000bp - REPORT TO DEVELOPER. Happened for read: TGCCAATCCTGTGGGACGAATATGGGGCTCAAACCTCAGCCAAAACTCAATAGACACAGTGACGAATGTCTGGTAAAAAATTCAGACCAAAATACCAAAGGAGTAAGGCGTAGCAAGTCCCAGACCGAGAGTGAATAAAACCGGTTTTCCG ref len:53438756

    Do you have any idea about such a behavior? I ran strobealign with default parameters.

    opened by TDDB-limagrain 12
  • bam file header not compatible with GATK

    bam file header not compatible with GATK

    Hi, using GATK as a variant caller raises an error about the sort order of the chromosomes and indeed they are not sorted although samtools sort was used. The header looks like this

    @HD     VN:1.6  SO:coordinate
    @SQ     SN:StSOLv1.1ch07_RagTag LN:51859799
    @SQ     SN:StSOLv1.1ch01_RagTag LN:89994189
    @SQ     SN:StSOLv1.1ch12_RagTag LN:108042198
    @SQ     SN:Chr0_RagTag  LN:74722543
    @SQ     SN:StSOLv1.1ch08_RagTag LN:122763581
    @SQ     SN:StSOLv1.1ch02_RagTag LN:52958123
    @SQ     SN:StSOLv1.1ch11_RagTag LN:82684344
    @SQ     SN:StSOLv1.1ch09_RagTag LN:110327934
    @SQ     SN:StSOLv1.1ch06_RagTag LN:77416214
    @SQ     SN:StSOLv1.1ch04_RagTag LN:108266590
    @SQ     SN:StSOLv1.1ch05_RagTag LN:91627361
    @SQ     SN:StSOLv1.1ch10_RagTag LN:63742240
    @SQ     SN:StSOLv1.1ch03_RagTag LN:45113244
    @PG     ID:strobealign  PN:strobealign  VN:0.7  CL:strobealign
    @PG     ID:samtools     PN:samtools     PP:strobealign  VN:1.13 CL:samtools fixmate [email protected] -u -m - -
    @PG     ID:samtools.1   PN:samtools     PP:samtools     VN:1.13 CL:samtools view -bhS -
    @PG     ID:samtools.2   PN:samtools     PP:samtools.1   VN:1.13 CL:samtools sort [email protected]
    @PG     ID:samtools.3   PN:samtools     PP:samtools.2   VN:1.13 CL:samtools view -H Altus.ColombaNRGene.ragtag.strobealign.bam
    

    any ideas what is going wrong ? Other aligners like bwa produce a proper sorted bam file

    opened by danessel 11
  • sam file format error

    sam file format error

    Dear StrobeAlign author,

    Use default settings, the mapping sam file output cannot be transformed into sorted bam file by samtools:

    (base) [[email protected] Competitive_mapping]$ time StrobeAlign -t 24 T4Aer_MAG.fasta T4AerOil_R1.fa T4AerOil_R2.fa Using n: 2 k: 22 s: 18 t: 24 R: 2 w_min: 6 w_max: 14 [w_min, w_max] under thinning w roughly corresponds to sampling from downstream read coordinates (under random minimizer sampling): [30, 70] Time reading references: 7.76452 s

    ref vector approximate size: 9290379 Ref vector actual size: 8664874 Unique strobemers: 8651483 Total time generating flat vector: 2.85387 s

    Flat vector size: 8664874 Total strobemers count: 8664874 Total strobemers occur once: 8639067 Total strobemers highly abundant > 100: 0 Total strobemers mid abundance (between 2-100): 12415 Total distinct strobemers stored: 8651483 Ratio distinct to non distinct: 696 Filtered cutoff index: 1730 Filtered cutoff count: 2

    Total time generating hash table index: 0.72899 s

    Total time indexing: 11.3475 s

    Using rescue cutoff: 4 Running PE mode Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 104.07, stddev: 124.369) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 96.6191, stddev: 175.71) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 90.6641, stddev: 208.097) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 117.778, stddev: 245.558) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 106.226, stddev: 274.345) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 101.158, stddev: 301.601) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 97.5571, stddev: 323.28) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 102.831, stddev: 345.341) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 109.84, stddev: 371.438) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 114.511, stddev: 392.006) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 117.354, stddev: 407.773) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 138.058, stddev: 426.87) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 114.614, stddev: 440.569) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 108.915, stddev: 452.172) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 132.276, stddev: 467.806) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 129.727, stddev: 483.589) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 109.746, stddev: 500.041) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 107.748, stddev: 512.066) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 107.68, stddev: 525.201) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 117.557, stddev: 539.586) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 107.239, stddev: 553.863) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 116.547, stddev: 566.07) Mapping chunk of 1000000 query sequences... Estimated diff in start coordinates b/t mates, (mean: 110.742, stddev: 578.011) Mapping chunk of 148502 query sequences... Estimated diff in start coordinates b/t mates, (mean: 110.161, stddev: 589.893) Total mapping sites tried: 13055987 Total calls to ksw: 3218137 Calls to ksw (rescue mode): 1276608 Did not fit strobe start site: 1776340 Tried rescue: 33208162 Total time mapping: 103.476 s. Total time reading read-file(s): 25.1783 s. Total time creating strobemers: 18.6664 s. Total time finding NAMs (non-rescue mode): 3.60437 s. Total time finding NAMs (rescue mode): 0.409914 s. Total time sorting NAMs (candidate sites): 0.0890606 s. Total time reverse compl seq: 0 s. Total time extending alignment: 17.081 s. Total time writing alignment to files: 5.61499 s.

    real 1m55.191s user 17m49.681s sys 0m14.051s (base) [[email protected] Competitive_mapping]$ ls -lhs mapped.sam 7.3G -rw-r--r-- 1 jzhao399 p-ktk3 7.3G Nov 21 20:23 mapped.sam (base) [[email protected] Competitive_mapping]$ samtools view -bS [email protected] 24 mapped.sam | samtools sort [email protected] 24 -O bam -o mapped_sorted.bam samtools view: error reading file "mapped.sam": Input/output error samtools view: error closing "mapped.sam": -5 [bam_sort_core] merging from 0 files and 24 in-memory blocks...

    Any idea?

    Thanks,

    Jianshu

    bug 
    opened by jianshu93 10
  • terminate called after throwing an instance of 'std::out_of_range'

    terminate called after throwing an instance of 'std::out_of_range'

    Hi @ksahlin , I am giving a try to strobealign v0.6. I was able to compile it correctly on my computing nodes. Genome indexing works fine but in the PE mode, it crashes from the beginning of the mapping part:

    ...
    Using rescue cutoff: 208
    @SQ     SN:Chr01        LN:51433939
    @SQ     SN:Chr04        LN:48048378
    @SQ     SN:Chr08        LN:63048260
    @SQ     SN:Chr11        LN:53580169
    @SQ     SN:Chr09        LN:38250102
    @SQ     SN:Chr05        LN:40923498
    @SQ     SN:Chr02        LN:49670989
    @SQ     SN:Chr10        LN:44302882
    @SQ     SN:Chr07        LN:40041001
    @SQ     SN:Chr06        LN:31236378
    @SQ     SN:Chr03        LN:53438756
    @PG     ID:strobealign  PN:strobealign  VN:0.6  CL:strobealign
    Running PE mode
    Mapping chunk of 1000000 query sequences...
    terminate called after throwing an instance of 'std::out_of_range'
      what():  basic_string::substr
    Aborted (core dumped)
    

    This is strange, because running it in SE mode with either the forward or the reverse reads work perfectly. Any idea about that possible issue?

    Thanks!

    opened by TDDB-limagrain 8
  • Strategies for mutltimapping reads

    Strategies for mutltimapping reads

    Hello,

    first of all thanks for this efficient software. I was wondering if there is currently any option to retain all the reads with asociated flags on .sam output file, especially the FLAG tag specifying if the read is mapped 0, 1 or >1 time. This behavior is the same like -a option of Bowtie2.

    Multimapping reads seems to be discadreded at the time am I right ?

    Thanks in advance

    feature 
    opened by sebgra 6
  • Fasta accession modification

    Fasta accession modification

    From issue #3:

    "Small suggestion: remove the fasta annotation (only keep after > and before the first tab) from reference in sam output so that the sam output can be a little bit smaller. Or add an option to only keep them or an option to include annotation."

    feature 
    opened by ksahlin 4
  • CMake compilation supported?

    CMake compilation supported?

    Hi there,

    tried to compile v0.4 (on Ubuntu) from source via cmake, but - https://github.com/ksahlin/StrobeAlign/blob/main/CMakeLists.txt#L6 refers to source/edlib.cpp source/edlib.h which are not present in the repository/release tarball.

    Compiling via g++ -std=c++14 main.cpp source/index.cpp source/xxhash.c source/ksw2_extz2_sse.c source/ssw_cpp.cpp source/ssw.c -lz -fopenmp -o strobealign -O3 -mavx2, as outlined in the README, works as intended, though.

    Also, could you consider lowering cmake_minimum_required? 3.19 isn't widely deployed, Ubuntu 20.04 LTS currently ships with 3.16.3, for example.

    Thanks a lot!

    opened by sjaenick 3
  • -r option

    -r option

    Hi @ksahlin,

    for automated pipelines that employ strobealign, it's difficult to come up with a sensible setting for the -r parameter (even though it has a default value) since datasets to be processed will likely have different read lengths. Would it make sense to e.g. automatically infer a value here by looking at the first n reads to be mapped?

    feature 
    opened by sjaenick 2
  • cmake: add install target

    cmake: add install target

    Hi @ksahlin,

    I've now also added an install target to cmake; by default, make install would attempt installation to /usr/local, but this can be adapted with -DCMAKE_INSTALL_PREFIX:PATH=${PREFIX} when invoking cmake.

    opened by sjaenick 2
  • Allow adding read group (RG) tags

    Allow adding read group (RG) tags

    It would be great if StrobeAlign could (optionally) add an @RG header and RG tags to each read. Having RG tags is a requirement in some of our pipelines and for some programs, and I would consider it best practice. It’s possible, but awkward to add read group tags afterwards with samtools addreplacrg.

    For compatibility with BWA(-MEM)/minimap2, I would suggest to use command-line parameter -R, but at the moment, that is already used to set the "Rescue level". And although there would certainly be nicer ways to provide read group information than the way in which BWA does it, it would be good to at least also support the syntax that it uses (-R '@RG\tID:foo\tSM:bar').

    opened by marcelm 1
  • Segmentation fault when reference FASTA does not exist or has wrong format

    Segmentation fault when reference FASTA does not exist or has wrong format

    The program exits with a segfault when the reference FASTA file does not exist:

    $ strobealign ref.fasta dummy.fasta
    ...
    Unique strobemers: 1
    Total time generating flat vector: 3.2896e-05 s
    
    Flat vector size: 0
    Segmentation fault (core dumped)
    

    The same happens when a compressed FASTA (.gz) was provided (even if it exists).

    bug 
    opened by marcelm 0
  • The index: 32bit hashes instead of 64bit

    The index: 32bit hashes instead of 64bit

    Note to developer:

    Using 32-bit hashes may reduce both vector and hash table size. For human, it is expected to reduce peak memory from 32Gb (64bit) down to around 20-24Gb

    At the final hash value computation h1/2+h2/2 when the seed has been decided, simply hash this 64 bit down to 32 bit with wyhash or XXH32.

    This will result in more collisions on large genomes though as there are only 4.2billion unique slots. For human, we store about 450M unique values, but for larger genomes this may be a problem (can this be solved with 32/64 bit template?).

    I see quite a reduction in both space and alignment time if we change to 32bit hash value (on a small dataset!).

    optimization 
    opened by ksahlin 0
  • Base level alignment

    Base level alignment

    Note to developer:

    The extension step (nucleotide level alignment) is the bottleneck in strobealign. There are different three ways to reduce this:

    1. Direction 1 (change the alignment module):
      1. Change to base level alignment with WFA (WFA publ) as is done in Accelalign
    2. Direction 2 Speedup the current module used (SSW):
      1. By using 8bit slots in alignment matrix?
      2. By not computing alignment twice - is ssw does this?
    3. Direction 3 (partitioned SSW)
      1. Finish implementing partitioned SW (split alignment into several small hamming or SW alignments) if seeds are in middle.
    optimization 
    opened by ksahlin 0
  • MAPQ scores

    MAPQ scores

    Note to developer:

    It is not clear if strobealign has optimal mapq scores for variant calling. For example, having an accurate MAPQ score is crucial for SNP calling. See this tweet thread and this subthread.

    For info, a relevant paper about this is found here.

    feature high priority 
    opened by ksahlin 0
  • Alignment of long reads

    Alignment of long reads

    Note to developer:

    1. Use Randstobes order n=3.
    2. Collinear chaining: is needed, perhaps heuristic by sorting based on reference (see find_NAMs_alt() function)
    3. How to implement split mapping?
    4. Need partitioned local alignments of reads (regions extracted from seeds) - not SW-aligning the whole read at once
    feature 
    opened by ksahlin 0
Releases(v0.7.1)
  • v0.7.1(Apr 17, 2022)

    Improvements mainly for large repetitive genomes.

    • Introduces maximum limit on repetitive seeds before calling optimized merged match finder (optimized for repetitive reads). This reduces the computational time if the genome is large and repetitive, e.g., maize (2.4Gb), rye (7.8Gb), significantly.
    • Fixes sam header issue https://github.com/ksahlin/StrobeAlign/issues/22
    • Removes dependency on ksw2.
    Source code(tar.gz)
    Source code(zip)
    strobealign-v0.7.1_linux.zip(148.97 KB)
    strobealign-v0.7.1_osx.zip(159.51 KB)
  • v0.7(Apr 1, 2022)

    Major update in the implemented parallelization. The new parallel implementation allows a much more efficient interplay with reading input -> aligning -> writing output. This results in much better CPU usage as the number of threads increases. For example, I observed an almost a 2x speedup (50-30% reduced runtime) across four larger datasets when using 16 cores (SIM and GIAB 150bp and 250bp reads, see README benchmarks).

    For reference, previous naive parallelization ran in sequential order: 1. Read batch of reads with one thread 2. Align batch input in parallel with OpenMP 3. Write output with one thread. New parallelization performs 1-3 across threads with mutex on input and output. Such types of parallelization are commonly applied in other tools.

    This release also includes:

    • Implemented automatic inference of read length, which removes the need of specifying -r (as reported in https://github.com/ksahlin/StrobeAlign/issues/19)
    • Some minor bugfixes. For example, this bug is fixed.

    This release has identical or near-identical alignments to the previous version v0.6.1 (same accuracy and SV calling stats across tested datasets)

    Source code(tar.gz)
    Source code(zip)
    strobealign-v0.7_linux.zip(157.42 KB)
    strobealign-v0.7_OSX.zip(164.98 KB)
  • v0.6.1(Feb 23, 2022)

  • v0.6(Feb 20, 2022)

    Version 0.6 fixes a crucial bug introduced in v0.5 and has two additional bug fixes that improve accuracy. It is highly recommended to update to this version.

    1. Crucial bugfix to v0.5 causing rare but occasional alignments to very long reference regions due to bug in coordinate. This becomes detrimental to speed.
    2. Identifying symmetrical hash collisions and in those cases test the reverse orientation. This leads to a further slight bump in alignment accuracy over previous versions, particularly for shorter read lengths.
    3. Fix to rare but occasional uninitialized joint alignment score S calculation that would cause suboptimal alignment
    4. Fixes reporting of template len field in SAM output if deletion in alignment.
    Source code(tar.gz)
    Source code(zip)
    strobealign-v0.6_linux.zip(141.81 KB)
    strobealign-v0.6_osx.zip(150.89 KB)
  • v0.5(Feb 16, 2022)

    Added features, some improvements in alignment (accuracy), and minor bugfixes.

    1. Added parameter -N [INT] to output secondary alignments
    2. Base level alignment parameters can now be specified from command line -A -B -E -O
    3. Improved MAPQ calculation: calculating them from alignments (if alignment mode) instead of from seeds.
    4. Update default base-level alignment parameters for better alignments around indels.
    5. Added Quality values, AS:i and NM:i tags to SAM output.

    See INDEL/SNV calling benchmark in README.

    Source code(tar.gz)
    Source code(zip)
    strobealign-v0.5_linux.zip(138.31 KB)
    strobealign-v0.5_osx.zip(147.69 KB)
  • v0.4(Jan 16, 2022)

  • v0.3(Jan 13, 2022)

  • v0.2.1(Jan 9, 2022)

  • v0.2(Dec 30, 2021)

  • v0.1(Dec 27, 2021)

    Major update of strobealign. This version comes with an improvement in accuracy (and the number of aligned reads) around lengths 100-125nt reads, and it is also faster than older versions for these lengths. Most notable changes:

    • Algorithm changes

      • Using xxhash instead of no hash for strobes. Gives a better pseudorandom generation of hashes for linking.
      • Linking strobes using bitcount( (h_1 ^ h_2) ^ q) which creates a skewed seed length distribution towards shorter seeds in the window. This improves mapping candidate read detection particularly for shorter reads (100nt).
    • Parameters

      • Adding the option to customize sampling window of second strobe with -l and -u.
      • Adding a parameter -r [INT] for approximate read length (default 150). This will make strobealign customize parameters -l -u, and -k
    • Also cuts the reference accessions at first space, which fixes issue #4

    Source code(tar.gz)
    Source code(zip)
    strobealign-v0.1_osx.zip(119.93 KB)
    strobealign_v0.1_linux.zip(114.32 KB)
  • v0.0.3.2(Nov 30, 2021)

  • v0.0.3(Nov 3, 2021)

    Version 0.0.3

    1. Has paired-end alignment mode
    2. Implements a rescue mode both in SE and PE alignment modes (described in preprint).
    3. Changed to symmetrical strobemer hash values due to inversions (described in preprint).

    Known bugs:

    • Negative SAM coordinate bug in Single-end alignment mode. Observed once in 150M simulated reads
    • Segfault in paired-end mapping mode (never in alignment mode). Observed for the shortes reads (100nt) three times in 150M simulated reads
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Sep 27, 2021)

    StrobeAlign is now parallelized with OpenMP and can read fastq and gzipped fastq files with kseqpp.

    TODO

    • PE-alignment mode and joint scoring
    • Separate creation and storage of reference index
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Sep 21, 2021)

Owner
Kristoffer
Kristoffer
Get Next Line is a project at 42. It is a function that reads a file and allows you to read a line ending with a newline character from a file descriptor

Get Next Line is a project at 42. It is a function that reads a file and allows you to read a line ending with a newline character from a file descriptor. When you call the function again on the same file, it grabs the next line

Mhamed Ajjig 3 May 17, 2022
A short and sweet hex dumper!

██████╗ ██████╗ ███╗ ███╗██████╗ ██╔═══██╗██╔══██╗████╗ ████║██╔══██╗ ██║ ██║██║ ██║██╔████╔██║██████╔╝ ██║▄▄ ██║██║ ██║██║╚██╔╝██║██╔═══

Victor Sarkisov 1 Nov 18, 2021
Just another short video app (not tiktok) but 3 in 1.

Short videos app - India Another short videos app for Hindi audience. Made with 3 different apis: Moj app Josh app Chingari app Authetication No authe

Not Your Surya 2 Jan 6, 2022
This is a very short tool that predicts the number of cycles and execution time in Fulcrum when the operands and operations are known.

fulcrum-analytical-tool This is a very short tool that predicts the number of cycles and execution time in Fulcrum when the operands and operations ar

null 2 Feb 6, 2022
A test using a TTGO module (ESP32 + screen) which renders a 3d scene using pingo library

A simple 3D renderer tested and developed for the TTGO T-Display ESP32 board. The 3d renderer is: https://github.com/fededevi/pingo The 3D renderer is

fedevi 8 Apr 28, 2022
credential dump using foreshaw technique using SeTrustedCredmanAccessPrivilege

forkatz credential dump using forshaw technique using SeTrustedCredmanAccessPrivilege This code is based off of the blog post by james forshaw: https:

Barbarisch 116 Jun 25, 2022
Another version of EVA using anti-debugging techs && using Syscalls

EVA2 Another version of EVA using anti-debugging techs && using Syscalls First thing: Dont Upload to virus total. this note is for you and not for me.

null 259 Aug 3, 2022
In this Program, I am using C language and creating All Patterns Program using Switch case

In this Program, I am using C language and creating All Patterns Program using Switch case. It has 15 pattern programs like a pyramid, half pyramid, etc...

Rudra_deep 1 Nov 13, 2021
In DFS-BFS Implementation In One Program Using Switch Case I am Using an Simple And Efficient Code of DFS-BFS Implementation.

DFS-BFS Implementation-In-One-Program-Using-Switch-Case-in-C Keywords : Depth First Search(DFS), Breadth First Search(BFS) In Depth First Search(DFS),

Rudra_deep 1 Nov 17, 2021
multi-sdr-gps-sim generates a IQ data stream on-the-fly to simulate a GPS L1 baseband signal using a SDR platform like HackRF or ADLAM-Pluto.

multi-sdr-gps-sim generates a GPS L1 baseband signal IQ data stream, which is then transmitted by a software-defined radio (SDR) platform. Supported at the moment are HackRF, ADLAM-Pluto and binary IQ file output. The software interacts with the user through a curses based text user interface (TUI) in terminal.

null 58 Aug 3, 2022
CMSIS-DAP using TinyUSB

Dapper Mime This unearths the name of a weekend project that I did in 2014. Both then and now, this is a port of ARM's CMSIS-DAP code to a platform wi

null 52 Jul 24, 2022
And ESP32 powered VU matrix using the INMP441 I2S microphone

ESP32-INMP441-Matrix-VU This is the repository for a 3D-printed, (optionally) battery-powered, WS2812B LED matrix that produces pretty patterns using

null 46 Jul 17, 2022
Using a RP2040 Pico as a basic logic analyzer, exporting CSV data to read in sigrok / Pulseview

rp2040-logic-analyzer This project modified the PIO logic analyzer example that that was part of the Raspberry Pi Pico examples. The example now allow

Mark 52 Aug 4, 2022
Arduino sample code to help you get started using the Soracom IoT Starter Kit!

Soracom IoT Starter Kit The Soracom IoT Starter Kit includes everything you need to build your first connected device. It includes an Arduino MKR GSM

Soracom Labs 13 Jul 30, 2022
A laser cut Dreamcast Pop'n Music controller and integrated memory card using the Raspberry Pi Pico's Programmable IO

Dreamcast Pop'n Music Controller Using Raspbery Pi Pico (RP2040) Intro This is a homebrew controller for playing the Pop'n Music games on the Sega Dre

null 36 Aug 3, 2022
Web Server based on the Raspberry Pico using an ESP8266 with AT firmware for WiFi

PicoWebServer This program runs on a Raspberry Pico RP2040 to provide a web server when connected to an Espressif ESP8266. This allows the Pico to be

null 46 Jul 19, 2022
Using the LilyGo EPD 4.7" display to show OWM Weather Data

LilyGo-EPD-4-7-OWM-Weather-Display Using the LilyGo EPD 4.7" display to show OWM Weather Data Version 2.72 Improved Icon shapes and positioning Adjust

G6EJD 13 Apr 2, 2021
A Walkie-Talkie based around the ESP32 using UDP broadcast or ESP-NOW

Overview We've made a Walkie-Talkie using the ESP32. Explanatory video Audio data is transmitted over either UDP broadcast or ESP-NOW. So the Walkie-T

atomic14 219 Jul 21, 2022
Control your mouse using razer synapse

rzctl Control your mouse using razer synapse Compile in x64 Not tested for x86 Credits Process Hacker - https://github.com/processhacker/processhacker

null 39 Jun 17, 2022