vg
tools for working with variation graphs
Functions | Variables
minimizer_main.cpp File Reference
#include "subcommand.hpp"
#include <vg/io/vpkg.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
#include <getopt.h>
#include <omp.h>
#include "../index_manager.hpp"
#include <gbwtgraph/index.h>
#include "../min_distance.hpp"
#include "../handle.hpp"
#include "../utility.hpp"

Functions

int get_default_threads ()
 
size_t get_default_k ()
 
size_t get_default_w ()
 
size_t get_default_s ()
 
void help_minimizer (char **argv)
 
int main_minimizer (int argc, char **argv)
 

Variables

constexpr int DEFAULT_MAX_THREADS = 16
 

Detailed Description

Defines the "vg minimizer" subcommand, which builds the minimizer index.

The index contains the lexicographically smallest kmer in a window of w successive kmers and their reverse complements. If the kmer contains characters other than A, C, G, and T, it will not be indexed.

The index contains either all or haplotype-consistent minimizers. Indexing all minimizers from complex graph regions can take a long time (e.g. tens of hours vs 5-10 minutes for 1000GP), because many windows have the same minimizer. As the total number of minimizers is manageable (e.g. 1.5x more for 1000GP) it should be possible to develop a better algorithm for finding the minimizers.

A quick idea for indexing the entire graph:

Function Documentation

◆ get_default_k()

size_t get_default_k ( )

◆ get_default_s()

size_t get_default_s ( )

◆ get_default_threads()

int get_default_threads ( )

◆ get_default_w()

size_t get_default_w ( )

◆ help_minimizer()

void help_minimizer ( char **  argv)

◆ main_minimizer()

int main_minimizer ( int  argc,
char **  argv 
)

Variable Documentation

◆ DEFAULT_MAX_THREADS

constexpr int DEFAULT_MAX_THREADS = 16
constexpr