View on GitHub


Combinatorial benchmarks for SAT solvers

Download this project as a .zip file Download this project as a tar.gz file

CNFgen is produces combinatorial benchmarks in DIMACS format, ready to be fed to SAT solvers. These benchmarks come mostly from research in Proof Complexity (e.g. pigeonhole principle, ordering principle, k-clique, …). Many of these formulas encode structured combinatorial problems well known to be challenging for certain SAT solvers.


  • cnfgen generator for CNF formula benchmarks,
  • the cnfformula library (read the docs) for more flexible CNF generation and manipulation,
  • CNF based on graph structures (see the supported graph formats),
  • behind-the-scene integration with many SAT solvers,
  • output in DIMACS and LaTeX formats,
  • formula post-processing.


You can install CNFgen from Python Package Index, together with all its dependencies, typing either

easy_install [--user] cnfgen


pip install  [--user] cnfgen

Otherwise it is possible to install from source using

python install [--user]

The --user option allows to install the package in the user home directory. If you do that please check that the location where the command line utilities are installed is in your PATH environment variable.


Run cnfgen -h to get information on how to invoke cnfgen. To get more information on how to generate some specific type of formula run cnfgen <formula_type> -h. See below for a gentler explanation of to how invoke cnfgen. A quick guide for each formula family follows.

Pigeonhole principle

Pigeonhole principle from \(m\) pigeons to \(n\) holes,

cnfgen php <m> <n>

You can add functional and onto clauses using the command line options.

cnfgen php [--functional] [--onto] <m> <n>
Graph pigeonhole Principle

The variant of pigeonhole principle over bipartite graphs, given in input or generated from the command line.

cnfgen gphp -i <bipartite_graph>
cnfgen gphp --bregular <P> <H> <d>

The latter command produces a pigeonhole formula over a random bipartite graph with \(P\) pigeons, \(H\) holes, and degree \(d\) on the left side. The graph is regular on the right side as well, therefore \(H\) must divide \(P\cdot d\).

Tseitin formulas

The Tseitin formula on a graph <G> (given in input) has the command line

cnfgen tseitin –charge <type> -i <G>

The initial charge of vertices is either first (only the first vertex is labeled 1) or one of random, randomodd, randomeven.

For example this produces a tseitin formula over a 10 by 10 grid graph, with random labeling of the vertices.

cnfgen tseitin --charge random --grid 10 10
Ordering principle

The ordering principle over N elements is generated by

cnfgen op <N>

It is possible to consider total orders with

cnfgen -t op <N>
Graph ordering principle

The graph ordering principle over an input graph <G> is produced by

cnfgen gop -i <G>

The vanilla ordering principle equivalent to the graph ordering principle if the underlying graph is the complete one, as in the example

cnfgen gop --complete 30
Matching principle

The formula claims that <N> elements can be matched in pairs.

cnfgen matching <N>
Graph matching principle

The formula claims a graph <G> in input has a perfect matching

cnfgen gmatching -i <G>

The matching formula is equivalent to the graph matching formula over a complete graph

cnfgen gmatching --complete 10
Counting principle

A formula that claims that a set of <M> elements can be partitioned into sets of size <p>.

cnfgen count <M> <p>

so that

cnfgen count 10 4

gives an unsatisfiable formula while

cnfgen count 9 3 

gives a satisfiable one.

Pebbling formula

We can generate a pebbling formula for a directed acyclic graph <D> in input using

cnfgen peb -i <D>

and, for example, we can generate the pebbling formula for a rooted binary tree of height 10 with

cnfgen peb --tree 10
Stones formulas

A variant of the pebbling formula. To generate the stone formula for a directed acyclic graph <D> in input and with a collection of <s> stones use

cnfgen stone -i <D> <s>
Graph isomorphism formula

Produce a CNF that claims two graphs <G1> and <G2> to be isomorphic.

cnfgen giso -1 <G1> -2 <G2>
Graph automorphism formula

Produce a CNF that claims that a graph <G> has no nontrivial automorphism.

cnfgen gauto -i <G>
Ramsey number formula

The command line

cnfgen ram <s> <k> <N>

produces a formula that claims that the Ramsey number r(s,k) is larger than N.

Ramsey counterexample

The formula claims that the graph <G> given in input is not a counter example for the Ramsey number \(r(k,s)\). Namely it claims that the graph has either a clique of size <k> or an independent set of size <s>.

cnfgen ramlb <k> <s> -i <G>
K-clique formula

The formula claims that there is no clique of size at least <k> in the input graph <G>.

cnfgen kclique <k> -i <G>

It is possible to plant a clique in the graph with the option --plantclique. The formula generated by

cnfgen kclique 10 --gnp 200 0.5 --plantclique 15

claims that there is a clique of size 10 in a random Erdös-Renyi graph with 200 vertices where a clique of size 15 has been planted.

K-coloring formula

The formula asserts that the input graph <G> is colorable with <k> colors.

cnfgen kcolor <k> -i <G>

For example a formula generated as

cnfgen kcolor --gnd 100 2 --plantclique 4 

is clearly unsatisfiable.

K-coloring formula

The formula asserts that there exists a graph \(G\) of <n> vertices that simultaneously

  • contains a clique of <k> vertices
  • it has a coloring with <c> colors.
cnfgen cliquecoloring <n> <k> <c>

In particular the theoretical interesting cases are when <k> is greater than <c>. For example

cnfgen cliquecoloring 100 25 24

is clearly unsatisfiable.

Random K-CNF

Sample a random CNF with <m> clauses of <k> literals over <n> variables. Clauses are sampled without replacement.

Warning the sampling could take infinite time if the ratio between remaining clauses and clauses to sample is too small.

cnfgen kcnf <k> <n> <m>
Single conjunction

Produce a CNF made by unit clauses, <p> positives and <n> negatives.

cnfgen and <p> <n>
Single disjunction

Produce a CNF made by a single clause, with <p> positives literals and <n> negatives ones.

cnfgen or <p> <n>

The CNFgen command line

The command line for producing a DIMACS file is

cnfgen [-o <output_file>] <formula_type> <formula_parameters>

where each choice of <formula_type> has its own parameters and options. For example here's how to get a pigeonhole principle formula from 10 pigeons to 7 holes printed to standard output:

cnfgen php 10 7 

We implement several families of formula in cnfgen tool. For a full list of implemented formula families type cnfgen --help. To get specific information on one family use

cnfgen <formula_type> --help

Some formulas require input graph(s), and the cnfgen tool supports several graph file formats. More importantly cnfgen can generate (and save on a side) the graph itself. Here's how to generate a formula encoding the Graph ordering principle on a random regular graph with 10 vertices and degree 3.

cnfgen gop --gnd 10 3

Or the formula claiming the 3-colorability formula of a 15 by 15 grid graph.

cnfgen kcolor 3 --grid 15 15

In the next example we generate the formula that claims the graph isomorphism between (1) the bidimensional torus of 3x1 and (2) the complete graph over three vertices. This formula is clearly satisfiable.

cnfgen giso --torus1 3 1 --complete2 3


The CNFgen project is by Massimo Lauria (, with helps and contributions by Marc Vinyals, Jan Elffers, Mladen Mikša and Jakob Nordström from KTH Royal Institute of Technology in Stockholm, Sweden. The work on this software has been funded

  • [2016-2017] The European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement ERC-2014-CoG 648276 AUTAR)
  • [2012-2015] The European Research Council under the European Union's Seventh Framework Programme (FP7/2007–2013) ERC grant agreement no. 279611.