# CNFgen

# CNFgen

## Combinatorial benchmarks for SAT solvers

CNFgen is produces combinatorial benchmarks in DIMACS format, ready to be fed to SAT solvers. These benchmarks come mostly from research in Proof Complexity (e.g. pigeonhole principle, ordering principle, k-clique, …). Many of these formulas encode structured combinatorial problems well known to be challenging for certain SAT solvers.

### Features

`cnfgen`

generator for CNF formula benchmarks,- the
`cnfformula`

library (read the docs) for more flexible CNF generation and manipulation, - CNF based on graph structures (see the supported graph formats),
- behind-the-scene integration with many SAT solvers,
- output in DIMACS and LaTeX formats,
- formula post-processing.

### Installation

You can install CNFgen from Python Package Index, together with all its dependencies, typing either

pip3 install [--user] cnfgen

or

python3 -m pip install [--user] cnfgen

if `pip3`

is not a program on your path. Otherwise it is possible
to install from source, assuming the requirements are already
installed, using

python3 setup.py install [--user]

The `--user`

option allows to install the package in the user home
directory. If you do that please check that the target location for
the command line utilities are in your $PATH.

### Usage

Run `cnfgen -h`

to get information on how to invoke `cnfgen`

. To get
more information on how to generate some specific type of formula
run `cnfgen <formula_type> -h`

. See below for a gentler explanation
of to how invoke `cnfgen`

. A quick guide for each formula
family follows.

- Pigeonhole principle
- Graph pigeonhole principle
- Tseitin formulas
- Ordering principle
- Graph ordering principle
- Parity principle
- Matching principle
- Counting principle
- Pebbling formula
- Stones formula
- Graph isomorphism formula
- Graph automorphism formula
- Ramsey number formula
- Ramsey counterexample
- K-clique formula
- K-coloring formula
- Clique-coloring formula
- Random k-CNF
- Single conjunction
- Single disjunction

##### Pigeonhole principle

Pigeonhole principle from \(m\) pigeons to \(n\) holes,

cnfgen php <m> <n>

You can add functional and onto clauses using the command line options.

cnfgen php [--functional] [--onto] <m> <n>

##### Graph pigeonhole Principle

The variant of pigeonhole principle over bipartite graphs, given in input or generated from the command line.

cnfgen gphp -i <bipartite_graph>

cnfgen gphp --bregular <P> <H> <d>

The latter command produces a pigeonhole formula over a random bipartite graph with \(P\) pigeons, \(H\) holes, and degree \(d\) on the left side. The graph is regular on the right side as well, therefore \(H\) must divide \(P\cdot d\).

##### Tseitin formulas

The Tseitin formula on a graph `<G>`

(given in input) has the
command line

cnfgen tseitin –charge <type> -i <G>

The initial charge of vertices is either `first`

(only the
first vertex is labeled 1) or one of `random`

, `randomodd`

,
`randomeven`

.

For example this produces a `tseitin`

formula over a 10 by 10 grid
graph, with random labeling of the vertices.

cnfgen tseitin --charge random --grid 10 10

##### Ordering principle

The ordering principle over N elements is generated by

cnfgen op <N>

It is possible to consider total orders with

cnfgen -t op <N>

##### Graph ordering principle

The graph ordering principle over an input graph `<G>`

is produced by

cnfgen gop -i <G>

The vanilla ordering principle equivalent to the graph ordering principle if the underlying graph is the complete one, as in the example

cnfgen gop --complete 30

##### Matching principle

The formula claims a graph `<G>`

in input has a perfect matching

cnfgen matching -i <G>

The parity principle is equivalent to the matching formula over a complete graph

cnfgen matching --complete 10

##### Counting principle

A formula that claims that a set of `<M>`

elements can be
partitioned into sets of size `<p>`

.

cnfgen count <M> <p>

so that

cnfgen count 10 4

gives an unsatisfiable formula while

cnfgen count 9 3

gives a satisfiable one.

##### Pebbling formula

We can generate a pebbling formula for a directed acyclic graph
`<D>`

in input using

cnfgen peb -i <D>

and, for example, we can generate the pebbling formula for a rooted binary tree of height 10 with

cnfgen peb --tree 10

##### Stones formulas

A variant of the pebbling formula. To generate the stone formula
for a directed acyclic graph `<D>`

in input and with a collection
of `<s>`

stones use

cnfgen stone -i <D> <s>

##### Graph isomorphism formula

Produce a CNF that claims two graphs `<G1>`

and `<G2>`

to
be isomorphic.

cnfgen giso -1 <G1> -2 <G2>

##### Graph automorphism formula

Produce a CNF that claims that a graph `<G>`

has no nontrivial automorphism.

cnfgen gauto -i <G>

##### Ramsey number formula

The command line

cnfgen ram <s> <k> <N>

produces a formula that claims that the Ramsey number r(s,k) is larger than N.

##### Ramsey counterexample

The formula claims that the graph `<G>`

given in input is not
a counter example for the Ramsey number \(r(k,s)\). Namely it claims
that the graph has either a clique of size `<k>`

or an independent
set of size `<s>`

.

cnfgen ramlb <k> <s> -i <G>

##### K-clique formula

The formula claims that there is no clique of
size at least <k> in the input graph `<G>`

.

cnfgen kclique <k> -i <G>

It is possible to plant a clique in the graph with the option
`--plantclique`

. The formula generated by

cnfgen kclique 10 --gnp 200 0.5 --plantclique 15

claims that there is a clique of size 10 in a random Erdös-Renyi graph with 200 vertices where a clique of size 15 has been planted.

##### K-coloring formula

The formula asserts that the input graph `<G>`

is colorable with
<k> colors.

cnfgen kcolor <k> -i <G>

For example a formula generated as

cnfgen kcolor --gnd 100 2 --plantclique 4

is clearly unsatisfiable.

##### K-coloring formula

The formula asserts that there exists a graph \(G\) of `<n>`

vertices that simultaneously

- contains a clique of
`<k>`

vertices - it has a coloring with
`<c>`

colors.

cnfgen cliquecoloring <n> <k> <c>

In particular the theoretical interesting cases are when `<k>`

is
greater than `<c>`

. For example

cnfgen cliquecoloring 100 25 24

is clearly unsatisfiable.

##### Random K-CNF

Sample a random CNF with `<m>`

clauses of `<k>`

literals over
`<n>`

variables. Clauses are sampled without replacement.

**Warning** the sampling could take infinite time if the ratio
between remaining clauses and clauses to sample is too small.

cnfgen randkcnf <k> <n> <m>

### The CNFgen command line

The command line for producing a DIMACS file is

cnfgen [-o <output_file>] <formula_type> <formula_parameters>

where each choice of `<formula_type>`

has its own parameters and
options. For example here's how to get a pigeonhole principle
formula from 10 pigeons to 7 holes printed to standard output:

cnfgen php 10 7

We implement several families of formula in `cnfgen`

tool.
For a full list of implemented formula families type ```
cnfgen
--help
```

. To get specific information on one family use

cnfgen <formula_type> --help

Some formulas require input graph(s), and the `cnfgen`

tool supports
several graph file formats. More importantly `cnfgen`

can generate
(and save on a side) the graph itself. Here's how to generate
a formula encoding the Graph ordering principle on a random regular
graph with 10 vertices and degree 3.

cnfgen gop --gnd 10 3

Or the formula claiming the 3-colorability formula of a 15 by 15 grid graph.

cnfgen kcolor 3 --grid 15 15

In the next example we generate the formula that claims the graph isomorphism between (1) the bidimensional torus of 3x1 and (2) the complete graph over three vertices. This formula is clearly satisfiable.

cnfgen giso --torus1 3 1 --complete2 3

### Acknowledgments

The CNFgen project is by Massimo Lauria (lauria.massimo@gmail.com), with helps and contributions by Marc Vinyals, Jan Elffers, Mladen Mikša and Jakob Nordström from KTH Royal Institute of Technology in Stockholm, Sweden. Special thanks to Adam Lugowski who updated the codebase to work on Python 3 and NetworkX 2.

A large part of this work has been funded by

- [2016-2017] The European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement ERC-2014-CoG 648276 AUTAR)
- [2012-2015] The European Research Council under the European Union's Seventh Framework Programme (FP7/2007–2013) ERC grant agreement no. 279611.