# CNFgen - list of benchmarks

# CNFgen

## Combinatorial benchmarks for SAT solvers

Here we highlight some of the most convenient and simple ways to get
benchmark formulas using `cnfgen`

.

- Pigeonhole principle
- Tseitin formulas
- Ordering principle
- Subset Cardinality
- Parity principle
- Counting principle
- Matching principle
- K-coloring formula
- K-clique formula
- Pebbling formula
- Stones formula
- Graph isomorphism formula
- Ramsey number formula
- Ramsey counterexample
- Clique-coloring formula
- Random k-CNF

### Pigeonhole principle

Pigeonhole principle claims that it is possible to assign \(M\) pigeons to \(N\) holes injectively. It is unsatisfiable when \(M>N\), and is generated by command line

cnfgen php M N

The following "sparse" variant of the pigeonhole principle has much shorter clauses.

cnfgen php M N d

Indeed, it generates a pigeonhole principle formula from \(m\) pigeons to \(n\) holes, too, but each pigeon is only allowed to fly into one among \(d\) of the available holes.

You can add functional and onto clauses using the command line options.

cnfgen php [--functional] [--onto] M N cnfgen php [--functional] [--onto] M N d

### Tseitin formulas

Basic UNSAT examples of Tseitin formulas can be generated via

cnfgen tseitin N cnfgen tseitin N d

which produce Tseitin formulas over random \(d\)-regular graphs, with random odd charge on the vertices (\(d\) defaults to 4 if unspecified).

For more control, Tseitin formula on a graph \(G\) has the command line

cnfgen tseitin [first|random|randomodd|randomeven] $G$

where \(G\) is a graph file or a graph specification, and where
charge can be either `first`

(only the first vertex is labeled 1)
or a random charge with, respectively, uspecified, odd, even
total charge.

For example the following command produces a `tseitin`

formula
over a 10 by 10 grid graph, with random labeling of the vertices.

cnfgen tseitin random grid 10 10

### Ordering principle

The ordering principle over \(N\) elements is generated by

cnfgen op N

and the *graph* ordering principle over a \(d\)-regular random graph
of \(N\) vertices is generate by

cnfgen op N d

It is possible to consider total orders with

cnfgen op --total N cnfgen op --total N d

For further control the graph ordering principle over an input graph \(G\) is produced by

cnfgen op G

### Subset cardinality

Subset cardinality formulas encode the requirement that, given \(\{-1,1\}\) values to the edges of a bipartite graph, the sum of the values incident to the left and right vertices must be, respectively, \(\geq 0\) and \(\leq 0\). To get UNSAT instances that are hard (at least for resolution) use

cnfgen subsetcard N

for increasing values of \(N\).

### Parity principle

The formula claims that \(N\) elements can be grouped in pairs. Clearly satisfiable if and only if \(N\) is even.

cnfgen parity N

### Counting principle

This formula that claims that a set of \(M\) elements can be partitioned into sets of size \(d\). Of course the formula is satisfiable if and only if \(d\) divides \(M\).

cnfgen count M d

so that

cnfgen count 10 4

gives an unsatisfiable formula while

cnfgen count 9 3

gives a satisfiable one.

### Matching principle

The formula claims some graph \(G\) has a matching. To generate the benchmark with \(G\) being a random \(d\)-regular on \(N\) vertices we can use

cnfgen matching gnd N d

The parity principle is equivalent to the matching formula over a complete graph.

cnfgen matching complete N

### K-coloring formula

The formula asserts that the input graph \(G\) is colorable with \(k\) colors.

cnfgen kcolor k G

For example a formula generated as

cnfgen kcolor 3 gnd 100 2 plantclique 4

is clearly unsatisfiable. Hard instances of \(3\)-coloring are random \(G(n,m)\) graphs with \(n\) vertices and \(m=2.522n\) edges placed at random. Unfortunately \(n\) must be large for the hardness to show up in practice.

cnfgen kcolor 3 gnm 1000 2530

### K-clique formula

The formula claims that there is no clique of size at least \(k\) in the input graph \(G\).

cnfgen kclique k G

It is possible to plant a clique in the graph with the option
`--plantclique`

. The formula generated by

cnfgen kclique 10 gnp 200 0.5 plantclique 15

claims that there is a clique of size 10 in a random ErdÃ¶s-Renyi graph with 200 vertices where a clique of size 15 has been planted.

For some hard instances you can ask for \(k\)-cliques in a \(G(n,p)\) random graph with \(p \ll n^{-2/(k-1)}\).

### Pebbling formula

Pebbling formulas are UNSAT CNFs which represent propagation from the sources to the sink of a directed acyclic graph. Command line

cnfgen peb pyramid H cnfgen peb tree H cnfgen peb path H

build pebbling formulas on pyramid, trees or path. of height \(H\).

Usually such formula are lifted via XOR substitution as in

cnfgen peb pyramid 10 -T xor 2

### Stones formulas

A more complex variant of the pebbling formula, apt to study hardness with respect to subsystems of resolution. The formula considers a set of \(s\) stones and a directed acyclic graph. Each vertex of the graph obtains a stone and then a process of propagation from the sources to the sink (similar to the one in the pebbling formula) is considered on the vertices with a stone on it. To generate a stone formula for, say, 12 stones and a pyramid graph:

cnfgen stone 12 pyramid 5

### Graph isomorphism formula

Produce a CNF that claims two graphs \(G_1\) and \(G_2\) to be
isomorphic. For example if the graphs are saved on `gml`

files use
the command.

cnfgen iso G1.gml -e G2.gml

If only one graph is given, the formula claims that the formula has a non trivial automorphism.

cnfgen iso G.gml

### Ramsey number formula

The command line

cnfgen ram s k N

produces a formula that claims that the Ramsey number \(r(s,k)\) is larger than \(N\).

### Ramsey counterexample

The formula claims that the graph \(G\) given in input is not a counter example for the Ramsey number \(r(k,s)\). Namely it claims that the graph has either a clique of size \(k\) or an independent set of size \(s\).

cnfgen ramlb k s G

### Clique-coloring formula

The formula asserts that there exists a graph \(G\) of \(n\) vertices that simultaneously

- contains a clique of \(k\) vertices
- it has a coloring with \(c\) colors.

cnfgen cliquecoloring n k c

Interesting hard UNSAT cases for resolution and cutting planes are when \(k \approx n^{2/3}\) and \(c=k-1\). For example

cnfgen cliquecoloring 64 16 15

is clearly unsatisfiable.

### Random K-CNF

Sample a random CNF with \(m\) clauses of \(k\) literals over \(n\) variables. Clauses are sampled without replacement. To get hard UNSAT random \(3\)-CNFs the user should pick about \(c n\) clauses where \(c>4.5\). Unfortunately this hardness is asymptotic, therefore \(n\) may need to be quite large.

cnfgen randkcnf 3 1000 4500