Welcome to pygobnilp’s documentation!¶
The Gobnilp class¶

class
pygobnilp.gobnilp.
Gobnilp
¶ Subclass of the Gurobi Model class specific to learning Bayesian networks. See documentation for the Gurobi Model class for all methods not documented here.

exception
StageError
¶ Raised when a method is called at the wrong stage of learning.

exception
UserConstraintError
¶ Raised when there is a problem with a userdefined constraint.

__init__
()¶ Initialise a Gobnilp object

absolute_generation_difference
¶ Dictionary of absolute generation difference variables (if these variables have been created, by default they are not)
Assuming the appropriate constraints have been added, if it exists
absolute_generation_difference[v1,v2]
is the absolute value ofgeneration_difference[v1,v2]
.Raises: Gobnilp.StageError
– If no absolute generation difference variables have been createdType: dict

add_basic_constraints
()¶ Adds the most useful constraints
Adds the constraints added by the following methods:

add_basic_variables
()¶ Adds the most useful Gurobi MIP variables
Adds the variables added by the following methods:
Arrow and adjacency variables are given a higher branching priority than family variables to encourage a balanced search tree.

add_constraints_4b
()¶ Adds “4B” constraints.
All possibly useful 4B constraints are created but these are stored as lazy constraints with a lazy setting of 3 which means that a 4B constraint is pulled into the MIP model only when they cut off the current linear relaxation solution (or integer solution).
See Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets and Complexity (Cussens et al, JAIR) for details

add_constraints_absgendiff
()¶ Adds constraints linking generation difference variables to absolute generation difference variables

add_constraints_arrow_adjacency
()¶ Add constraints that there is an adjacency between v1 and v2 in the undirected skeleton if there is either an arrow from v1 to v2, or an arrow from v2 to v1 in the DAG

add_constraints_arrow_family
()¶ Adds constraints linking arrow variables to family variables
If
pa
is not a parent ofch
in any family then the corresponding arrow variablech<pa
is simply removed.

add_constraints_arrow_total_order
()¶ Adds constraints linking arrow variables to total order variables

add_constraints_bests
()¶ Add the constraint that at least one BN variable has its best scoring parent set selected

add_constraints_choose_best_for_order
()¶ Adds the constraint that the highest scoring parent set should be chosen whenever whenever the total order variables indicate that doing so would not cause a cycle

add_constraints_chordal
()¶ Adds simple constraints to rule out nonchordal DAGs i.e. those without vstructures (aka immoralities)
 Constraints are roughly of this sort:
 a<{b,c} + b<{a,c} + c<{a,b} <= ab

add_constraints_clusters
(cluster_cuts=True, matroid_cuts=False, matroid_constraints=False)¶ Adds cluster constraints
For any cluster of BN variable, the cluster constraint states that at least one element in the cluster has no parents in that cluster.
These constraints are always added lazily, since there are exponentially many of them.
Parameters:  cluster_cuts (bool) – If True then cluster constraints are added as cuts (i.e. as soon as the linear relaxation is solved). If False, we wait until a candidate integer solution is found.
 matroid_cuts (bool) – If True then cuts corresponding to rank 2 matroids are also added.
 matroid_constraints (bool) – If True then constraints corresponding to rank 2 matroids are added when an integer solution corresponding to a cyclic digraph is generated.

add_constraints_cycles
()¶ Adds cycle constraints (on arrow variables)
Since there are exponentially many possible cycles, these constraints are added lazily (via a callback).

add_constraints_gen_arrow_indicator
()¶ Adds constraints stating that an arrow from parent to child means that the child’s generation number is strictly greater than the parent’s.

add_constraints_gen_index_link
()¶ Adds the constraint linking generation to generation index variables

add_constraints_gendiff
()¶ Adds constraints linking generation difference variables to generation variables.

add_constraints_genindex
()¶ Adds the constraint that each variable has exactly one generation index and that each of these indices is distinct

add_constraints_kbranching
(k=0)¶ Adds a constraint so that the learned BN is a kbranching.
A DAG is a branching if each child has at most one parent. A DAG is a kbranching if there is a set of at most k edges the removal of which results in a branching
Parameters: k (int) – The value of k

add_constraints_one_dag_per_MEC
(dynamic=True, careful=False)¶ Adds a constraint that only one DAG per Markov equivalence class is feasible.
The constraint is effected by adding appropriate lazy constraints when Gurobi generates a DAG solution.
Parameters:  dynamic (bool) – If True then which DAG is feasible for each Markov equivalence
class is arbitrary (it will be the first one Gurobi comes across). If false
then the representative DAG is fixed
(and is determined by the method
dec2mag
).  careful (bool) – If True then all the lazy constraints are stored (not just posted) to ensure that new solutions satisfy them. (The value for careful is ignored if dynamic is False.)
 dynamic (bool) – If True then which DAG is feasible for each Markov equivalence
class is arbitrary (it will be the first one Gurobi comes across). If false
then the representative DAG is fixed
(and is determined by the method

add_constraints_oneparentset
()¶ Adds the constraint that each child has exactly one parent set.

add_constraints_polytree
()¶ Adds the constraint that the DAG should be a polytree
Constraints (and cuts) ruling out cycles in the undirected skeletong are always added lazily, since there are exponentially many of them.
Cluster constraints are removed if this constraint added since ruling out cycles in the undirected skeleton prevents any in the DAG.

add_constraints_setpacking
()¶ Adds constraints like a<b,c + b<a,c + c<a,b <= 1. That is an example for a “triple”. Also adds similar constraints for 4tuples.

add_constraints_sumgen
()¶ Adds the constraint that sum of generation numbers is n*(n1)/2

add_constraints_total_order
(lazy=0)¶ Adds constraints so that total order variables represent a total order
Parameters: lazy (int) – Controls the ‘laziness’ of these constraints by settng the Lazy attribute of the constraints. See the Gurobi documentation

add_forbidden_adjacency
(uv)¶ Add a constraint that a pair of vertices must not be adjacent.
Parameters: uv (iter) – Pair of nodes Raises: Gobnilp.UserConstraintError
– If the adjacency is also obligatory.

add_forbidden_ancestor
(u, v)¶ Add constraint that there can be no directed path from u to v
Parameters:  u (str) – Start of path
 v (str) – End of path
Raises: Gobnilp.UserConstraintError
– If the vertices are the same or the directed path is also obligatory.

add_forbidden_arrow
(u, v)¶ Add a constraint that there can be no arrow between two vertices
Parameters:  u (str) – Head of forbidden arrow
 v (str) – Tail of forbidden arrow
Raises: Gobnilp.UserConstraintError
– If the vertices are the same or the arrow is also obligatory.

add_obligatory_adjacency
(uv)¶ Add a constraint that a pair of vertices must be adjacent.
Parameters: uv (iter) – Pair of nodes Raises: Gobnilp.UserConstraintError
– If the adjacency is also forbidden.

add_obligatory_ancestor
(u, v)¶ Add a constraint that there must be a directed path between specified vertices.
Parameters:  u (str) – Start of path
 v (str) – End of path
Raises: Gobnilp.UserConstraintError
– If the vertices are the same or the directed path is also forbidden or would create a cycle.

add_obligatory_arrow
(u, v)¶ Add constraint that there must be an arrow from u to v
Parameters:  u (str) – Head of arrow
 v (str) – Tail of arrow
Raises: Gobnilp.UserConstraintError
– If the vertices are the same or the arrow is also forbidden or would create a cycle.

add_obligatory_conditional_independence
(a, b, s)¶ Add a constraint that each BN variable in a must be independent of each BN variable in b conditional on s.
Parameters:  a (iter) – A set of BN variables
 b (iter) – A set of BN variables
 s (iter) – A set, possibly empty, of BN variables
Raises: Gobnilp.UserConstraintError
– If a or b is empty or the 3 sets are not disjoint or the desired conditional independence is not possible (given other constraints).

add_obligatory_independence
(a, b)¶ Add a constraint that each BN variable in a must be independent of each BN variable in b.
Parameters:  a (iter) – A set of BN variables
 b (iter) – A set of BN variables
Raises: Gobnilp.UserConstraintError
– If the 3 sets are not disjoint or the desired conditional independence is not possible (given other constraints).

add_variables_absgendiff
(branch_priority=0)¶ Adds variables representing the absolute difference in generation number between each pair of distinct BN variables.
These variables are constrained to have a lower bound of 1. So as long as constraints are posted connecting these variables to the generation_difference variables and thus ultimately to the generation variables, then each BN variable will have a different generation number.
Calling
add_constraints_absgendiff
ensure that these variables indeed are equal to the absolute values of generation difference variables.Generation variables are added with
add_variables_gen
. Generation difference variables are added withadd_variables_gendiff
. See the documentation for these two methods for details of how to add appropriate constraints.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the absolute generation difference variables.

add_variables_adjacency
(branch_priority=0)¶ Adds binary Gurobi MIP adjacency variables to the model
The adjacency variable corresponding to
{v1,v2}
is set to 1 iff there is an arrow fromv1
tov2
or an arrow fromv2
tov1
.To connect these variables appropriately to arrow variables it is necessary to call
add_constraints_arrow_adjacency
.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the adjacency variables.

add_variables_arrow
(branch_priority=0)¶ Adds binary Gurobi MIP arrow variables to the model
The arrow variable corresponding to
(pa,ch)
is set to 1 iff there is an arrow frompa
toch
in a learned BN.To connect these variables appropriately to family variables it is necessary to call
add_constraints_arrow_family
.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the arrow variables.

add_variables_family
(branch_priority=0, best_first_branch_priority=False)¶ Adds binary Gurobi MIP family variables to the model
This method should only be called after data (or local scores) have been read in using, for example, a method such as
input_discrete_data
Parameters:  branch_priority (int) – The Gurobi branching priority for the family variables. This value is ignored if best_first_branch_priority is True.
 best_first_branch_priority (bool) – If True then the branching priority for the family variables for any given child are (totally) ordered according to local score, with the higher scoring families given higher branching priority than lower ones.

add_variables_gen
(branch_priority=0)¶ Adds generation variables to the model
A generation number for a variable in a DAG is an integer such that any variable has a generation number stricty greater than any of its parents.
To connect these variables appropriately to arrow variables it is necessary to call
add_constraints_gen_arrow_indicator
.To set the sum of all generation numbers to n*(n1)/2 use
add_constraints_sumgen
.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the generation variables.

add_variables_gendiff
(branch_priority=0)¶ Adds variables representing the difference in generation number between distinct BN variables
Generation and generation difference variables are connected appropriately with
add_constraints_gendiff
.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the generation difference variables.

add_variables_genindex
(branch_priority=0, earlyfirst=True)¶ Adds binary variables indicating whether a BN variable has a particular generation number
Parameters:  branch_priority (int) – The Gurobi branching priority for the generation index variables. (Ignored if earlyfirst==True.)
 earlyfirst (bool) – Generation index variable for low generation numbers have higher branching priority than those for high generation numbers.

add_variables_kbranching
(branch_priority=0, ub=None)¶ Adds a variable which is the number of arcs that must be deleted for the learned DAG to be a branching. In a branching each node has at most one parent
Parameters:  branch_priority (int) – The Gurobi branching priority for the generation difference variables.
 ub (int) – An upper bound for this variable

add_variables_kbranching_ch
(branch_priority=0)¶ Adds variables for recording max(0,parents1) for each child
Parameters: branch_priority (int) – The Gurobi branching priority for the generation difference variables.

add_variables_total_order
(branch_priority=0)¶ Adds binary Gurobi MIP total order variables to the model
The total order variable corresponding to
(v1,v2)
is set to 1 iffv1 > v2
in the total order associated with a learned BN. Parents always come before children in the total order.To connect these variables appropriately to arrow variables it is necessary to call
add_constraints_arrow_total_order
.All these variables are given objective value 0. (This can be overridden using the
Obj
attribute of the variable.)Parameters: branch_priority (int) – The Gurobi branching priority for the total order variables.

adjacency
¶ Dictionary of adjacency variables (if these variables have been created, by default they are)
Assuming the appropriate constraints have been added,
adjacency[{v1,v2}]
is the adjacency variable indicating thatv1
andv2
are adjacent.Raises: Gobnilp.StageError
– If no adjacency variables have been createdType: dict

allowed_user_constypes
= ('forbidden_arrows', 'forbidden_adjacencies', 'obligatory_arrows', 'obligatory_adjacencies', 'obligatory_ancestors', 'forbidden_ancestors', 'obligatory_conditional_independences')¶ Constraint types available to users. Used as key values when providing constraints via a dictionary. For each constraint type there is also a method for adding constraints whose name has add_ as a prefix and is in the singular.
See also
input_user_conss_from_dict
add_forbidden_arrow
add_forbidden_adjacency
add_obligatory_arrow
add_obligatory_adjacency
add_obligatory_ancestor
add_forbidden_ancestor
add_obligatory_conditional_independence
Type: tuple

arrow
¶ Dictionary of arrow variables (if these variables have been created, by default they are)
Assuming the appropriate constraints have been added,
arrow[pa,ch]
is the arrow variable indicating that there is an arrow frompa
toch
.Raises: Gobnilp.StageError
– If no arrow variables have been createdType: dict

before
(stage1, stage2)¶ Is stage1 strictly after stage2?
Parameters:  stage1 (str) – A Gobnilp learning stage
 stage2 (str) – A Gobnilp learning stage
Raises: KeyError
– If any of the arguments are not the names of Gobnilp learning stagesReturns: Whether stage1 is strictly before stage2
Return type: bool

between
(stage1, stage2, stage3)¶ Is stage2 strictly after stage1 but not strictly after stage 3?
Parameters:  stage1 (str) – A Gobnilp learning stage
 stage2 (str) – A Gobnilp learning stage
 stage3 (str) – A Gobnilp learning stage
Raises: KeyError
– If any of the arguments are not the names of Gobnilp learning stagesReturns: Whether stage2 is strictly after stage1 but not strictly after stage3
Return type: bool

bn_variables
¶ The BN variables (in order)
Type: list

bn_variables_index
¶ Maps each BN variable to its index in the sorted list of BN variables
Type: dict

child
¶ child[i]
is the child in the family with indexi
.See also
Type: list

clear_basic_model
()¶ Removes variables and constraints added by
make_basic_model
.

dag2mec
(dag)¶ Finds the Markov equivalence class for a DAG
Parameters: dag (iterable) – A DAG represented by a collection of (indices for) ‘families’, where each family is a BN variable together with its parents. Returns: A set of sets which is a sparse representation of the characteristic imset which represents the Markov equivalence class. If (and only if) a set has value 1 in the characteristic imset then it is included as an element in the returned set. Return type: frozenset

data
¶ Data associated with the instance
Type: pandas.DataFrame

data_arities
¶ Arities of the variables in the data
The order of the arities matches the order of the variables in the data, see: data_variables not the order in bn_variables (which is always in sorted order).
Raises AttributeError if continuous data being used
Type: numpy.array

data_variables
¶ the variables in the data
Variables are in the order supplied by the original data source not the order in bn_variables (which is always in sorted order).
Type: list

family
¶ Dictionary of family variables (if these variables have been created, by default they are)
Assuming the appropriate constraints have been added,
family[child][parent_set]
is the family variable indicating thatparent_set
is the parent set forchild
.Raises: Gobnilp.StageError
– If no family variables have been createdType: dict

family_list
¶ family_list[i]
is the family variable for the family with indexi
. (if these variables have been created, by default they are)See also
Raises: Gobnilp.StageError
– If no family variables have been createdType: list

family_scores
¶ Dictionary of family scores (aka local scores)
family_scores[child][parentset]
is the local score forchild
havingparentset
(a frozenset) as its parents.Raises: Gobnilp.StageError
– If no local scores have been createdType: dict

forbidden_adjacencies
¶ If {u,v} is included then an adjacency between u to v is forbidden.
See also
Type: set

forbidden_ancestors
¶ If (u,v) is included then an directed path from u to v is forbidden.
See also
add_forbidden_ancestors
Type: set

forbidden_arrows
¶ If (u,v) is included then an arrow from u to v is forbidden.
See also
Type: set

generation
¶ Dictionary of generation variables (if these variables have been created by calling
add_variables_gen
)Assuming appropriate constraints have been added, by, for example, calling the method
add_constraints_gen_arrow_indicator
, thengeneration[v1]
is the generation number forv1
.See Section 3.1.2 of Maximum likelihood pedigree reconstruction using integer programming (Cussens, 2010) for details of how generation variables can be used to rule out cycles in directed graphs.
Raises: Gobnilp.StageError
– If no generation variables have been createdType: dict

generation_difference
¶ Dictionary of generation difference variables (if these variables have been created by calling
add_variables_gendiff
)Assuming the appropriate constraints have been added, by, for example, calling the method
add_constraints_gendiff
, thengeneration_difference[v1,v2] = generation[v1]  generation[v2]
Raises: Gobnilp.StageError
– If no generation difference variables have been createdType: dict

generation_index
¶ Dictionary of generation index variables (if these variables have been created, by default they are not)
Assuming the appropriate constraints have been added, if it exists
generation_index[v1,pos]
indicates whetherv1
has generation numberpos
Raises: Gobnilp.StageError
– If no generation index variables have been createdType: dict

get_family_index
¶ Maps a family to its index
get_family_index[child][parents]
is the index for the given family.See also
child
andparents
andfamily_list
.Type: dict

gobnilp_optimize
()¶ Solve the MIP model constructed by Gobnilp
This overrides Gurobi’s optimize method by hard coding in calls to a callback function for adding Gobnilp specific lazy constraints (and cuts).

input_local_scores
(local_scores)¶ Read local scores from a dictionary.
Once this method has been run, methods for adding MIP variables, such as
add_variables_family
andadd_basic_variables
, can be used.Parameters: local_scores (dict) – Dictionary containing local scores. local_scores[child][parentset]
is the score forchild
having parentsparentset
whereparentset
is a frozenset.

input_user_conss
(consfile)¶ Read user constraints from a Python module and store them
If consfile is None then this method returns silently.
See also
Such constraints can be read in prior to computation of local scores, and can make that computation more efficient
Parameters: consfile (str/None) – If not None then a file containing user constraints

input_user_conss_from_dict
(consdict)¶ Read user constraints from a dictionary and store them
The keys of the dictionary must be strings each of which should name an allowed constraint type. Each value should be either (1) a sequence of items where each item is a sequence (perhaps of length one) providing argument(s) to give to the corresponding “add_..” method. For example, dkt1 (below) would be an acceptable dictionary (assuming A, B, C, E and F) are BN variables:
dkt1={'forbidden_adjacencies':[['AB'],['BC']], 'obligatory_arrows':[['E','F']]}
or (2) a function which, when given the Gobnilp object, returns a sequence of the same form as in case 1). For example, dkt2 would be an acceptable dictionary:
def no_arrows(gobnilp): return ([(v,w) for v in gobnilp.bn_variables for w in gobnilp.bn_variables if w!=v]) dkt2={'forbidden_arrows':no_arrows}
See also
allowed_use_constypes
add_forbidden_arrow
add_forbidden_adjacency
add_obligatory_arrow
add_obligatory_adjacency
add_obligatory_ancestor
add_forbidden_ancestor
add_obligatory_conditional_independence
Such constraints can be read in prior to computation of local scores, and can make that computation more efficient
Parameters: consdict (dict/None) – Dictionary mapping the names of allowed constraint types to constraints Raises: ValueError
– If dictionary contains a key that is not (the name of) an allowed constraint type

learn
(data_source=None, varnames=None, header=True, comments='#', delimiter=None, start='no data', end='output written', data_type='discrete', score='BDeu', local_score_fun=None, k=1, sdresidparam=True, standardise=False, arities=None, palim=3, alpha=1.0, nu=None, alpha_mu=1.0, alpha_omega=None, starts=(), local_scores_source=None, nsols=1, kbest=False, mec=False, polytree=False, chordal=False, consfile=None, consdict=None, settingsfile=None, pruning=True, edge_penalty=0.0, plot=True, abbrev=True, output_scores=None, output_stem=None, output_dag=True, output_cpdag=True, output_ext=('pdf', ), verbose=0, gurobi_output=False, **params)¶ Parameters:  data_source (str/array_like) – If not None, name of the file containing the discrete data or an array_like object. If None, then it is assumed that data has previously been read in.
 varnames (iterable/None) – Names for the variables in the data. If data_source is a filename then this value is ignored and the variable names are those given in the file. Otherwise if None then the variable names will X1, X2, …
 header (bool) – Ignored if data is not a filename with continuous data. Whether a header containing variable names is the first noncomment line in the file.
 comments (str) – Ignored if data is not a filename with continuous data. Lines starting with this string are treated as comments.
 delimiter (None/str) – Ignored if data is not a filename with continuous data. String used to separate values. If None then whitespace is used.
 start (str) – Starting stage for learning. Possible stages are: ‘no data’, ‘data’, ‘local scores’, ‘MIP model’, ‘MIP solution’, ‘BN(s)’ and ‘CPDAG(s)’.
 end (str) – End stage for learning. Possible values are the same as for start.
 data_type (str) – Indicates the type of data. Must be either ‘discrete’ or ‘continuous’
 score (str) – Name of scoring function used for computing local scores. Must be one of the following: BDeu, BGe, DiscreteLL, DiscreteBIC, DiscreteAIC, GaussianLL, GaussianBIC, GaussianAIC, GaussianL0. This value is ignored if local_score_fun is not None.
 local_score_fun (fun/None) – If not None a local score function such that local_score_fun(child,parents) computes (score,ub) where score is the desired local score for child having parentset parents and ub is either None or an upper bound on the local score for child with any proper superset of parents
 k (float) – Penalty multiplier for penalised loglikelihood scores (eg BIC, AIC) or tuning parameter (‘lambda^2) for l_0 penalised Gaussian scoring (as per van de Geer and Buehlmann)
 sdresidparam (bool) – For Gaussian BIC and AIC, whether (like bnlearn) to count the standard deviation of the residuals as a parameter when computing the penalty
 standardise (bool) – Whether to standardise continuous data.
 arities (array_like/None) – Arities for the discrete variables. If data_source is a filename then this value is ignored and the arities are those given in the file. Otherwise if None then the arity for a variable is set to the number of distinct values observed for that variable in the data. Ignored for continuous data.
 palim (int/None) – If an integer, this should be the maximum size of parent sets.
 alpha (float) – The equivalent sample size for BDeu local score generation.
 nu (iter/None) – The mean vector for the Normal part of the normalWishart prior for BGe scoring. If None then the sample mean is used.
 alpha_mu (float) – Imaginary sample size value for the Normal part of the normalWishart prior for BGe scoring.
 alpha_omega (float/None) – Degrees of freedom for the Wishart part of the normalWishart prior for BGe scoring. Must be at least the number of variables. If None then set to 2 more than the number of variables.
 starts (iter) – A sequence of feasible DAGs the highest scoring one of which will be the initial incumbent solution. Each element in the sequence can be either a bnlearn model string or an nx.DiGraph instance. If this value is not empty, a local scoring function must be provided.
 local_scores_source (str/file/dict/None) – Ignored if None. If not None then local scores are not computed from data.
but come from local_scores_source. If a string then the name of a file containing local scores.
If a file then the file containing local scores.
If a dictionary, then
local_scores[child][parentset]
is the score forchild
having parentsparentset
whereparentset
is a frozenset.  nsols (int) – Number of BNs to learn
 kbest (bool) – Whether the nsols learned BNs should be a highest scoring set of nsols BNs.
 mec (bool) – Whether only one BN per Markov equivalence class should be feasible.
 polytree (bool) – Whether the BN must be a polytree.
 chordal (bool) – Whether the BN represent a chordal undirected graph (i.e. have no immoralities).
 consfile (str/None) – If not None then a file (Python module) containing user constraints. Each such constraint is stored indefinitely and it is not possible to remove them.
 consdict (dict/None) – If not None then a dictionary containing user constraints.
The dictionary is used as input to
input_user_conss_from_dict
 settingsfile (str/None) – If not None then a file (Python module) containing values for the arguments for this method. Any such values override both default values and any values set by the method caller.
 pruning (bool) – Whether not to include parent sets which cannot be optimal when acyclicity is the only constraint.
 edge_penalty (float) – The local score for a parent set with p parents will be reduced by p*edge_penalty.
 plot (bool) – Whether to plot learned BNs/CPDAGs once they have been learned.
 abbrev (bool) – When plotting whether to abbreviate variable names to the first 3 characters.
 output_scores (str/file/None) – If not None, then a file or name of a file to write local scores
 output_stem (str/None) – If not None, then learned BNs will be written to “output_stem.ext” for each extension defined in output_ext. If multiple DAGs have been learned then output files are called “output_stem_0.ext”, “output_stem_1.ext” …
 output_dag (bool) – Whether to write DAGs to any output files
 output_cpdag (bool) – Whether to write CPDAGs to any output files
 output_ext (tuple) – File extensions.
 verbose (int) – How much information to show when adding variables and constraints (and computing scores)
 gurobi_output (bool) – Whether to show output generated by Gurobi.
 **params – Arbitrary Gurobi model parameter settings. For example if this method is called with TimeLimit=3, then the Gurobi model parameter TimeLimit will be set to 3
Raises: ValueError
– If start= ‘no data’ but no data source or local scores source has been provided

learned_bns
¶ Learned BNs
Type: tuple

learned_scores
¶ Learned BNs
Type: tuple

local_scores
¶ alternative name for family_scores

make_basic_model
(nsols=1, kbest=False, mec=False, polytree=False, chordal=False)¶ Adds standard variables and constraints to the model, together with any user constraints
Variables added by
add_basic_variables
. Constraints added byadd_basic_constraints
.Parameters:  nsols (int) – Number of BNs to learn
 kbest (bool) – Whether the nsols learned BNs should be a highest scoring set of nsols BNs.
 mec (bool) – Whether only one BN per Markov equivalence class should be feasible.
 polytree (bool) – Whether the BN must be a polytree
 chordal (bool) – Whether the BN must contain no immoralities
Raises: Gobnilp.StageError
– If local scores are not yet available.

mec2dag
(c, vs=None)¶ Returns a DAG representative from a Markov equivalence class.
Parameters:  c (frozenset) – A Markov equivalence class represented by a characteristic imset which is itself (sparsely) represented by the set of all sets with value 1 in the characteristic imset. This method assumes that c is a valid characteristic imset without any checking.
 vs (iterable) – An ordering for the BN variables which determines the particular DAG representative
returned. If
None
the ordering determined by Pythonsort
is used.
Returns: The DAG represented by a list of family indices, or None if the required families are not represented.
Return type: list/None
See also
A rough ‘inverse’ of this method is
dag2mec
.

n
¶ The number of BN variables
Type: int

obligatory_adjacencies
¶ If {u,v} is included then an adjacency between u to v is obligatory.
See also
Type: set

obligatory_ancestors
¶ If (u,v) is included then an directed path from u to v is obligatory.
See also
add_obligatory_ancestors
Type: set

obligatory_arrows
¶ If (u,v) is included then an arrow from u to v is obligatory.
See also
Type: set

obligatory_conditional_independences
¶ If (a,b,s) is included, where each is a frozenset, then each BN variable in a must be independent of each BN variable in b conditional on s.
Type: set

ordered_parentsets
¶ For each child a list of parent sets sorted by local score
Higher scoring parent sets come before lower scoring ones. Each parent set is a frozenset.
Raises: Gobnilp.StageError
– If no local scores have been createdType: dict

parents
¶ parents[i]
is the parent set in the family with indexi
.The parent set is a frozenset.
See also
Type: list

rawdata
¶ Raw data associated with the instance
Returns a twodimensional array with one row for each datapoint (and thus one colun for each variable).
If the data is discrete the array entries are of dtype uint32 and if the data is continuous the entries are of dtype float64
Type: numpy.array

return_local_scores
(local_score_fun, palim=3, pruning=True)¶ Return a dictionary for each child variable where the keys are the child variables and the values map parent sets to local scores.
Not all parent sets are included. If palim is not None, then only those parent sets of cardinality at most palim can be included.
Also, when pruning=True, a parent set is only included if its local score exceeds that of all its proper subsets.
local_score_fun should be a function that computes a local score.
Parameters:  local_score_fun (fun/None) – If not None a local score function such that local_score_fun(child,parents) computes (score,ub) where score is the desired local score for child having parentset parents and ub is either None or an upper bound on the local score for child with any proper superset of parents
 palim (int/None) – If not None then the maximal size of a parent set
 pruning (bool) – Whether not to include parent sets which cannot be optimal.
Returns: A dictionary dkt such that dkt[child][parentset] is the local score for child having parent set parentset (where parentset is a frozenset).
Return type: dict

set_bn_variables
(bnvars)¶ Set the BN variables to be a subset of existing BN variables
Parameters: bnvars (iter) – A subset of the existing BN variables Raises: ValueError
– If bnvars is not a subset of the variables in the data

set_stage
(stage)¶ Manually set the stage of learning
Parameters: stage (str) – The desired stage of learning Raises: ValueError
– If stage is not among the list of possible stages

set_starts
(dags)¶ Provide a set of ‘starting’ DAGs
The highest scoring of these DAGs will become the initial incumbent in the search for the best DAG. So the learned DAG will have a score at least as good as the best of these.
This method should be called prior to computing local scores to ensure that the local scores required for each starting DAG are computed even if parent sets in starting DAGs are bigger than the current limit on parent sets. (So limits on parent set size do not affect starting DAGs).
Parameters: dags (iter) – Collection of DAGS. Each individual DAG must be either a bnlearn modelstring or a nx.DiGraph object.

sol2fvs
()¶ Extracts the family variables set to true by some Gurobi solution
The solution corresponding to the current value of Gurobi parameter
SolutionNumber
is used.Returns: A pair of lists. The first list contains the families as (child,parents)
tuples, whereparents
is a frozenset. The second list contains Gurobi binary MIP variables for the families.Return type: tuple

stage
¶ Stage of solving
Type: str

stages
= ('no data', 'data', 'local scores', 'MIP model', 'MIP solution', 'BN(s)', 'CPDAG(s)', 'output shown', 'output written')¶ A tuple of strings giving Gobnilp’s stages of learning (in order).
Type: tuple

stages_set
= frozenset({'no data', 'output shown', 'MIP solution', 'BN(s)', 'CPDAG(s)', 'data', 'output written', 'local scores', 'MIP model'})¶ The set of Gobnilp’s stages of learning.
Type: frozenset

total_order
¶ Dictionary of total order variables (if these variables have been created, by default they are not)
Assuming the appropriate constraints have been added, if it exists
total_order[v1,v2]
is the total order variable indicating thatv1 > v2
.Raises: Gobnilp.StageError
– If no total order variables have been createdType: dict

write_local_scores
(f)¶ Write local scores to a file
Parameters: f (str/file) – If a string the name of the file to write to (where “” leads to writing to standard output). Otherwise a file object.

exception
The BN class¶

class
pygobnilp.gobnilp.
BN
(*args, **kwargs)¶ Subclass of networkx.DiGraph. See documentation for networkx.DiGraph for all methods not documented here.
At present this class only implements the structure of a BN  the DAG.

__str__
()¶ Returns a textual representation of the BN
Returns: A textual representation of the BN Return type: str

adjacency_matrix
()¶ The adjacency matrix
Returns: The adjacency matrix Return type: numpy.matrix

bnlearn_modelstring
()¶ Return a string representation suitable for bnlearn’s “modelstring” function
Returns: A string representation of the BN structure (DAG) suitable for bnlearn’s “modelstring” function Return type: str

compute_compelled
(compelled=())¶ Determines which directed edges are present in all DAGs Markov equivalent to the given DAG (i.e. which are compelled to have this direction).
Whether an edge has its direction compelled is given by the “compelled” attribute of the edge.
Starting from a initial set of edges whose direction is compelled to be that given in the DAG, the following 3 rules from Koller and Friedman (here shown in Prolog) are used:
%R1 compelled(Y,Z) : edge(Y,Z), compelled(X,Y), not edge(X,Z), not edge(Z,X). %R2 compelled(X,Z) : edge(X,Z), compelled(X,Y), compelled(Y,Z). %R3 compelled(X,Z) : edge(X,Z), compelled(Y1,Z), compelled(Y2,Z), (edge(X,Y1);edge(Y1,X)), (edge(X,Y2);edge(Y2,X)).
This method uses the “seminaive evaluation” algorithm for computing the relevant least Herbrand model, which is the set of all the compelled edges.
Parameters: compelled (iter) – Edges to set as compelled in addition to those involved in immoralities.

connected
(u, v)¶ Are u and v connected (in either direction)
Parameters:  u (str) – A node
 v (str) – A node
Returns: Whether u and v are connected
Return type: bool

cpdag_str
()¶ Returns a textual representation of the CPDAG
Returns: A textual representation of the CPDAG Return type: str

directed_arrow_colour
= 'red'¶ Colour to indicate a directed arrow

directed_arrow_text
= '>'¶ Text to indicate a directed arrow

minimal_ancestral_graph
(nodes)¶ Find the minimal ancestral graph of self containing nodes
Parameters: nodes (iter) – The nodes for the minimal ancestral graph Returns: The minimal ancestral graph Return type: network.DiGraph

plot
(abbrev=True)¶ Generate and show a plot of the CPDAG/DAG
A DAG from the Markov equivalence class defined by the CPDAG is shown. Reversible and irreversible arrows are distinguished by colour. By default the colours are black and red, respectively.
Parameters: abbrev (int) – Whether to abbreviate variable names to first 3 characters.

satisfy_ci
(a, b, s)¶ Does the DAG satisfy this conditional independence relation: a is independent of b conditional on s?
Parameters:  a (iter) – A set of BN variables
 b (iter) – A set of BN variables
 s (iter) – A set, possibly empty, of BN variables
Returns: A pair where the first element is a bool stating whether the given conditional independence relation is satisfied and the second is the minimal ancestral graph containing a, b and s.
Return type: tuple

undirected_arrow_colour
= 'black'¶ Colour to indicate a undirected edge

undirected_arrow_text
= ''¶ Text to indicate a undirected edge

The MN class¶

class
pygobnilp.gobnilp.
MN
(*args, **kwargs)¶ Subclass of networkx.Graph. See documentation for networkx.Graph for all methods not documented here.
At present this class only implements the structure of a Markov network  an undirected graph

satisfy_ci
(a, b, s)¶ Does the Markov network satisfy a __ b  s? i.e. is there a path from a node in a to a node in b which avoids nodes in s?
This method does not check that a, b and s are disjoint or that a and b are nonempty.
Parameters:  a (iter) – A set of nodes in a Markov network
 b (iter) – A set of nodes in a Markov network
 s (iter) – A set of nodes in a Markov network
Returns: Whether the Markov network satisfies a __ b  s
Return type: bool

The Data class¶

class
pygobnilp.scoring.
Data
¶ Complete data (either discrete or continuous)
This is an abstract class

rawdata
()¶ The data without any information about variable names.
Returns: The data Return type: numpy.ndarray

variables
()¶ Returns: The variable names Return type: list

varidx
()¶ Returns: Maps a variable name to its position in the list of variable names. Return type: dict

The DiscreteData class¶

class
pygobnilp.scoring.
DiscreteData
(data_source, varnames=None, arities=None)¶ Bases:
pygobnilp.scoring.Data
Complete discrete data

__init__
(data_source, varnames=None, arities=None)¶ Initialises a DiscreteData object.
If data_source is a filename then it is assumed that:
 All values are separated by whitespace
 Empty lines are ignored
 Comment lines start with a ‘#’
 The first line is a header line stating the names of the variables
 The second line states the arities of the variables
 All other lines contain the actual data
Parameters:  data_source (str/array_like/Pandas.DataFrame) – Either a filename containing the data or an array_like object or Pandas data frame containing it.
 varnames (iter) – Variable names corresponding to columns in the data. Ignored if data_source is a filename or Pandas DataFrame (since they will supply the variable names). Otherwise if not supplied (=None) then variables names will be: X1, X2, …
 arities (iter) – Arities for the variables corresponding to columns in the data. Ignored if data_source is a filename or Pandas DataFrame (since they will supply the arities). Otherwise if not supplied (=None) the arity for each variable will be set to the number of distinct values observed for that variable in the data.

arities
()¶ Returns: The arities of the variables. Return type: numpy.ndarray

arity
(v)¶ Parameters: v (str) – A variable name Returns: The arity of v Return type: int

data
()¶ The data with all values converted to unsigned integers.
Returns: The data Return type: pandas.DataFrame

data_length
()¶ Returns: The number of datapoints in the data Return type: int

make_contab_adtree
(variables)¶ Compute a marginal contingency table from data or report that the desired contingency table would be too big.
Parameters: variables (iter) – The variables in the marginal contingency table. Returns:  1st element is of type ndarray:
 If the contingency table would have too many then the array is empty (and the 2nd element of the tuple should be ignored) else an array of counts of length equal to the product of the arities. Counts are in lexicographic order of the joint instantiations of the columns (=variables) 2nd element: the ‘strides’ for each column (=variable)
Return type: tuple

The AbsDiscreteLLScore class¶

class
pygobnilp.scoring.
AbsDiscreteLLScore
(data_source, varnames=None, arities=None)¶ Bases:
pygobnilp.scoring.DiscreteData
Abstract class for discrete log likelihood scores

entropy
(variables)¶ Compute the entropy for the empirical distribution of some variables
Parameters: variables (iter) – Variables Returns: The entropy for the empirical distribution of variables and the number of joint instantiations of variables if not too big else None

ll_score
(child, parents)¶ The fitted loglikelihood score for child having parents
In addition to the score the number of joint instantations of the parents is returned. If this number would cause an overflow None is returned instead of the number.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns:  The fitted loglikelihood local score for the given family and
 the number of joint instantations of the parents (or None if too big)
Return type: tuple

score
(child, parents)¶ Return LL score minus complexity penalty for child having parents, and also upper bound on the score for proper supersets.
To compute the penalty the number of joint instantations is multiplied by the arity of the child minus one. This value is then multiplied by log(N)/2 for BIC and 1 for AIC.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Raises: ValueError
– If the number of joint instantations of the parents would cause an overflow when computing the penaltyReturns: The local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The DiscreteLL class¶

class
pygobnilp.scoring.
DiscreteLL
(data)¶ Bases:
pygobnilp.scoring.AbsDiscreteLLScore

__init__
(data)¶ Initialises a DiscreteLL object.
Parameters: data (DiscreteData) – data

score
(child, parents)¶ Return the fitted loglikelihood score for child having parents, and also upper bound on the score for proper supersets of parents.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns: The fitted loglikelihood local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The DiscreteBIC class¶

class
pygobnilp.scoring.
DiscreteBIC
(data, k=1)¶ Bases:
pygobnilp.scoring.AbsDiscreteLLScore

__init__
(data, k=1)¶ Initialises a DiscreteBIC object.
Parameters:  data (DiscreteData) – data
 k (float) – Multiply standard BIC penalty by this amount, so increase for sparser networks

The DiscreteAIC class¶

class
pygobnilp.scoring.
DiscreteAIC
(data, k=1)¶ Bases:
pygobnilp.scoring.AbsDiscreteLLScore

__init__
(data, k=1)¶ Initialises an DiscreteAIC object.
Parameters:  data (DiscreteData) – data
 k (float) – Multiply standard AIC penalty by this amount, so increase for sparser networks

The BDeu class¶

class
pygobnilp.scoring.
BDeu
(data, alpha=1.0)¶ Bases:
pygobnilp.scoring.DiscreteData
Discrete data with attributes and methods for BDeu scoring

__init__
(data, alpha=1.0)¶ Initialises a BDeu object.
Parameters:  data (DiscreteData) – data
 alpha (float) – The equivalent sample size

alpha
¶ The equivalent sample size used for BDeu scoring
Type: float

bdeu_score_component
(variables, alpha=None)¶ Compute the BDeu score component for a set of variables (from the current dataset).
The BDeu score for a child v having parents Pa is the BDeu score component for Pa subtracted from that for v+Pa
Parameters:  variables (iter) – The names of the variables
 alpha (float) – The effective sample size parameter for the BDeu score. If not supplied (=None) then the value of self.alpha is used.
Returns: The BDeu score component.
Return type: float

bdeu_scores
(palim=None, pruning=True, alpha=None)¶ Exhaustively compute all BDeu scores for all child variables and all parent sets up to size palim. If pruning delete those parent sets which have a subset with a better score. Return a dictionary dkt where dkt[child][parents] = bdeu_score
Parameters:  palim (int/None) – Limit on parent set size
 pruning (bool) – Whether to prune
 alpha (float) – ESS for BDeu score. If not supplied (=None) then the value of self.alpha is used.
Returns: dkt where dkt[child][parents] = bdeu_score
Return type: dict

clear_cache
()¶ Empty the cache of stored BDeu component scores
This should be called, for example, if new scores are being computed with a different alpha value

upper_bound_james
(child, parents, alpha=None)¶ Compute an upper bound on proper supersets of parents
Parameters:  child (str) – Child variable.
 parents (iter) – Parent variables
 alpha (float) – ESS value for BDeu score. If not supplied (=None) then the value of self.alpha is used.
Returns: An upper bound on the local score for parent sets for child which are proper supersets of parents
Return type: float

The ContinuousData class¶

class
pygobnilp.scoring.
ContinuousData
(data, varnames=None, header=True, comments='#', delimiter=None, standardise=False)¶ Bases:
pygobnilp.scoring.Data
Complete continuous data

__init__
(data, varnames=None, header=True, comments='#', delimiter=None, standardise=False)¶ Continuous data
Parameters:  data (numpy.ndarray/str) – The data (either as an array or a filename containing the data)
 varnames (iterable/None) – The names of the variables. If not given (=None) then if data is a file having the variable names as a header then these are used else the variables are named X1, X2, X3, etc
 header (bool) – Ignored if data is not a filename. Whether a header containing variable names is the first noncomment line in the file.
 comments (str) – Ignored if data is not a filename. Lines starting with this string are treated as comments.
 delimiter (None/str) – Ignored if data is not a filename. String used to separate values. If None then whitespace is used.
 standardise (bool) – Whether to standardise the date to have mean 0 and sd = 1.

data
()¶ The data as a Pandas dataframe.
Returns: The data Return type: pandas.DataFrame

The AbsGaussianLLScore class¶

class
pygobnilp.scoring.
AbsGaussianLLScore
(data, varnames=None, header=True, comments='#', delimiter=None, standardise=False)¶ Bases:
pygobnilp.scoring.ContinuousData
Abstract class for Gaussian loglikelihood scoring

gaussianll
(variables)¶ Compute the Gaussian loglikelihood of some variables
Parameters: variables (iter) – Variables Returns: The Gaussian loglikelihood of some variables

ll_score
(child, parents)¶ The Gaussian loglikelhood score for a given family, plus the number of free parameters
Parameters:  child (str) – The child variable
 parents (iter) – The parents
Returns:  First element of tuple is the Gaussian loglikelihood score for the family for current data
Second element is number of free parameters which is number of parents plus 1 (for intercept)
Return type: tuple

The GaussianLL class¶

class
pygobnilp.scoring.
GaussianLL
(data)¶ Bases:
pygobnilp.scoring.AbsGaussianLLScore

__init__
(data)¶ Initialises an GaussianLL object.
Parameters: data (ContinuousData) – data

score
(child, parents)¶ Return the fitted loglikelihood score for child having parents, and also upper bound on the score for proper supersets of parents.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns: The fitted loglikelihood local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The GaussianBIC class¶

class
pygobnilp.scoring.
GaussianBIC
(data, k=1, sdresidparam=True)¶ Bases:
pygobnilp.scoring.AbsGaussianLLScore

__init__
(data, k=1, sdresidparam=True)¶ Initialises an GaussianBIC object.
Parameters:  data (ContinuousData) – data
 k (float) – Multiply standard BIC penalty by this amount, so increase for sparser networks
 sdresidparam (bool) – Whether to count the standard deviation of the residuals as a parameter when computing the penalty

score
(child, parents)¶ Return the fitted Gaussian BIC score for child having parents, and also upper bound on the score for proper supersets of parents.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns: The Gaussian BIC local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The GaussianAIC class¶

class
pygobnilp.scoring.
GaussianAIC
(data, k=1, sdresidparam=True)¶ Bases:
pygobnilp.scoring.AbsGaussianLLScore

__init__
(data, k=1, sdresidparam=True)¶ Initialises an GaussianAIC object.
Parameters:  data (ContinuousData) – data
 k (float) – Multiply standard AIC penalty by this amount, so increase for sparser networks
 sdresidparam (bool) – Whether to count the standard deviation of the residuals as a parameter when computing the penalty

score
(child, parents)¶ Return the fitted Gaussian AIC score for child having parents, and also upper bound on the score for proper supersets of parents.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns: The Gaussian AIC local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The GaussianL0 class¶

class
pygobnilp.scoring.
GaussianL0
(data, k=1)¶ Bases:
pygobnilp.scoring.AbsGaussianLLScore
Implements score discussed in “l_0Penalized Maximum Likelihood for Sparse Directed Acyclic Graphs” by Sara van de Geer and Peter Buehlmann. Annals of Statistics 41(2):536567, 2013.

__init__
(data, k=1)¶ Initialises an GaussianL0 object.
Parameters:  data (ContinuousData) – data
 k (float) – Tuning parameter for L0 penalty. Called “lambda^2” in van de Geer and Buehlmann

score
(child, parents)¶ Return the fitted Gaussian AIC score for child having parents, and also upper bound on the score for proper supersets of parents.
Parameters:  child (str) – The child variable
 parents (iter) – The parent variables
Returns: The Gaussian AIC local score for the given family and an upper bound on the local score for proper supersets of parents
Return type: tuple

The BGe class¶

class
pygobnilp.scoring.
BGe
(data, nu=None, alpha_mu=1.0, alpha_omega=None, prior_matrix=None)¶ Bases:
pygobnilp.scoring.ContinuousData
Continuous data with attributes and methods for BGe scoring

__init__
(data, nu=None, alpha_mu=1.0, alpha_omega=None, prior_matrix=None)¶ Create a BGe scoring object
Parameters:  data (ContinuousData) – The data
 nu (numpy.ndarray/None) – the mean vector for the normal part of the normalWishart prior. If not given (=None), then the sample mean is used.
 alpha_mu (float) – imaginary sample size for the normal part of the normalWishart prior.
 alpha_omega (int/None) – The degrees of freedom for the Wishart part of the normalWishart prior. Must exceed p1 where p is the number of variables. If not given (=None) then alpha_omega is set to p+2.
 prior_matrix (numpy.ndarray/None) – The prior matrix ‘T’ for the Wishart part of the normalWishart prior. If not given (=None), then this is set to t*I_n where t = alpha_mu*(alpha_omegan1)/(alpha_mu+1)

bge_component
(vs)¶ Compute the BGe component for given variables
The BGe score for a family child<parents is the component for child+parents minus the component for parents (+ a constant term which just depends on the number of parents).
Parameters: vs (iter) – Variable names Returns: The BGe component for the given variables Return type: float

bge_score
(child, parents)¶ The BGe score for a given family, plus upper bound
Parameters:  child (str) – The child variable
 parents (iter) – The parents
Returns:  First element of tuple isf the BGe score for the family for current data (using current hyperparameters)
Second element is an upper bound.
Return type: tuple

Functions in the gobnilp module¶
Python version of GOBNILP

pygobnilp.gobnilp.
from_bnlearn_modelstring
(modelstring)¶ Return a DAG from a bnlearn modelstring
Parameters: modelstring (str) – A bnlearn modelstring defining a DAG Returns: The DAG as a networkx Digraph Return type: networkx.DiGraph

pygobnilp.gobnilp.
read_local_scores
(f, verbose=False)¶ Read local scores from a named file, standard input or a file object, and return a dictionary dkt where dkt[child][parentset] is the local score for the family child<parentset
The file is assumed to be in “Jaakkola” format.
Parameters: f (str/file object) – The file containing the local scores. Returns: Dictionary containing local scores Return type: dict

pygobnilp.gobnilp.
mhs
(subsets, ground_set=None)¶ Return a minimal hitting set for a set of subsets
A hitting set is a set of elements from the ground set that has non empty intersection with each of the given subsets. A minimal hitting set is a hitting set with minimal cardinality. This function uses Gurobi to solve this NPhard problem.
Parameters:  subsets (iter) – The collection of subsets for which the minimal hitting set is sought. These could be, for example, a list of lists of strings where the strings are elements of the ground_set.
 ground_set (iter) – The ground set: each subset must be a subset of this ground set. If missing (=None) then the ground set is the union of all the given subsets.
Raises: ValueError
– If Gurobi cannot solve the minimal hitting set problem.Returns: A minimal hitting set which will be a subset of the ground set, or None if there is no hitting set.
Return type: list/None