flex.gp namespace
Submodules
flex.gp.cochain_primitives module
- class flex.gp.cochain_primitives.CochainBasePrimitive(base_name: str, base_fun: Callable, input: List[str], output: str, att_input: Dict, map_rule: Dict)[source]
Bases:
objectA simple class to handle a cochain base primitive function.
- Parameters:
base_name – name of the base primitive.
base_fun – callable base function.
input – a list containing the input types (str) of base_fun.
output – a string containing the output type of base_fun.
att_input – a dictionary with keys ‘complex’ (primal/dual), ‘dimension’ (0,1,2), and ‘rank’ (“SC”, i.e. scalar, “V”, “T”).
map_rule – a dictionary consisting of the same keys of att_input. In this case, each key contains a callable object that provides the map to get the output complex/dimension/rank given the input one.
- class flex.gp.cochain_primitives.Complex(*values)[source]
Bases:
EnumEnum class for complex.
- DUAL = 'D'
- PRIMAL = 'P'
- class flex.gp.cochain_primitives.Dimension(*values)[source]
Bases:
IntEnumEnum class for dimension.
- ONE = 1
- TWO = 2
- ZERO = 0
- class flex.gp.cochain_primitives.Rank(*values)[source]
Bases:
EnumEnum class for rank.
- SCALAR = ''
- TENSOR = 'T'
- VECTOR = 'V'
- flex.gp.cochain_primitives.compute_primitive_in_out_type(primitive: CochainBasePrimitive, eval_with_globals: Callable, in_complex: Complex, in_dim: Dimension, in_rank: Rank)[source]
Resolves the specific variant name and types for a primitive.
Based on the input complex (Primal/Dual), dimension (0, 1, 2), and rank (Scalar, Vector, Tensor), this function generates a unique name for the primitive variant, resolves the Python types for all input arguments, and calculates the resulting output type using defined mapping rules.
- Parameters:
primitive – a CochainBasePrimitive object.
eval_with_globals – The evaluation function created by define_eval_with_suitable_imports.
in_complex – The current complex.
in_dim – The current dimension.
in_rank – The current rank.
- Returns:
- A tuple containing the concatenated name (e.g., “addP1V”),
a list of resolved Python type objects for inputs and the resolved Python type object for the output.
- flex.gp.cochain_primitives.define_eval_with_suitable_imports(imports: Dict)[source]
Creates a scoped evaluation function with pre-loaded modules.
This prevents repetitive imports and ensures that string-based type definitions (like “IntP1V”) can be converted into actual class references during GP tree construction.
- Parameters:
imports – A dictionary where keys are module paths and values are lists of function/class names to import.
- Returns:
A function eval_with_globals(expression) that evaluates strings within the context of the imported components.
- flex.gp.cochain_primitives.generate_primitive_variants(primitive: CochainBasePrimitive, imports: Dict)[source]
Generate primitive variants given a typed primitive.
- Parameters:
primitive – a CochainBasePrimitive object.
imports – dictionary whose keys and values are the modules and the functions to be imported in order to evaluate the input/output types of the primitive.
- Returns:
- a dict in which each key is the name of the primitive variant and each value
is a PrimitiveParams object.
flex.gp.jax_primitives module
flex.gp.numpy_primitives module
flex.gp.primitives module
- class flex.gp.primitives.PrimitiveParams(op: Callable, in_types: Type, out_type: Type)[source]
Bases:
objectA simple class to handle a primitive function.
- Parameters:
op – the callable function.
in_types – input types of the primitive.
out_type – output type of the primitive.
- flex.gp.primitives.add_primitives_to_pset_from_dict(pset: PrimitiveSetTyped, primitives_dict: Dict)[source]
Add a given set of primitives to a PrimitiveSetTyped object.
- Parameters:
pset – a primitive set.
primitives_dict – a dictionary composed of two keys: imports, containing the import location of the pre-defined primitives; used, containing a list of dictionaries (of the same structure as the one in add_primitives_to_pset).
- Returns:
the updated primitive set
flex.gp.regressor module
- class flex.gp.regressor.GPSymbolicRegressor(pset_config: PrimitiveSet | PrimitiveSetTyped, fitness: Callable, predict_func: Callable, score_func: Callable | None = None, select_fun: str = 'tools.selection.tournament_with_elitism', select_args: str = "{'num_elitist': self.n_elitist, 'tournsize': 3, 'stochastic_tourn': { 'enabled': False, 'prob': [0.8, 0.2] }}", mut_fun: str = 'gp.mutUniform', mut_args: str = "{'expr': toolbox.expr_mut, 'pset': pset}", expr_mut_fun: str = 'gp.genHalfAndHalf', expr_mut_args: str = "{'min_': 1, 'max_': 3}", crossover_fun: str = 'gp.cxOnePoint', crossover_args: str = '{}', min_height: int = 1, max_height: int = 3, max_length: int = 100, num_individuals: int = 10, generations: int = 1, num_islands: int = 1, remove_init_duplicates: bool = False, mig_freq: int = 10, mig_frac: float = 0.05, crossover_prob: float = 0.5, mut_prob: float = 0.2, variation_mechanism: str = 'varAnd', frac_elitist: float = 0.0, overlapping_generation: bool = False, common_data: Dict | None = None, validate: bool = False, preprocess_args: Dict | None = None, callback_func: Callable | None = None, seed_str: List[str] | None = None, print_log: bool = False, num_best_inds_str: int = 1, save_best_individual: bool = False, save_train_fit_history: bool = False, save_detailed_log: bool = False, detailed_log_filename: str = 'population_detailed_log.csv', early_stop_fitness_threshold: float | None = None, output_path: str | None = None, batch_size: int = 1, num_cpus: int = 1, max_calls: int = 0, custom_logger: Callable = None, multiprocessing: bool = True)[source]
Bases:
RegressorMixin,BaseEstimatorSymbolic regression via Genetic Programming (GP).
This regressor evolves symbolic expressions represented as GP trees in order to minimize a user-defined fitness function. It is built on top of DEAP and follows the scikit-learn estimator interface.
The regressor supports: - Arbitrary user-defined fitness, prediction, and scoring functions - Multi-island evolution with periodic migration - Elitism and overlapping or non-overlapping generations - Parallel fitness evaluation using Ray - Validation-set monitoring - Conversion of the best individuals to a SymPy expression
- Parameters:
pset_config – set of primitives and terminals (loosely or strongly typed).
fitness – fitness evaluation function. It must return a tuple containing a single scalar fitness value, e.g. (fitness_value,).
predict_func – function that returns a prediction given an individual and a test dataset as inputs.
score_func – score metric used for validation and for the score method.
select_fun – string representing the selection operator to use.
select_args – stringified dictionary of keyword arguments passed to the selection operator. The string is evaluated at runtime.
mut_fun – mutation operator.
mut_args – arguments for the mutation operator.
expr_mut_fun – expression generator used during mutation.
expr_mut_args – arguments for the mutation expression generator.
crossover_fun – crossover operator.
crossover_args – arguments for the crossover operator.
min_height – minimum height of GP trees at initialization.
max_height – maximum height of GP trees at initialization.
max_length – maximum number of nodes allowed in a GP tree.
num_individuals – population size per island.
generations – number of generations.
num_islands – number of islands (for a multi-island model).
remove_init_duplicates – whether to remove duplicate individuals from the initial populations.
mig_freq – migration frequency (in generations).
mig_frac – fraction of individuals exchanged during migration.
crossover_prob – probability of applying crossover.
mut_prob – probability of applying mutation.
variation_mechanism – variation operator used to generate offspring. Supported values are
"varAnd"and"varOr".frac_elitist – fraction of elite individuals preserved each generation.
overlapping_generation – True if the offspring competes with the parents for survival.
common_data – dictionary of arguments shared between fitness, prediction, and scoring functions.
validate – whether to use a validation dataset.
preprocess_args – configuration for a function applied to individuals prior to fitness evaluation. It must contain three keys: func, the callable to execute. It must accept an individual and the toolbox as its first two arguments; func_args: a dictionary of additional arguments for func; callback: a function used to assign the resulting preprocessed values back to each individual.
callback_func – function called after fitness evaluation to perform custom processing.
seed_str – list of GP expressions used to seed the initial population.
print_log – whether to print the log containing the population statistics during the run.
num_best_inds_str – number of best individuals printed at each generation.
save_best_individual – whether to save the string representation of the best individual.
save_train_fit_history – whether to save the training fitness history.
save_detailed_log – whether to save a per-generation population log with each individual string, size, fitness, and island index.
detailed_log_filename – file name used for detailed population logging.
early_stop_fitness_threshold – if set, stop evolution early when the best training fitness is less than or equal to this threshold.
output_path – directory where outputs are saved.
batch_size – batch size used for Ray-based fitness evaluation.
num_cpus – number of CPUs allocated to each Ray task.
max_calls – maximum number of tasks a Ray worker can execute before restart. The default is 0, which means infinite number of tasks.
custom_logger – user-defined logging function called with the best individuals.
multiprocessing – whether to use Ray for parallel fitness evaluation.
- fit(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, y: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None, X_val: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None, y_val: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None)[source]
Fits the training data using GP-based symbolic regression.
This method initializes the populations, evaluates the fitness of the individuals, and evolves the populations for the specified number of generations.
- Parameters:
X – training input data.
y – training targets. If None, the fitness function must not require targets.
X_val – validation input data.
y_val – validation targets.
- get_best_individuals(n_ind: int = 1)[source]
Returns the best individuals across all islands.
- Parameters:
n_ind – number of top individuals to return.
- Returns:
List of the best GP individuals.
- get_best_individuals_sympy(sympy_conversion_rules: Dict = {'add': <function <lambda>>, 'aq': <function <lambda>>, 'div': <function <lambda>>, 'log': <function <lambda>>, 'mul': <function <lambda>>, 'pow': <function <lambda>>, 'prot_log': <function <lambda>>, 'prot_pow': <function <lambda>>, 'square': <function <lambda>>, 'sub': <function <lambda>>}, special_term_name: str = 'c', n_ind: int = 1)[source]
Returns the SymPy expression of the best individuals.
- Parameters:
sympy_conversion_rules – mapping from GP primitives (DEAP) to SymPy primitives.
special_term_name – name used for constants during SymPy conversion.
n_ind – number of best individuals to convert to SymPy.
- Returns:
sympy representation of the best individuals.
- get_population_individuals(by_island: bool = False)[source]
Returns individuals from the current population.
- Parameters:
by_island – if True, return a list of populations (one per island); otherwise return a single flattened list.
- Returns:
current population individuals.
- get_train_fit_history()[source]
Returns the training score history.
- Returns:
list containing the validation scores at each generation.
- get_val_score_history()[source]
Returns the validation score history.
- Returns:
list containing the validation scores at each generation.
- property n_elitist
Compute the number of elitists in the population
- predict(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array)[source]
Predict outputs using the best evolved individual.
- Parameters:
X – Input data.
- Returns:
predictions computed by the best individual.
- save_best_test_sols(X_test: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, output_path: str)[source]
Compute and save the predictions corresponding to the best individual at the end of the evolution, evaluated over the test dataset.
- Parameters:
X_test – test input data.
output_path – path where the predictions should be saved (one .npy file for each sample in the test dataset).
- score(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, y: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None)[source]
Compute the score of the best evolved individual. This method evaluates the user-provided score_func on the given dataset.
- Parameters:
X – input data.
y – target values.
- Returns:
score value returned by score_func.
- set_fit_request(*, X_val: bool | None | str = '$UNCHANGED$', y_val: bool | None | str = '$UNCHANGED$') GPSymbolicRegressor
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- Returns:
self – The updated object.
- Return type:
flex.gp.sympy module
- flex.gp.sympy.deap_primitive_to_sympy_expr(prim: Primitive, conversion_rules: Dict, args: Tuple)[source]
Convert a DEAP primitive and its arguments into the corresponding sympy expression.
- Parameters:
prim – the primitive.
conversion_rules – a dictionary of conversion rules.
args – args of the primitive.
- Returns:
the sympy-compatible expression.
- flex.gp.sympy.stringify_for_sympy(f: PrimitiveTree, conversion_rules: Dict, special_term_name: str) str[source]
Returns a sympy-compatible expression.
- Parameters:
f – the individual tree (DEAP format)
conversion_rules – a dictionary of conversion rules.
special_term_name – name of the constant placeholder.
- Returns:
the sympy-compatible expression.