flex.gp namespace

Submodules

flex.gp.cochain_primitives module

class flex.gp.cochain_primitives.CochainBasePrimitive(base_name: str, base_fun: Callable, input: List[str], output: str, att_input: Dict, map_rule: Dict)[source]

Bases: object

A simple class to handle a cochain base primitive function.

Parameters:
  • base_name – name of the base primitive.

  • base_fun – callable base function.

  • input – a list containing the input types (str) of base_fun.

  • output – a string containing the output type of base_fun.

  • att_input – a dictionary with keys ‘complex’ (primal/dual), ‘dimension’ (0,1,2), and ‘rank’ (“SC”, i.e. scalar, “V”, “T”).

  • map_rule – a dictionary consisting of the same keys of att_input. In this case, each key contains a callable object that provides the map to get the output complex/dimension/rank given the input one.

class flex.gp.cochain_primitives.Complex(*values)[source]

Bases: Enum

Enum class for complex.

DUAL = 'D'
PRIMAL = 'P'
class flex.gp.cochain_primitives.Dimension(*values)[source]

Bases: IntEnum

Enum class for dimension.

ONE = 1
TWO = 2
ZERO = 0
class flex.gp.cochain_primitives.Rank(*values)[source]

Bases: Enum

Enum class for rank.

SCALAR = ''
TENSOR = 'T'
VECTOR = 'V'
flex.gp.cochain_primitives.compute_primitive_in_out_type(primitive: CochainBasePrimitive, eval_with_globals: Callable, in_complex: Complex, in_dim: Dimension, in_rank: Rank)[source]

Resolves the specific variant name and types for a primitive.

Based on the input complex (Primal/Dual), dimension (0, 1, 2), and rank (Scalar, Vector, Tensor), this function generates a unique name for the primitive variant, resolves the Python types for all input arguments, and calculates the resulting output type using defined mapping rules.

Parameters:
  • primitive – a CochainBasePrimitive object.

  • eval_with_globals – The evaluation function created by define_eval_with_suitable_imports.

  • in_complex – The current complex.

  • in_dim – The current dimension.

  • in_rank – The current rank.

Returns:

A tuple containing the concatenated name (e.g., “addP1V”),

a list of resolved Python type objects for inputs and the resolved Python type object for the output.

flex.gp.cochain_primitives.define_eval_with_suitable_imports(imports: Dict)[source]

Creates a scoped evaluation function with pre-loaded modules.

This prevents repetitive imports and ensures that string-based type definitions (like “IntP1V”) can be converted into actual class references during GP tree construction.

Parameters:

imports – A dictionary where keys are module paths and values are lists of function/class names to import.

Returns:

A function eval_with_globals(expression) that evaluates strings within the context of the imported components.

flex.gp.cochain_primitives.generate_primitive_variants(primitive: CochainBasePrimitive, imports: Dict)[source]

Generate primitive variants given a typed primitive.

Parameters:
  • primitive – a CochainBasePrimitive object.

  • imports – dictionary whose keys and values are the modules and the functions to be imported in order to evaluate the input/output types of the primitive.

Returns:

a dict in which each key is the name of the primitive variant and each value

is a PrimitiveParams object.

flex.gp.cochain_primitives.inv_scalar_mul(c: Cochain, f: float)[source]

Scalar multiplication between a cochain and the inverse of a float.

Parameters:
  • c – a cochain.

  • f – a float.

Returns:

the scalar product between c and 1/f.

flex.gp.cochain_primitives.switch_complex(complex: Complex)[source]

Switch complex (from primal to dual or viceversa). :param complex: a complex object.

Returns:

the other complex.

flex.gp.jax_primitives module

flex.gp.jax_primitives.inv_float(x)[source]
flex.gp.jax_primitives.protectedDiv(left, right)[source]
flex.gp.jax_primitives.protectedLog(x)[source]
flex.gp.jax_primitives.protectedSqrt(x)[source]
flex.gp.jax_primitives.square_mod(x)[source]

flex.gp.numpy_primitives module

flex.gp.primitives module

class flex.gp.primitives.PrimitiveParams(op: Callable, in_types: Type, out_type: Type)[source]

Bases: object

A simple class to handle a primitive function.

Parameters:
  • op – the callable function.

  • in_types – input types of the primitive.

  • out_type – output type of the primitive.

flex.gp.primitives.add_primitives_to_pset_from_dict(pset: PrimitiveSetTyped, primitives_dict: Dict)[source]

Add a given set of primitives to a PrimitiveSetTyped object.

Parameters:
  • pset – a primitive set.

  • primitives_dict – a dictionary composed of two keys: imports, containing the import location of the pre-defined primitives; used, containing a list of dictionaries (of the same structure as the one in add_primitives_to_pset).

Returns:

the updated primitive set

flex.gp.primitives.get_base_name(typed_name: str)[source]

Extracts the base name by removing P/D and rank/dim indicators.

Parameters:

typed_name – the full name of the primitive (e.g., “St1D1V”).

Returns:

the base name of the primitive (e.g., St1).

flex.gp.regressor module

class flex.gp.regressor.GPSymbolicRegressor(pset_config: PrimitiveSet | PrimitiveSetTyped, fitness: Callable, predict_func: Callable, score_func: Callable | None = None, select_fun: str = 'tools.selection.tournament_with_elitism', select_args: str = "{'num_elitist': self.n_elitist, 'tournsize': 3, 'stochastic_tourn': { 'enabled': False, 'prob': [0.8, 0.2] }}", mut_fun: str = 'gp.mutUniform', mut_args: str = "{'expr': toolbox.expr_mut, 'pset': pset}", expr_mut_fun: str = 'gp.genHalfAndHalf', expr_mut_args: str = "{'min_': 1, 'max_': 3}", crossover_fun: str = 'gp.cxOnePoint', crossover_args: str = '{}', min_height: int = 1, max_height: int = 3, max_length: int = 100, num_individuals: int = 10, generations: int = 1, num_islands: int = 1, remove_init_duplicates: bool = False, mig_freq: int = 10, mig_frac: float = 0.05, crossover_prob: float = 0.5, mut_prob: float = 0.2, variation_mechanism: str = 'varAnd', frac_elitist: float = 0.0, overlapping_generation: bool = False, common_data: Dict | None = None, validate: bool = False, preprocess_args: Dict | None = None, callback_func: Callable | None = None, seed_str: List[str] | None = None, print_log: bool = False, num_best_inds_str: int = 1, save_best_individual: bool = False, save_train_fit_history: bool = False, save_detailed_log: bool = False, detailed_log_filename: str = 'population_detailed_log.csv', early_stop_fitness_threshold: float | None = None, output_path: str | None = None, batch_size: int = 1, num_cpus: int = 1, max_calls: int = 0, custom_logger: Callable = None, multiprocessing: bool = True)[source]

Bases: RegressorMixin, BaseEstimator

Symbolic regression via Genetic Programming (GP).

This regressor evolves symbolic expressions represented as GP trees in order to minimize a user-defined fitness function. It is built on top of DEAP and follows the scikit-learn estimator interface.

The regressor supports: - Arbitrary user-defined fitness, prediction, and scoring functions - Multi-island evolution with periodic migration - Elitism and overlapping or non-overlapping generations - Parallel fitness evaluation using Ray - Validation-set monitoring - Conversion of the best individuals to a SymPy expression

Parameters:
  • pset_config – set of primitives and terminals (loosely or strongly typed).

  • fitness – fitness evaluation function. It must return a tuple containing a single scalar fitness value, e.g. (fitness_value,).

  • predict_func – function that returns a prediction given an individual and a test dataset as inputs.

  • score_func – score metric used for validation and for the score method.

  • select_fun – string representing the selection operator to use.

  • select_args – stringified dictionary of keyword arguments passed to the selection operator. The string is evaluated at runtime.

  • mut_fun – mutation operator.

  • mut_args – arguments for the mutation operator.

  • expr_mut_fun – expression generator used during mutation.

  • expr_mut_args – arguments for the mutation expression generator.

  • crossover_fun – crossover operator.

  • crossover_args – arguments for the crossover operator.

  • min_height – minimum height of GP trees at initialization.

  • max_height – maximum height of GP trees at initialization.

  • max_length – maximum number of nodes allowed in a GP tree.

  • num_individuals – population size per island.

  • generations – number of generations.

  • num_islands – number of islands (for a multi-island model).

  • remove_init_duplicates – whether to remove duplicate individuals from the initial populations.

  • mig_freq – migration frequency (in generations).

  • mig_frac – fraction of individuals exchanged during migration.

  • crossover_prob – probability of applying crossover.

  • mut_prob – probability of applying mutation.

  • variation_mechanism – variation operator used to generate offspring. Supported values are "varAnd" and "varOr".

  • frac_elitist – fraction of elite individuals preserved each generation.

  • overlapping_generation – True if the offspring competes with the parents for survival.

  • common_data – dictionary of arguments shared between fitness, prediction, and scoring functions.

  • validate – whether to use a validation dataset.

  • preprocess_args – configuration for a function applied to individuals prior to fitness evaluation. It must contain three keys: func, the callable to execute. It must accept an individual and the toolbox as its first two arguments; func_args: a dictionary of additional arguments for func; callback: a function used to assign the resulting preprocessed values back to each individual.

  • callback_func – function called after fitness evaluation to perform custom processing.

  • seed_str – list of GP expressions used to seed the initial population.

  • print_log – whether to print the log containing the population statistics during the run.

  • num_best_inds_str – number of best individuals printed at each generation.

  • save_best_individual – whether to save the string representation of the best individual.

  • save_train_fit_history – whether to save the training fitness history.

  • save_detailed_log – whether to save a per-generation population log with each individual string, size, fitness, and island index.

  • detailed_log_filename – file name used for detailed population logging.

  • early_stop_fitness_threshold – if set, stop evolution early when the best training fitness is less than or equal to this threshold.

  • output_path – directory where outputs are saved.

  • batch_size – batch size used for Ray-based fitness evaluation.

  • num_cpus – number of CPUs allocated to each Ray task.

  • max_calls – maximum number of tasks a Ray worker can execute before restart. The default is 0, which means infinite number of tasks.

  • custom_logger – user-defined logging function called with the best individuals.

  • multiprocessing – whether to use Ray for parallel fitness evaluation.

fit(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, y: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None, X_val: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None, y_val: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None)[source]

Fits the training data using GP-based symbolic regression.

This method initializes the populations, evaluates the fitness of the individuals, and evolves the populations for the specified number of generations.

Parameters:
  • X – training input data.

  • y – training targets. If None, the fitness function must not require targets.

  • X_val – validation input data.

  • y_val – validation targets.

get_best_individuals(n_ind: int = 1)[source]

Returns the best individuals across all islands.

Parameters:

n_ind – number of top individuals to return.

Returns:

List of the best GP individuals.

get_best_individuals_sympy(sympy_conversion_rules: Dict = {'add': <function <lambda>>, 'aq': <function <lambda>>, 'div': <function <lambda>>, 'log': <function <lambda>>, 'mul': <function <lambda>>, 'pow': <function <lambda>>, 'prot_log': <function <lambda>>, 'prot_pow': <function <lambda>>, 'square': <function <lambda>>, 'sub': <function <lambda>>}, special_term_name: str = 'c', n_ind: int = 1)[source]

Returns the SymPy expression of the best individuals.

Parameters:
  • sympy_conversion_rules – mapping from GP primitives (DEAP) to SymPy primitives.

  • special_term_name – name used for constants during SymPy conversion.

  • n_ind – number of best individuals to convert to SymPy.

Returns:

sympy representation of the best individuals.

get_last_gen()[source]

Returns the last generation index.

Returns:

the last generation.

get_params(deep: bool = True)[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

get_pop_stats()[source]

Get population stats.

get_population_individuals(by_island: bool = False)[source]

Returns individuals from the current population.

Parameters:

by_island – if True, return a list of populations (one per island); otherwise return a single flattened list.

Returns:

current population individuals.

get_train_fit_history()[source]

Returns the training score history.

Returns:

list containing the validation scores at each generation.

get_val_score_history()[source]

Returns the validation score history.

Returns:

list containing the validation scores at each generation.

property n_elitist

Compute the number of elitists in the population

predict(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array)[source]

Predict outputs using the best evolved individual.

Parameters:

X – Input data.

Returns:

predictions computed by the best individual.

save_best_test_sols(X_test: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, output_path: str)[source]

Compute and save the predictions corresponding to the best individual at the end of the evolution, evaluated over the test dataset.

Parameters:
  • X_test – test input data.

  • output_path – path where the predictions should be saved (one .npy file for each sample in the test dataset).

score(X: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array, y: ndarray[tuple[Any, ...], dtype[_ScalarT]] | Array = None)[source]

Compute the score of the best evolved individual. This method evaluates the user-provided score_func on the given dataset.

Parameters:
  • X – input data.

  • y – target values.

Returns:

score value returned by score_func.

set_fit_request(*, X_val: bool | None | str = '$UNCHANGED$', y_val: bool | None | str = '$UNCHANGED$') GPSymbolicRegressor

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for X_val parameter in fit.

  • y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_val parameter in fit.

Returns:

self – The updated object.

Return type:

object

flex.gp.sympy module

flex.gp.sympy.deap_primitive_to_sympy_expr(prim: Primitive, conversion_rules: Dict, args: Tuple)[source]

Convert a DEAP primitive and its arguments into the corresponding sympy expression.

Parameters:
  • prim – the primitive.

  • conversion_rules – a dictionary of conversion rules.

  • args – args of the primitive.

Returns:

the sympy-compatible expression.

flex.gp.sympy.stringify_for_sympy(f: PrimitiveTree, conversion_rules: Dict, special_term_name: str) str[source]

Returns a sympy-compatible expression.

Parameters:
  • f – the individual tree (DEAP format)

  • conversion_rules – a dictionary of conversion rules.

  • special_term_name – name of the constant placeholder.

Returns:

the sympy-compatible expression.

flex.gp.util module

flex.gp.util.avg_func(values)[source]
flex.gp.util.compile_individual_with_consts(tree, toolbox, special_term_name='c')[source]
flex.gp.util.compile_individuals(toolbox, individuals_str_batch)[source]
flex.gp.util.detect_nested_trigonometric_functions(equation)[source]
flex.gp.util.dim_constructor(loader, node)[source]
flex.gp.util.dummy_fitness(individuals_str, toolbox, X, y)[source]
flex.gp.util.dummy_predict(individuals_str, toolbox, X)[source]
flex.gp.util.dummy_score(individuals_str, toolbox, X, y)[source]
flex.gp.util.fitness_value(ind)[source]
flex.gp.util.load_config_data(filename)[source]

Load problem settings from YAML file.

flex.gp.util.mapper(f, individuals, toolbox_ref, batch_size)[source]
flex.gp.util.max_func(values)[source]
flex.gp.util.min_func(values)[source]
flex.gp.util.rank_constructor(loader, node)[source]
flex.gp.util.std_func(values)[source]