gp¶
The gp
subpackage contains a genetic programming framework that produces
Push programs.
The genetic program framework is heavily inspired by the Scikitlearn machine learning framework. PyshGP is designed to be usable using only the base classes, many of which are sklearn estimators. The PyshGP genetic programming framework aims to be easily embedded into Data Science/Machine Learning pipelines, over being used as a standalone tool.
Aside from the general goal of Inductive Program Synthesis, the PyshGP genetic programming framework does not have a single intended application (ie. regression, classification, etc).
Base¶
TODO: Module docstg

class
pyshgp.gp.base.
PyshBase
(atom_generators='default', operators='default', error_threshold=0, max_generations=1000, population_size=300, selection_method='lexicase', n_jobs=1, initial_max_genome_size=50, program_growth_cap=100, verbose=0, epsilon='auto', tournament_size=7, simplification_steps=2000, keep_linear=False)¶ Base class for all PushGP evolvers.
TODO: Add validation checks.
Parameters: atom_generators : list or str, optional (default=’default’)
Atom generators used to generate random Push programs. If
'default'
then all atom generators are used.operators : list or str, optional (default=’default’)
List of tuples. Each tuple contains a VariationOperator and a float. The float determines the relative probability of using the VariationOperator to produce a child. If
'default'
a commonly used set of genetic operators is used.error_threshold : int or float, optional (default=0)
If a program’s total error is ever less than or equal to this value, the program is considered a solution.
max_generations : int, optional (default=1000)
Max number of generation before stopping evolution.
population_size : int, optional (default=300)
Number of Individuals to have in the population at any given generation.
selection_method : str, optional (default=’lexicase’)
Method to use when selecting parents. Supported options are ‘lexicase’, ‘epsilon_lexicase’, and ‘tournament’.
n_jobs : int or str, optional (default=1)
Number of processes to run at once during program evaluation. If
1
the number of processes will be equal to the number of cores.initial_max_genome_size : int, optional (default=50)
Max number of genes to have in each randomly generated genome.
program_growth_cap : int, optional (default=100)
TODO: Implement this feature.
verbose : int, optional (default=0)
If 1, will print minimal information while evolving. If 2, will print as much information as possible during evolution however this might slightly impact runtime. If 0, prints nothing during evolution.
epsilon : float or str, optional (default=’auto’)
The value of epsilon when using ‘epsilon_lexicase’ as the selection method. If auto, epsilon is set to be equal to the Median Absolute Deviation of each error.
tournament_size : int, optional (default=7)
The size of each tournament when using ‘tournament’ selection.
simplification_steps : int, optional (default=2000)
Number of steps of automatic program simplification to perform.

choose_genetic_operator
()¶ Normalizes operator probabilities so that values sum to 1.

init_executor
()¶ Initializes a pool of processes.
This requires pathos.multiprocessing because the standard multiprocessing library does not support pickling lambda and nontop level functions. Pathos specifically makes use of the dill package.
Todo
TODO: If there is away around using pathos, it would be great to remove this dependency.

init_population
()¶ Generate random population of Individuals with Push programs.

make_spawner
(num_inputs)¶ Creates a spawner object used to generate random code.
Parameters: num_inputs : int
The number of inputs instructions to generate at add to the Spawner. This should be set to the number of input values (features) that will be supplied to Push programs during evaluation.
output_types : list
A list of pysh types. The spawner will include instructions which ouput a list of outputs with the corresponding type in each index.

print_monitor
(generation)¶ Prints a basic set of values that can be used to manually monitor run health.
TODO: Add validation check for if population exists.
Parameters: generation : int
The generation number.

print_monitor_verbose
(generation)¶ Prints all implemented values that can be used to manually monitor run health.
TODO: Add validation check for if population exists.
Parameters: generation : int
The generation number.


class
pyshgp.gp.base.
PyshEstimatorMixin
¶ A Mixin class for the Sklearn estimators included in pyshgp.

evolve
(X, y)¶ Main evolutionary loop for the sklearn estimators in pyshgp.
Parameters: X : {arraylike, sparse matrix}, shape = (n_samples, n_features)
Samples.
y : {arraylike, sparse matrix}, shape = (n_samples, 1)
Target values.


pyshgp.gp.base.
choice
(a, size=None, replace=True, p=None)¶ Generates a random sample from a given 1D array
New in version 1.7.0.
Parameters: a : 1D arraylike or int
If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.replace : boolean, optional
Whether the sample is with or without replacement
p : 1D arraylike, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
Returns: samples : single item or ndarray
The generated random samples
Raises: ValueError
If a is an int and less than zero, if a or p are not 1dimensional, if a is an arraylike of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint
,shuffle
,permutation
Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) >>> #This is equivalent to np.random.randint(0,5,3)
Generate a nonuniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0])
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a nonuniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0])
Any of the above can be repeated with an arbitrary arraylike instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], dtype='S11')

pyshgp.gp.base.
random
() → x in the interval [0, 1).¶
Population¶
Classes that reperesents Individuals and Populations in evolutionary algorithms.

class
pyshgp.gp.population.
Individual
(genome)¶ Holds all information about an individual in the PushGP framework.
The main role of an Individual is to hold a Push program that determines the Individual’s behavior. An Individual’s push program comes from a Plush genome, which is also stored in the Individual. Genomes are what pyshgp’s VariationOperators manipulation. An Individual is created based off a genome, and it’s program is set by translating the genome into into a program.
Parameters: genome : list of genes
List of plush genes.
Attributes
genome
Plush Genome of individual. program
Push program of individual. error_vector (list) A list of numeric error values. total_error (float) A single numeric error value. Generally some aggregate of the error_vector
.
genome
¶ Plush Genome of individual.

program
¶ Push program of individual. Taken from Plush genome.

run_program
(inputs, output_types, print_trace=False)¶ Runs the Individual’s program.
Parameters: inputs : list
List of input values that can be accessed by the Individual’s program.
print_trace : bool, optional
If
True
, prints the current program element and the state of the stack at each step of executing the program.output_types : list
A list of pysh types. The spawner will include instructions which ouput a list of outputs with the corresponding type in each index.
Returns: The final state of the push Interpreter after executing the program.


class
pyshgp.gp.population.
Population
¶ Pyshgp population of Individuals.

average_error
()¶ Returns: The average total error found in the population.

best_program
()¶ Returns: The program of the Individual with the lowest total error.

best_program_error_vector
()¶ Returns: The program of the Individual with the lowest total error.

epsilon_lexicase_selection
(epsilon='auto')¶ Returns an individual that does the best on the fitness cases when considered one at a time in random order.
Parameters: epsilon : float, arraylike or str, optional (default=’auto’)
If an individual is within epsilon of being elite, it will remain in the selection pool. If ‘auto’, epsilon is set at the start of each selection even to be equal to the Median Absolute Deviation of each error.
Returns: individual : Individual
An individual from the population selected using lexicase selection.

evaluate_by_dataset
(X, y, mode, pool=None)¶ Evalutes the population based on the specified mode.
Parameters: X : {arraylike, sparse matrix}, shape = (n_samples, n_features)
Samples.
y : {arraylike, sparse matrix}, shape = (n_samples, 1)
Target values.
mode : str
Valid options include “regression” and “classification”.
pool : pathos.multiprocessing.Pool, optional
Pool of processes to evaluate in parallel.

evaluate_by_function
(error_function, pool=None)¶ Evaluates every individual in the population, if the individual has not been previously evaluated.
Parameters: error_function : function
The error function which takes a push program as input and Returns an error vector
pool : pathos.multiprocessing.Pool, optional
Pool of processes to evaluate in parallel.

lexicase_selection
()¶ Returns an individual that does the best on the fitness cases when considered one at a time in random order.
http://faculty.hampshire.edu/lspector/pubs/lexicaseIEEETEC.pdf
Returns: individual : Individual
An individual from the population selected using lexicase selection.

lowest_error
()¶ Returns: The lowest total error found in the population.

select
(method='lexicase', epsilon='auto', tournament_size=7, cap=2)¶ Selects a individual from the population with the given selection method.
Parameters: method : str, optional (default=’lexicase’)
The selection method to be used when selecting parents. Supported options are ‘lexicase’, ‘epsilon_lexicase’, and ‘tournament’.
epsilon : int, str, optional (default=’auto’)
The value of epsilon when using ‘epsilon_lexicase’ as the selection method. If auto, epsilon is set to be equal to the Median Absolute Deviation of each error.
tournament_size : int, optional (default=7)
The size of each tournament when using ‘tournament’ selection.

tournament_selection
(tournament_size=7)¶ Returns the individual with the lowest error within a random tournament.
Parameters: tournament_size : int, optional (default=7)
Size of each tournament.
Returns: individual : Individual
An individual from the population selected using tournament selection.

unique
()¶ Returns: The number of unique programs found in the population.

Simplificaiton¶
The simplification
module contains functions that help when
automatically simplifying Push genomes and Push programs.
TODO: function parameter docstrings

pyshgp.gp.simplification.
choice
(a, size=None, replace=True, p=None)¶ Generates a random sample from a given 1D array
New in version 1.7.0.
Parameters: a : 1D arraylike or int
If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.replace : boolean, optional
Whether the sample is with or without replacement
p : 1D arraylike, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
Returns: samples : single item or ndarray
The generated random samples
Raises: ValueError
If a is an int and less than zero, if a or p are not 1dimensional, if a is an arraylike of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint
,shuffle
,permutation
Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) >>> #This is equivalent to np.random.randint(0,5,3)
Generate a nonuniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0])
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a nonuniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0])
Any of the above can be repeated with an arbitrary arraylike instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], dtype='S11')

pyshgp.gp.simplification.
noop_n_random_genes
(genome, n)¶ Returns a new genome that is identical to input genome, with n genes replaced with noop instructions.
Parameters: genome : list of Genes
List of Plush genes.
n : int
Number of gnese to switch to noop.

pyshgp.gp.simplification.
randint
(low, high=None, size=None, dtype='l')¶ Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution of the specified dtype in the “halfopen” interval [low, high). If high is None (the default), then results are from [0, low).
Parameters: low : int
Lowest (signed) integer to be drawn from the distribution (unless
high=None
, in which case this parameter is one above the highest such integer).high : int, optional
If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if
high=None
).size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.dtype : dtype, optional
Desired dtype of the result. All dtypes are determined by their name, i.e., ‘int64’, ‘int’, etc, so byteorder is not available and a specific precision may have different C types depending on the platform. The default value is ‘np.int’.
New in version 1.11.0.
Returns: out : int or ndarray of ints
sizeshaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
See also
random.random_integers
 similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted. In particular, this other one is the one to use to generate uniformly distributed discrete nonintegers.
Examples
>>> np.random.randint(2, size=10) array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) >>> np.random.randint(1, size=10) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) array([[4, 0, 2, 1], [3, 2, 2, 0]])

pyshgp.gp.simplification.
silent_n_random_genes
(genome, n)¶ Returns a new genome that is identical to input genome, with n genes marked as silent.
Parameters: genome : list of Genes
List of Plush genes.
n : int
Number of gnese to switch to silent.

pyshgp.gp.simplification.
simplify_by_dataset
(individual, X, y, mode, steps=1000, verbose=0)¶ Simplifies the genome (and program) of the individual based on a dataset by randomly removing some elements of the program and confirming that the total error remains the same or lower. This is acheived by silencing some genes in the individual’s genome.
Parameters: individual : Individual
The individual to simply.
X : {arraylike, sparse matrix}, shape = (n_samples, n_features)
Samples.
y : {arraylike, sparse matrix}, shape = (n_samples, 1)
Labels.
mode : str
Valid options include “regression” and “classification”
steps : int, optional (default=1000)
Function to used to calculate the error of the individual. Sklearn scoring functions are supported.
verbose :int, optional (default=0)
When greater than 0, verbose printing is enabled.

pyshgp.gp.simplification.
simplify_by_function
(individual, error_function, steps=1000, verbose=0)¶ Simplifies the genome (and program) of the individual based on a function by randomly removing some elements of the program and confirming that the total error remains the same or lower. This is acheived by silencing some genes in the individual’s genome.
Parameters: individual : Individual
The individual to simply.
error_function : function
Error function used to evaluate the individual’s program.
steps : int, optional (default=1000)
Function to used to calculate the error of the individual. Sklearn scoring functions are supported.
verbose :int, optional (default=0)
When greater than 0, verbose printing is enabled.

pyshgp.gp.simplification.
simplify_once
(genome)¶ Silences or noops between 1 and 3 random genes.
Parameters: genome : list of Genes
List of Plush genes.
Variation¶
The variation
module defines classes for variation operators (aka
genetic operators). These operators are used in evoluation to create new
children from selected parents.

class
pyshgp.gp.variation.
Alternation
(rate=0.01, alignment_deviation=10)¶ Uniformly alternates between the two parents.
More information can be found on the this PushRedux page.
Parameters: rate : float, optional (default=0.01)
The probablility of switching which parent program elements are being copied from. Must be 0 <= rate <= 1. Defaults to 0.1.
alignment_deviation : int, optional (default=10)
The standard deviation of how far alternation may jump between indices when switching between parents.

produce
(parents, spawner=None)¶ Produces a child using the UniformMutation operator.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner, optional
A spawner that can be used to create random Push code. Not used by this operator.


class
pyshgp.gp.variation.
FlipBooleanMutation
(rate=0.01)¶ Randomly flips the boolean literal genes.

class
pyshgp.gp.variation.
Genesis
(max_genome_size)¶ Creates an entirely new (and random) genome.

class
pyshgp.gp.variation.
LiteralMutation
(pysh_type, rate=0.01)¶ Base class for all constant mutators.

produce
(parents, spawner)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.


class
pyshgp.gp.variation.
PerturbCloseMutation
(rate=0.01, standard_deviation=1)¶ Randomly perturbs the number of close markers on each gene.

produce
(parents, spawner=None)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.
Returns: A child Individual.


class
pyshgp.gp.variation.
PerturbFloatMutation
(rate=0.01, standard_deviation=1)¶ Randomly perturbs the genes containing float literals.

class
pyshgp.gp.variation.
PerturbIntegerMutation
(rate=0.01, standard_deviation=1)¶ Randomly perturbs the genes containing integer literals.

class
pyshgp.gp.variation.
RandomAdditionMutation
(rate=0.01)¶ Randomly adds new genes.

produce
(parents, spawner)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.


class
pyshgp.gp.variation.
RandomDeletionMutation
(rate=0.01)¶ Randomly removes some genes.

produce
(parents, spawner)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.


class
pyshgp.gp.variation.
RandomReplaceMutation
(rate=0.01)¶ Randomly replaces genes.

produce
(parents, spawner)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.


class
pyshgp.gp.variation.
Reproduction
¶ Clones the parent genome.

class
pyshgp.gp.variation.
TweakStringMutation
(rate=0.01, char_tweak_rate=0.1)¶ Randomly tweaks the string values in string literal genes.

class
pyshgp.gp.variation.
UniformMutation
(rate=0.01, literal_tweak_rate=0.5, float_standard_deviation=1.0, int_standard_deviation=1.0, string_char_tweak_rate=0.1)¶ A simple mutation operator that mutates all genes.

produce
(parents, spawner)¶ Produces a child by perturbing some floats in the parent.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.


class
pyshgp.gp.variation.
VariationOperator
(num_parents)¶ The base class for all variation operators.
Parameters: num_parents : int
Number of parent Individuals the operator needs to produce a child Individual.

produce
(parents, spawner)¶ Produces a child.


class
pyshgp.gp.variation.
VariationOperatorPipeline
(operators)¶ Variation operator that chains together other variation operators.
Parameters: operators : list of VariationOperators
A list of operators to apply in order to produce the child Individual.

produce
(parents, spawner)¶ Produces a child using the VariationOperatorPipeline.
Parameters: parents : list of Individuals
A list of parents to use when producing the child.
spawner : pyshgp.push.spawn.Spawner
A spawner that can be used to create random Push code.
