Transforms

Transforms make changes to a domain to improve optimisation quality or adapt a domain to the constraints of a strategy.

class summit.strategies.base.Transform(domain, **kwargs)[source]

Pre/post-processing of data for strategies

Parameters
  • domain (Domain) – A domain for that is being used in the strategy

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

Notes

This class can be overridden to create custom transformations as necessary.

Below we describe the transforms available in Summit.

Chimera

class summit.strategies.base.Chimera(domain: summit.domain.Domain, hierarchy: dict, softness=0.001, absolutes=None)[source]

Scalarise a multiobjective problem using Chimera.

Chimera is a hiearchical multiobjective scalarasation function. You set the parameter loss_tolerances to weight the importance of each objective.

Parameters
  • domain (Domain) – A domain for that is being used in the strategy

  • hierarchy (dict) – Dictionary with keys as the names of the objectives and values as dictionaries with the keys “hierarchy” and “tolerance” for the ranking and tolerance, respectively, on each objective. The hierachy is indexed from zero (i.e., 0, 1, 2, etc.) with zero being the highest priority objective. A smaller tolerance means that the objective will be weighted more, while a larger tolerance indicates that the objective will be weighted less. The tolerance must be between zero and one.

  • softness (float, optional) – Smoothing parameter. Defaults to 1e-3 as recommended by Häse et al. Larger values result in a more smooth objective while smaller values will give a disjointed objective.

  • absolutes (array-like, optional) – Default is zeros.

Examples

>>> from summit.domain import *
>>> from summit.strategies import SNOBFIT, MultitoSingleObjective
>>> from summit.utils.dataset import DataSet
>>> # Create domain
>>> domain = Domain()
>>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100])
>>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True)
>>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True)
>>> # Previous reactions
>>> columns = [v.name for v in domain.variables]
>>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90}
>>> previous_results = DataSet([values], columns=columns)
>>> # Multiobjective transform
>>> hierarchy =  {"yield_": {"hierarchy": 0, "tolerance": 0.5}, "de": {"hierarchy": 1, "tolerance": 1.0}}
>>> transform = Chimera(domain, hierarchy=hierarchy)
>>> strategy = SNOBFIT(domain, transform=transform)
>>> next_experiments = strategy.suggest_experiments(5, previous_results)

Notes

The original paper on Chimera can be found at 1.

This code is based on the code for Griffyn 2, which can be found on Github.

Chimera turns problems into minimisation problems. This is done automatically by reading the type of objective from the domain.

References

1

Häse, F., Roch, L. M., & Aspuru-Guzik, A. “Chimera: enabling hierarchy based multi-objective optimization for self-driving laboratories.” Chemical Science, 2018, 9,7642-7655

2

Häse, F., Roch, L.M. and Aspuru-Guzik, A., 2020. Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry. arXiv preprint arXiv:2003.12127.

to_dict()[source]

Output a dictionary representation of the transform

transform_inputs_outputs(ds, copy=True, **kwargs)[source]

Transform of data into inputs and outptus for a strategy

This will do a log transform on the objectives (outputs).

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • copy (bool, optional) – Copy the dataset internally. Defaults to True.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

Returns

Datasets with the input and output datasets

Return type

inputs, outputs

un_transform(ds, **kwargs)
Transform data back into its original represetnation

after strategy is finished

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

  • standardize_inputs (bool, optional) – Standardize all input continuous variables. Default is False.

  • standardize_outputs (bool, optional) – Standardize all output continuous variables. Default is False.

  • categorical_method (str or None, optional) – The method for transforming categorical variables. Either “one-hot”, “descriptors” or None. Descriptors must be included in the categorical variables for the later. Default is None.

Notes

Override this class to achieve custom untransformations

MultitoSingleObjective

class summit.strategies.base.MultitoSingleObjective(domain: summit.domain.Domain, expression: str, maximize=True)[source]

Transform a multiobjective problem into a single objective problems

Parameters
  • domain (Domain) – A domain for that is being used in the strategy

  • expression (str) – An expression in terms of variable names used to convert the multiobjective problem into a single objective problem

Returns

result – description

Return type

bool

Raises

ValueError – If domain does not have at least two objectives

Examples

>>> from summit.domain import *
>>> from summit.strategies import SNOBFIT, MultitoSingleObjective
>>> from summit.utils.dataset import DataSet
>>> # Create domain
>>> domain = Domain()
>>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100])
>>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True)
>>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True)
>>> # Previous reactions
>>> columns = [v.name for v in domain.variables]
>>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90}
>>> previous_results = DataSet([values], columns=columns)
>>> # Multiobjective transform
>>> transform = MultitoSingleObjective(domain, expression="(yield_+de)/2", maximize=True)
>>> strategy = SNOBFIT(domain, transform=transform)
>>> next_experiments = strategy.suggest_experiments(5, previous_results)
to_dict()[source]

Output a dictionary representation of the transform

transform_inputs_outputs(ds, **kwargs)[source]

Transform of data into inputs and outputs for a strategy

This will do multi to single objective transform

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • copy (bool, optional) – Copy the dataset internally. Defaults to True.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

Returns

Datasets with the input and output datasets

Return type

inputs, outputs

un_transform(ds, **kwargs)
Transform data back into its original represetnation

after strategy is finished

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

  • standardize_inputs (bool, optional) – Standardize all input continuous variables. Default is False.

  • standardize_outputs (bool, optional) – Standardize all output continuous variables. Default is False.

  • categorical_method (str or None, optional) – The method for transforming categorical variables. Either “one-hot”, “descriptors” or None. Descriptors must be included in the categorical variables for the later. Default is None.

Notes

Override this class to achieve custom untransformations

Log Objectives

class summit.strategies.base.LogSpaceObjectives(domain: summit.domain.Domain)[source]

Log transform objectives

Parameters

domain (Domain) – A domain for that is being used in the strategy

Raises

ValueError – When the domain has no objectives.

Examples

>>> from summit.domain import *
>>> from summit.strategies import SNOBFIT, MultitoSingleObjective
>>> from summit.utils.dataset import DataSet
>>> # Create domain
>>> domain = Domain()
>>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100])
>>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5])
>>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True)
>>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True)
>>> # Previous reactions
>>> columns = [v.name for v in domain.variables]
>>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90}
>>> previous_results = DataSet([values], columns=columns)
>>> # Multiobjective transform
>>> transform = LogSpaceObjectives(domain)
>>> strategy = SNOBFIT(domain, transform=transform)
>>> next_experiments = strategy.suggest_experiments(5, previous_results)
to_dict(**kwargs)

Output a dictionary representation of the transform

transform_inputs_outputs(ds, **kwargs)[source]

Transform of data into inputs and outptus for a strategy

This will do a log transform on the objectives (outputs).

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • copy (bool, optional) – Copy the dataset internally. Defaults to True.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.

Returns

Datasets with the input and output datasets

Return type

inputs, outputs

un_transform(ds, **kwargs)[source]

Untransform objectives from log space

Parameters
  • ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.

  • copy (bool, optional) – Copy the dataset internally. Defaults to True.

  • transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.