Transforms¶
Transforms make changes to a domain to improve optimisation quality or adapt a domain to the constraints of a strategy.
-
class
summit.strategies.base.
Transform
(domain, **kwargs)[source]¶ Pre/post-processing of data for strategies
- Parameters
domain (
Domain
) – A domain for that is being used in the strategytransform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
Notes
This class can be overridden to create custom transformations as necessary.
Below we describe the transforms available in Summit.
Chimera¶
-
class
summit.strategies.base.
Chimera
(domain: summit.domain.Domain, hierarchy: dict, softness=0.001, absolutes=None)[source]¶ Scalarise a multiobjective problem using Chimera.
Chimera is a hiearchical multiobjective scalarasation function. You set the parameter loss_tolerances to weight the importance of each objective.
- Parameters
domain (
Domain
) – A domain for that is being used in the strategyhierarchy (dict) – Dictionary with keys as the names of the objectives and values as dictionaries with the keys “hierarchy” and “tolerance” for the ranking and tolerance, respectively, on each objective. The hierachy is indexed from zero (i.e., 0, 1, 2, etc.) with zero being the highest priority objective. A smaller tolerance means that the objective will be weighted more, while a larger tolerance indicates that the objective will be weighted less. The tolerance must be between zero and one.
softness (float, optional) – Smoothing parameter. Defaults to 1e-3 as recommended by Häse et al. Larger values result in a more smooth objective while smaller values will give a disjointed objective.
absolutes (array-like, optional) – Default is zeros.
Examples
>>> from summit.domain import * >>> from summit.strategies import SNOBFIT, MultitoSingleObjective >>> from summit.utils.dataset import DataSet >>> # Create domain >>> domain = Domain() >>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100]) >>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True) >>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True) >>> # Previous reactions >>> columns = [v.name for v in domain.variables] >>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90} >>> previous_results = DataSet([values], columns=columns) >>> # Multiobjective transform >>> hierarchy = {"yield_": {"hierarchy": 0, "tolerance": 0.5}, "de": {"hierarchy": 1, "tolerance": 1.0}} >>> transform = Chimera(domain, hierarchy=hierarchy) >>> strategy = SNOBFIT(domain, transform=transform) >>> next_experiments = strategy.suggest_experiments(5, previous_results)
Notes
The original paper on Chimera can be found at 1.
This code is based on the code for Griffyn 2, which can be found on Github.
Chimera turns problems into minimisation problems. This is done automatically by reading the type of objective from the domain.
References
- 1
Häse, F., Roch, L. M., & Aspuru-Guzik, A. “Chimera: enabling hierarchy based multi-objective optimization for self-driving laboratories.” Chemical Science, 2018, 9,7642-7655
- 2
Häse, F., Roch, L.M. and Aspuru-Guzik, A., 2020. Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry. arXiv preprint arXiv:2003.12127.
-
transform_inputs_outputs
(ds, copy=True, **kwargs)[source]¶ Transform of data into inputs and outptus for a strategy
This will do a log transform on the objectives (outputs).
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
copy (bool, optional) – Copy the dataset internally. Defaults to True.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
- Returns
Datasets with the input and output datasets
- Return type
inputs, outputs
-
un_transform
(ds, **kwargs)¶ - Transform data back into its original represetnation
after strategy is finished
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
standardize_inputs (bool, optional) – Standardize all input continuous variables. Default is False.
standardize_outputs (bool, optional) – Standardize all output continuous variables. Default is False.
categorical_method (str or None, optional) – The method for transforming categorical variables. Either “one-hot”, “descriptors” or None. Descriptors must be included in the categorical variables for the later. Default is None.
Notes
Override this class to achieve custom untransformations
MultitoSingleObjective¶
-
class
summit.strategies.base.
MultitoSingleObjective
(domain: summit.domain.Domain, expression: str, maximize=True)[source]¶ Transform a multiobjective problem into a single objective problems
- Parameters
domain (
Domain
) – A domain for that is being used in the strategyexpression (str) – An expression in terms of variable names used to convert the multiobjective problem into a single objective problem
- Returns
result – description
- Return type
bool
- Raises
ValueError – If domain does not have at least two objectives
Examples
>>> from summit.domain import * >>> from summit.strategies import SNOBFIT, MultitoSingleObjective >>> from summit.utils.dataset import DataSet >>> # Create domain >>> domain = Domain() >>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100]) >>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True) >>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True) >>> # Previous reactions >>> columns = [v.name for v in domain.variables] >>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90} >>> previous_results = DataSet([values], columns=columns) >>> # Multiobjective transform >>> transform = MultitoSingleObjective(domain, expression="(yield_+de)/2", maximize=True) >>> strategy = SNOBFIT(domain, transform=transform) >>> next_experiments = strategy.suggest_experiments(5, previous_results)
-
transform_inputs_outputs
(ds, **kwargs)[source]¶ Transform of data into inputs and outputs for a strategy
This will do multi to single objective transform
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
copy (bool, optional) – Copy the dataset internally. Defaults to True.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
- Returns
Datasets with the input and output datasets
- Return type
inputs, outputs
-
un_transform
(ds, **kwargs)¶ - Transform data back into its original represetnation
after strategy is finished
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
standardize_inputs (bool, optional) – Standardize all input continuous variables. Default is False.
standardize_outputs (bool, optional) – Standardize all output continuous variables. Default is False.
categorical_method (str or None, optional) – The method for transforming categorical variables. Either “one-hot”, “descriptors” or None. Descriptors must be included in the categorical variables for the later. Default is None.
Notes
Override this class to achieve custom untransformations
Log Objectives¶
-
class
summit.strategies.base.
LogSpaceObjectives
(domain: summit.domain.Domain)[source]¶ Log transform objectives
- Parameters
domain (
Domain
) – A domain for that is being used in the strategy- Raises
ValueError – When the domain has no objectives.
Examples
>>> from summit.domain import * >>> from summit.strategies import SNOBFIT, MultitoSingleObjective >>> from summit.utils.dataset import DataSet >>> # Create domain >>> domain = Domain() >>> domain += ContinuousVariable(name="temperature",description="reaction temperature in celsius", bounds=[50, 100]) >>> domain += ContinuousVariable(name="flowrate_a", description="flow of reactant a in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="flowrate_b", description="flow of reactant b in mL/min", bounds=[0.1, 0.5]) >>> domain += ContinuousVariable(name="yield_", description="", bounds=[0, 100], is_objective=True, maximize=True) >>> domain += ContinuousVariable(name="de",description="diastereomeric excess",bounds=[0, 100],is_objective=True,maximize=True) >>> # Previous reactions >>> columns = [v.name for v in domain.variables] >>> values = {("temperature", "DATA"): 60,("flowrate_a", "DATA"): 0.5,("flowrate_b", "DATA"): 0.5,("yield_", "DATA"): 50,("de", "DATA"): 90} >>> previous_results = DataSet([values], columns=columns) >>> # Multiobjective transform >>> transform = LogSpaceObjectives(domain) >>> strategy = SNOBFIT(domain, transform=transform) >>> next_experiments = strategy.suggest_experiments(5, previous_results)
-
to_dict
(**kwargs)¶ Output a dictionary representation of the transform
-
transform_inputs_outputs
(ds, **kwargs)[source]¶ Transform of data into inputs and outptus for a strategy
This will do a log transform on the objectives (outputs).
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
copy (bool, optional) – Copy the dataset internally. Defaults to True.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.
- Returns
Datasets with the input and output datasets
- Return type
inputs, outputs
-
un_transform
(ds, **kwargs)[source]¶ Untransform objectives from log space
- Parameters
ds (DataSet) – Dataset with columns corresponding to the inputs and objectives of the domain.
copy (bool, optional) – Copy the dataset internally. Defaults to True.
transform_descriptors (bool, optional) – Transform the descriptors into continuous variables. Default True.