{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install -q -U circuitree[distributed]==0.11.1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Defining custom Grammars using `CircuitGrammar`\n", "\n", "For search spaces that can't expressed using a grammar in `circuitree.models`, you can make a custom grammar. A grammar will have rules for how to represent each distinct `state` in the design space and how different `actions` affect the `state`. \n", "\n", "You can define a grammar by subclassing `CircuitGrammar`. `CircuitGrammar` is an *abstract* class, which means that we must define certain methods in order to use it. Here's a description of each method's call signature and what it should do.\n", "\n", "``` \n", "is_terminal(state) -> bool # Return whether or not this state is terminal\n", "get_actions(state) -> list[str] # Return a list of actions that can be taken from this state\n", "do_action(state, action) -> str # Return a new state as a result of making this move\n", "get_unique_state(state) -> str # Return a unique representation of this state \n", "```\n", "\n", "Together, these functions allow us to explore the space of possible circuits on-the-fly, without needing to enumerate every possibility.\n", "\n", "As an example, let's consider an existing design space, explored by Angela Chau and colleagues in [this seminal paper](https://dx.doi.org/10.1016/j.cell.2012.08.040) in their study of single-cell polarization circuits. The authors were studying all possible network topologies that could be made with two membrane-bound enzymes A and B that have active and inactive forms. Each species can catalyze either the forward or reverse reaction of the other species and of itself (autocatalysis). In other words, any of the four pairwise interactions can be present, and they can be either activating or inactivating (inhibiting).\n", "\n", "First, let's decide how we could represent a circuit assembly as a string of characters that we will call the `state` string. (Any `Hashable` representation can be used, but strings are convenient.) Let's use the following convention, which is the same one used by `SimpleNetworkGrammar`:\n", "* Each component is an uppercase letter. `A` refers to enzyme A and `B` refers to enzyme B.\n", "* Each pairwise interaction is represented by three characters, two uppercase letters for the components and one lowercase letter referring to the type of interaction. \n", " - For instance, `ABa` means \"A activates B\", and `BBi` means \"B inhibits B\".\n", "* The `state` string consists of two parts. separated by `::`.\n", " - The left side says which components are present (`AB`).\n", " - The right side says which interactions are present, separated by underscores (`ABa_BBi`).\n", " - A terminal `state` starts with `*`, denoting that assembly is complete and the game is over.\n", "* The allowed `action`s are\n", " - Adding a new interaction\n", " - Terminating\n", "* When adding multiple interactions, the order does not matter\n", "\n", "For instance, a circuit where A and B both inhibit each other can be written as the state string `*AB::ABi_BAi` - a fully assembled circuit (`*`) with components `A` and `B` (`AB`) that inhibit each other (`ABi_BAi`). Since order does not matter, we want `get_unique_state(\"*AB::ABi_BAi\")` and `get_unique_state(\"*AB::BAi_ABi\")` to return the same string. We can achieve this by sorting the interactions alphabetically." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from circuitree import CircuitGrammar\n", "\n", "class PolarizationGrammar(CircuitGrammar):\n", " def __init__(self, *args, **kwargs):\n", " super().__init__(*args, **kwargs)\n", " \n", " def is_terminal(self, state) -> bool:\n", " return state.startswith(\"*\")\n", " \n", " def get_actions(self, state: str) -> list[str]:\n", " # If termination hasn't happened yet, it's always a possibiilty\n", " if self.is_terminal(state):\n", " return []\n", " actions = [\"*terminate*\"]\n", " \n", " # Get the part of the string that contains the interactions\n", " interactions = state.split(\"::\")[1]\n", " \n", " # We can add an interaction between any unused pair\n", " for pair in (\"AA\", \"AB\", \"BA\", \"BB\"):\n", " if pair not in interactions:\n", " actions.append(pair + \"a\")\n", " actions.append(pair + \"i\")\n", " return actions\n", "\n", " def do_action(self, state, action):\n", " if action == \"*terminate*\":\n", " return \"*\" + state\n", "\n", " prefix, interactions = state.split(\"::\")\n", " if len(interactions) == 0:\n", " return f\"{prefix}::{action}\"\n", " else:\n", " return f\"{prefix}::{interactions}_{action}\"\n", " \n", " def get_unique_state(self, state):\n", " prefix, interactions = state.split(\"::\")\n", " if len(interactions) == 0:\n", " return state # No interactions to sort\n", " else:\n", " interactions_list = interactions.split(\"_\")\n", " interactions = \"_\".join(sorted(interactions_list))\n", " return f\"{prefix}::{interactions}\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Here are some example inputs to each function and the expected outputs.\n", "\n", "```\n", ">>> grammar = PolarizationGrammar()\n", ">>> grammar.is_terminal(\"*AB::AAa\") # True\n", ">>> grammar.get_actions(\"AB::ABa_BAi\") # ['*terminate*', 'AAa', 'AAi', 'BBa', 'BBi']\n", ">>> grammar.do_action(\"AB::ABa_BAi\", \"AAa\") # 'AB::ABa_BAi_AAa'\n", ">>> grammar.get_unique_state(\"*AB::ABa_BAi_AAa\") # '*AB::AAa_ABa_BAi'\n", "```\n", "The original paper enumerated all the possible topologies in this space and found 81. We can check our work by expanding all the possible `state` strings and printing the terminal ones. There should be 81. To do this, we can use the function `CircuiTree.grow_tree()`, which recursively builds the entire decision tree of possible states." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# terminal states: 81\n", "\n", "All terminal topologies:\n", "['*AB::AAi_ABi_BAi_BBi',\n", " '*AB::AAa_ABi_BAi_BBi',\n", " '*AB::ABi_BAi_BBi',\n", " '*AB::AAi_ABa_BAi_BBi',\n", " '*AB::AAa_ABa_BAi_BBi',\n", " '*AB::ABa_BAi_BBi',\n", " '*AB::AAi_BAi_BBi',\n", " '*AB::AAa_BAi_BBi',\n", " '*AB::BAi_BBi',\n", " '*AB::AAi_ABi_BAa_BBi',\n", " '*AB::AAa_ABi_BAa_BBi',\n", " '*AB::ABi_BAa_BBi',\n", " '*AB::AAi_ABa_BAa_BBi',\n", " '*AB::AAa_ABa_BAa_BBi',\n", " '*AB::ABa_BAa_BBi',\n", " '*AB::AAi_BAa_BBi',\n", " '*AB::AAa_BAa_BBi',\n", " '*AB::BAa_BBi',\n", " '*AB::AAi_ABi_BBi',\n", " '*AB::AAa_ABi_BBi',\n", " '*AB::ABi_BBi',\n", " '*AB::AAi_ABa_BBi',\n", " '*AB::AAa_ABa_BBi',\n", " '*AB::ABa_BBi',\n", " '*AB::AAi_BBi',\n", " '*AB::AAa_BBi',\n", " '*AB::BBi',\n", " '*AB::AAi_ABi_BAi_BBa',\n", " '*AB::AAa_ABi_BAi_BBa',\n", " '*AB::ABi_BAi_BBa',\n", " '*AB::AAi_ABa_BAi_BBa',\n", " '*AB::AAa_ABa_BAi_BBa',\n", " '*AB::ABa_BAi_BBa',\n", " '*AB::AAi_BAi_BBa',\n", " '*AB::AAa_BAi_BBa',\n", " '*AB::BAi_BBa',\n", " '*AB::AAi_ABi_BAa_BBa',\n", " '*AB::AAa_ABi_BAa_BBa',\n", " '*AB::ABi_BAa_BBa',\n", " '*AB::AAi_ABa_BAa_BBa',\n", " '*AB::AAa_ABa_BAa_BBa',\n", " '*AB::ABa_BAa_BBa',\n", " '*AB::AAi_BAa_BBa',\n", " '*AB::AAa_BAa_BBa',\n", " '*AB::BAa_BBa',\n", " '*AB::AAi_ABi_BBa',\n", " '*AB::AAa_ABi_BBa',\n", " '*AB::ABi_BBa',\n", " '*AB::AAi_ABa_BBa',\n", " '*AB::AAa_ABa_BBa',\n", " '*AB::ABa_BBa',\n", " '*AB::AAi_BBa',\n", " '*AB::AAa_BBa',\n", " '*AB::BBa',\n", " '*AB::AAi_ABi_BAi',\n", " '*AB::AAa_ABi_BAi',\n", " '*AB::ABi_BAi',\n", " '*AB::AAi_ABa_BAi',\n", " '*AB::AAa_ABa_BAi',\n", " '*AB::ABa_BAi',\n", " '*AB::AAi_BAi',\n", " '*AB::AAa_BAi',\n", " '*AB::BAi',\n", " '*AB::AAi_ABi_BAa',\n", " '*AB::AAa_ABi_BAa',\n", " '*AB::ABi_BAa',\n", " '*AB::AAi_ABa_BAa',\n", " '*AB::AAa_ABa_BAa',\n", " '*AB::ABa_BAa',\n", " '*AB::AAi_BAa',\n", " '*AB::AAa_BAa',\n", " '*AB::BAa',\n", " '*AB::AAi_ABi',\n", " '*AB::AAa_ABi',\n", " '*AB::ABi',\n", " '*AB::AAi_ABa',\n", " '*AB::AAa_ABa',\n", " '*AB::ABa',\n", " '*AB::AAi',\n", " '*AB::AAa',\n", " '*AB::']\n" ] } ], "source": [ "from circuitree import CircuiTree\n", "\n", "class PolarizationTree(CircuiTree):\n", " def __init__(self, *args, **kwargs):\n", " \n", " # Specify the required keyword arguments\n", " kwargs = {\"grammar\": PolarizationGrammar(), \"root\": \"AB::\"} | kwargs\n", " super().__init__(*args, **kwargs)\n", " \n", " def get_reward(self, state):\n", " raise NotImplementedError # no need to implement this\n", "\n", "\n", "from pprint import pprint\n", "\n", "tree = PolarizationTree()\n", "tree.grow_tree()\n", "\n", "terminal_states = list(tree.terminal_states)\n", "print(f\"# terminal states: {len(terminal_states)}\")\n", "print()\n", "print(\"All terminal topologies:\")\n", "pprint(terminal_states)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Note that if your design space is large, `grow_tree()` can take an extremely long time and/or cause an out-of-memory error." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python implementation: CPython\n", "Python version : 3.10.8\n", "IPython version : 8.24.0\n", "\n", "circuitree: 0.11.1\n", "jupyterlab: 4.1.8\n", "watermark : 2.4.3\n", "\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -v -p circuitree,jupyterlab,watermark" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 0 }