๐ฃ Release Notes - 0.8.0 โ
0.7.0 was a huge release, in 0.8.0 we address certain issues from 0.7.0 and introduce a variety of new feature as we prepare for a stable 1.0.0 release coming soon this quarter!
โ ๏ธ We highly recommend you to read 0.7.0 first for how to migrate towards the latest version!
โฐ For some statistics! This release contains 404 changed files with 26,426 additions and 8,474 deletions.
๐ก Summary โ
Type | Count |
---|---|
Breaking Changes | 3 |
Features | 13 |
Improvements | 4 |
Bug Fixes | 14 |
Documentation | 1 |
Maintenance | 1 |
๐ฅ Breaking Change โ
Renaming Runtime
to Trainer
โ
We are renaming all runtime related wording to Trainer
.
To migrate to this change, this means that you have to change:
r = Runtime()
a = Agent()
r.train(a, train_cycles=5)
r.record(a, output_dir="/tmp/composabl_recordings")
to
t = Trainer()
a = Agent()
t.train(a, train_cycles=5)
t.record(a, output_dir="/tmp/composabl_recordings")
Renaming train_iters
to train_cycles
โ
We renamed train_iters
to train_cycles
for clarity.
To migrate to this change, this means that you have to change:
r = Trainer()
a = Agent()
r.train(a, train_iters=5)
r.record(a, output_dir="/tmp/composabl_recordings")
to
t = Trainer()
a = Agent()
t.train(a, train_cycles=5)
t.record(a, output_dir="/tmp/composabl_recordings")
Renaming observation
, observation_space
to sensor
, sensor_space
โ
We renamed the wording observation
to sensor
for clarity and consistency throughout the SDK.
Required SDK Changes โ
To migrate to this change, this means that you have to change:
This means that for your Simulator Implementations
, you have to change:
async def observation_space_info(self) -> gym.Space:
return self.env.observation_space
to
async def sensor_space_info(self) -> gym.Space:
return self.env.sensor_space
for your Perceptor, Controller, Teacher and Coach Implementations
, you have to change:
async def filtered_observation_space(self) -> List[str]:
to
async def filtered_sensor_space(self) -> List[str]:
Required Sim Changes โ
On the simulator side, this means that the interface changed.
async def observation_space_info(self) -> gym.Space:
return self.env.observation_space
becomes
async def sensor_space_info(self) -> gym.Space:
return self.env.sensor_space # note: this is your sim variable, ideally this is named sensor_space as well
๐ Features โ
Context Manager API โ
We have added a new Context Manager API to the Agent definition. This allows you to define an agent and add sensors, perceptors, and skills to it within a context manager block, as you naturally would think of an agent.
For example, if we have the following agent structure:
skill-selector-1
โโโ skill-1
โโโ skill-group-1
โ โโโ skill-2 -> skill-3
โโโ skill-selector-2
โ โโโ skill-4
โ โโโ skill-5
Written in code, it would look like this:
from composabl import Agent
a = Agent()
a.add_sensors([
Sensor("s-1", "my demo sensor #1"),
Sensor("s-2", "my demo sensor #2"),
Sensor("s-3", "my demo sensor #3"),
])
with SkillSelector("skill-selector-1", TeacherSpaceBox) as ss1:
with Skill("skill-1", ControllerExpertBox) as s1:
s1.add_scenario(scenario1)
ss1.add_skill(s1)
with SkillGroup() as sg1:
sg1.set_first_skill(Skill("skill-2", TeacherSpaceBox))
sg1.set_second_skill(Skill("skill-3", TeacherSpaceBox))
ss1.add_skill_group(sg1)
with SkillSelector("skill-selector-2", ControllerExpertBox) as ss2:
with Skill("skill-4", ControllerExpertBox) as s4:
ss2.add_skill(s4)
with Skill("skill-5", ControllerExpertBox) as s5:
s5.add_scenario(scenario2)
ss2.add_skill(s5)
# Passing the children is optional, as we fetch them from the selector
a.add_selector_skill(ss2, fixed_order=True, fixed_order_repeat=False)
# Here we do add them even though they are optional to prevent breaking changes
a.add_selector_skill(ss1, [s1, sg1, ss2], fixed_order=True, fixed_order_repeat=False)
Goals โ
Gone are the days of writing complex reward function as we introduce a new concept called Goals. Goals are a way to define what you want the agent to achieve. This allows you to define the goals of the agent in a more natural way. Instead of being defined on the Simulator side, goals are also defined on the Skill side, making it more flexible.
Goals have 3 elements to them:
- Objective: What you want to achieve
- Variable: The variable you want to achieve the objective on. This is either a Sensor or a Perceptor
- Target: The target value you want to achieve (optional)
We currently support the following objectives:
Objective | Description | Example |
---|---|---|
Maximize | Value is maximized - Push the value as high as possible. | "Maximize" (=objective) the "Speed" (=variable) |
Minimize | Value is minimized - Push the test value as low as possible. | "Minimize" (=objective) the "Speed" (=variable) |
Maintain | Value in a Target or Range - Get to a target or range as quickly as possible and stay in range. | "Maintain" (=objective) "speed" (=variable) to "55" (=target) |
Avoid | Value is far from the Avoid - Avoid the value. | "Avoid" (=objective) "speed" (=variable) to "55" (=target) |
Approach | Value to reach a specified Target - Get to a target range as quickly as possible. | "Approach" (=objective) "speed" (=variable) to "55" (=target) |
Combing this into the agent structure, it would look like this:
class BalanceTeacher(CoordinatedGoal):
def __init__(self):
pole_goal = MaintainGoal("pole_theta", "Maintain pole upright", target=0, stop_distance=0.418)
cart_goal = MaintainGoal("cart_pos", "Maintain cart in the center", target=0, stop_distance=2.4)
super().__init__([pole_goal, cart_goal], GoalCoordinationStrategy.AND)
Visualize Agent (Text) โ
Not in the CLI, but in the SDK itself, you can now visualize the agent structure in text. This allows you to see the structure of the agent in a textual way.
a = Agent()
a.draw_text()
Which will provide an output such as:
============================================================================
=============== Agent 'ea4f3627-fa63-4a84-9cee-bf0d440775a3' ===============
============================================================================
#Sensors: 8
#Perceptors: 0
#Skills: 3
#Skill Groups: 0
#Skill Selectors: 2
#Skill Coordinated: 0
================================== GRAPH: ==================================
skill-selector-2 (id: skill-selector-8f38c90d-df25-4396-94fc-b8ea8db8bda8)
โโโ skill-selector-1 (id: skill-selector-fda30e82-fd65-4571-8d91-35ae069c767c)
โโโ skill-3 (id: skill-e1e4ac32-311d-4db3-920b-e20f394537bd)
skill-selector-1 (id: skill-selector-fda30e82-fd65-4571-8d91-35ae069c767c)
โโโ skill-1 (id: skill-7868c26b-ce14-411d-8b21-e5beb803c33d)
โโโ skill-2 (id: skill-c9ed6cbf-3b2a-42db-b811-af6061a85a20)
skill-3 (id: skill-e1e4ac32-311d-4db3-920b-e20f394537bd)
skill-1 (id: skill-7868c26b-ce14-411d-8b21-e5beb803c33d)
skill-2 (id: skill-c9ed6cbf-3b2a-42db-b811-af6061a85a20)
============================================================================
Record Agent Performance โ
An exciting new feature is to record how well your agent performs once it was trained! This allows you to record the agent's performance on the environment as a video and GIF file. This gets saved to /tmp/composabl_recordings
by default.
r = Runtime()
a = Agent()
r.train(a, train_iters=5)
r.record(a, output_dir="/tmp/composabl_recordings")
New Historian โ
After a lot of feedback of our customers, we decided that the Historian needed some love. It worked by setting up 3 large dependencies (EMQX, TimescaleDB and our Processor). This however caused issues as TimescaleDB sometimes had problems due to latencies, and on top, is not ideal to work with when having multiple training jobs running (as well as for our no-code application coming soon).
That's why we decided to build a brand new Historian system! This system is built in a Similar way, where all events pass through EMQX (large scale MQTT broker) but instead of storing them in a database system, we now store them in a big-data flat file named "Delta Lake". Delta Lake brings a lot of advantages, but the most important for us are:
- ACID Transactions: Ensuring data integrity
- Schema Enforcement: Ensuring data quality
- Time Travel (data versioning): Providing data versioning and audit history
- Flat File Storage: Allowing us to store data in a flat file system per training session, making it easier to maintain
- Cross Language Support: Allowing us to use Delta Lake in multiple languages
We are happy to announce that this Historian system is not available in the Composabl SDK and can be used as follows:
# Base
composabl historian start
# Specify the output path
composabl historian start --output-path /tmp/composabl-historian
When you now train an agent, a flat file will be created in the output path specified. This file contains all the events that happened during the training session. You can view the file location using the status command:
composabl historian status
showing something like:
โโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Service โ Container Name โ Status โ Connection Details โ Ports โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ EMQX โ emqx โ running โ tcp://admin:PASS!@localhost:1883 โ 18083/tcp,1883/tcp,4370/tcp,5369/tcp,8080/tcp,8083/tcp,8084/tcp,8883/tcp โ
โ Historian โ historian โ running โ file:///tmp/composabl-historian/RUN_ID.delta โ โ
โโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Introduce Dry Run Mode โ
We are introducing a Dry Run Mode. By running an agent with Dry Run Mode enabled, we will now be outputting the JSON of the agent. There are 2 ways of enabling dry run mode:
- Through environment flags:
export COMPOSABL_FLAGS_DRY_RUN_ENABLED="1"
export COMPOSABL_FLAGS_DRY_RUN_OUTPUT_DIR = "/tmp/my-dir"
export COMPOSABL_FLAGS_DRY_RUN_OUTPUT_FILE = "output.json"
- Through the Trainer configuration:
r = Trainer({
"target": {
"local": {
"address": "localhost:1337"
}
},
"flags": {
"dry_run_enabled": True,
"dry_run_output_dir": temp_dir,
"dry_run_output_file": "main.json"
}
})
Introduce impl_url
on Skills and Perceptors โ
Skills and Perceptors can now define an impl_url
that points to the implementation of the Skill or Perceptor. This allows us to easily navigate to the implementation of the Skill or Perceptor. From a user perspective, nothing changes and you are still able to define skills and perceptors as before:
s = Skill("skill-1", "https://example.com/my-skill.tar.gz")
p = Perceptor("perceptor-1", "https://example.com/my-perceptor.tar.gz")
CLI - New Commands โ
We have added a variety of new commands to the CLI to make it easier for you to interact with the SDK.
Command | Description |
---|---|
composabl agent visualize agent.json | Visualize the agent structure in a graph. |
composabl login | Login to the NoCode application. |
composabl sim new --name "My Sim" --description "This is my sim" --location "/tmp/my-path" | Create a new Sim with a given name and description. |
composabl sim run /tmp/my-path | Run the sim that was created with the command above. |
composabl skill new | Create a new Skill with a given name and description and implementation type. |
composabl skill publish <path> | Publish the skill that was created with the command above. |
composabl perceptor new | Create a new Perceptor with a given name and description and implementation type. |
composabl perceptor publish <path> | Publish the perceptor that was created with the command above. |
composabl version | Show the installed version of the SDK. |
composabl debug | Show debug information of the SDK. |
Support Skill Groups for DRL to Controller โ
You can now define a Skill Group where a Teacher Skill is the first skill and a Controller Skill is the second skill. This allows you to define a Teacher Skill that is used to train the Controller Skill. This is useful in cases where you have a SetPoint scenario.
agent_setpoint = Agent()
agent_setpoint.add_sensors(sensors_discrete)
agent_setpoint.add_skill(target_skill_custom_action_space)
agent_setpoint.add_skill(pass_through_skill_controller)
# Create the SkillGroup
skill_group = SkillGroup(target_skill_custom_action_space, pass_through_skill_controller)
agent_setpoint.add_skill_group(skill_group=skill_group)
Set Action Space and Observation Space for a Skill โ
You can now define the action space and observation space that is supported for a skill:
s = Skill("my-skill", TeacherSpaceDiscrete, SkillOptions(
action_space=spaces.Discrete(3),
observation_space=spaces.Box(low=0, high=1, shape=(3,))
))
Support for rollout_fragment_length
flag โ
You can now set the rollout_fragment_length
flag in the Trainer configuration. This flag is used to set the length of the rollout fragment. This is useful for slow simulators.
r = Trainer({
"target": {
"local": {
"address": "localhost:1337"
}
},
"trainer": {
"rollout_fragment_length": 100
}
})
Support for sample_timeout_ms
flag โ
You can now set the sample_timeout_ms
flag in the Trainer configuration. This flag is used to set the sample timeout in milliseconds. This is useful for slow simulators.
r = Trainer({
"target": {
"local": {
"address": "localhost:1337"
}
},
"trainer": {
"sample_timeout_ms": 100
}
})
Sim Validator โ
A sim validator was added that allows you to validate the sim you create. It will print out issues that should be resolved before you are able to run your sim.
Example:
composabl sim new --name "my-sim" --description "This is my sim" --location "/tmp/my-path"
composabl sim validate /tmp/my-path/my-sim
Prints:
# Validation Results
## `init`
โ
No issues found
## `sensor_space_info`
โ
No issues found
**Result:**
\`\`\`python
Box([ -400. -100. 0. -1000. -6.3 -3. ], [ 400. 100. 1000. 1000. 6.3 3. ], (6,), float32)
\`\`\`
## `action_space_info`
โ
No issues found
**Result:**
\`\`\`python
Box(-1.0, 1.0, (2,), float32)
\`\`\`
## `make`
โ
No issues found
## `reset`
โ
No issues found
## `action_space_sample`
โ
No issues found
## `step`
โ
No issues found
## `close`
โ
No issues found
## `set_scenario`
โ
No issues found
## `get_scenario`
โ
No issues found
## `set_render_mode`
โ
No issues found
## `get_render_mode`
โ
No issues found
## `get_render`
- โ ๏ธ 'Env' object has no attribute 'get_render_frame'
**Stacktrace:**
\`\`\`text
Traceback (most recent call last):
File "/home/xanrin/composabl/sdk.composabl.ai/composabl_core/composabl_core/networking/grpc/exception_handler.py", line 23, in wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xanrin/composabl/sdk.composabl.ai/composabl_core/composabl_core/networking/grpc/server.py", line 201, in get_render
res = await super().get_render(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xanrin/composabl/sdk.composabl.ai/composabl_core/composabl_core/networking/server_base.py", line 214, in get_render
render = await self.server_impl.get_render()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/my-path/my-sim/my_sim/sim_impl.py", line 63, in get_render
return self.env.get_render_frame()
^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Env' object has no attribute 'get_render_frame'
\`\`\`
๐ Improvements โ
- Added a normalization check to avoid double un-normalizing
- Added a check that warns the user if the sim is too slow
- Added a
p value test
to see if the reward mean is improving - We now run a sim validation step while initializing the Trainer to prevent sim issues
- We cleaned up the training pipeline for better logging output
- We reduced the error spree on the async_sync util
- We now provide default implementations for the skill's their
filter
methods. Making them optional.
๐ Bug Fix โ
- Fixed an issue that prevented the agent to learn from experience caused due to a new API implementation in the underlying library.
- Fixed an issue with action space range normalization where the agent was not able to normalize the action space range correctly due to a change in the underlying library.
- Fixed an async/await issue with the
.execute
being called instead of._execute
in theSkillSelectorProcessor
- Fixed an issue where Sensor mapping had issues with obs keys that were out of bounds
- Fixed an issue where Sensor mapping had issues with multiline sensor maps
- Fixed an issue that caused the agent to take the same action all the time
- Fixed an issue that didn't allow us to get the IP Address correctly on Mac
- Fixed an issue with inference not being possible when we didn't have a selector (controller only agent)
- Fixed resume training
- Fixed reward printing
- Fixed an issue with the Kubernetes sim manager that spawned multiple sims when only 1 is expected
- Fixed an issue with the Sim Manager not correctly marking sims as "recycle" when they are done
- Fixed an issue with port allocations that would cause training to stop if no port was available. We retry this a couple of times now.
- Fixed an issue with dry run not working if the directory didn't exist
๐ Documentation โ
We have heard your feedback and are excited to announce that we have completely overhauld the documentation! We have added a new CLI section, a new Building Agents section, and a new Runtime section. This should make it easier for you to get started with Composabl! Next to that, we have moved the Machine Teaching paradigm to its own webpage that you can find here.