Skip to content

๐Ÿ“ฃ Release Notes - 0.4.0 โ€‹

Release 0.4.0 is a new release that brings a lot of improvements to the Composabl framework. All of this taking us one step closer to our 1.0.0 release. .

During this release we have focused on:

  • Exciting new features (see below)
  • Stability improvements
  • Logging improvements

โฐ For some statistics! This release contains 121 changed files with 5,567 additions and 1,763 deletions.

๐Ÿ’ฅ Breaking Change โ€‹

New Runtime Target Configuration โ€‹

The configuration for the target has been changed and moved to its dedicated section. Rather than having to configure it in the env key, we are now using the target key. This allows us to add more configuration options in the future.

In other words, we changed:

javascript
{
    "env": {
        "name": "lunar_lander_sim",
        "compute": "docker",
        "config": {
            "use_gpu": False,
            "image": "composabl/sim-lunar-lander:latest",
        },
    },
    "license": "<This is your license key>", // License key
    "training": {},
}

To:

javascript
{
    "target": {
        "docker": {
            "image": "composabl/sim-demo:latest",
        },
    },
    "env": {
        "name": "sim-demo",
    },
    "license": "<This is your license key>",
    "training": {},
};

๐Ÿš€ Features โ€‹

Skill Groups โ€‹

Skill groups allow you to group skills together. Imagine when you have a controller that you want to run before a skill, you can now group them together and run them sequentially.

Example:

python
from composabl import Agent, Runtime, Scenario, Sensor, Skill, SkillGroup

config = {} # normal config

runtime = Runtime(config)
agent = Agent(runtime, config)

# Note: this is a dummy representation, see examples for the full code
increment_skill = Skill("increment", IncrementTeacher)
decrement = Skill("decremement", DecrementController, trainable=False)

# Initialize the Skill Group
sg = SkillGroup(increment_skill, decrement_skill)
agent.add_skill_group(sg)

agent.train(train_iters=1)

Agent Export โ€‹

You can now export your agent and continue from an exported agent. This allows you to resume training from where you left-off.

Example:

python
from composabl import Agent, Runtime, Scenario, Sensor, Skill, SkillGroup

config = {} # normal config
runtime = Runtime(config)
agent = Agent(runtime, config)

# ... agent definition ...

# Train initially
agent.train(train_iters=1)

# Export the agent
agent.export("/my/directory")

# Load the agent
agent.load("/my/directory")

# Continue training
agent.train(train_iters=1)

Agent Loading & Inferencing โ€‹

Once an agent is exported, we can now load it back in and use it for inferencing. This allows you to use the agent in production environments.

๐Ÿ’ก Note: this feature is undergoing development and we are working on making this API easier to use in an upcoming release.

Currently this feature is named execute and is subject to change.

Example:

python
from composabl import Agent, Runtime, Scenario, Sensor, Skill, SkillGroup

config = {} # normal config
runtime = Runtime(config)
agent = Agent(runtime, config)

# ... agent definition ...

agent.train(train_iters=1)

# Prepare the agent for inferencing
# note: this can be done after export/load (see above)
trained_agent = agent.prepare()

sim = SimEnv()
for _episode_idx in range(5):
    obs, _info = sim.reset()

    for _step_index in range(100):
        action = trained_agent.execute(obs)
        obs, _reward, done, _truncated, _info = sim.step(action)

Sim Rewards are Exposed โ€‹

The compute_reward method in the teacher now exposes the reward from the simulator. This allows you to use the reward from the simulator in your reward function.

Example:

python
def compute_reward(self, transformed_obs, action, sim_reward):
    pass

Worker specification โ€‹

You can now specify the amount of workers you want to use for training. This allows you to scale up and down the amount of workers you want to use for training. To use this, set runtime.ray.workers in your Runtime configuration.

Example:

python
config = {
    "runtime": {
        "ray": {
            "workers": 4 # Specify the amount of workers
        }
    },
}

runtime = Runtime(config)

Historian Iteration Sinking โ€‹

Iteration information and episode rewards are now being sinked to the Historian. This allows for better logging and querying of the historian so that you can create advanced plots.

Example:

sql
SELECT
    data->>'name' AS skill_name
    , data->'result'->>'done' AS is_done
    , data->>'iteration' AS training_iteration
    , data->>'iteration_total' AS training_iteration_total
    , data->'result'->'info'->>'num_agent_steps_trained' AS steps_trained
    , data->'result'->>'episode_reward_mean' AS episode_reward_mean
    , data->'result'->>'episode_reward_min' AS episode_reward_min
    , data->'result'->>'episode_reward_max' AS episode_reward_max
    , data->'result'->>'episode_len_mean' AS episode_len_mean
    , data->'result'->>'time_total_s' AS time_total_s
    , data->'result'->'hist_stats'->>'episode_reward' AS episode_rewards
    , data->'result'->'hist_stats'->>'episode_lengths' AS episode_length
FROM events
WHERE category = 'agent' AND category_sub = 'skill-training-iteration'
ORDER BY time;

๐Ÿงช Experimental โ€‹

  • Kubernetes: Kubernetes support has been added in experimental mode. Note that this feature is still under development and is not documented yet, if you are interested in this, please contact us through beta@composabl.io or on Discord.

๐Ÿ› Bug Fix โ€‹

  • Actors are now being reused whenever possible, not throwing away any resources and improving the load time.
  • There was an issue in setting scenarios that has been resolved
  • There was a typo in the teacher processing, crashing the runtime

๐Ÿ’– Improvements โ€‹

  • Lifecycle Management: Sims are being managed in a new and improved way, improving stability and scalability.
  • Cleaned logging: Logging was "verbose", during this release we focused on making it less verbose. If more verbosity is required, you can use LOGLEVEL=debug before your command (e.g., LOGLEVEL=debug python src/main.py) to add more verbosity.
  • Sim Watchdog: A watchdog has been added that will remove simulators if they are not in use, cleaning up resources.

๐Ÿ“ Documentation โ€‹

  • Changelog: Changelogs are now added on the documentation
  • Versioning: Versioning has been added for easy navigation between the development build and the latest stable build or older versions.

๐Ÿ”ง Maintenance โ€‹

  • Ray Upgrade: Ray has been upgraded to the latest version (2.7.1).