Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]¶
Added¶
Changed¶
Fixed¶
v0.2.0¶
Added¶
- Introduce cherry.nn.Policy, cherry.nn.ActionValue, and cherry.nn.StateValue.
- Algorithm class utilities for: A2C, PPO, TRPO, DDPG, TD3, SAC, and DrQ/DrQv2.
- DMC examples for SAC, DrQ, and DrQv2.
- N-steps returns sampling in ExperienceReplay.
Changed¶
- Discontinue most of cherry.wrappers.
Fixed¶
- Fixes return value of StateNormalizer and RewardNormalizer wrappers.
- Requirements to generate docs.
v0.1.4¶
Fixed¶
- Support for torch 1.5 and new
_parse_to
behavior in ExperienceReplay. (thanks @ManifoldFR)
v0.1.3¶
Added¶
- A CHANGELOG.md file.
Changed¶
- Travis testing with different versions of Python (3.6, 3.7), torch (1.1, 1.2, 1.3, 1.4), and torchvision (0.3, 0.4, 0.5).
Fixed¶
- fix bug in
torch_wrapper
when use GPU by callling Tensor.cpu().detach().numpy() to convert CUDA tensor to numpy.(@walkacross) - Bugfix when using
td.discount
with replays coming from vectorized environments (@galatolofederico) - env.action_size and env.state_size when the number of vectorized environments is 1. (thanks @galatolofederico)
- Actor-critic integration test being to finicky.
cherry.onehot
support for numpy's float and integer types. (thanks @ngoby)