Tag
This paper investigates memory-efficient meta-reinforcement learning architectures for adaptive safety-critical control in adversarial spacecraft proximity operations, finding that state space models like Mamba with PPO achieve superior task completion, safety, and fuel savings compared to LSTM and GRU.
This paper proposes Action-Conditioned Risk Gating, a lightweight reinforcement learning method for risk-sensitive control under partial observability that uses a compact finite-history proxy state and an action-conditioned near-term risk predictor to balance safety and performance.