Tag
A discussion on how harness designs can reduce token costs by structuring information instead of feeding everything into a language model's context, citing an example of an RLM agent processing many lines of logs with few active tokens.
A developer's RLM agent processes ~80k lines of CloudWatch logs efficiently, inferring service architecture and finding issues, with plans to open-source it soon.