Reinforcement Learning

Your weight: normal

all topics
  1. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers propose a new algorithm, Distributional DAgger, to improve reinforcement learning from rich feedback. This approach aims to reduce uncertainty in decision-making by leveraging distributional information.

  2. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers propose QUBRIC, a co-design framework for reinforcement learning (RL) that goes beyond verifiable rewards. QUBRIC combines queries and rubrics to enable RL in complex scenarios.

  3. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers propose a self-refining agentic reinforcement learning approach for vision-conditioned UAV navigation, which improves navigation performance and adaptability.

  4. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers propose a method to induce diverse behavior in reinforcement learning by incorporating reward uncertainty. This approach aims to improve exploration and decision-making in complex environments.

  5. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers found that reinforcement learning from human feedback can be configured to optimize misaligned biases in AI systems, according to a study published on arXiv AI.