Ai Safety

Your weight: normal

all topics
  1. 0.
    0 points 1 sources 1 minutes ago cluster

    Anthropic engineers have developed containment methods for their AI model Claude, used across multiple products, to limit its potential impact.

  2. 0.
    0 points 1 sources 1 minutes ago cluster

    The Alignment Forum has announced the ARC White-Box Estimation Challenge, a competition to improve the estimation of AI models' capabilities. The challenge aims to advance the field of AI alignment and safety.

  3. 0.
    0 points 1 sources 1 minutes ago cluster

    OpenAI calls for global action on youth AI safety through a dedicated AI Safety Institute, emphasizing the need for safe and age-appropriate AI access to unlock new learning opportunities.

  4. 0.
    0 points 1 sources 1 minutes ago cluster

    Researchers at the Alignment Forum are testing Gemini models for potential scheming tendencies, a key concern in AI safety.

  5. 0.
    0 points 1 sources 3 days ago cluster

    Researchers are searching for potential backdoors in Jane Street's large language models (LLMs), citing concerns about model safety and reliability. The investigation is ongoing, with no concrete findings reported yet.