This job posting has expired

Expired on April 1, 2026

Evaluation Scenario Writer - AI Agent Testing Specialist

OmanFreelance? - 40 USD
PythonGitJSONYAMLLLM limitations knowledge

Job Description

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. This involves creating structured test cases that simulate complex human workflows and defining gold-standard behavior.

Responsibilities

  • Create structured test cases that simulate complex human workflows
  • Define gold-standard behavior and scoring logic
  • Analyze agent logs, failure modes, and decision paths
  • Work with code repositories and test frameworks to validate scenarios

Qualifications

  • 3+ years of software development experience with strong Python focus
  • Experience with Git and code repositories
  • Familiarity with Docker
  • English proficiency - B2

Job Information

Posted

January 31, 2026

Status

Expired