Back to jobs

Machine Learning Evaluation Specialist

$200-400/hrRemoteFreelanceSTEM

About the Role

What if your years of hard-earned research expertise could directly shape the future of AI? We're looking for domain experts with deep machine learning knowledge to design evaluation challenges that push state-of-the-art AI systems to their limits — the kind of problems only a true specialist could craft.

Your work won't sit in a drawer. It directly influences how the next generation of AI models are measured, trained, and improved.

  • Organization: Alignerr
  • Type: Hourly Contract
  • Location: Remote
  • Commitment: 10–40 hours/week

What You'll Do

  • Design complex, original machine learning problems rooted in your area of domain expertise
  • Create evaluation tasks that demand advanced knowledge well beyond standard ML pipelines
  • Draw from your own research experience to craft challenges that genuinely test highly capable AI models
  • Write clear problem statements, define evaluation criteria, and establish gold-standard solutions
  • Assess AI-generated solutions for correctness, creativity, and methodological rigor
  • Document problem difficulty, required domain knowledge, and expected failure modes
  • Collaborate asynchronously with a global team of researchers and engineers

Who You Are

  • Graduate-level expertise (MS or PhD preferred) in a scientific or technical discipline that intersects with machine learning
  • Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
  • Deep familiarity with active, open research problems in your field
  • A sharp eye for where general ML knowledge breaks down and specialized domain insight becomes essential
  • Experience publishing or conducting original research is highly valued
  • Excellent written communication — you can articulate complex, nuanced problems with precision and clarity
  • Self-motivated and energized by intellectually demanding, independent work

Example Domains

We welcome experts from a wide range of fields, including but not limited to:

  • Computational biology, genomics, or bioinformatics
  • Climate science and environmental modeling
  • Medical imaging and healthcare ML
  • Materials science and computational chemistry
  • Astrophysics and signal processing
  • Natural language processing for low-resource or specialized corpora
  • Robotics, control theory, or reinforcement learning in complex environments
  • Financial modeling and quantitative analysis

If your domain sits at the frontier of ML research, we want to hear from you.


Why Join Us

  • Work at the cutting edge — your challenges help define the boundaries of what AI can and cannot do
  • Make a real impact — your expertise directly shapes AI safety and evaluation research
  • Full autonomy — work on your own schedule, from anywhere in the world
  • Flexible commitment — scale hours up or down based on your availability
  • Ongoing opportunity — strong contributors are considered for contract extensions and deeper research involvement
  • Build your profile — establish yourself as a contributor to frontier AI development alongside top research labs