Machine Learning Evaluation Specialist

About the Role

What if your years of hard-earned research expertise could directly shape the future of AI? We're looking for domain experts with deep machine learning knowledge to design evaluation challenges that push state-of-the-art AI systems to their limits — the kind of problems only a true specialist could craft.

Your work won't sit in a drawer. It directly influences how the next generation of AI models are measured, trained, and improved.

Organization: Alignerr
Type: Hourly Contract
Location: Remote
Commitment: 10–40 hours/week

What You'll Do

Design complex, original machine learning problems rooted in your area of domain expertise
Create evaluation tasks that demand advanced knowledge well beyond standard ML pipelines
Draw from your own research experience to craft challenges that genuinely test highly capable AI models
Write clear problem statements, define evaluation criteria, and establish gold-standard solutions
Assess AI-generated solutions for correctness, creativity, and methodological rigor
Document problem difficulty, required domain knowledge, and expected failure modes
Collaborate asynchronously with a global team of researchers and engineers

Who You Are

Graduate-level expertise (MS or PhD preferred) in a scientific or technical discipline that intersects with machine learning
Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design
Deep familiarity with active, open research problems in your field
A sharp eye for where general ML knowledge breaks down and specialized domain insight becomes essential
Experience publishing or conducting original research is highly valued
Excellent written communication — you can articulate complex, nuanced problems with precision and clarity
Self-motivated and energized by intellectually demanding, independent work

Example Domains

We welcome experts from a wide range of fields, including but not limited to:

Computational biology, genomics, or bioinformatics
Climate science and environmental modeling
Medical imaging and healthcare ML
Materials science and computational chemistry
Astrophysics and signal processing
Natural language processing for low-resource or specialized corpora
Robotics, control theory, or reinforcement learning in complex environments
Financial modeling and quantitative analysis

If your domain sits at the frontier of ML research, we want to hear from you.

Why Join Us

Work at the cutting edge — your challenges help define the boundaries of what AI can and cannot do
Make a real impact — your expertise directly shapes AI safety and evaluation research
Full autonomy — work on your own schedule, from anywhere in the world
Flexible commitment — scale hours up or down based on your availability
Ongoing opportunity — strong contributors are considered for contract extensions and deeper research involvement
Build your profile — establish yourself as a contributor to frontier AI development alongside top research labs